Sample records for small feature sets

  1. A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets.

    PubMed

    Li, Der-Chiang; Liu, Chiao-Wen; Hu, Susan C

    2011-05-01

    Medical data sets are usually small and have very high dimensionality. Too many attributes will make the analysis less efficient and will not necessarily increase accuracy, while too few data will decrease the modeling stability. Consequently, the main objective of this study is to extract the optimal subset of features to increase analytical performance when the data set is small. This paper proposes a fuzzy-based non-linear transformation method to extend classification related information from the original data attribute values for a small data set. Based on the new transformed data set, this study applies principal component analysis (PCA) to extract the optimal subset of features. Finally, we use the transformed data with these optimal features as the input data for a learning tool, a support vector machine (SVM). Six medical data sets: Pima Indians' diabetes, Wisconsin diagnostic breast cancer, Parkinson disease, echocardiogram, BUPA liver disorders dataset, and bladder cancer cases in Taiwan, are employed to illustrate the approach presented in this paper. This research uses the t-test to evaluate the classification accuracy for a single data set; and uses the Friedman test to show the proposed method is better than other methods over the multiple data sets. The experiment results indicate that the proposed method has better classification performance than either PCA or kernel principal component analysis (KPCA) when the data set is small, and suggest creating new purpose-related information to improve the analysis performance. This paper has shown that feature extraction is important as a function of feature selection for efficient data analysis. When the data set is small, using the fuzzy-based transformation method presented in this work to increase the information available produces better results than the PCA and KPCA approaches. Copyright © 2011 Elsevier B.V. All rights reserved.

  2. The Model-Based Study of the Effectiveness of Reporting Lists of Small Feature Sets Using RNA-Seq Data.

    PubMed

    Kim, Eunji; Ivanov, Ivan; Hua, Jianping; Lampe, Johanna W; Hullar, Meredith Aj; Chapkin, Robert S; Dougherty, Edward R

    2017-01-01

    Ranking feature sets for phenotype classification based on gene expression is a challenging issue in cancer bioinformatics. When the number of samples is small, all feature selection algorithms are known to be unreliable, producing significant error, and error estimators suffer from different degrees of imprecision. The problem is compounded by the fact that the accuracy of classification depends on the manner in which the phenomena are transformed into data by the measurement technology. Because next-generation sequencing technologies amount to a nonlinear transformation of the actual gene or RNA concentrations, they can potentially produce less discriminative data relative to the actual gene expression levels. In this study, we compare the performance of ranking feature sets derived from a model of RNA-Seq data with that of a multivariate normal model of gene concentrations using 3 measures: (1) ranking power, (2) length of extensions, and (3) Bayes features. This is the model-based study to examine the effectiveness of reporting lists of small feature sets using RNA-Seq data and the effects of different model parameters and error estimators. The results demonstrate that the general trends of the parameter effects on the ranking power of the underlying gene concentrations are preserved in the RNA-Seq data, whereas the power of finding a good feature set becomes weaker when gene concentrations are transformed by the sequencing machine.

  3. Human immunophenotyping via low-variance, low-bias, interpretive regression modeling of small, wide data sets: Application to aging and immune response to influenza vaccination.

    PubMed

    Holmes, Tyson H; He, Xiao-Song

    2016-10-01

    Small, wide data sets are commonplace in human immunophenotyping research. As defined here, a small, wide data set is constructed by sampling a small to modest quantity n,1

  4. Human Immunophenotyping via Low-Variance, Low-Bias, Interpretive Regression Modeling of Small, Wide Data Sets: Application to Aging and Immune Response to Influenza Vaccination

    PubMed Central

    Holmes, Tyson H.; He, Xiao-Song

    2016-01-01

    Small, wide data sets are commonplace in human immunophenotyping research. As defined here, a small, wide data set is constructed by sampling a small to modest quantity n, 1 < n < 50, of human participants for the purpose of estimating many parameters p, such that n < p < 1,000. We offer a set of prescriptions that are designed to facilitate low-variance (i.e. stable), low-bias, interpretive regression modeling of small, wide data sets. These prescriptions are distinctive in their especially heavy emphasis on minimizing use of out-of-sample information for conducting statistical inference. That allows the working immunologist to proceed without being encumbered by imposed and often untestable statistical assumptions. Problems of unmeasured confounders, confidence-interval coverage, feature selection, and shrinkage/denoising are defined clearly and treated in detail. We propose an extension of an existing nonparametric technique for improved small-sample confidence-interval tail coverage from the univariate case (single immune feature) to the multivariate (many, possibly correlated immune features). An important role for derived features in the immunological interpretation of regression analyses is stressed. Areas of further research are discussed. Presented principles and methods are illustrated through application to a small, wide data set of adults spanning a wide range in ages and multiple immunophenotypes that were assayed before and after immunization with inactivated influenza vaccine (IIV). Our regression modeling prescriptions identify some potentially important topics for future immunological research. 1) Immunologists may wish to distinguish age-related differences in immune features from changes in immune features caused by aging. 2) A form of the bootstrap that employs linear extrapolation may prove to be an invaluable analytic tool because it allows the working immunologist to obtain accurate estimates of the stability of immune parameter estimates with a bare minimum of imposed assumptions. 3) Liberal inclusion of immune features in phenotyping panels can facilitate accurate separation of biological signal of interest from noise. In addition, through a combination of denoising and potentially improved confidence interval coverage, we identify some candidate immune correlates (frequency of cell subset and concentration of cytokine) with B cell response as measured by quantity of IIV-specific IgA antibody-secreting cells and quantity of IIV-specific IgG antibody-secreting cells. PMID:27196789

  5. Using listener-based perceptual features as intermediate representations in music information retrieval.

    PubMed

    Friberg, Anders; Schoonderwaldt, Erwin; Hedblad, Anton; Fabiani, Marco; Elowsson, Anders

    2014-10-01

    The notion of perceptual features is introduced for describing general music properties based on human perception. This is an attempt at rethinking the concept of features, aiming to approach the underlying human perception mechanisms. Instead of using concepts from music theory such as tones, pitches, and chords, a set of nine features describing overall properties of the music was selected. They were chosen from qualitative measures used in psychology studies and motivated from an ecological approach. The perceptual features were rated in two listening experiments using two different data sets. They were modeled both from symbolic and audio data using different sets of computational features. Ratings of emotional expression were predicted using the perceptual features. The results indicate that (1) at least some of the perceptual features are reliable estimates; (2) emotion ratings could be predicted by a small combination of perceptual features with an explained variance from 75% to 93% for the emotional dimensions activity and valence; (3) the perceptual features could only to a limited extent be modeled using existing audio features. Results clearly indicated that a small number of dedicated features were superior to a "brute force" model using a large number of general audio features.

  6. Model-Based Learning of Local Image Features for Unsupervised Texture Segmentation

    NASA Astrophysics Data System (ADS)

    Kiechle, Martin; Storath, Martin; Weinmann, Andreas; Kleinsteuber, Martin

    2018-04-01

    Features that capture well the textural patterns of a certain class of images are crucial for the performance of texture segmentation methods. The manual selection of features or designing new ones can be a tedious task. Therefore, it is desirable to automatically adapt the features to a certain image or class of images. Typically, this requires a large set of training images with similar textures and ground truth segmentation. In this work, we propose a framework to learn features for texture segmentation when no such training data is available. The cost function for our learning process is constructed to match a commonly used segmentation model, the piecewise constant Mumford-Shah model. This means that the features are learned such that they provide an approximately piecewise constant feature image with a small jump set. Based on this idea, we develop a two-stage algorithm which first learns suitable convolutional features and then performs a segmentation. We note that the features can be learned from a small set of images, from a single image, or even from image patches. The proposed method achieves a competitive rank in the Prague texture segmentation benchmark, and it is effective for segmenting histological images.

  7. Detection of explosive cough events in audio recordings by internal sound analysis.

    PubMed

    Rocha, B M; Mendes, L; Couceiro, R; Henriques, J; Carvalho, P; Paiva, R P

    2017-07-01

    We present a new method for the discrimination of explosive cough events, which is based on a combination of spectral content descriptors and pitch-related features. After the removal of near-silent segments, a vector of event boundaries is obtained and a proposed set of 9 features is extracted for each event. Two data sets, recorded using electronic stethoscopes and comprising a total of 46 healthy subjects and 13 patients, were employed to evaluate the method. The proposed feature set is compared to three other sets of descriptors: a baseline, a combination of both sets, and an automatic selection of the best 10 features from both sets. The combined feature set yields good results on the cross-validated database, attaining a sensitivity of 92.3±2.3% and a specificity of 84.7±3.3%. Besides, this feature set seems to generalize well when it is trained on a small data set of patients, with a variety of respiratory and cardiovascular diseases, and tested on a bigger data set of mostly healthy subjects: a sensitivity of 93.4% and a specificity of 83.4% are achieved in those conditions. These results demonstrate that complementing the proposed feature set with a baseline set is a promising approach.

  8. Developing a radiomics framework for classifying non-small cell lung carcinoma subtypes

    NASA Astrophysics Data System (ADS)

    Yu, Dongdong; Zang, Yali; Dong, Di; Zhou, Mu; Gevaert, Olivier; Fang, Mengjie; Shi, Jingyun; Tian, Jie

    2017-03-01

    Patient-targeted treatment of non-small cell lung carcinoma (NSCLC) has been well documented according to the histologic subtypes over the past decade. In parallel, recent development of quantitative image biomarkers has recently been highlighted as important diagnostic tools to facilitate histological subtype classification. In this study, we present a radiomics analysis that classifies the adenocarcinoma (ADC) and squamous cell carcinoma (SqCC). We extract 52-dimensional, CT-based features (7 statistical features and 45 image texture features) to represent each nodule. We evaluate our approach on a clinical dataset including 324 ADCs and 110 SqCCs patients with CT image scans. Classification of these features is performed with four different machine-learning classifiers including Support Vector Machines with Radial Basis Function kernel (RBF-SVM), Random forest (RF), K-nearest neighbor (KNN), and RUSBoost algorithms. To improve the classifiers' performance, optimal feature subset is selected from the original feature set by using an iterative forward inclusion and backward eliminating algorithm. Extensive experimental results demonstrate that radiomics features achieve encouraging classification results on both complete feature set (AUC=0.89) and optimal feature subset (AUC=0.91).

  9. A new computer approach to mixed feature classification for forestry application

    NASA Technical Reports Server (NTRS)

    Kan, E. P.

    1976-01-01

    A computer approach for mapping mixed forest features (i.e., types, classes) from computer classification maps is discussed. Mixed features such as mixed softwood/hardwood stands are treated as admixtures of softwood and hardwood areas. Large-area mixed features are identified and small-area features neglected when the nominal size of a mixed feature can be specified. The computer program merges small isolated areas into surrounding areas by the iterative manipulation of the postprocessing algorithm that eliminates small connected sets. For a forestry application, computer-classified LANDSAT multispectral scanner data of the Sam Houston National Forest were used to demonstrate the proposed approach. The technique was successful in cleaning the salt-and-pepper appearance of multiclass classification maps and in mapping admixtures of softwood areas and hardwood areas. However, the computer-mapped mixed areas matched very poorly with the ground truth because of inadequate resolution and inappropriate definition of mixed features.

  10. Testing of Haar-Like Feature in Region of Interest Detection for Automated Target Recognition (ATR) System

    NASA Technical Reports Server (NTRS)

    Zhang, Yuhan; Lu, Dr. Thomas

    2010-01-01

    The objectives of this project were to develop a ROI (Region of Interest) detector using Haar-like feature similar to the face detection in Intel's OpenCV library, implement it in Matlab code, and test the performance of the new ROI detector against the existing ROI detector that uses Optimal Trade-off Maximum Average Correlation Height filter (OTMACH). The ROI detector included 3 parts: 1, Automated Haar-like feature selection in finding a small set of the most relevant Haar-like features for detecting ROIs that contained a target. 2, Having the small set of Haar-like features from the last step, a neural network needed to be trained to recognize ROIs with targets by taking the Haar-like features as inputs. 3, using the trained neural network from the last step, a filtering method needed to be developed to process the neural network responses into a small set of regions of interests. This needed to be coded in Matlab. All the 3 parts needed to be coded in Matlab. The parameters in the detector needed to be trained by machine learning and tested with specific datasets. Since OpenCV library and Haar-like feature were not available in Matlab, the Haar-like feature calculation needed to be implemented in Matlab. The codes for Adaptive Boosting and max/min filters in Matlab could to be found from the Internet but needed to be integrated to serve the purpose of this project. The performance of the new detector was tested by comparing the accuracy and the speed of the new detector against the existing OTMACH detector. The speed was referred as the average speed to find the regions of interests in an image. The accuracy was measured by the number of false positives (false alarms) at the same detection rate between the two detectors.

  11. Textural features for radar image analysis

    NASA Technical Reports Server (NTRS)

    Shanmugan, K. S.; Narayanan, V.; Frost, V. S.; Stiles, J. A.; Holtzman, J. C.

    1981-01-01

    Texture is seen as an important spatial feature useful for identifying objects or regions of interest in an image. While textural features have been widely used in analyzing a variety of photographic images, they have not been used in processing radar images. A procedure for extracting a set of textural features for characterizing small areas in radar images is presented, and it is shown that these features can be used in classifying segments of radar images corresponding to different geological formations.

  12. "Learning to Work" in Small Businesses: Learning and Training for Young Adults with Learning Disabilities

    ERIC Educational Resources Information Center

    Ruggeri-Stevens, Geoff; Goodwin, Susan

    2007-01-01

    Purpose: The paper alerts small business employers to new dictates of the Disability Discrimination Act (2005) as it applies to learning disabilities. Then the "Learning to Work" project featured in the paper offers small business employers a set of approaches and methods for the identification of a learning-disabled young adult…

  13. FuzzObserver

    NASA Technical Reports Server (NTRS)

    Howard, Ayanna; Bayard, David

    2006-01-01

    Fuzzy Feature Observation Planner for Small Body Proximity Observations (FuzzObserver) is a developmental computer program, to be used along with other software, for autonomous planning of maneuvers of a spacecraft near an asteroid, comet, or other small astronomical body. Selection of terrain features and estimation of the position of the spacecraft relative to these features is an essential part of such planning. FuzzObserver contributes to the selection and estimation by generating recommendations for spacecraft trajectory adjustments to maintain the spacecraft's ability to observe sufficient terrain features for estimating position. The input to FuzzObserver consists of data from terrain images, including sets of data on features acquired during descent toward, or traversal of, a body of interest. The name of this program reflects its use of fuzzy logic to reason about the terrain features represented by the data and extract corresponding trajectory-adjustment rules. Linguistic fuzzy sets and conditional statements enable fuzzy systems to make decisions based on heuristic rule-based knowledge derived by engineering experts. A major advantage of using fuzzy logic is that it involves simple arithmetic calculations that can be performed rapidly enough to be useful for planning within the short times typically available for spacecraft maneuvers.

  14. Roles and Responsibilities in Feature Teams

    NASA Astrophysics Data System (ADS)

    Eckstein, Jutta

    Agile development requires self-organizing teams. The set-up of a (feature) team has to enable self-organization. Special care has to be taken if the project is not only distributed, but also large and more than one feature team is involved. Every feature team needs in such a setting a product owner who ensures the continuous focus on business delivery. The product owners collaborate by working together in a virtual team. Each feature team is supported by a coach who ensures not only the agile process of the individual feature team but also across all feature teams. An architect (or if necessary a team of architects) takes care that the system is technically sound. Contrariwise to small co-located projects, large global projects require a project manager who deals with—among other things—internal and especially external politics.

  15. Feature extraction using convolutional neural network for classifying breast density in mammographic images

    NASA Astrophysics Data System (ADS)

    Thomaz, Ricardo L.; Carneiro, Pedro C.; Patrocinio, Ana C.

    2017-03-01

    Breast cancer is the leading cause of death for women in most countries. The high levels of mortality relate mostly to late diagnosis and to the direct proportionally relationship between breast density and breast cancer development. Therefore, the correct assessment of breast density is important to provide better screening for higher risk patients. However, in modern digital mammography the discrimination among breast densities is highly complex due to increased contrast and visual information for all densities. Thus, a computational system for classifying breast density might be a useful tool for aiding medical staff. Several machine-learning algorithms are already capable of classifying small number of classes with good accuracy. However, machinelearning algorithms main constraint relates to the set of features extracted and used for classification. Although well-known feature extraction techniques might provide a good set of features, it is a complex task to select an initial set during design of a classifier. Thus, we propose feature extraction using a Convolutional Neural Network (CNN) for classifying breast density by a usual machine-learning classifier. We used 307 mammographic images downsampled to 260x200 pixels to train a CNN and extract features from a deep layer. After training, the activation of 8 neurons from a deep fully connected layer are extracted and used as features. Then, these features are feedforward to a single hidden layer neural network that is cross-validated using 10-folds to classify among four classes of breast density. The global accuracy of this method is 98.4%, presenting only 1.6% of misclassification. However, the small set of samples and memory constraints required the reuse of data in both CNN and MLP-NN, therefore overfitting might have influenced the results even though we cross-validated the network. Thus, although we presented a promising method for extracting features and classifying breast density, a greater database is still required for evaluating the results.

  16. Impact of experimental design on PET radiomics in predicting somatic mutation status.

    PubMed

    Yip, Stephen S F; Parmar, Chintan; Kim, John; Huynh, Elizabeth; Mak, Raymond H; Aerts, Hugo J W L

    2017-12-01

    PET-based radiomic features have demonstrated great promises in predicting genetic data. However, various experimental parameters can influence the feature extraction pipeline, and hence, Here, we investigated how experimental settings affect the performance of radiomic features in predicting somatic mutation status in non-small cell lung cancer (NSCLC) patients. 348 NSCLC patients with somatic mutation testing and diagnostic PET images were included in our analysis. Radiomic feature extractions were analyzed for varying voxel sizes, filters and bin widths. 66 radiomic features were evaluated. The performance of features in predicting mutations status was assessed using the area under the receiver-operating-characteristic curve (AUC). The influence of experimental parameters on feature predictability was quantified as the relative difference between the minimum and maximum AUC (δ). The large majority of features (n=56, 85%) were significantly predictive for EGFR mutation status (AUC≥0.61). 29 radiomic features significantly predicted EGFR mutations and were robust to experimental settings with δ Overall <5%. The overall influence (δ Overall ) of the voxel size, filter and bin width for all features ranged from 5% to 15%, respectively. For all features, none of the experimental designs was predictive of KRAS+ from KRAS- (AUC≤0.56). The predictability of 29 radiomic features was robust to the choice of experimental settings; however, these settings need to be carefully chosen for all other features. The combined effect of the investigated processing methods could be substantial and must be considered. Optimized settings that will maximize the predictive performance of individual radiomic features should be investigated in the future. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Honorio, J.; Goldstein, R.; Honorio, J.

    We propose a simple, well grounded classification technique which is suited for group classification on brain fMRI data sets that have high dimensionality, small number of subjects, high noise level, high subject variability, imperfect registration and capture subtle cognitive effects. We propose threshold-split region as a new feature selection method and majority voteas the classification technique. Our method does not require a predefined set of regions of interest. We use average acros ssessions, only one feature perexperimental condition, feature independence assumption, and simple classifiers. The seeming counter-intuitive approach of using a simple design is supported by signal processing and statisticalmore » theory. Experimental results in two block design data sets that capture brain function under distinct monetary rewards for cocaine addicted and control subjects, show that our method exhibits increased generalization accuracy compared to commonly used feature selection and classification techniques.« less

  18. MO-AB-BRA-10: Cancer Therapy Outcome Prediction Based On Dempster-Shafer Theory and PET Imaging

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lian, C; University of Rouen, QuantIF - EA 4108 LITIS, 76000 Rouen; Li, H

    2015-06-15

    Purpose: In cancer therapy, utilizing FDG-18 PET image-based features for accurate outcome prediction is challenging because of 1) limited discriminative information within a small number of PET image sets, and 2) fluctuant feature characteristics caused by the inferior spatial resolution and system noise of PET imaging. In this study, we proposed a new Dempster-Shafer theory (DST) based approach, evidential low-dimensional transformation with feature selection (ELT-FS), to accurately predict cancer therapy outcome with both PET imaging features and clinical characteristics. Methods: First, a specific loss function with sparse penalty was developed to learn an adaptive low-rank distance metric for representing themore » dissimilarity between different patients’ feature vectors. By minimizing this loss function, a linear low-dimensional transformation of input features was achieved. Also, imprecise features were excluded simultaneously by applying a l2,1-norm regularization of the learnt dissimilarity metric in the loss function. Finally, the learnt dissimilarity metric was applied in an evidential K-nearest-neighbor (EK- NN) classifier to predict treatment outcome. Results: Twenty-five patients with stage II–III non-small-cell lung cancer and thirty-six patients with esophageal squamous cell carcinomas treated with chemo-radiotherapy were collected. For the two groups of patients, 52 and 29 features, respectively, were utilized. The leave-one-out cross-validation (LOOCV) protocol was used for evaluation. Compared to three existing linear transformation methods (PCA, LDA, NCA), the proposed ELT-FS leads to higher prediction accuracy for the training and testing sets both for lung-cancer patients (100+/−0.0, 88.0+/−33.17) and for esophageal-cancer patients (97.46+/−1.64, 83.33+/−37.8). The ELT-FS also provides superior class separation in both test data sets. Conclusion: A novel DST- based approach has been proposed to predict cancer treatment outcome using PET image features and clinical characteristics. A specific loss function has been designed for robust accommodation of feature set incertitude and imprecision, facilitating adaptive learning of the dissimilarity metric for the EK-NN classifier.« less

  19. Quality assessment of data discrimination using self-organizing maps.

    PubMed

    Mekler, Alexey; Schwarz, Dmitri

    2014-10-01

    One of the important aspects of the data classification problem lies in making the most appropriate selection of features. The set of variables should be small and, at the same time, should provide reliable discrimination of the classes. The method for the discriminating power evaluation that enables a comparison between different sets of variables will be useful in the search for the set of variables. A new approach to feature selection is presented. Two methods of evaluation of the data discriminating power of a feature set are suggested. Both of the methods implement self-organizing maps (SOMs) and the newly introduced exponents of the degree of data clusterization on the SOM. The first method is based on the comparison of intraclass and interclass distances on the map. Another method concerns the evaluation of the relative number of best matching unit's (BMUs) nearest neighbors of the same class. Both methods make it possible to evaluate the discriminating power of a feature set in cases when this set provides nonlinear discrimination of the classes. Current algorithms in program code can be downloaded for free at http://mekler.narod.ru/Science/Articles_support.html, as well as the supporting data files. Copyright © 2014 Elsevier Inc. All rights reserved.

  20. Texture-based approach to palmprint retrieval for personal identification

    NASA Astrophysics Data System (ADS)

    Li, Wenxin; Zhang, David; Xu, Z.; You, J.

    2000-12-01

    This paper presents a new approach to palmprint retrieval for personal identification. Three key issues in image retrieval are considered - feature selection, similarity measures and dynamic search for the best matching of the sample in the image database. We propose a texture-based method for palmprint feature representation. The concept of texture energy is introduced to define a palm print's global and local features, which are characterized with high convergence of inner-palm similarities and good dispersion of inter-palm discrimination. The search is carried out in a layered fashion: first global features are used to guide the fast selection of a small set of similar candidates from the database from the database and then local features are used to decide the final output within the candidate set. The experimental results demonstrate the effectiveness and accuracy of the proposed method.

  1. Texture-based approach to palmprint retrieval for personal identification

    NASA Astrophysics Data System (ADS)

    Li, Wenxin; Zhang, David; Xu, Z.; You, J.

    2001-01-01

    This paper presents a new approach to palmprint retrieval for personal identification. Three key issues in image retrieval are considered - feature selection, similarity measures and dynamic search for the best matching of the sample in the image database. We propose a texture-based method for palmprint feature representation. The concept of texture energy is introduced to define a palm print's global and local features, which are characterized with high convergence of inner-palm similarities and good dispersion of inter-palm discrimination. The search is carried out in a layered fashion: first global features are used to guide the fast selection of a small set of similar candidates from the database from the database and then local features are used to decide the final output within the candidate set. The experimental results demonstrate the effectiveness and accuracy of the proposed method.

  2. A bootstrap based Neyman-Pearson test for identifying variable importance.

    PubMed

    Ditzler, Gregory; Polikar, Robi; Rosen, Gail

    2015-04-01

    Selection of most informative features that leads to a small loss on future data are arguably one of the most important steps in classification, data analysis and model selection. Several feature selection (FS) algorithms are available; however, due to noise present in any data set, FS algorithms are typically accompanied by an appropriate cross-validation scheme. In this brief, we propose a statistical hypothesis test derived from the Neyman-Pearson lemma for determining if a feature is statistically relevant. The proposed approach can be applied as a wrapper to any FS algorithm, regardless of the FS criteria used by that algorithm, to determine whether a feature belongs in the relevant set. Perhaps more importantly, this procedure efficiently determines the number of relevant features given an initial starting point. We provide freely available software implementations of the proposed methodology.

  3. Differential diagnosis of CT focal liver lesions using texture features, feature selection and ensemble driven classifiers.

    PubMed

    Mougiakakou, Stavroula G; Valavanis, Ioannis K; Nikita, Alexandra; Nikita, Konstantina S

    2007-09-01

    The aim of the present study is to define an optimally performing computer-aided diagnosis (CAD) architecture for the classification of liver tissue from non-enhanced computed tomography (CT) images into normal liver (C1), hepatic cyst (C2), hemangioma (C3), and hepatocellular carcinoma (C4). To this end, various CAD architectures, based on texture features and ensembles of classifiers (ECs), are comparatively assessed. Number of regions of interests (ROIs) corresponding to C1-C4 have been defined by experienced radiologists in non-enhanced liver CT images. For each ROI, five distinct sets of texture features were extracted using first order statistics, spatial gray level dependence matrix, gray level difference method, Laws' texture energy measures, and fractal dimension measurements. Two different ECs were constructed and compared. The first one consists of five multilayer perceptron neural networks (NNs), each using as input one of the computed texture feature sets or its reduced version after genetic algorithm-based feature selection. The second EC comprised five different primary classifiers, namely one multilayer perceptron NN, one probabilistic NN, and three k-nearest neighbor classifiers, each fed with the combination of the five texture feature sets or their reduced versions. The final decision of each EC was extracted by using appropriate voting schemes, while bootstrap re-sampling was utilized in order to estimate the generalization ability of the CAD architectures based on the available relatively small-sized data set. The best mean classification accuracy (84.96%) is achieved by the second EC using a fused feature set, and the weighted voting scheme. The fused feature set was obtained after appropriate feature selection applied to specific subsets of the original feature set. The comparative assessment of the various CAD architectures shows that combining three types of classifiers with a voting scheme, fed with identical feature sets obtained after appropriate feature selection and fusion, may result in an accurate system able to assist differential diagnosis of focal liver lesions from non-enhanced CT images.

  4. Enhanced flyby science with onboard computer vision: Tracking and surface feature detection at small bodies

    NASA Astrophysics Data System (ADS)

    Fuchs, Thomas J.; Thompson, David R.; Bue, Brian D.; Castillo-Rogez, Julie; Chien, Steve A.; Gharibian, Dero; Wagstaff, Kiri L.

    2015-10-01

    Spacecraft autonomy is crucial to increase the science return of optical remote sensing observations at distant primitive bodies. To date, most small bodies exploration has involved short timescale flybys that execute prescripted data collection sequences. Light time delay means that the spacecraft must operate completely autonomously without direct control from the ground, but in most cases the physical properties and morphologies of prospective targets are unknown before the flyby. Surface features of interest are highly localized, and successful observations must account for geometry and illumination constraints. Under these circumstances onboard computer vision can improve science yield by responding immediately to collected imagery. It can reacquire bad data or identify features of opportunity for additional targeted measurements. We present a comprehensive framework for onboard computer vision for flyby missions at small bodies. We introduce novel algorithms for target tracking, target segmentation, surface feature detection, and anomaly detection. The performance and generalization power are evaluated in detail using expert annotations on data sets from previous encounters with primitive bodies.

  5. Automated detection of pulmonary nodules in CT images with support vector machines

    NASA Astrophysics Data System (ADS)

    Liu, Lu; Liu, Wanyu; Sun, Xiaoming

    2008-10-01

    Many methods have been proposed to avoid radiologists fail to diagnose small pulmonary nodules. Recently, support vector machines (SVMs) had received an increasing attention for pattern recognition. In this paper, we present a computerized system aimed at pulmonary nodules detection; it identifies the lung field, extracts a set of candidate regions with a high sensitivity ratio and then classifies candidates by the use of SVMs. The Computer Aided Diagnosis (CAD) system presented in this paper supports the diagnosis of pulmonary nodules from Computed Tomography (CT) images as inflammation, tuberculoma, granuloma..sclerosing hemangioma, and malignant tumor. Five texture feature sets were extracted for each lesion, while a genetic algorithm based feature selection method was applied to identify the most robust features. The selected feature set was fed into an ensemble of SVMs classifiers. The achieved classification performance was 100%, 92.75% and 90.23% in the training, validation and testing set, respectively. It is concluded that computerized analysis of medical images in combination with artificial intelligence can be used in clinical practice and may contribute to more efficient diagnosis.

  6. Combined rule extraction and feature elimination in supervised classification.

    PubMed

    Liu, Sheng; Patel, Ronak Y; Daga, Pankaj R; Liu, Haining; Fu, Gang; Doerksen, Robert J; Chen, Yixin; Wilkins, Dawn E

    2012-09-01

    There are a vast number of biology related research problems involving a combination of multiple sources of data to achieve a better understanding of the underlying problems. It is important to select and interpret the most important information from these sources. Thus it will be beneficial to have a good algorithm to simultaneously extract rules and select features for better interpretation of the predictive model. We propose an efficient algorithm, Combined Rule Extraction and Feature Elimination (CRF), based on 1-norm regularized random forests. CRF simultaneously extracts a small number of rules generated by random forests and selects important features. We applied CRF to several drug activity prediction and microarray data sets. CRF is capable of producing performance comparable with state-of-the-art prediction algorithms using a small number of decision rules. Some of the decision rules are biologically significant.

  7. Filter Bank Regularized Common Spatial Pattern Ensemble for Small Sample Motor Imagery Classification.

    PubMed

    Park, Sang-Hoon; Lee, David; Lee, Sang-Goog

    2018-02-01

    For the last few years, many feature extraction methods have been proposed based on biological signals. Among these, the brain signals have the advantage that they can be obtained, even by people with peripheral nervous system damage. Motor imagery electroencephalograms (EEG) are inexpensive to measure, offer a high temporal resolution, and are intuitive. Therefore, these have received a significant amount of attention in various fields, including signal processing, cognitive science, and medicine. The common spatial pattern (CSP) algorithm is a useful method for feature extraction from motor imagery EEG. However, performance degradation occurs in a small-sample setting (SSS), because the CSP depends on sample-based covariance. Since the active frequency range is different for each subject, it is also inconvenient to set the frequency range to be different every time. In this paper, we propose the feature extraction method based on a filter bank to solve these problems. The proposed method consists of five steps. First, motor imagery EEG is divided by a using filter bank. Second, the regularized CSP (R-CSP) is applied to the divided EEG. Third, we select the features according to mutual information based on the individual feature algorithm. Fourth, parameter sets are selected for the ensemble. Finally, we classify using ensemble based on features. The brain-computer interface competition III data set IVa is used to evaluate the performance of the proposed method. The proposed method improves the mean classification accuracy by 12.34%, 11.57%, 9%, 4.95%, and 4.47% compared with CSP, SR-CSP, R-CSP, filter bank CSP (FBCSP), and SR-FBCSP. Compared with the filter bank R-CSP ( , ), which is a parameter selection version of the proposed method, the classification accuracy is improved by 3.49%. In particular, the proposed method shows a large improvement in performance in the SSS.

  8. Ion microprobe elemental analyses of impact features on interplanetary dust experiment sensor surfaces

    NASA Technical Reports Server (NTRS)

    Simon, Charles G.; Hunter, Jerry L.; Wortman, Jim J.; Griffis, Dieter P.

    1992-01-01

    Hypervelocity impact features from very small particles (less than 3 microns in diameter) on several of the electro-active dust sensors used in the Interplanetary Dust Experiment (IDE) were subjected to elemental analysis using an ion microscope. The same analytical techniques were applied to impact and containment features on a set of ultra-pure, highly polished single crystal germanium wafer witness plates that were mounted on tray B12. Very little unambiguously identifiable impactor debris was found in the central craters or shatter zones of small impacts in this crystalline surface. The surface contamination, ubiquitous on the surface of the Long Duration Exposure Facility, has greatly complicated data collection and interpretation from microparticle impacts on all surfaces.

  9. Data survey on the effect of product features on competitive advantage of selected firms in Nigeria.

    PubMed

    Olokundun, Maxwell; Iyiola, Oladele; Ibidunni, Stephen; Falola, Hezekiah; Salau, Odunayo; Amaihian, Augusta; Peter, Fred; Borishade, Taiye

    2018-06-01

    The main objective of this study was to present a data article that investigates the effect product features on firm's competitive advantage. Few studies have examined how the features of a product could help in driving the competitive advantage of a firm. Descriptive research method was used. Statistical Package for Social Sciences (SPSS 22) was engaged for analysis of one hundred and fifty (150) valid questionnaire which were completed by small business owners registered under small and medium scale enterprises development of Nigeria (SMEDAN). Stratified and simple random sampling techniques were employed; reliability and validity procedures were also confirmed. The field data set is made publicly available to enable critical or extended analysis.

  10. Predicting a small molecule-kinase interaction map: A machine learning approach

    PubMed Central

    2011-01-01

    Background We present a machine learning approach to the problem of protein ligand interaction prediction. We focus on a set of binding data obtained from 113 different protein kinases and 20 inhibitors. It was attained through ATP site-dependent binding competition assays and constitutes the first available dataset of this kind. We extract information about the investigated molecules from various data sources to obtain an informative set of features. Results A Support Vector Machine (SVM) as well as a decision tree algorithm (C5/See5) is used to learn models based on the available features which in turn can be used for the classification of new kinase-inhibitor pair test instances. We evaluate our approach using different feature sets and parameter settings for the employed classifiers. Moreover, the paper introduces a new way of evaluating predictions in such a setting, where different amounts of information about the binding partners can be assumed to be available for training. Results on an external test set are also provided. Conclusions In most of the cases, the presented approach clearly outperforms the baseline methods used for comparison. Experimental results indicate that the applied machine learning methods are able to detect a signal in the data and predict binding affinity to some extent. For SVMs, the binding prediction can be improved significantly by using features that describe the active site of a kinase. For C5, besides diversity in the feature set, alignment scores of conserved regions turned out to be very useful. PMID:21708012

  11. Hyperspectral data discrimination methods

    NASA Astrophysics Data System (ADS)

    Casasent, David P.; Chen, Xuewen

    2000-12-01

    Hyperspectral data provides spectral response information that provides detailed chemical, moisture, and other description of constituent parts of an item. These new sensor data are useful in USDA product inspection. However, such data introduce problems such as the curse of dimensionality, the need to reduce the number of features used to accommodate realistic small training set sizes, and the need to employ discriminatory features and still achieve good generalization (comparable training and test set performance). Several two-step methods are compared to a new and preferable single-step spectral decomposition algorithm. Initial results on hyperspectral data for good/bad almonds and for good/bad (aflatoxin infested) corn kernels are presented. The hyperspectral application addressed differs greatly from prior USDA work (PLS) in which the level of a specific channel constituent in food was estimated. A validation set (separate from the test set) is used in selecting algorithm parameters. Threshold parameters are varied to select the best Pc operating point. Initial results show that nonlinear features yield improved performance.

  12. Higher criticism thresholding: Optimal feature selection when useful features are rare and weak.

    PubMed

    Donoho, David; Jin, Jiashun

    2008-09-30

    In important application fields today-genomics and proteomics are examples-selecting a small subset of useful features is crucial for success of Linear Classification Analysis. We study feature selection by thresholding of feature Z-scores and introduce a principle of threshold selection, based on the notion of higher criticism (HC). For i = 1, 2, ..., p, let pi(i) denote the two-sided P-value associated with the ith feature Z-score and pi((i)) denote the ith order statistic of the collection of P-values. The HC threshold is the absolute Z-score corresponding to the P-value maximizing the HC objective (i/p - pi((i)))/sqrt{i/p(1-i/p)}. We consider a rare/weak (RW) feature model, where the fraction of useful features is small and the useful features are each too weak to be of much use on their own. HC thresholding (HCT) has interesting behavior in this setting, with an intimate link between maximizing the HC objective and minimizing the error rate of the designed classifier, and very different behavior from popular threshold selection procedures such as false discovery rate thresholding (FDRT). In the most challenging RW settings, HCT uses an unconventionally low threshold; this keeps the missed-feature detection rate under better control than FDRT and yields a classifier with improved misclassification performance. Replacing cross-validated threshold selection in the popular Shrunken Centroid classifier with the computationally less expensive and simpler HCT reduces the variance of the selected threshold and the error rate of the constructed classifier. Results on standard real datasets and in asymptotic theory confirm the advantages of HCT.

  13. Higher criticism thresholding: Optimal feature selection when useful features are rare and weak

    PubMed Central

    Donoho, David; Jin, Jiashun

    2008-01-01

    In important application fields today—genomics and proteomics are examples—selecting a small subset of useful features is crucial for success of Linear Classification Analysis. We study feature selection by thresholding of feature Z-scores and introduce a principle of threshold selection, based on the notion of higher criticism (HC). For i = 1, 2, …, p, let πi denote the two-sided P-value associated with the ith feature Z-score and π(i) denote the ith order statistic of the collection of P-values. The HC threshold is the absolute Z-score corresponding to the P-value maximizing the HC objective (i/p − π(i))/i/p(1−i/p). We consider a rare/weak (RW) feature model, where the fraction of useful features is small and the useful features are each too weak to be of much use on their own. HC thresholding (HCT) has interesting behavior in this setting, with an intimate link between maximizing the HC objective and minimizing the error rate of the designed classifier, and very different behavior from popular threshold selection procedures such as false discovery rate thresholding (FDRT). In the most challenging RW settings, HCT uses an unconventionally low threshold; this keeps the missed-feature detection rate under better control than FDRT and yields a classifier with improved misclassification performance. Replacing cross-validated threshold selection in the popular Shrunken Centroid classifier with the computationally less expensive and simpler HCT reduces the variance of the selected threshold and the error rate of the constructed classifier. Results on standard real datasets and in asymptotic theory confirm the advantages of HCT. PMID:18815365

  14. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Potash, Peter J.; Bell, Eric B.; Harrison, Joshua J.

    Predictive models for tweet deletion have been a relatively unexplored area of Twitter-related computational research. We first approach the deletion of tweets as a spam detection problem, applying a small set of handcrafted features to improve upon the current state-of-the- art in predicting deleted tweets. Next, we apply our approach to a dataset of deleted tweets that better reflects the current deletion rate. Since tweets are deleted for reasons beyond just the presence of spam, we apply topic modeling and text embeddings in order to capture the semantic content of tweets that can lead to tweet deletion. Our goal ismore » to create an effective model that has a low-dimensional feature space and is also language-independent. A lean model would be computationally advantageous processing high-volumes of Twitter data, which can reach 9,885 tweets per second. Our results show that a small set of spam-related features combined with word topics and character-level text embeddings provide the best f1 when trained with a random forest model. The highest precision of the deleted tweet class is achieved by a modification of paragraph2vec to capture author identity.« less

  15. Reproducibility and Prognosis of Quantitative Features Extracted from CT Images12

    PubMed Central

    Balagurunathan, Yoganand; Gu, Yuhua; Wang, Hua; Kumar, Virendra; Grove, Olya; Hawkins, Sam; Kim, Jongphil; Goldgof, Dmitry B; Hall, Lawrence O; Gatenby, Robert A; Gillies, Robert J

    2014-01-01

    We study the reproducibility of quantitative imaging features that are used to describe tumor shape, size, and texture from computed tomography (CT) scans of non-small cell lung cancer (NSCLC). CT images are dependent on various scanning factors. We focus on characterizing image features that are reproducible in the presence of variations due to patient factors and segmentation methods. Thirty-two NSCLC nonenhanced lung CT scans were obtained from the Reference Image Database to Evaluate Response data set. The tumors were segmented using both manual (radiologist expert) and ensemble (software-automated) methods. A set of features (219 three-dimensional and 110 two-dimensional) was computed, and quantitative image features were statistically filtered to identify a subset of reproducible and nonredundant features. The variability in the repeated experiment was measured by the test-retest concordance correlation coefficient (CCCTreT). The natural range in the features, normalized to variance, was measured by the dynamic range (DR). In this study, there were 29 features across segmentation methods found with CCCTreT and DR ≥ 0.9 and R2Bet ≥ 0.95. These reproducible features were tested for predicting radiologist prognostic score; some texture features (run-length and Laws kernels) had an area under the curve of 0.9. The representative features were tested for their prognostic capabilities using an independent NSCLC data set (59 lung adenocarcinomas), where one of the texture features, run-length gray-level nonuniformity, was statistically significant in separating the samples into survival groups (P ≤ .046). PMID:24772210

  16. Large scale analysis of protein-binding cavities using self-organizing maps and wavelet-based surface patches to describe functional properties, selectivity discrimination, and putative cross-reactivity.

    PubMed

    Kupas, Katrin; Ultsch, Alfred; Klebe, Gerhard

    2008-05-15

    A new method to discover similar substructures in protein binding pockets, independently of sequence and folding patterns or secondary structure elements, is introduced. The solvent-accessible surface of a binding pocket, automatically detected as a depression on the protein surface, is divided into a set of surface patches. Each surface patch is characterized by its shape as well as by its physicochemical characteristics. Wavelets defined on surfaces are used for the description of the shape, as they have the great advantage of allowing a comparison at different resolutions. The number of coefficients to describe the wavelets can be chosen with respect to the size of the considered data set. The physicochemical characteristics of the patches are described by the assignment of the exposed amino acid residues to one or more of five different properties determinant for molecular recognition. A self-organizing neural network is used to project the high-dimensional feature vectors onto a two-dimensional layer of neurons, called a map. To find similarities between the binding pockets, in both geometrical and physicochemical features, a clustering of the projected feature vector is performed using an automatic distance- and density-based clustering algorithm. The method was validated with a small training data set of 109 binding cavities originating from a set of enzymes covering 12 different EC numbers. A second test data set of 1378 binding cavities, extracted from enzymes of 13 different EC numbers, was then used to prove the discriminating power of the algorithm and to demonstrate its applicability to large scale analyses. In all cases, members of the data set with the same EC number were placed into coherent regions on the map, with small distances between them. Different EC numbers are separated by large distances between the feature vectors. A third data set comprising three subfamilies of endopeptidases is used to demonstrate the ability of the algorithm to detect similar substructures between functionally related active sites. The algorithm can also be used to predict the function of novel proteins not considered in training data set. 2007 Wiley-Liss, Inc.

  17. Optimal number of features as a function of sample size for various classification rules.

    PubMed

    Hua, Jianping; Xiong, Zixiang; Lowey, James; Suh, Edward; Dougherty, Edward R

    2005-04-15

    Given the joint feature-label distribution, increasing the number of features always results in decreased classification error; however, this is not the case when a classifier is designed via a classification rule from sample data. Typically (but not always), for fixed sample size, the error of a designed classifier decreases and then increases as the number of features grows. The potential downside of using too many features is most critical for small samples, which are commonplace for gene-expression-based classifiers for phenotype discrimination. For fixed sample size and feature-label distribution, the issue is to find an optimal number of features. Since only in rare cases is there a known distribution of the error as a function of the number of features and sample size, this study employs simulation for various feature-label distributions and classification rules, and across a wide range of sample and feature-set sizes. To achieve the desired end, finding the optimal number of features as a function of sample size, it employs massively parallel computation. Seven classifiers are treated: 3-nearest-neighbor, Gaussian kernel, linear support vector machine, polynomial support vector machine, perceptron, regular histogram and linear discriminant analysis. Three Gaussian-based models are considered: linear, nonlinear and bimodal. In addition, real patient data from a large breast-cancer study is considered. To mitigate the combinatorial search for finding optimal feature sets, and to model the situation in which subsets of genes are co-regulated and correlation is internal to these subsets, we assume that the covariance matrix of the features is blocked, with each block corresponding to a group of correlated features. Altogether there are a large number of error surfaces for the many cases. These are provided in full on a companion website, which is meant to serve as resource for those working with small-sample classification. For the companion website, please visit http://public.tgen.org/tamu/ofs/ e-dougherty@ee.tamu.edu.

  18. Fuzzy feature selection based on interval type-2 fuzzy sets

    NASA Astrophysics Data System (ADS)

    Cherif, Sahar; Baklouti, Nesrine; Alimi, Adel; Snasel, Vaclav

    2017-03-01

    When dealing with real world data; noise, complexity, dimensionality, uncertainty and irrelevance can lead to low performance and insignificant judgment. Fuzzy logic is a powerful tool for controlling conflicting attributes which can have similar effects and close meanings. In this paper, an interval type-2 fuzzy feature selection is presented as a new approach for removing irrelevant features and reducing complexity. We demonstrate how can Feature Selection be joined with Interval Type-2 Fuzzy Logic for keeping significant features and hence reducing time complexity. The proposed method is compared with some other approaches. The results show that the number of attributes is proportionally small.

  19. Form drag in rivers due to small-scale natural topographic features: 2. Irregular sequences

    USGS Publications Warehouse

    Kean, J.W.; Smith, J.D.

    2006-01-01

    The size, shape, and spacing of small-scale topographic features found on the boundaries of natural streams, rivers, and floodplains can be quite variable. Consequently, a procedure for determining the form drag on irregular sequences of different-sized topographic features is essential for calculating near-boundary flows and sediment transport. A method for carrying out such calculations is developed in this paper. This method builds on the work of Kean and Smith (2006), which describes the flow field for the simpler case of a regular sequence of identical topographic features. Both approaches model topographic features as two-dimensional elements with Gaussian-shaped cross sections defined in terms of three parameters. Field measurements of bank topography are used to show that (1) the magnitude of these shape parameters can vary greatly between adjacent topographic features and (2) the variability of these shape parameters follows a lognormal distribution. Simulations using an irregular set of topographic roughness elements show that the drag on an individual element is primarily controlled by the size and shape of the feature immediately upstream and that the spatial average of the boundary shear stress over a large set of randomly ordered elements is relatively insensitive to the sequence of the elements. In addition, a method to transform the topography of irregular surfaces into an equivalently rough surface of regularly spaced, identical topographic elements also is given. The methods described in this paper can be used to improve predictions of flow resistance in rivers as well as quantify bank roughness.

  20. A single-layer network unsupervised feature learning method for white matter hyperintensity segmentation

    NASA Astrophysics Data System (ADS)

    Vijverberg, Koen; Ghafoorian, Mohsen; van Uden, Inge W. M.; de Leeuw, Frank-Erik; Platel, Bram; Heskes, Tom

    2016-03-01

    Cerebral small vessel disease (SVD) is a disorder frequently found among the old people and is associated with deterioration in cognitive performance, parkinsonism, motor and mood impairments. White matter hyperintensities (WMH) as well as lacunes, microbleeds and subcortical brain atrophy are part of the spectrum of image findings, related to SVD. Accurate segmentation of WMHs is important for prognosis and diagnosis of multiple neurological disorders such as MS and SVD. Almost all of the published (semi-)automated WMH detection models employ multiple complex hand-crafted features, which require in-depth domain knowledge. In this paper we propose to apply a single-layer network unsupervised feature learning (USFL) method to avoid hand-crafted features, but rather to automatically learn a more efficient set of features. Experimental results show that a computer aided detection system with a USFL system outperforms a hand-crafted approach. Moreover, since the two feature sets have complementary properties, a hybrid system that makes use of both hand-crafted and unsupervised learned features, shows a significant performance boost compared to each system separately, getting close to the performance of an independent human expert.

  1. Resolving Transition Metal Chemical Space: Feature Selection for Machine Learning and Structure-Property Relationships.

    PubMed

    Janet, Jon Paul; Kulik, Heather J

    2017-11-22

    Machine learning (ML) of quantum mechanical properties shows promise for accelerating chemical discovery. For transition metal chemistry where accurate calculations are computationally costly and available training data sets are small, the molecular representation becomes a critical ingredient in ML model predictive accuracy. We introduce a series of revised autocorrelation functions (RACs) that encode relationships of the heuristic atomic properties (e.g., size, connectivity, and electronegativity) on a molecular graph. We alter the starting point, scope, and nature of the quantities evaluated in standard ACs to make these RACs amenable to inorganic chemistry. On an organic molecule set, we first demonstrate superior standard AC performance to other presently available topological descriptors for ML model training, with mean unsigned errors (MUEs) for atomization energies on set-aside test molecules as low as 6 kcal/mol. For inorganic chemistry, our RACs yield 1 kcal/mol ML MUEs on set-aside test molecules in spin-state splitting in comparison to 15-20× higher errors for feature sets that encode whole-molecule structural information. Systematic feature selection methods including univariate filtering, recursive feature elimination, and direct optimization (e.g., random forest and LASSO) are compared. Random-forest- or LASSO-selected subsets 4-5× smaller than the full RAC set produce sub- to 1 kcal/mol spin-splitting MUEs, with good transferability to metal-ligand bond length prediction (0.004-5 Å MUE) and redox potential on a smaller data set (0.2-0.3 eV MUE). Evaluation of feature selection results across property sets reveals the relative importance of local, electronic descriptors (e.g., electronegativity, atomic number) in spin-splitting and distal, steric effects in redox potential and bond lengths.

  2. A unified framework for image retrieval using keyword and visual features.

    PubMed

    Jing, Feng; Li, Mingling; Zhang, Hong-Jiang; Zhang, Bo

    2005-07-01

    In this paper, a unified image retrieval framework based on both keyword annotations and visual features is proposed. In this framework, a set of statistical models are built based on visual features of a small set of manually labeled images to represent semantic concepts and used to propagate keywords to other unlabeled images. These models are updated periodically when more images implicitly labeled by users become available through relevance feedback. In this sense, the keyword models serve the function of accumulation and memorization of knowledge learned from user-provided relevance feedback. Furthermore, two sets of effective and efficient similarity measures and relevance feedback schemes are proposed for query by keyword scenario and query by image example scenario, respectively. Keyword models are combined with visual features in these schemes. In particular, a new, entropy-based active learning strategy is introduced to improve the efficiency of relevance feedback for query by keyword. Furthermore, a new algorithm is proposed to estimate the keyword features of the search concept for query by image example. It is shown to be more appropriate than two existing relevance feedback algorithms. Experimental results demonstrate the effectiveness of the proposed framework.

  3. Shift-invariant discrete wavelet transform analysis for retinal image classification.

    PubMed

    Khademi, April; Krishnan, Sridhar

    2007-12-01

    This work involves retinal image classification and a novel analysis system was developed. From the compressed domain, the proposed scheme extracts textural features from wavelet coefficients, which describe the relative homogeneity of localized areas of the retinal images. Since the discrete wavelet transform (DWT) is shift-variant, a shift-invariant DWT was explored to ensure that a robust feature set was extracted. To combat the small database size, linear discriminant analysis classification was used with the leave one out method. 38 normal and 48 abnormal (exudates, large drusens, fine drusens, choroidal neovascularization, central vein and artery occlusion, histoplasmosis, arteriosclerotic retinopathy, hemi-central retinal vein occlusion and more) were used and a specificity of 79% and sensitivity of 85.4% were achieved (the average classification rate is 82.2%). The success of the system can be accounted to the highly robust feature set which included translation, scale and semi-rotational, features. Additionally, this technique is database independent since the features were specifically tuned to the pathologies of the human eye.

  4. The Paragon Algorithm, a next generation search engine that uses sequence temperature values and feature probabilities to identify peptides from tandem mass spectra.

    PubMed

    Shilov, Ignat V; Seymour, Sean L; Patel, Alpesh A; Loboda, Alex; Tang, Wilfred H; Keating, Sean P; Hunter, Christie L; Nuwaysir, Lydia M; Schaeffer, Daniel A

    2007-09-01

    The Paragon Algorithm, a novel database search engine for the identification of peptides from tandem mass spectrometry data, is presented. Sequence Temperature Values are computed using a sequence tag algorithm, allowing the degree of implication by an MS/MS spectrum of each region of a database to be determined on a continuum. Counter to conventional approaches, features such as modifications, substitutions, and cleavage events are modeled with probabilities rather than by discrete user-controlled settings to consider or not consider a feature. The use of feature probabilities in conjunction with Sequence Temperature Values allows for a very large increase in the effective search space with only a very small increase in the actual number of hypotheses that must be scored. The algorithm has a new kind of user interface that removes the user expertise requirement, presenting control settings in the language of the laboratory that are translated to optimal algorithmic settings. To validate this new algorithm, a comparison with Mascot is presented for a series of analogous searches to explore the relative impact of increasing search space probed with Mascot by relaxing the tryptic digestion conformance requirements from trypsin to semitrypsin to no enzyme and with the Paragon Algorithm using its Rapid mode and Thorough mode with and without tryptic specificity. Although they performed similarly for small search space, dramatic differences were observed in large search space. With the Paragon Algorithm, hundreds of biological and artifact modifications, all possible substitutions, and all levels of conformance to the expected digestion pattern can be searched in a single search step, yet the typical cost in search time is only 2-5 times that of conventional small search space. Despite this large increase in effective search space, there is no drastic loss of discrimination that typically accompanies the exploration of large search space.

  5. Assessing future vent opening locations at the Somma-Vesuvio volcanic complex: 1. A new information geodatabase with uncertainty characterizations

    NASA Astrophysics Data System (ADS)

    Tadini, A.; Bisson, M.; Neri, A.; Cioni, R.; Bevilacqua, A.; Aspinall, W. P.

    2017-06-01

    This study presents new and revised data sets about the spatial distribution of past volcanic vents, eruptive fissures, and regional/local structures of the Somma-Vesuvio volcanic system (Italy). The innovative features of the study are the identification and quantification of important sources of uncertainty affecting interpretations of the data sets. In this regard, the spatial uncertainty of each feature is modeled by an uncertainty area, i.e., a geometric element typically represented by a polygon drawn around points or lines. The new data sets have been assembled as an updatable geodatabase that integrates and complements existing databases for Somma-Vesuvio. The data are organized into 4 data sets and stored as 11 feature classes (points and lines for feature locations and polygons for the associated uncertainty areas), totaling more than 1700 elements. More specifically, volcanic vent and eruptive fissure elements are subdivided into feature classes according to their associated eruptive styles: (i) Plinian and sub-Plinian eruptions (i.e., large- or medium-scale explosive activity); (ii) violent Strombolian and continuous ash emission eruptions (i.e., small-scale explosive activity); and (iii) effusive eruptions (including eruptions from both parasitic vents and eruptive fissures). Regional and local structures (i.e., deep faults) are represented as linear feature classes. To support interpretation of the eruption data, additional data sets are provided for Somma-Vesuvio geological units and caldera morphological features. In the companion paper, the data presented here, and the associated uncertainties, are used to develop a first vent opening probability map for the Somma-Vesuvio caldera, with specific attention focused on large or medium explosive events.

  6. Comparison of machine learning methods for classifying mediastinal lymph node metastasis of non-small cell lung cancer from 18F-FDG PET/CT images.

    PubMed

    Wang, Hongkai; Zhou, Zongwei; Li, Yingci; Chen, Zhonghua; Lu, Peiou; Wang, Wenzhi; Liu, Wanyu; Yu, Lijuan

    2017-12-01

    This study aimed to compare one state-of-the-art deep learning method and four classical machine learning methods for classifying mediastinal lymph node metastasis of non-small cell lung cancer (NSCLC) from 18 F-FDG PET/CT images. Another objective was to compare the discriminative power of the recently popular PET/CT texture features with the widely used diagnostic features such as tumor size, CT value, SUV, image contrast, and intensity standard deviation. The four classical machine learning methods included random forests, support vector machines, adaptive boosting, and artificial neural network. The deep learning method was the convolutional neural networks (CNN). The five methods were evaluated using 1397 lymph nodes collected from PET/CT images of 168 patients, with corresponding pathology analysis results as gold standard. The comparison was conducted using 10 times 10-fold cross-validation based on the criterion of sensitivity, specificity, accuracy (ACC), and area under the ROC curve (AUC). For each classical method, different input features were compared to select the optimal feature set. Based on the optimal feature set, the classical methods were compared with CNN, as well as with human doctors from our institute. For the classical methods, the diagnostic features resulted in 81~85% ACC and 0.87~0.92 AUC, which were significantly higher than the results of texture features. CNN's sensitivity, specificity, ACC, and AUC were 84, 88, 86, and 0.91, respectively. There was no significant difference between the results of CNN and the best classical method. The sensitivity, specificity, and ACC of human doctors were 73, 90, and 82, respectively. All the five machine learning methods had higher sensitivities but lower specificities than human doctors. The present study shows that the performance of CNN is not significantly different from the best classical methods and human doctors for classifying mediastinal lymph node metastasis of NSCLC from PET/CT images. Because CNN does not need tumor segmentation or feature calculation, it is more convenient and more objective than the classical methods. However, CNN does not make use of the import diagnostic features, which have been proved more discriminative than the texture features for classifying small-sized lymph nodes. Therefore, incorporating the diagnostic features into CNN is a promising direction for future research.

  7. Decorrelation of the true and estimated classifier errors in high-dimensional settings.

    PubMed

    Hanczar, Blaise; Hua, Jianping; Dougherty, Edward R

    2007-01-01

    The aim of many microarray experiments is to build discriminatory diagnosis and prognosis models. Given the huge number of features and the small number of examples, model validity which refers to the precision of error estimation is a critical issue. Previous studies have addressed this issue via the deviation distribution (estimated error minus true error), in particular, the deterioration of cross-validation precision in high-dimensional settings where feature selection is used to mitigate the peaking phenomenon (overfitting). Because classifier design is based upon random samples, both the true and estimated errors are sample-dependent random variables, and one would expect a loss of precision if the estimated and true errors are not well correlated, so that natural questions arise as to the degree of correlation and the manner in which lack of correlation impacts error estimation. We demonstrate the effect of correlation on error precision via a decomposition of the variance of the deviation distribution, observe that the correlation is often severely decreased in high-dimensional settings, and show that the effect of high dimensionality on error estimation tends to result more from its decorrelating effects than from its impact on the variance of the estimated error. We consider the correlation between the true and estimated errors under different experimental conditions using both synthetic and real data, several feature-selection methods, different classification rules, and three error estimators commonly used (leave-one-out cross-validation, k-fold cross-validation, and .632 bootstrap). Moreover, three scenarios are considered: (1) feature selection, (2) known-feature set, and (3) all features. Only the first is of practical interest; however, the other two are needed for comparison purposes. We will observe that the true and estimated errors tend to be much more correlated in the case of a known feature set than with either feature selection or using all features, with the better correlation between the latter two showing no general trend, but differing for different models.

  8. Constant size descriptors for accurate machine learning models of molecular properties

    NASA Astrophysics Data System (ADS)

    Collins, Christopher R.; Gordon, Geoffrey J.; von Lilienfeld, O. Anatole; Yaron, David J.

    2018-06-01

    Two different classes of molecular representations for use in machine learning of thermodynamic and electronic properties are studied. The representations are evaluated by monitoring the performance of linear and kernel ridge regression models on well-studied data sets of small organic molecules. One class of representations studied here counts the occurrence of bonding patterns in the molecule. These require only the connectivity of atoms in the molecule as may be obtained from a line diagram or a SMILES string. The second class utilizes the three-dimensional structure of the molecule. These include the Coulomb matrix and Bag of Bonds, which list the inter-atomic distances present in the molecule, and Encoded Bonds, which encode such lists into a feature vector whose length is independent of molecular size. Encoded Bonds' features introduced here have the advantage of leading to models that may be trained on smaller molecules and then used successfully on larger molecules. A wide range of feature sets are constructed by selecting, at each rank, either a graph or geometry-based feature. Here, rank refers to the number of atoms involved in the feature, e.g., atom counts are rank 1, while Encoded Bonds are rank 2. For atomization energies in the QM7 data set, the best graph-based feature set gives a mean absolute error of 3.4 kcal/mol. Inclusion of 3D geometry substantially enhances the performance, with Encoded Bonds giving 2.4 kcal/mol, when used alone, and 1.19 kcal/mol, when combined with graph features.

  9. Compact cancer biomarkers discovery using a swarm intelligence feature selection algorithm.

    PubMed

    Martinez, Emmanuel; Alvarez, Mario Moises; Trevino, Victor

    2010-08-01

    Biomarker discovery is a typical application from functional genomics. Due to the large number of genes studied simultaneously in microarray data, feature selection is a key step. Swarm intelligence has emerged as a solution for the feature selection problem. However, swarm intelligence settings for feature selection fail to select small features subsets. We have proposed a swarm intelligence feature selection algorithm based on the initialization and update of only a subset of particles in the swarm. In this study, we tested our algorithm in 11 microarray datasets for brain, leukemia, lung, prostate, and others. We show that the proposed swarm intelligence algorithm successfully increase the classification accuracy and decrease the number of selected features compared to other swarm intelligence methods. Copyright © 2010 Elsevier Ltd. All rights reserved.

  10. Small Business Grants at the National Cancer Institute and National Institutes of Health

    NASA Astrophysics Data System (ADS)

    Baker, Houston

    2002-10-01

    Ten Federal Agencies set aside 2.5% of their external research budget for US small businesses—mainly for technology research and development, including radiation sensor system developments. Five agencies also set aside another 0.15% for the Small Business Technology Transfer Program, which is intended to facilitate technology transfers from research laboratories to public use through small businesses. The second largest of these agencies is the Department of Health and Human Services, and almost all of its extramural research funds flow through the 28 Institutes and Centers of the National Institutes of Health. For information, instructions, and application forms, visit the NIH website's Omnibus Solicitation for SBIR and STTR applications. The National Cancer Institute is the largest NIH research unit and SBIR/STTR participant. NCI also issues SBIR and STTR Program Announcements of its own that feature details modified to better support its initiatives and objectives in cancer prevention, detection, diagnosis, treatment, and monitoring.

  11. The Distribution and Behaviour of Photospheric Magnetic Features

    NASA Astrophysics Data System (ADS)

    Parnell, C. E.; Lamb, D. A.; DeForest, C. E.

    2014-12-01

    Over the past two decades enormous amounts of data on the magnetic fields of the solar photosphere have been produced by both ground-based (Kitt Peak & SOLIS), as well as space-based instruments (MDI, Hinode & HMI). In order to study the behaviour and distribution of photospheric magnetic features, efficient automated detection routines need to be utilised to identify and track magnetic features. In this talk, I will discuss the pros and cons of different automated magnetic feature identification and tracking routines with a special focus on the requirements of these codes to deal with the large data sets produced by HMI. By patching together results from Hinode and MDI (high-res & full-disk), the fluxes of magnetic features were found to follow a power-law over 5 orders of magnitude. At the strong flux tail of this distribution, the power law was found to fall off at solar minimum, but was maintained over all fluxes during solar maximum. However, the point of deflection in the power-law distribution occurs at a patching point between instruments and so questions remain over the reasons for the deflection. The feature fluxes determined from the superb high-resolution HMI data covers almost all of the 5 orders of magnitude. Considering both solar mimimum and solar maximum HMI data sets, we investigate whether the power-law over 5 orders of magnitude in flux still holds. Furthermore, we investigate the behaviour of magnetic features in order to probe the nature of their origin. In particular, we analyse small-scale flux emergence events using HMI data to investigate the existence of a small-scale dynamo just below the solar photosphere.

  12. xMSanalyzer: automated pipeline for improved feature detection and downstream analysis of large-scale, non-targeted metabolomics data.

    PubMed

    Uppal, Karan; Soltow, Quinlyn A; Strobel, Frederick H; Pittard, W Stephen; Gernert, Kim M; Yu, Tianwei; Jones, Dean P

    2013-01-16

    Detection of low abundance metabolites is important for de novo mapping of metabolic pathways related to diet, microbiome or environmental exposures. Multiple algorithms are available to extract m/z features from liquid chromatography-mass spectral data in a conservative manner, which tends to preclude detection of low abundance chemicals and chemicals found in small subsets of samples. The present study provides software to enhance such algorithms for feature detection, quality assessment, and annotation. xMSanalyzer is a set of utilities for automated processing of metabolomics data. The utilites can be classified into four main modules to: 1) improve feature detection for replicate analyses by systematic re-extraction with multiple parameter settings and data merger to optimize the balance between sensitivity and reliability, 2) evaluate sample quality and feature consistency, 3) detect feature overlap between datasets, and 4) characterize high-resolution m/z matches to small molecule metabolites and biological pathways using multiple chemical databases. The package was tested with plasma samples and shown to more than double the number of features extracted while improving quantitative reliability of detection. MS/MS analysis of a random subset of peaks that were exclusively detected using xMSanalyzer confirmed that the optimization scheme improves detection of real metabolites. xMSanalyzer is a package of utilities for data extraction, quality control assessment, detection of overlapping and unique metabolites in multiple datasets, and batch annotation of metabolites. The program was designed to integrate with existing packages such as apLCMS and XCMS, but the framework can also be used to enhance data extraction for other LC/MS data software.

  13. NAVSUP Global Logistics Support

    DTIC Science & Technology

    2012-08-01

    Support $3.5 M Ill SB Contracting Actions Ill SB Value 35% of total spend to Small Business ! NAVAL SUPPLY SYSTEMS COMMAND • Procurement • Barge...Other services now using as well • Awarded Aug 2011, Features: • 100% Sma II Business Set Aside ! • 25 multiple award task order contracts to 8...UP- GLOBAL LOGISTICS I · -~ --; •• ~.c. SUPPORT ,.. NAVAL SUPPLY SYSTEMS COMMAND Fiscal Year 2011 Small Business Contracting Spend: 28,000 actions

  14. Neuron’s eye view: Inferring features of complex stimuli from neural responses

    PubMed Central

    Chen, Xin; Beck, Jeffrey M.

    2017-01-01

    Experiments that study neural encoding of stimuli at the level of individual neurons typically choose a small set of features present in the world—contrast and luminance for vision, pitch and intensity for sound—and assemble a stimulus set that systematically varies along these dimensions. Subsequent analysis of neural responses to these stimuli typically focuses on regression models, with experimenter-controlled features as predictors and spike counts or firing rates as responses. Unfortunately, this approach requires knowledge in advance about the relevant features coded by a given population of neurons. For domains as complex as social interaction or natural movement, however, the relevant feature space is poorly understood, and an arbitrary a priori choice of features may give rise to confirmation bias. Here, we present a Bayesian model for exploratory data analysis that is capable of automatically identifying the features present in unstructured stimuli based solely on neuronal responses. Our approach is unique within the class of latent state space models of neural activity in that it assumes that firing rates of neurons are sensitive to multiple discrete time-varying features tied to the stimulus, each of which has Markov (or semi-Markov) dynamics. That is, we are modeling neural activity as driven by multiple simultaneous stimulus features rather than intrinsic neural dynamics. We derive a fast variational Bayesian inference algorithm and show that it correctly recovers hidden features in synthetic data, as well as ground-truth stimulus features in a prototypical neural dataset. To demonstrate the utility of the algorithm, we also apply it to cluster neural responses and demonstrate successful recovery of features corresponding to monkeys and faces in the image set. PMID:28827790

  15. Association of high proliferation marker Ki-67 expression with DCEMR imaging features of breast: a large scale evaluation

    NASA Astrophysics Data System (ADS)

    Saha, Ashirbani; Harowicz, Michael R.; Grimm, Lars J.; Kim, Connie E.; Ghate, Sujata V.; Walsh, Ruth; Mazurowski, Maciej A.

    2018-02-01

    One of the methods widely used to measure the proliferative activity of cells in breast cancer patients is the immunohistochemical (IHC) measurement of the percentage of cells stained for nuclear antigen Ki-67. Use of Ki-67 expression as a prognostic marker is still under investigation. However, numerous clinical studies have reported an association between a high Ki-67 and overall survival (OS) and disease free survival (DFS). On the other hand, to offer non-invasive alternative in determining Ki-67 expression, researchers have made recent attempts to study the association of Ki-67 expression with magnetic resonance (MR) imaging features of breast cancer in small cohorts (<30). Here, we present a large scale evaluation of the relationship between imaging features and Ki-67 score as: (a) we used a set of 450 invasive breast cancer patients, (b) we extracted a set of 529 imaging features of shape and enhancement from breast, tumor and fibroglandular tissue of the patients, (c) used a subset of patients as the training set to select features and trained a multivariate logistic regression model to predict high versus low Ki-67 values, and (d) we validated the performance of the trained model in an independent test set using the area-under the receiver operating characteristics (ROC) curve (AUC) of the values predicted. Our model was able to predict high versus low Ki-67 in the test set with an AUC of 0.67 (95% CI: 0.58-0.75, p<1.1e-04). Thus, a moderate strength of association of Ki-67 values and MRextracted imaging features was demonstrated in our experiments.

  16. Arabic OCR: toward a complete system

    NASA Astrophysics Data System (ADS)

    El-Bialy, Ahmed M.; Kandil, Ahmed H.; Hashish, Mohamed; Yamany, Sameh M.

    1999-12-01

    Latin and Chinese OCR systems have been studied extensively in the literature. Yet little work was performed for Arabic character recognition. This is due to the technical challenges found in the Arabic text. Due to its cursive nature, a powerful and stable text segmentation is needed. Also; features capturing the characteristics of the rich Arabic character representation are needed to build the Arabic OCR. In this paper a novel segmentation technique which is font and size independent is introduced. This technique can segment the cursive written text line even if the line suffers from small skewness. The technique is not sensitive to the location of the centerline of the text line and can segment different font sizes and type (for different character sets) occurring on the same line. Features extraction is considered one of the most important phases of the text reading system. Ideally, the features extracted from a character image should capture the essential characteristics of this character that are independent of the font type and size. In such ideal case, the classifier stores a single prototype per character. However, it is practically challenging to find such ideal set of features. In this paper, a set of features that reflect the topological aspects of Arabia characters is proposed. These proposed features integrated with a topological matching technique introduce an Arabic text reading system that is semi Omni.

  17. Close-range sensors for small unmanned bottom vehicles: update

    NASA Astrophysics Data System (ADS)

    Bernstein, Charles L.

    2000-07-01

    The Surf Zone Reconnaissance Project is developing sensors for small, autonomous, Underwater Bottom-crawling Vehicles. The objective is to enable small, crawling robots to autonomously detect and classify mines and obstacles on the ocean bottom in depths between 0 and 10 feet. We have identified a promising set of techniques that will exploit the electromagnetic, shape, texture, image, and vibratory- modal features of this images. During FY99 and FY00 we have worked toward refining these techniques. Signature data sets have been collected for a standard target set to facilitate the development of sensor fusion and target detection and classification algorithms. Specific behaviors, termed microbehaviors, are developed to utilize the robot's mobility to position and operate the sensors. A first generation, close-range sensor suite, composed of 5 sensors, will be completed and tested on a crawling platform in FY00, and will be further refined and demonstrated in FY01 as part of the Mine Countermeasures 6.3 core program sponsored by the Office of Naval Research.

  18. Predicting hot spots in protein interfaces based on protrusion index, pseudo hydrophobicity and electron-ion interaction pseudopotential features

    PubMed Central

    Xia, Junfeng; Yue, Zhenyu; Di, Yunqiang; Zhu, Xiaolei; Zheng, Chun-Hou

    2016-01-01

    The identification of hot spots, a small subset of protein interfaces that accounts for the majority of binding free energy, is becoming more important for the research of drug design and cancer development. Based on our previous methods (APIS and KFC2), here we proposed a novel hot spot prediction method. For each hot spot residue, we firstly constructed a wide variety of 108 sequence, structural, and neighborhood features to characterize potential hot spot residues, including conventional ones and new one (pseudo hydrophobicity) exploited in this study. We then selected 3 top-ranking features that contribute the most in the classification by a two-step feature selection process consisting of minimal-redundancy-maximal-relevance algorithm and an exhaustive search method. We used support vector machines to build our final prediction model. When testing our model on an independent test set, our method showed the highest F1-score of 0.70 and MCC of 0.46 comparing with the existing state-of-the-art hot spot prediction methods. Our results indicate that these features are more effective than the conventional features considered previously, and that the combination of our and traditional features may support the creation of a discriminative feature set for efficient prediction of hot spots in protein interfaces. PMID:26934646

  19. Non-specific filtering of beta-distributed data.

    PubMed

    Wang, Xinhui; Laird, Peter W; Hinoue, Toshinori; Groshen, Susan; Siegmund, Kimberly D

    2014-06-19

    Non-specific feature selection is a dimension reduction procedure performed prior to cluster analysis of high dimensional molecular data. Not all measured features are expected to show biological variation, so only the most varying are selected for analysis. In DNA methylation studies, DNA methylation is measured as a proportion, bounded between 0 and 1, with variance a function of the mean. Filtering on standard deviation biases the selection of probes to those with mean values near 0.5. We explore the effect this has on clustering, and develop alternate filter methods that utilize a variance stabilizing transformation for Beta distributed data and do not share this bias. We compared results for 11 different non-specific filters on eight Infinium HumanMethylation data sets, selected to span a variety of biological conditions. We found that for data sets having a small fraction of samples showing abnormal methylation of a subset of normally unmethylated CpGs, a characteristic of the CpG island methylator phenotype in cancer, a novel filter statistic that utilized a variance-stabilizing transformation for Beta distributed data outperformed the common filter of using standard deviation of the DNA methylation proportion, or its log-transformed M-value, in its ability to detect the cancer subtype in a cluster analysis. However, the standard deviation filter always performed among the best for distinguishing subgroups of normal tissue. The novel filter and standard deviation filter tended to favour features in different genome contexts; for the same data set, the novel filter always selected more features from CpG island promoters and the standard deviation filter always selected more features from non-CpG island intergenic regions. Interestingly, despite selecting largely non-overlapping sets of features, the two filters did find sample subsets that overlapped for some real data sets. We found two different filter statistics that tended to prioritize features with different characteristics, each performed well for identifying clusters of cancer and non-cancer tissue, and identifying a cancer CpG island hypermethylation phenotype. Since cluster analysis is for discovery, we would suggest trying both filters on any new data sets, evaluating the overlap of features selected and clusters discovered.

  20. Linguistic feature analysis for protein interaction extraction

    PubMed Central

    2009-01-01

    Background The rapid growth of the amount of publicly available reports on biomedical experimental results has recently caused a boost of text mining approaches for protein interaction extraction. Most approaches rely implicitly or explicitly on linguistic, i.e., lexical and syntactic, data extracted from text. However, only few attempts have been made to evaluate the contribution of the different feature types. In this work, we contribute to this evaluation by studying the relative importance of deep syntactic features, i.e., grammatical relations, shallow syntactic features (part-of-speech information) and lexical features. For this purpose, we use a recently proposed approach that uses support vector machines with structured kernels. Results Our results reveal that the contribution of the different feature types varies for the different data sets on which the experiments were conducted. The smaller the training corpus compared to the test data, the more important the role of grammatical relations becomes. Moreover, deep syntactic information based classifiers prove to be more robust on heterogeneous texts where no or only limited common vocabulary is shared. Conclusion Our findings suggest that grammatical relations play an important role in the interaction extraction task. Moreover, the net advantage of adding lexical and shallow syntactic features is small related to the number of added features. This implies that efficient classifiers can be built by using only a small fraction of the features that are typically being used in recent approaches. PMID:19909518

  1. UArizona at the CLEF eRisk 2017 Pilot Task: Linear and Recurrent Models for Early Depression Detection

    PubMed Central

    Sadeque, Farig; Xu, Dongfang; Bethard, Steven

    2017-01-01

    The 2017 CLEF eRisk pilot task focuses on automatically detecting depression as early as possible from a users’ posts to Reddit. In this paper we present the techniques employed for the University of Arizona team’s participation in this early risk detection shared task. We leveraged external information beyond the small training set, including a preexisting depression lexicon and concepts from the Unified Medical Language System as features. For prediction, we used both sequential (recurrent neural network) and non-sequential (support vector machine) models. Our models perform decently on the test data, and the recurrent neural models perform better than the non-sequential support vector machines while using the same feature sets. PMID:29075167

  2. Metastable structures and size effects in small group dynamics

    PubMed Central

    Lauro Grotto, Rosapia; Guazzini, Andrea; Bagnoli, Franco

    2014-01-01

    In his seminal works on group dynamics Bion defined a specific therapeutic setting allowing psychoanalytic observations on group phenomena. In describing the setting he proposed that the group was where his voice arrived. This physical limit was later made operative by assuming that the natural dimension of a therapeutic group is around 12 people. Bion introduced a theory of the group aspects of the mind in which proto-mental individual states spontaneously evolve into shared psychological states that are characterized by a series of features: (1) they emerge as a consequence of the natural tendency of (both conscious and unconscious) emotions to combine into structured group patterns; (2) they have a certain degree of stability in time; (3) they tend to alternate so that the dissolution of one is rapidly followed by the emergence of another; (4) they can be described in qualitative terms according to the nature of the emotional mix that dominates the state, in structural terms by a kind of typical “leadership” pattern, and in “cognitive” terms by a set of implicit expectations that are helpful in explaining the group behavior (i.e., the group behaves “as if” it was assuming that). Here we adopt a formal approach derived from Socio-physics in order to explore some of the structural and dynamic properties of this small group dynamics. We will described data from an analytic DS model simulating small group interactions of agents endowed with a very simplified emotional and cognitive dynamic in order to assess the following main points: (1) are metastable collective states allowed to emerge in the model and if so, under which conditions in the parameter space? (2) can these states be differentiated in structural terms? (3) to what extent are the emergent dynamic features of the systems dependent of the system size? We will finally discuss possible future applications of the quantitative descriptions of the interaction structure in the small group clinical setting. PMID:25071665

  3. MS2Analyzer: A Software for Small Molecule Substructure Annotations from Accurate Tandem Mass Spectra

    PubMed Central

    2015-01-01

    Systematic analysis and interpretation of the large number of tandem mass spectra (MS/MS) obtained in metabolomics experiments is a bottleneck in discovery-driven research. MS/MS mass spectral libraries are small compared to all known small molecule structures and are often not freely available. MS2Analyzer was therefore developed to enable user-defined searches of thousands of spectra for mass spectral features such as neutral losses, m/z differences, and product and precursor ions from MS/MS spectra in MSP/MGF files. The software is freely available at http://fiehnlab.ucdavis.edu/projects/MS2Analyzer/. As the reference query set, 147 literature-reported neutral losses and their corresponding substructures were collected. This set was tested for accuracy of linking neutral loss analysis to substructure annotations using 19 329 accurate mass tandem mass spectra of structurally known compounds from the NIST11 MS/MS library. Validation studies showed that 92.1 ± 6.4% of 13 typical neutral losses such as acetylations, cysteine conjugates, or glycosylations are correct annotating the associated substructures, while the absence of mass spectra features does not necessarily imply the absence of such substructures. Use of this tool has been successfully demonstrated for complex lipids in microalgae. PMID:25263576

  4. Learning in data-limited multimodal scenarios: Scandent decision forests and tree-based features.

    PubMed

    Hor, Soheil; Moradi, Mehdi

    2016-12-01

    Incomplete and inconsistent datasets often pose difficulties in multimodal studies. We introduce the concept of scandent decision trees to tackle these difficulties. Scandent trees are decision trees that optimally mimic the partitioning of the data determined by another decision tree, and crucially, use only a subset of the feature set. We show how scandent trees can be used to enhance the performance of decision forests trained on a small number of multimodal samples when we have access to larger datasets with vastly incomplete feature sets. Additionally, we introduce the concept of tree-based feature transforms in the decision forest paradigm. When combined with scandent trees, the tree-based feature transforms enable us to train a classifier on a rich multimodal dataset, and use it to classify samples with only a subset of features of the training data. Using this methodology, we build a model trained on MRI and PET images of the ADNI dataset, and then test it on cases with only MRI data. We show that this is significantly more effective in staging of cognitive impairments compared to a similar decision forest model trained and tested on MRI only, or one that uses other kinds of feature transform applied to the MRI data. Copyright © 2016. Published by Elsevier B.V.

  5. Effect of Written Presentation on Performance in Introductory Physics

    ERIC Educational Resources Information Center

    Stewart, John; Ballard, Shawn

    2010-01-01

    This study examined the written work of students in the introductory calculus-based electricity and magnetism course at the University of Arkansas. The students' solutions to hourly exams were divided into a small set of countable features organized into three major categories, mathematics, language, and graphics. Each category was further divided…

  6. An Analysis of Testing Time within a Mastery-Based Medical School Course.

    ERIC Educational Resources Information Center

    Wade, David R.; Williams, Reed G.

    1979-01-01

    Southern Illinois University School of Medicine's personalized teaching system has the following features: students are provided with behavioral objectives prior to instruction; passing levels for tests are set in advance and are independent of class performance; and the program is divided into small units and students are tested frequently. (LBH)

  7. Simulation of millimeter-wave body images and its application to biometric recognition

    NASA Astrophysics Data System (ADS)

    Moreno-Moreno, Miriam; Fierrez, Julian; Vera-Rodriguez, Ruben; Parron, Josep

    2012-06-01

    One of the emerging applications of the millimeter-wave imaging technology is its use in biometric recognition. This is mainly due to some properties of the millimeter-waves such as their ability to penetrate through clothing and other occlusions, their low obtrusiveness when collecting the image and the fact that they are harmless to health. In this work we first describe the generation of a database comprising 1200 synthetic images at 94 GHz obtained from the body of 50 people. Then we extract a small set of distance-based features from each image and select the best feature subsets for person recognition using the SFFS feature selection algorithm. Finally these features are used in body geometry authentication obtaining promising results.

  8. Sparse Substring Pattern Set Discovery Using Linear Programming Boosting

    NASA Astrophysics Data System (ADS)

    Kashihara, Kazuaki; Hatano, Kohei; Bannai, Hideo; Takeda, Masayuki

    In this paper, we consider finding a small set of substring patterns which classifies the given documents well. We formulate the problem as 1 norm soft margin optimization problem where each dimension corresponds to a substring pattern. Then we solve this problem by using LPBoost and an optimal substring discovery algorithm. Since the problem is a linear program, the resulting solution is likely to be sparse, which is useful for feature selection. We evaluate the proposed method for real data such as movie reviews.

  9. Multi-class computational evolution: development, benchmark evaluation and application to RNA-Seq biomarker discovery.

    PubMed

    Crabtree, Nathaniel M; Moore, Jason H; Bowyer, John F; George, Nysia I

    2017-01-01

    A computational evolution system (CES) is a knowledge discovery engine that can identify subtle, synergistic relationships in large datasets. Pareto optimization allows CESs to balance accuracy with model complexity when evolving classifiers. Using Pareto optimization, a CES is able to identify a very small number of features while maintaining high classification accuracy. A CES can be designed for various types of data, and the user can exploit expert knowledge about the classification problem in order to improve discrimination between classes. These characteristics give CES an advantage over other classification and feature selection algorithms, particularly when the goal is to identify a small number of highly relevant, non-redundant biomarkers. Previously, CESs have been developed only for binary class datasets. In this study, we developed a multi-class CES. The multi-class CES was compared to three common feature selection and classification algorithms: support vector machine (SVM), random k-nearest neighbor (RKNN), and random forest (RF). The algorithms were evaluated on three distinct multi-class RNA sequencing datasets. The comparison criteria were run-time, classification accuracy, number of selected features, and stability of selected feature set (as measured by the Tanimoto distance). The performance of each algorithm was data-dependent. CES performed best on the dataset with the smallest sample size, indicating that CES has a unique advantage since the accuracy of most classification methods suffer when sample size is small. The multi-class extension of CES increases the appeal of its application to complex, multi-class datasets in order to identify important biomarkers and features.

  10. Deep-learning derived features for lung nodule classification with limited datasets

    NASA Astrophysics Data System (ADS)

    Thammasorn, P.; Wu, W.; Pierce, L. A.; Pipavath, S. N.; Lampe, P. D.; Houghton, A. M.; Haynor, D. R.; Chaovalitwongse, W. A.; Kinahan, P. E.

    2018-02-01

    Only a few percent of indeterminate nodules found in lung CT images are cancer. However, enabling earlier diagnosis is important to avoid invasive procedures or long-time surveillance to those benign nodules. We are evaluating a classification framework using radiomics features derived with a machine learning approach from a small data set of indeterminate CT lung nodule images. We used a retrospective analysis of 194 cases with pulmonary nodules in the CT images with or without contrast enhancement from lung cancer screening clinics. The nodules were contoured by a radiologist and texture features of the lesion were calculated. In addition, sematic features describing shape were categorized. We also explored a Multiband network, a feature derivation path that uses a modified convolutional neural network (CNN) with a Triplet Network. This was trained to create discriminative feature representations useful for variable-sized nodule classification. The diagnostic accuracy was evaluated for multiple machine learning algorithms using texture, shape, and CNN features. In the CT contrast-enhanced group, the texture or semantic shape features yielded an overall diagnostic accuracy of 80%. Use of a standard deep learning network in the framework for feature derivation yielded features that substantially underperformed compared to texture and/or semantic features. However, the proposed Multiband approach of feature derivation produced results similar in diagnostic accuracy to the texture and semantic features. While the Multiband feature derivation approach did not outperform the texture and/or semantic features, its equivalent performance indicates promise for future improvements to increase diagnostic accuracy. Importantly, the Multiband approach adapts readily to different size lesions without interpolation, and performed well with relatively small amount of training data.

  11. Martian North Polar Impacts and Volcanoes: Feature Discrimination and Comparisons to Global Trends

    NASA Technical Reports Server (NTRS)

    Sakimoto, E. H.; Weren, S. L.

    2003-01-01

    The recent Mars Global Surveyor and Mars Odyssey Missions have greatly improved our available data for the north polar region of Mars. Pre- MGS and MO studies proposed possible volcanic features, and have revealed numerous volcanoes and impact craters in a range of weathering states that were poorly visible or not visible in prior data sets. This new data has helped in the reassessment of the polar deposits. From images or shaded Mars Orbiter Laser Altimeter (MOLA) topography grids alone, it has proved to be difficult to differentiate cratered cones of probable volcanic origins from impact craters that appear to have been filled. It is important that the distinction is made if possible, as the relative ages of the polar deposits hinge on small numbers of craters, and the local volcanic regime originally only proposed small numbers of volcanoes. Therefore, we have expanded prior work on detailed topographic parameter measurements and modeling for the polar volcanic landforms and mapped and measured all of the probable volcanic and impact features for the north polar region as well as other midlatitude fields, and suggest that: 1) The polar volcanic edifices are significantly different topographically from midlatitude edifices, and have steeper slopes and larger craters as a group; 2) The impact craters are actually distinct from the volcanoes in terms of the feature volume that is cavity compared to feature volume that is positive relief; 3) There are actually several distinct types of volcanic edifices present; 4) These types tend to be spatially grouped by edifice. This is a contrast to many of the other small volcanic fields around Mars, where small edifices tend to be mixed types within a field.

  12. Genetic Programming and Frequent Itemset Mining to Identify Feature Selection Patterns of iEEG and fMRI Epilepsy Data

    PubMed Central

    Smart, Otis; Burrell, Lauren

    2014-01-01

    Pattern classification for intracranial electroencephalogram (iEEG) and functional magnetic resonance imaging (fMRI) signals has furthered epilepsy research toward understanding the origin of epileptic seizures and localizing dysfunctional brain tissue for treatment. Prior research has demonstrated that implicitly selecting features with a genetic programming (GP) algorithm more effectively determined the proper features to discern biomarker and non-biomarker interictal iEEG and fMRI activity than conventional feature selection approaches. However for each the iEEG and fMRI modalities, it is still uncertain whether the stochastic properties of indirect feature selection with a GP yield (a) consistent results within a patient data set and (b) features that are specific or universal across multiple patient data sets. We examined the reproducibility of implicitly selecting features to classify interictal activity using a GP algorithm by performing several selection trials and subsequent frequent itemset mining (FIM) for separate iEEG and fMRI epilepsy patient data. We observed within-subject consistency and across-subject variability with some small similarity for selected features, indicating a clear need for patient-specific features and possible need for patient-specific feature selection or/and classification. For the fMRI, using nearest-neighbor classification and 30 GP generations, we obtained over 60% median sensitivity and over 60% median selectivity. For the iEEG, using nearest-neighbor classification and 30 GP generations, we obtained over 65% median sensitivity and over 65% median selectivity except one patient. PMID:25580059

  13. Regional climate model sensitivity to domain size

    NASA Astrophysics Data System (ADS)

    Leduc, Martin; Laprise, René

    2009-05-01

    Regional climate models are increasingly used to add small-scale features that are not present in their lateral boundary conditions (LBC). It is well known that the limited area over which a model is integrated must be large enough to allow the full development of small-scale features. On the other hand, integrations on very large domains have shown important departures from the driving data, unless large scale nudging is applied. The issue of domain size is studied here by using the “perfect model” approach. This method consists first of generating a high-resolution climatic simulation, nicknamed big brother (BB), over a large domain of integration. The next step is to degrade this dataset with a low-pass filter emulating the usual coarse-resolution LBC. The filtered nesting data (FBB) are hence used to drive a set of four simulations (LBs for Little Brothers), with the same model, but on progressively smaller domain sizes. The LB statistics for a climate sample of four winter months are compared with BB over a common region. The time average (stationary) and transient-eddy standard deviation patterns of the LB atmospheric fields generally improve in terms of spatial correlation with the reference (BB) when domain gets smaller. The extraction of the small-scale features by using a spectral filter allows detecting important underestimations of the transient-eddy variability in the vicinity of the inflow boundary, which can penalize the use of small domains (less than 100 × 100 grid points). The permanent “spatial spin-up” corresponds to the characteristic distance that the large-scale flow needs to travel before developing small-scale features. The spin-up distance tends to grow in size at higher levels in the atmosphere.

  14. A Realistic Data Warehouse Project: An Integration of Microsoft Access[R] and Microsoft Excel[R] Advanced Features and Skills

    ERIC Educational Resources Information Center

    King, Michael A.

    2009-01-01

    Business intelligence derived from data warehousing and data mining has become one of the most strategic management tools today, providing organizations with long-term competitive advantages. Business school curriculums and popular database textbooks cover data warehousing, but the examples and problem sets typically are small and unrealistic. The…

  15. Automated Scoring of L2 Spoken English with Random Forests

    ERIC Educational Resources Information Center

    Kobayashi, Yuichiro; Abe, Mariko

    2016-01-01

    The purpose of the present study is to assess second language (L2) spoken English using automated scoring techniques. Automated scoring aims to classify a large set of learners' oral performance data into a small number of discrete oral proficiency levels. In automated scoring, objectively measurable features such as the frequencies of lexical and…

  16. The Department of Defense Very High Speed Integrated Circuit (VHSIC) Technology Availability Program Plan for the Committees on Armed Services United States Congress.

    DTIC Science & Technology

    1986-06-30

    features of computer aided design systems and statistical quality control procedures that are generic to chip sets and processes. RADIATION HARDNESS -The...System PSP Programmable Signal Processor SSI Small Scale Integration ." TOW Tube Launched, Optically Tracked, Wire Guided TTL Transistor Transitor Logic

  17. Impacting Readiness: Nature and Nurture

    ERIC Educational Resources Information Center

    Healy, Jane M.

    2011-01-01

    Whereas some four year olds could draw a person with five fingers on each hand and a full set of facial features, others could barely hold a pencil. Some sat quietly in a small group, intently listening to and understanding a story, while others wiggled, fidgeted, and couldn't focus their attention. In those days, before the explosion of…

  18. Benefits and Pitfalls of Multimedia and Interactive Features in Technology-Enhanced Storybooks: A Meta-Analysis

    ERIC Educational Resources Information Center

    Takacs, Zsofia K.; Swart, Elise K.; Bus, Adriana G.

    2015-01-01

    A meta-analysis was conducted on the effects of technology-enhanced stories for young children's literacy development when compared to listening to stories in more traditional settings like storybook reading. A small but significant additional benefit of technology was found for story comprehension (g+ = 0.17) and expressive vocabulary (g+ =…

  19. Enabling Quantitative Optical Imaging for In-die-capable Critical Dimension Targets

    PubMed Central

    Barnes, B.M.; Henn, M.-A.; Sohn, M. Y.; Zhou, H.; Silver, R. M.

    2017-01-01

    Dimensional scaling trends will eventually bring semiconductor critical dimensions (CDs) down to only a few atoms in width. New optical techniques are required to address the measurement and variability for these CDs using sufficiently small in-die metrology targets. Recently, Qin et al. [Light Sci Appl, 5, e16038 (2016)] demonstrated quantitative model-based measurements of finite sets of lines with features as small as 16 nm using 450 nm wavelength light. This paper uses simulation studies, augmented with experiments at 193 nm wavelength, to adapt and optimize the finite sets of features that work as in-die-capable metrology targets with minimal increases in parametric uncertainty. A finite element based solver for time-harmonic Maxwell's equations yields two- and three-dimensional simulations of the electromagnetic scattering for optimizing the design of such targets as functions of reduced line lengths, fewer number of lines, fewer focal positions, smaller critical dimensions, and shorter illumination wavelength. Metrology targets that exceeded performance requirements are as short as 3 μm for 193 nm light, feature as few as eight lines, and are extensible to sub-10 nm CDs. Target areas measured at 193 nm can be fifteen times smaller in area than current state-of-the-art scatterometry targets described in the literature. This new methodology is demonstrated to be a promising alternative for optical model-based in-die CD metrology. PMID:28757674

  20. The value of teaching about geomorphology in non-traditional settings

    NASA Astrophysics Data System (ADS)

    Davis, R. Laurence

    2002-10-01

    Academics usually teach about geomorphology in the classroom, where the audience is enthusiastic, but generally small. Less traditional settings offer opportunities to reach a wider audience, one that is equally enthusiastic, given its love of geomorphic features in the National Parks, but one which has little knowledge of the science behind what they are seeing. I have "taught" geomorphology in four non-traditional settings: at a summer camp, a state wildlife refuge, on community field trips, and at meetings for clubs and government boards. This paper discusses my experiences and offers suggestions to others who may wish to follow this less-traveled educational path. As Head of Nature Programs at Camp Pemigewassett in New Hampshire, I have worked, over the last 33 years, with thousands of campers ranging from 8 to 15 years old. Our setting, in a glaciated valley on a small lake, exhibits a wide range of geomorphic features and offers many opportunities for direct learning through field investigations. I have found that even 8-year olds can do real science, if we avoid the jargon. Once "taught" they carry their knowledge about landforms and processes with them and eagerly share it with their friends and family on outings and trips, thus reaching an even wider public. Parks, wildlife refuges, nature preserves, and other similar areas generally have nature trails, often with educational information about the environment. Generally, interpretive signs are prepared by biologists and the content ignores the site's physical features, as well as the connections between ecological communities and the underlying geology and geomorphology. My students and I have addressed this situation at two places in Connecticut, one a state wildlife management area, also used for training teachers to teach Environmental Education, and the other, a town recreation area. We catalogued the geomorphic features, looked at relationships of the community level ecology to those features, and prepared interpretive signs that added this perspective to the trails. The public response has been extremely favorable. Geomorphology can also be taught by leading field trips for community organizations. I have done this twice, once for the Manchester (NH) Historical Society and once for a small watershed association. The attendance and interest surprised me. We finally had to limit the Manchester trip to one full busload (˜45) and the watershed trip, which was part of a "trails day," drew over 90 people. Finally, I have found that organizations such as Sierra Club chapters and town conservation boards are frequently looking for speakers for their periodic meetings. Why not a geomorphologist? After all, much of what conservationists do is related to what geomorphologists do. I have given several of these presentations and the receptions have always been enthusiastic. While the work involved in preparing to teach in one of these non-traditional settings is frequently substantial, the rewards are equally large. It is a way to reach masses of people who know little about the science of geomorphology and to demonstrate its importance to them. Taking our message directly to the public in these settings is an effective way to put geomorphology in the public eye.

  1. JEMRMS Small Satellite Deployment Observation

    NASA Image and Video Library

    2012-10-04

    ISS033-E-009334 (4 Oct. 2012) --- Several tiny satellites are featured in this image photographed by an Expedition 33 crew member on the International Space Station. The satellites were released outside the Kibo laboratory using a Small Satellite Orbital Deployer attached to the Japanese module’s robotic arm on Oct. 4, 2012. Japan Aerospace Exploration Agency astronaut Aki Hoshide, flight engineer, set up the satellite deployment gear inside the lab and placed it in the Kibo airlock. The Japanese robotic arm then grappled the deployment system and its satellites from the airlock for deployment.

  2. JEMRMS Small Satellite Deployment Observation

    NASA Image and Video Library

    2012-10-04

    ISS033-E-009458 (4 Oct. 2012) --- Several tiny satellites are featured in this image photographed by an Expedition 33 crew member on the International Space Station. The satellites were released outside the Kibo laboratory using a Small Satellite Orbital Deployer attached to the Japanese module’s robotic arm on Oct. 4, 2012. Japan Aerospace Exploration Agency astronaut Aki Hoshide, flight engineer, set up the satellite deployment gear inside the lab and placed it in the Kibo airlock. The Japanese robotic arm then grappled the deployment system and its satellites from the airlock for deployment.

  3. Genetic Algorithms and Classification Trees in Feature Discovery: Diabetes and the NHANES database

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Heredia-Langner, Alejandro; Jarman, Kristin H.; Amidan, Brett G.

    2013-09-01

    This paper presents a feature selection methodology that can be applied to datasets containing a mixture of continuous and categorical variables. Using a Genetic Algorithm (GA), this method explores a dataset and selects a small set of features relevant for the prediction of a binary (1/0) response. Binary classification trees and an objective function based on conditional probabilities are used to measure the fitness of a given subset of features. The method is applied to health data in order to find factors useful for the prediction of diabetes. Results show that our algorithm is capable of narrowing down the setmore » of predictors to around 8 factors that can be validated using reputable medical and public health resources.« less

  4. Benefits and Pitfalls of Multimedia and Interactive Features in Technology-Enhanced Storybooks

    PubMed Central

    Takacs, Zsofia K.; Swart, Elise K.; Bus, Adriana G.

    2015-01-01

    A meta-analysis was conducted on the effects of technology-enhanced stories for young children’s literacy development when compared to listening to stories in more traditional settings like storybook reading. A small but significant additional benefit of technology was found for story comprehension (g+ = 0.17) and expressive vocabulary (g+ = 0.20), based on data from 2,147 children in 43 studies. When investigating the different characteristics of technology-enhanced stories, multimedia features like animated pictures, music, and sound effects were found beneficial. In contrast, interactive elements like hotspots, games, and dictionaries were found to be distracting. Especially for children disadvantaged because of less stimulating family environments, multimedia features were helpful and interactive features were detrimental. Findings are discussed from the perspective of cognitive processing theories. PMID:26640299

  5. Benefits and Pitfalls of Multimedia and Interactive Features in Technology-Enhanced Storybooks: A Meta-Analysis.

    PubMed

    Takacs, Zsofia K; Swart, Elise K; Bus, Adriana G

    2015-12-01

    A meta-analysis was conducted on the effects of technology-enhanced stories for young children's literacy development when compared to listening to stories in more traditional settings like storybook reading. A small but significant additional benefit of technology was found for story comprehension (g+ = 0.17) and expressive vocabulary (g+ = 0.20), based on data from 2,147 children in 43 studies. When investigating the different characteristics of technology-enhanced stories, multimedia features like animated pictures, music, and sound effects were found beneficial. In contrast, interactive elements like hotspots, games, and dictionaries were found to be distracting. Especially for children disadvantaged because of less stimulating family environments, multimedia features were helpful and interactive features were detrimental. Findings are discussed from the perspective of cognitive processing theories.

  6. A novel scheme for abnormal cell detection in Pap smear images

    NASA Astrophysics Data System (ADS)

    Zhao, Tong; Wachman, Elliot S.; Farkas, Daniel L.

    2004-07-01

    Finding malignant cells in Pap smear images is a "needle in a haystack"-type problem, tedious, labor-intensive and error-prone. It is therefore desirable to have an automatic screening tool in order that human experts can concentrate on the evaluation of the more difficult cases. Most research on automatic cervical screening tries to extract morphometric and texture features at the cell level, in accordance with the NIH "The Bethesda System" rules. Due to variances in image quality and features, such as brightness, magnification and focus, morphometric and texture analysis is insufficient to provide robust cervical cancer detection. Using a microscopic spectral imaging system, we have produced a set of multispectral Pap smear images with wavelengths from 400 nm to 690 nm, containing both spectral signatures and spatial attributes. We describe a novel scheme that combines spatial information (including texture and morphometric features) with spectral information to significantly improve abnormal cell detection. Three kinds of wavelet features, orthogonal, bi-orthogonal and non-orthogonal, are carefully chosen to optimize recognition performance. Multispectral feature sets are then extracted in the wavelet domain. Using a Back-Propagation Neural Network classifier that greatly decreases the influence of spurious events, we obtain a classification error rate of 5%. Cell morphometric features, such as area and shape, are then used to eliminate most remaining small artifacts. We report initial results from 149 cells from 40 separate image sets, in which only one abnormal cell was missed (TPR = 97.6%) and one normal cell was falsely classified as cancerous (FPR = 1%).

  7. Deep 3D Convolutional Encoder Networks With Shortcuts for Multiscale Feature Integration Applied to Multiple Sclerosis Lesion Segmentation.

    PubMed

    Brosch, Tom; Tang, Lisa Y W; Youngjin Yoo; Li, David K B; Traboulsee, Anthony; Tam, Roger

    2016-05-01

    We propose a novel segmentation approach based on deep 3D convolutional encoder networks with shortcut connections and apply it to the segmentation of multiple sclerosis (MS) lesions in magnetic resonance images. Our model is a neural network that consists of two interconnected pathways, a convolutional pathway, which learns increasingly more abstract and higher-level image features, and a deconvolutional pathway, which predicts the final segmentation at the voxel level. The joint training of the feature extraction and prediction pathways allows for the automatic learning of features at different scales that are optimized for accuracy for any given combination of image types and segmentation task. In addition, shortcut connections between the two pathways allow high- and low-level features to be integrated, which enables the segmentation of lesions across a wide range of sizes. We have evaluated our method on two publicly available data sets (MICCAI 2008 and ISBI 2015 challenges) with the results showing that our method performs comparably to the top-ranked state-of-the-art methods, even when only relatively small data sets are available for training. In addition, we have compared our method with five freely available and widely used MS lesion segmentation methods (EMS, LST-LPA, LST-LGA, Lesion-TOADS, and SLS) on a large data set from an MS clinical trial. The results show that our method consistently outperforms these other methods across a wide range of lesion sizes.

  8. Munitions related feature extraction from LIDAR data.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Roberts, Barry L.

    2010-06-01

    The characterization of former military munitions ranges is critical in the identification of areas likely to contain residual unexploded ordnance (UXO). Although these ranges are large, often covering tens-of-thousands of acres, the actual target areas represent only a small fraction of the sites. The challenge is that many of these sites do not have records indicating locations of former target areas. The identification of target areas is critical in the characterization and remediation of these sites. The Strategic Environmental Research and Development Program (SERDP) and Environmental Security Technology Certification Program (ESTCP) of the DoD have been developing and implementing techniquesmore » for the efficient characterization of large munitions ranges. As part of this process, high-resolution LIDAR terrain data sets have been collected over several former ranges. These data sets have been shown to contain information relating to former munitions usage at these ranges, specifically terrain cratering due to high-explosives detonations. The location and relative intensity of crater features can provide information critical in reconstructing the usage history of a range, and indicate areas most likely to contain UXO. We have developed an automated procedure using an adaptation of the Circular Hough Transform for the identification of crater features in LIDAR terrain data. The Circular Hough Transform is highly adept at finding circular features (craters) in noisy terrain data sets. This technique has the ability to find features of a specific radius providing a means of filtering features based on expected scale and providing additional spatial characterization of the identified feature. This method of automated crater identification has been applied to several former munitions ranges with positive results.« less

  9. Young children's use of features to reorient is more than just associative: further evidence against a modular view of spatial processing.

    PubMed

    Newcombe, Nora S; Ratliff, Kristin R; Shallcross, Wendy L; Twyman, Alexandra D

    2010-01-01

    Proponents of a geometric module have argued that instances of young children's use of features as well as geometry to reorient can be explained by a two-stage process. In this model, only the first stage is a true reorientation, accomplished by using geometric information alone; features are considered in a second stage using association (Lee, Shusterman & Spelke, 2006). This account is contradicted by the data from two experiments. Experiment 1a sets the stage for Experiment 1b by showing that young children use geometric information to reorient in a complex geometric figure without a single principal axis of symmetry (an octagon). In such a figure, there are two sets of geometrically congruent corners, with four corners in each set. The addition of a colored wall leads to the existence of three geometrically congruent but, crucially, all unmarked corners; using the colored wall to distinguish among them could not be done associatively. In Experiment 1b, both 3- and 5-year-old children showed true non-associative reorientation using features by performing at above-chance levels on all-white trials. Experiment 2 used a paradigm without distinctive geometry, modeled on Lee et al. (2006), involving an equilateral triangle of hiding places located within a circular enclosure, but with a large stable feature rather than a small moveable one. Four-year-olds (the age group studied by Lee et al.) used features at above-chance levels. Thus, features can be used to reorient, in a way not dependent on association, in contradiction to the two-stage version of the modular view.

  10. Classification of health webpages as expert and non expert with a reduced set of cross-language features.

    PubMed

    Grabar, Natalia; Krivine, Sonia; Jaulent, Marie-Christine

    2007-10-11

    Making the distinction between expert and non expert health documents can help users to select the information which is more suitable for them, according to whether they are familiar or not with medical terminology. This issue is particularly important for the information retrieval area. In our work we address this purpose through stylistic corpus analysis and the application of machine learning algorithms. Our hypothesis is that this distinction can be performed on the basis of a small number of features and that such features can be language and domain independent. The used features were acquired in source corpus (Russian language, diabetes topic) and then tested on target (French language, pneumology topic) and source corpora. These cross-language features show 90% precision and 93% recall with non expert documents in source language; and 85% precision and 74% recall with expert documents in target language.

  11. Binary classification of aqueous solubility using support vector machines with reduction and recombination feature selection.

    PubMed

    Cheng, Tiejun; Li, Qingliang; Wang, Yanli; Bryant, Stephen H

    2011-02-28

    Aqueous solubility is recognized as a critical parameter in both the early- and late-stage drug discovery. Therefore, in silico modeling of solubility has attracted extensive interests in recent years. Most previous studies have been limited in using relatively small data sets with limited diversity, which in turn limits the predictability of derived models. In this work, we present a support vector machines model for the binary classification of solubility by taking advantage of the largest known public data set that contains over 46 000 compounds with experimental solubility. Our model was optimized in combination with a reduction and recombination feature selection strategy. The best model demonstrated robust performance in both cross-validation and prediction of two independent test sets, indicating it could be a practical tool to select soluble compounds for screening, purchasing, and synthesizing. Moreover, our work may be used for comparative evaluation of solubility classification studies ascribe to the use of completely public resources.

  12. You Know You Have a Rockin' Artroom When...

    ERIC Educational Resources Information Center

    Stevens, Lori

    2006-01-01

    Lori Stevens teaches in the art department of Orland High School, a small high school of 600 students in Orland, California. Her program includes Art 1, Studio Art, and Advanced Placement all in one room, with one budget, and one teacher. She has been teaching art in this setting for twenty years. This article features a color photo of her class…

  13. Acetabular rim and surface segmentation for hip surgery planning and dysplasia evaluation

    NASA Astrophysics Data System (ADS)

    Tan, Sovira; Yao, Jianhua; Yao, Lawrence; Summers, Ronald M.; Ward, Michael M.

    2008-03-01

    Knowledge of the acetabular rim and surface can be invaluable for hip surgery planning and dysplasia evaluation. The acetabular rim can also be used as a landmark for registration purposes. At the present time acetabular features are mostly extracted manually at great cost of time and human labor. Using a recent level set algorithm that can evolve on the surface of a 3D object represented by a triangular mesh we automatically extracted rims and surfaces of acetabulae. The level set is guided by curvature features on the mesh. It can segment portions of a surface that are bounded by a line of extremal curvature (ridgeline or crestline). The rim of the acetabulum is such an extremal curvature line. Our material consists of eight hemi-pelvis surfaces. The algorithm is initiated by putting a small circle (level set seed) at the center of the acetabular surface. Because this surface distinctively has the form of a cup we were able to use the Shape Index feature to automatically extract an approximate center. The circle then expands and deforms so as to take the shape of the acetabular rim. The results were visually inspected. Only minor errors were detected. The algorithm also proved to be robust. Seed placement was satisfactory for the eight hemi-pelvis surfaces without changing any parameters. For the level set evolution we were able to use a single set of parameters for seven out of eight surfaces.

  14. Evolution properties of the community members for dynamic networks

    NASA Astrophysics Data System (ADS)

    Yang, Kai; Guo, Qiang; Li, Sheng-Nan; Han, Jing-Ti; Liu, Jian-Guo

    2017-03-01

    The collective behaviors of community members for dynamic social networks are significant for understanding evolution features of communities. In this Letter, we empirically investigate the evolution properties of the new community members for dynamic networks. Firstly, we separate data sets into different slices, and analyze the statistical properties of new members as well as communities they joined in for these data sets. Then we introduce a parameter φ to describe community evolution between different slices and investigate the dynamic community properties of the new community members. The empirical analyses for the Facebook, APS, Enron and Wiki data sets indicate that both the number of new members and joint communities increase, the ratio declines rapidly and then becomes stable over time, and most of the new members will join in the small size communities that is s ≤ 10. Furthermore, the proportion of new members in existed communities decreases firstly and then becomes stable and relatively small for these data sets. Our work may be helpful for deeply understanding the evolution properties of community members for social networks.

  15. Supervised neural network classification of pre-sliced cooked pork ham images using quaternionic singular values.

    PubMed

    Valous, Nektarios A; Mendoza, Fernando; Sun, Da-Wen; Allen, Paul

    2010-03-01

    The quaternionic singular value decomposition is a technique to decompose a quaternion matrix (representation of a colour image) into quaternion singular vector and singular value component matrices exposing useful properties. The objective of this study was to use a small portion of uncorrelated singular values, as robust features for the classification of sliced pork ham images, using a supervised artificial neural network classifier. Images were acquired from four qualities of sliced cooked pork ham typically consumed in Ireland (90 slices per quality), having similar appearances. Mahalanobis distances and Pearson product moment correlations were used for feature selection. Six highly discriminating features were used as input to train the neural network. An adaptive feedforward multilayer perceptron classifier was employed to obtain a suitable mapping from the input dataset. The overall correct classification performance for the training, validation and test set were 90.3%, 94.4%, and 86.1%, respectively. The results confirm that the classification performance was satisfactory. Extracting the most informative features led to the recognition of a set of different but visually quite similar textural patterns based on quaternionic singular values. Copyright 2009 Elsevier Ltd. All rights reserved.

  16. Process-based interpretation of conceptual hydrological model performance using a multinational catchment set

    NASA Astrophysics Data System (ADS)

    Poncelet, Carine; Merz, Ralf; Merz, Bruno; Parajka, Juraj; Oudin, Ludovic; Andréassian, Vazken; Perrin, Charles

    2017-08-01

    Most of previous assessments of hydrologic model performance are fragmented, based on small number of catchments, different methods or time periods and do not link the results to landscape or climate characteristics. This study uses large-sample hydrology to identify major catchment controls on daily runoff simulations. It is based on a conceptual lumped hydrological model (GR6J), a collection of 29 catchment characteristics, a multinational set of 1103 catchments located in Austria, France, and Germany and four runoff model efficiency criteria. Two analyses are conducted to assess how features and criteria are linked: (i) a one-dimensional analysis based on the Kruskal-Wallis test and (ii) a multidimensional analysis based on regression trees and investigating the interplay between features. The catchment features most affecting model performance are the flashiness of precipitation and streamflow (computed as the ratio of absolute day-to-day fluctuations by the total amount in a year), the seasonality of evaporation, the catchment area, and the catchment aridity. Nonflashy, nonseasonal, large, and nonarid catchments show the best performance for all the tested criteria. We argue that this higher performance is due to fewer nonlinear responses (higher correlation between precipitation and streamflow) and lower input and output variability for such catchments. Finally, we show that, compared to national sets, multinational sets increase results transferability because they explore a wider range of hydroclimatic conditions.

  17. The structure of Jupiter’s main ring from New Horizons: A comparison with other ring-moon systems

    NASA Astrophysics Data System (ADS)

    Chancia, Robert; Hedman, Matthew

    2018-04-01

    During New Horizon’s Jupiter flyby in 2007, the Long-Range Reconnaissance Imager (LORRI) took several images of the planet’s main ring. The data set contains two extended image-movies of the main ring, along with several brief observations at varying ring azimuths, and a small set of high phase angle images. Thus far, the only published work on the New Horizons Jupiter rings data set found seven bright clumps with sub-km equivalent radii embedded in the main ring (Showalter et al. 2007 Science). In this work, we searched the inner region of the main ring for any structures that might be perturbed at the 3:2 resonances with the rotation of Jupiter’s magnetic field or massive storms. We also examined the structure of the outer main ring in order to assess how it is shaped by the small moons Metis and Adrastea. Some of the features seen in Jupiter’s main ring are similar to those found in other dusty rings around Saturn, Uranus, and Neptune. By comparing these different rings, we can gain a better understanding of how small moons sculpt tenuous rings.

  18. Volatility return intervals analysis of the Japanese market

    NASA Astrophysics Data System (ADS)

    Jung, W.-S.; Wang, F. Z.; Havlin, S.; Kaizoji, T.; Moon, H.-T.; Stanley, H. E.

    2008-03-01

    We investigate scaling and memory effects in return intervals between price volatilities above a certain threshold q for the Japanese stock market using daily and intraday data sets. We find that the distribution of return intervals can be approximated by a scaling function that depends only on the ratio between the return interval τ and its mean <τ>. We also find memory effects such that a large (or small) return interval follows a large (or small) interval by investigating the conditional distribution and mean return interval. The results are similar to previous studies of other markets and indicate that similar statistical features appear in different financial markets. We also compare our results between the period before and after the big crash at the end of 1989. We find that scaling and memory effects of the return intervals show similar features although the statistical properties of the returns are different.

  19. Differential absorption lidar observation on small-time-scale features of water vapor in the atmospheric boundary layer

    NASA Astrophysics Data System (ADS)

    Kong, Wei; Li, Jiatang; Liu, Hao; Chen, Tao; Hong, Guanglie; Shu, Rong

    2017-11-01

    Observation on small-time-scale features of water vapor density is essential for turbulence, convection and many other fast atmospheric processes study. For the high signal-to-noise signal of elastic signal acquired by differential absorption lidar, it has great potential for all-day water vapor turbulence observation. This paper presents a set of differential absorption lidar at 935nm developed by Shanghai Institute of Technical Physics of the Chinese Academy of Science for water vapor turbulence observation. A case at the midday is presented to demonstrate the daytime observation ability of this system. "Autocovariance method" is used to separate the contribution of water vapor fluctuation from random error. The results show that the relative error is less than 10% at temporal and spatial resolution of 10 seconds and 60 meters in the ABL. This indicate that the system has excellent performance for daytime water vapor turbulence observation.

  20. In silico quantitative structure-toxicity relationship study of aromatic nitro compounds.

    PubMed

    Pasha, Farhan Ahmad; Neaz, Mohammad Morshed; Cho, Seung Joo; Ansari, Mohiuddin; Mishra, Sunil Kumar; Tiwari, Sharvan

    2009-05-01

    Small molecules often have toxicities that are a function of molecular structural features. Minor variations in structural features can make large difference in such toxicity. Consequently, in silico techniques may be used to correlate such molecular toxicities with their structural features. Relative to nine different sets of aromatic nitro compounds having known observed toxicities against different targets, we developed ligand-based 2D quantitative structure-toxicity relationship models using 20 selected topological descriptors. The topological descriptors have several advantages such as conformational independency, facile and less time-consuming computation to yield good results. Multiple linear regression analysis was used to correlate variations of toxicity with molecular properties. The information index on molecular size, lopping centric index and Kier flexibility index were identified as fundamental descriptors for different kinds of toxicity, and further showed that molecular size, branching and molecular flexibility might be particularly important factors in quantitative structure-toxicity relationship analysis. This study revealed that topological descriptor-guided quantitative structure-toxicity relationship provided a very useful, cost and time-efficient, in silico tool for describing small-molecule toxicities.

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Galavis, P; Friedman, K; Chandarana, H

    Purpose: Radiomics involves the extraction of texture features from different imaging modalities with the purpose of developing models to predict patient treatment outcomes. The purpose of this study is to investigate texture feature reproducibility across [18F]FDG PET/CT and [18F]FDG PET/MR imaging in patients with primary malignancies. Methods: Twenty five prospective patients with solid tumors underwent clinical [18F]FDG PET/CT scan followed by [18F]FDG PET/MR scans. In all patients the lesions were identified using nuclear medicine reports. The images were co-registered and segmented using an in-house auto-segmentation method. Fifty features, based on the intensity histogram, second and high order matrices, were extractedmore » from the segmented regions from both image data sets. One-way random-effects ANOVA model of the intra-class correlation coefficient (ICC) was used to establish texture feature correlations between both data sets. Results: Fifty features were classified based on their ICC values, which were found in the range from 0.1 to 0.86, in three categories: high, intermediate, and low. Ten features extracted from second and high-order matrices showed large ICC ≥ 0.70. Seventeen features presented intermediate 0.5 ≤ ICC ≤ 0.65 and the remaining twenty three presented low ICC ≤ 0.45. Conclusion: Features with large ICC values could be reliable candidates for quantification as they lead to similar results from both imaging modalities. Features with small ICC indicates a lack of correlation. Therefore, the use of these features as a quantitative measure will lead to different assessments of the same lesion depending on the imaging modality from where they are extracted. This study shows the importance of the need for further investigation and standardization of features across multiple imaging modalities.« less

  2. Machine learning approaches to diagnosis and laterality effects in semantic dementia discourse.

    PubMed

    Garrard, Peter; Rentoumi, Vassiliki; Gesierich, Benno; Miller, Bruce; Gorno-Tempini, Maria Luisa

    2014-06-01

    Advances in automatic text classification have been necessitated by the rapid increase in the availability of digital documents. Machine learning (ML) algorithms can 'learn' from data: for instance a ML system can be trained on a set of features derived from written texts belonging to known categories, and learn to distinguish between them. Such a trained system can then be used to classify unseen texts. In this paper, we explore the potential of the technique to classify transcribed speech samples along clinical dimensions, using vocabulary data alone. We report the accuracy with which two related ML algorithms [naive Bayes Gaussian (NBG) and naive Bayes multinomial (NBM)] categorized picture descriptions produced by: 32 semantic dementia (SD) patients versus 10 healthy, age-matched controls; and SD patients with left- (n = 21) versus right-predominant (n = 11) patterns of temporal lobe atrophy. We used information gain (IG) to identify the vocabulary features that were most informative to each of these two distinctions. In the SD versus control classification task, both algorithms achieved accuracies of greater than 90%. In the right- versus left-temporal lobe predominant classification, NBM achieved a high level of accuracy (88%), but this was achieved by both NBM and NBG when the features used in the training set were restricted to those with high values of IG. The most informative features for the patient versus control task were low frequency content words, generic terms and components of metanarrative statements. For the right versus left task the number of informative lexical features was too small to support any specific inferences. An enriched feature set, including values derived from Quantitative Production Analysis (QPA) may shed further light on this little understood distinction. Copyright © 2013 Elsevier Ltd. All rights reserved.

  3. Small blob identification in medical images using regional features from optimum scale.

    PubMed

    Zhang, Min; Wu, Teresa; Bennett, Kevin M

    2015-04-01

    Recent advances in medical imaging technology have greatly enhanced imaging-based diagnosis which requires computational effective and accurate algorithms to process the images (e.g., measure the objects) for quantitative assessment. In this research, we are interested in one type of imaging objects: small blobs. Examples of small blob objects are cells in histopathology images, glomeruli in MR images, etc. This problem is particularly challenging because the small blobs often have in homogeneous intensity distribution and an indistinct boundary against the background. Yet, in general, these blobs have similar sizes. Motivated by this finding, we propose a novel detector termed Hessian-based Laplacian of Gaussian (HLoG) using scale space theory as the foundation. Like most imaging detectors, an image is first smoothed via LoG. Hessian analysis is then launched to identify the single optimal scale on which a presegmentation is conducted. The advantage of the Hessian process is that it is capable of delineating the blobs. As a result, regional features can be retrieved. These features enable an unsupervised clustering algorithm for postpruning which should be more robust and sensitive than the traditional threshold-based postpruning commonly used in most imaging detectors. To test the performance of the proposed HLoG, two sets of 2-D grey medical images are studied. HLoG is compared against three state-of-the-art detectors: generalized LoG, Radial-Symmetry and LoG using precision, recall, and F-score metrics.We observe that HLoG statistically outperforms the compared detectors.

  4. Object Classification With Joint Projection and Low-Rank Dictionary Learning.

    PubMed

    Foroughi, Homa; Ray, Nilanjan; Hong Zhang

    2018-02-01

    For an object classification system, the most critical obstacles toward real-world applications are often caused by large intra-class variability, arising from different lightings, occlusion, and corruption, in limited sample sets. Most methods in the literature would fail when the training samples are heavily occluded, corrupted or have significant illumination or viewpoint variations. Besides, most of the existing methods and especially deep learning-based methods, need large training sets to achieve a satisfactory recognition performance. Although using the pre-trained network on a generic large-scale data set and fine-tune it to the small-sized target data set is a widely used technique, this would not help when the content of base and target data sets are very different. To address these issues simultaneously, we propose a joint projection and low-rank dictionary learning method using dual graph constraints. Specifically, a structured class-specific dictionary is learned in the low-dimensional space, and the discrimination is further improved by imposing a graph constraint on the coding coefficients, that maximizes the intra-class compactness and inter-class separability. We enforce structural incoherence and low-rank constraints on sub-dictionaries to reduce the redundancy among them, and also make them robust to variations and outliers. To preserve the intrinsic structure of data, we introduce a supervised neighborhood graph into the framework to make the proposed method robust to small-sized and high-dimensional data sets. Experimental results on several benchmark data sets verify the superior performance of our method for object classification of small-sized data sets, which include a considerable amount of different kinds of variation, and may have high-dimensional feature vectors.

  5. Small-scale structural heterogeneity and well-communication problems in the Granny Creek oil field of West Virginia

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zheng, L.; Wilson, T.H.; Shumaker, R.C.

    1993-08-01

    Seismic interpretations of the Granny Creek oil field in West Virginia suggest the presence of numerous small-scale fracture zones and faults. Seismic disruptions interpreted as faults and/or fracture zones are represented by abrupt reflection offsets, local amplitude reductions, and waveform changes. These features are enhanced through reprocessing, and the majority of the improvements to the data result from the surface consistent application of zero-phase deconvolution. Reprocessing yields a 20% improvement of resolution. Seismic interpretations of these features as small faults and fracture zones are supported by nearby offset vertical seismic profiles and by their proximity to wells between which directmore » communication occurs during waterflooding. Four sets of faults are interpreted based on subsurface and seismic data. Direct interwell communication is interpreted to be associated only with a northeast-trending set of faults, which are believed to have detached structural origins. Subsequent reactivation of deeper basement faults may have opened fractures along this trend. These faults have a limited effect on primary production, but cause many well-communication problems and reduce secondary production. Seismic detection of these zones is important to the economic and effective design of secondary recovery operations, because direct well communication often results in significant reduction of sweep efficiency during waterflooding. Prior information about the location of these zones would allow secondary recovery operations to avoid potential problem areas and increase oil recovery.« less

  6. A Temporal Pattern Mining Approach for Classifying Electronic Health Record Data

    PubMed Central

    Batal, Iyad; Valizadegan, Hamed; Cooper, Gregory F.; Hauskrecht, Milos

    2013-01-01

    We study the problem of learning classification models from complex multivariate temporal data encountered in electronic health record systems. The challenge is to define a good set of features that are able to represent well the temporal aspect of the data. Our method relies on temporal abstractions and temporal pattern mining to extract the classification features. Temporal pattern mining usually returns a large number of temporal patterns, most of which may be irrelevant to the classification task. To address this problem, we present the Minimal Predictive Temporal Patterns framework to generate a small set of predictive and non-spurious patterns. We apply our approach to the real-world clinical task of predicting patients who are at risk of developing heparin induced thrombocytopenia. The results demonstrate the benefit of our approach in efficiently learning accurate classifiers, which is a key step for developing intelligent clinical monitoring systems. PMID:25309815

  7. Two-Phase and Graph-Based Clustering Methods for Accurate and Efficient Segmentation of Large Mass Spectrometry Images.

    PubMed

    Dexter, Alex; Race, Alan M; Steven, Rory T; Barnes, Jennifer R; Hulme, Heather; Goodwin, Richard J A; Styles, Iain B; Bunch, Josephine

    2017-11-07

    Clustering is widely used in MSI to segment anatomical features and differentiate tissue types, but existing approaches are both CPU and memory-intensive, limiting their application to small, single data sets. We propose a new approach that uses a graph-based algorithm with a two-phase sampling method that overcomes this limitation. We demonstrate the algorithm on a range of sample types and show that it can segment anatomical features that are not identified using commonly employed algorithms in MSI, and we validate our results on synthetic MSI data. We show that the algorithm is robust to fluctuations in data quality by successfully clustering data with a designed-in variance using data acquired with varying laser fluence. Finally, we show that this method is capable of generating accurate segmentations of large MSI data sets acquired on the newest generation of MSI instruments and evaluate these results by comparison with histopathology.

  8. Remote Sensing Observations of Thunderstorm Features in Latvia

    NASA Astrophysics Data System (ADS)

    Avotniece, Zanita; Briede, Agrita; Klavins, Maris; Aniskevich, Svetlana

    2017-12-01

    Thunderstorms are the most hazardous meteorological phenomena in Latvia in the summer season, and the assessment of their characteristics is essential for the development of an effective national climate and weather prediction service. However, the complex nature of convective processes sets specific limitations to their observation, analysis and forecasting. Therefore, the aim of this study is to analyse thunderstorm features associated with severe thunderstorms observed in weather radar and satellite data in Latvia over the period 2006-2015. The obtained results confirm the applicability of the selected thunderstorm features for thunderstorm nowcasting and analysis in Latvia. The most frequent features observed on days with thunderstorm were maximum radar reflectivities exceeding 50 dBZ and the occurrence of overshooting tops and tilted updrafts, while the occurrence of gravity waves, V-shaped storm structures and small ice particles have been found to be useful indicators of increased thunderstorm severity potential.

  9. Improved initial guess with semi-subpixel level accuracy in digital image correlation by feature-based method

    NASA Astrophysics Data System (ADS)

    Zhang, Yunlu; Yan, Lei; Liou, Frank

    2018-05-01

    The quality initial guess of deformation parameters in digital image correlation (DIC) has a serious impact on convergence, robustness, and efficiency of the following subpixel level searching stage. In this work, an improved feature-based initial guess (FB-IG) scheme is presented to provide initial guess for points of interest (POIs) inside a large region. Oriented FAST and Rotated BRIEF (ORB) features are semi-uniformly extracted from the region of interest (ROI) and matched to provide initial deformation information. False matched pairs are eliminated by the novel feature guided Gaussian mixture model (FG-GMM) point set registration algorithm, and nonuniform deformation parameters of the versatile reproducing kernel Hilbert space (RKHS) function are calculated simultaneously. Validations on simulated images and real-world mini tensile test verify that this scheme can robustly and accurately compute initial guesses with semi-subpixel level accuracy in cases with small or large translation, deformation, or rotation.

  10. Probabilistic graphlet transfer for photo cropping.

    PubMed

    Zhang, Luming; Song, Mingli; Zhao, Qi; Liu, Xiao; Bu, Jiajun; Chen, Chun

    2013-02-01

    As one of the most basic photo manipulation processes, photo cropping is widely used in the printing, graphic design, and photography industries. In this paper, we introduce graphlets (i.e., small connected subgraphs) to represent a photo's aesthetic features, and propose a probabilistic model to transfer aesthetic features from the training photo onto the cropped photo. In particular, by segmenting each photo into a set of regions, we construct a region adjacency graph (RAG) to represent the global aesthetic feature of each photo. Graphlets are then extracted from the RAGs, and these graphlets capture the local aesthetic features of the photos. Finally, we cast photo cropping as a candidate-searching procedure on the basis of a probabilistic model, and infer the parameters of the cropped photos using Gibbs sampling. The proposed method is fully automatic. Subjective evaluations have shown that it is preferred over a number of existing approaches.

  11. Decoding memory features from hippocampal spiking activities using sparse classification models.

    PubMed

    Dong Song; Hampson, Robert E; Robinson, Brian S; Marmarelis, Vasilis Z; Deadwyler, Sam A; Berger, Theodore W

    2016-08-01

    To understand how memory information is encoded in the hippocampus, we build classification models to decode memory features from hippocampal CA3 and CA1 spatio-temporal patterns of spikes recorded from epilepsy patients performing a memory-dependent delayed match-to-sample task. The classification model consists of a set of B-spline basis functions for extracting memory features from the spike patterns, and a sparse logistic regression classifier for generating binary categorical output of memory features. Results show that classification models can extract significant amount of memory information with respects to types of memory tasks and categories of sample images used in the task, despite the high level of variability in prediction accuracy due to the small sample size. These results support the hypothesis that memories are encoded in the hippocampal activities and have important implication to the development of hippocampal memory prostheses.

  12. A Semisupervised Support Vector Machines Algorithm for BCI Systems

    PubMed Central

    Qin, Jianzhao; Li, Yuanqing; Sun, Wei

    2007-01-01

    As an emerging technology, brain-computer interfaces (BCIs) bring us new communication interfaces which translate brain activities into control signals for devices like computers, robots, and so forth. In this study, we propose a semisupervised support vector machine (SVM) algorithm for brain-computer interface (BCI) systems, aiming at reducing the time-consuming training process. In this algorithm, we apply a semisupervised SVM for translating the features extracted from the electrical recordings of brain into control signals. This SVM classifier is built from a small labeled data set and a large unlabeled data set. Meanwhile, to reduce the time for training semisupervised SVM, we propose a batch-mode incremental learning method, which can also be easily applied to the online BCI systems. Additionally, it is suggested in many studies that common spatial pattern (CSP) is very effective in discriminating two different brain states. However, CSP needs a sufficient labeled data set. In order to overcome the drawback of CSP, we suggest a two-stage feature extraction method for the semisupervised learning algorithm. We apply our algorithm to two BCI experimental data sets. The offline data analysis results demonstrate the effectiveness of our algorithm. PMID:18368141

  13. Modified kernel-based nonlinear feature extraction.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ma, J.; Perkins, S. J.; Theiler, J. P.

    2002-01-01

    Feature Extraction (FE) techniques are widely used in many applications to pre-process data in order to reduce the complexity of subsequent processes. A group of Kernel-based nonlinear FE ( H E ) algorithms has attracted much attention due to their high performance. However, a serious limitation that is inherent in these algorithms -- the maximal number of features extracted by them is limited by the number of classes involved -- dramatically degrades their flexibility. Here we propose a modified version of those KFE algorithms (MKFE), This algorithm is developed from a special form of scatter-matrix, whose rank is not determinedmore » by the number of classes involved, and thus breaks the inherent limitation in those KFE algorithms. Experimental results suggest that MKFE algorithm is .especially useful when the training set is small.« less

  14. Revealing metabolite biomarkers for acupuncture treatment by linear programming based feature selection.

    PubMed

    Wang, Yong; Wu, Qiao-Feng; Chen, Chen; Wu, Ling-Yun; Yan, Xian-Zhong; Yu, Shu-Guang; Zhang, Xiang-Sun; Liang, Fan-Rong

    2012-01-01

    Acupuncture has been practiced in China for thousands of years as part of the Traditional Chinese Medicine (TCM) and has gradually accepted in western countries as an alternative or complementary treatment. However, the underlying mechanism of acupuncture, especially whether there exists any difference between varies acupoints, remains largely unknown, which hinders its widespread use. In this study, we develop a novel Linear Programming based Feature Selection method (LPFS) to understand the mechanism of acupuncture effect, at molecular level, by revealing the metabolite biomarkers for acupuncture treatment. Specifically, we generate and investigate the high-throughput metabolic profiles of acupuncture treatment at several acupoints in human. To select the subsets of metabolites that best characterize the acupuncture effect for each meridian point, an optimization model is proposed to identify biomarkers from high-dimensional metabolic data from case and control samples. Importantly, we use nearest centroid as the prototype to simultaneously minimize the number of selected features and the leave-one-out cross validation error of classifier. We compared the performance of LPFS to several state-of-the-art methods, such as SVM recursive feature elimination (SVM-RFE) and sparse multinomial logistic regression approach (SMLR). We find that our LPFS method tends to reveal a small set of metabolites with small standard deviation and large shifts, which exactly serves our requirement for good biomarker. Biologically, several metabolite biomarkers for acupuncture treatment are revealed and serve as the candidates for further mechanism investigation. Also biomakers derived from five meridian points, Zusanli (ST36), Liangmen (ST21), Juliao (ST3), Yanglingquan (GB34), and Weizhong (BL40), are compared for their similarity and difference, which provide evidence for the specificity of acupoints. Our result demonstrates that metabolic profiling might be a promising method to investigate the molecular mechanism of acupuncture. Comparing with other existing methods, LPFS shows better performance to select a small set of key molecules. In addition, LPFS is a general methodology and can be applied to other high-dimensional data analysis, for example cancer genomics.

  15. Revealing metabolite biomarkers for acupuncture treatment by linear programming based feature selection

    PubMed Central

    2012-01-01

    Background Acupuncture has been practiced in China for thousands of years as part of the Traditional Chinese Medicine (TCM) and has gradually accepted in western countries as an alternative or complementary treatment. However, the underlying mechanism of acupuncture, especially whether there exists any difference between varies acupoints, remains largely unknown, which hinders its widespread use. Results In this study, we develop a novel Linear Programming based Feature Selection method (LPFS) to understand the mechanism of acupuncture effect, at molecular level, by revealing the metabolite biomarkers for acupuncture treatment. Specifically, we generate and investigate the high-throughput metabolic profiles of acupuncture treatment at several acupoints in human. To select the subsets of metabolites that best characterize the acupuncture effect for each meridian point, an optimization model is proposed to identify biomarkers from high-dimensional metabolic data from case and control samples. Importantly, we use nearest centroid as the prototype to simultaneously minimize the number of selected features and the leave-one-out cross validation error of classifier. We compared the performance of LPFS to several state-of-the-art methods, such as SVM recursive feature elimination (SVM-RFE) and sparse multinomial logistic regression approach (SMLR). We find that our LPFS method tends to reveal a small set of metabolites with small standard deviation and large shifts, which exactly serves our requirement for good biomarker. Biologically, several metabolite biomarkers for acupuncture treatment are revealed and serve as the candidates for further mechanism investigation. Also biomakers derived from five meridian points, Zusanli (ST36), Liangmen (ST21), Juliao (ST3), Yanglingquan (GB34), and Weizhong (BL40), are compared for their similarity and difference, which provide evidence for the specificity of acupoints. Conclusions Our result demonstrates that metabolic profiling might be a promising method to investigate the molecular mechanism of acupuncture. Comparing with other existing methods, LPFS shows better performance to select a small set of key molecules. In addition, LPFS is a general methodology and can be applied to other high-dimensional data analysis, for example cancer genomics. PMID:23046877

  16. Phoenix Interferometer

    DTIC Science & Technology

    1975-06-01

    proportioning circuit , Triac , and heater blankets. The significant features of the temperature controllers are small size, less than one half per...interferometer. The only change to the Firebird system needed to ac- commodate the new sensor is the replacement of several circuit boards. No hard wiring or...temperature at altitude (220oK). In addition to the sensor head, the Phoe- nix system also includes a set of plug-in printed circuit cards which

  17. Mining and Querying Multimedia Data

    DTIC Science & Technology

    2011-09-29

    able to capture more subtle spatial variations such as repetitiveness. Local feature descriptors such as SIFT [74] and SURF [12] have also been widely...empirically set to s = 90%, r = 50%, K = 20, where small variations lead to little perturbation of the output. The pseudo-code of the algorithm is...by constructing a three-layer graph based on clustering outputs, and executing a slight variation of random walk with restart algorithm. It provided

  18. Occurrence and significance of stalactites within the epithermal deposits at Creede, Colorado

    USGS Publications Warehouse

    Campbell, W.R.; Barton, P.B.

    1996-01-01

    In addition to the common and abundant features in karst terranes, stalactites involving a wide variety of minerals have also been found in other settings, including epigenetic mineral deposits, but these are almost always associated with supergene stages. Here we describe a different mode of occurrence from the Creede epithermal ore deposits, in Colorado, wherein stalactites of silica, sphalerite, galena, or pyrite formed in a vapor-dominated setting, below the paleo-water table, and except possibly for pyrite, as part of the hypogene mineralization. Axial cavities may, or may not, be present. No stalagmites have been recognized. The stalactites are small, from a few millimeters to a few centimeters long and a few millimeters in outer diameter. They represent only a small fraction of one percent of the total mineralization, and are covered by later crystals. Their growth orientation usually is unobservable; however, the parallel arrangement of all stalactites in a given specimen, consistency with indicators of gravitational settling, and the common presence of axial structures make the stalactitic interpretation almost unavoidable. In contrast with common carbonate stalactites, the growth mechanism for the sulfide and silica stalactites requires extensive evaporation. Stalactitic forms have also been reported from other deposits, mostly epithermal or Mississippi-Valley-type occurrences, but we caution that stalactite-like features can form by alternative processes.

  19. Regulation of podocalyxin trafficking by Rab small GTPases in epithelial cells

    PubMed Central

    Mrozowska, Paulina S.; Fukuda, Mitsunori

    2016-01-01

    ABSTRACT The characteristic feature of polarity establishment in MDCK II cells is transcytosis of apical glycoprotein podocalyxin (PCX) from the outer plasma membrane to the newly formed apical domain. This transcytotic event consists of multiple steps, including internalization from the plasma membrane, transport through early endosomes and Rab11-positive recycling endosomes, and delivery to the apical membrane. These steps are known to be tightly coordinated by Rab small GTPases, which act as molecular switches cycling between active GTP-bound and inactive GDP-bound states. However, our knowledge regarding which sets of Rabs regulate particular steps of PCX trafficking was rather limited. Recently, we have performed a comprehensive analysis of Rab GTPase engagement in the transcytotic pathway of PCX during polarity establishment in 2-dimensional (2D) and 3-dimensional (3D) MDCK II cell cultures. In this Commentary we summarize our findings and set them in the context of previous reports. PMID:27463697

  20. Use of Slopes of Small Martian Edifices to Discriminate Between Formation Mechanisms

    NASA Astrophysics Data System (ADS)

    Glaze, L. S.; Sakimoto, S. E.

    2001-05-01

    We have looked at Mars Orbiter Laser Altimeter (MOLA) topographic profiles of several small Martian edifices (3 - 50 km in size) in a variety of volcanic regions from the mid-latitudes to the poles. Viking and Mars Observer Camera (MOC) images and recent MOLA gridded topography data reveal a wide range of small edifice geometries (e.g., Garvin et al., 2000; Won et al., 2001), and a larger number of edifices than previously detected (e.g., Sakimoto, et al., 2001). We have attempted to characterize the average slopes of these edifices using a variety of statistics. Because of the curvature of many of the slopes, simple unweighted and weighted averages are not adequate for characterization. However, most of the flanks can be well described by a parabolic regression (R squared values greater than 90%). As a starting point, we have used the 'slope' term from the parabolic regression for comparison between the various features. The parabolic regression has the form: elevation = a - b sqrt(distance), where the constant 'a' is a vertical offset and 'b' is analogous to the slope. The true instantaneous slope at any point on the flank is found by taking the derivative of the expression above and is necessarily a function of location on the flank. The following table contains values of 'b' for the South and North facing flanks of several volcanic features found in different geologic settings: Feature: (South) (North)\\Polar moderate cratered cone (large crater) B1: (8.588) (7.46)\\Polar steep cratered cone (small crater) B5: (9.90) (10.613)\\Mid-latitude Tempe Terra shield TS1: (2.158) (1.964)\\Mid-latitude Tempe Terra cone TC1: (4.934) (4.591) As can be seen from the table, the individual features are very consistent between their South and North facing flanks. There is also a clear distinction between B5, TS1 and TC1. The uncertainty (standard error) in the 'b' values given above is typically less than 1, suggesting the possibility of at least three separate feature types represented above. In addition to this simple comparison between parabolic slopes, we can also compare the actual shapes of the features. For example, the TS1 shield-type feature has less curvature than the others and may be better characterized by a linear fit. This also distinguishes it from the other features purely by the shape of its flanks. These comparisons allow us to quantitatively document the differences between the small Martian shield volcanoes as a feature class from their more explosive counterparts. Garvin, J.B., et al., Icarus, 145, 648-652, 2000. Wong, M.P., et al., LPSC XXXII, CDROM, abstract #1563, 2001. Sakimoto, S.E.H., et al., LPSC XXXII, CDROM, abstract #1808, 2001.

  1. Robust k-mer frequency estimation using gapped k-mers

    PubMed Central

    Ghandi, Mahmoud; Mohammad-Noori, Morteza

    2013-01-01

    Oligomers of fixed length, k, commonly known as k-mers, are often used as fundamental elements in the description of DNA sequence features of diverse biological function, or as intermediate elements in the constuction of more complex descriptors of sequence features such as position weight matrices. k-mers are very useful as general sequence features because they constitute a complete and unbiased feature set, and do not require parameterization based on incomplete knowledge of biological mechanisms. However, a fundamental limitation in the use of k-mers as sequence features is that as k is increased, larger spatial correlations in DNA sequence elements can be described, but the frequency of observing any specific k-mer becomes very small, and rapidly approaches a sparse matrix of binary counts. Thus any statistical learning approach using k-mers will be susceptible to noisy estimation of k-mer frequencies once k becomes large. Because all molecular DNA interactions have limited spatial extent, gapped k-mers often carry the relevant biological signal. Here we use gapped k-mer counts to more robustly estimate the ungapped k-mer frequencies, by deriving an equation for the minimum norm estimate of k-mer frequencies given an observed set of gapped k-mer frequencies. We demonstrate that this approach provides a more accurate estimate of the k-mer frequencies in real biological sequences using a sample of CTCF binding sites in the human genome. PMID:23861010

  2. Robust k-mer frequency estimation using gapped k-mers.

    PubMed

    Ghandi, Mahmoud; Mohammad-Noori, Morteza; Beer, Michael A

    2014-08-01

    Oligomers of fixed length, k, commonly known as k-mers, are often used as fundamental elements in the description of DNA sequence features of diverse biological function, or as intermediate elements in the constuction of more complex descriptors of sequence features such as position weight matrices. k-mers are very useful as general sequence features because they constitute a complete and unbiased feature set, and do not require parameterization based on incomplete knowledge of biological mechanisms. However, a fundamental limitation in the use of k-mers as sequence features is that as k is increased, larger spatial correlations in DNA sequence elements can be described, but the frequency of observing any specific k-mer becomes very small, and rapidly approaches a sparse matrix of binary counts. Thus any statistical learning approach using k-mers will be susceptible to noisy estimation of k-mer frequencies once k becomes large. Because all molecular DNA interactions have limited spatial extent, gapped k-mers often carry the relevant biological signal. Here we use gapped k-mer counts to more robustly estimate the ungapped k-mer frequencies, by deriving an equation for the minimum norm estimate of k-mer frequencies given an observed set of gapped k-mer frequencies. We demonstrate that this approach provides a more accurate estimate of the k-mer frequencies in real biological sequences using a sample of CTCF binding sites in the human genome.

  3. Enhanced Regulatory Sequence Prediction Using Gapped k-mer Features

    PubMed Central

    Mohammad-Noori, Morteza; Beer, Michael A.

    2014-01-01

    Abstract Oligomers of length k, or k-mers, are convenient and widely used features for modeling the properties and functions of DNA and protein sequences. However, k-mers suffer from the inherent limitation that if the parameter k is increased to resolve longer features, the probability of observing any specific k-mer becomes very small, and k-mer counts approach a binary variable, with most k-mers absent and a few present once. Thus, any statistical learning approach using k-mers as features becomes susceptible to noisy training set k-mer frequencies once k becomes large. To address this problem, we introduce alternative feature sets using gapped k-mers, a new classifier, gkm-SVM, and a general method for robust estimation of k-mer frequencies. To make the method applicable to large-scale genome wide applications, we develop an efficient tree data structure for computing the kernel matrix. We show that compared to our original kmer-SVM and alternative approaches, our gkm-SVM predicts functional genomic regulatory elements and tissue specific enhancers with significantly improved accuracy, increasing the precision by up to a factor of two. We then show that gkm-SVM consistently outperforms kmer-SVM on human ENCODE ChIP-seq datasets, and further demonstrate the general utility of our method using a Naïve-Bayes classifier. Although developed for regulatory sequence analysis, these methods can be applied to any sequence classification problem. PMID:25033408

  4. Enhanced regulatory sequence prediction using gapped k-mer features.

    PubMed

    Ghandi, Mahmoud; Lee, Dongwon; Mohammad-Noori, Morteza; Beer, Michael A

    2014-07-01

    Oligomers of length k, or k-mers, are convenient and widely used features for modeling the properties and functions of DNA and protein sequences. However, k-mers suffer from the inherent limitation that if the parameter k is increased to resolve longer features, the probability of observing any specific k-mer becomes very small, and k-mer counts approach a binary variable, with most k-mers absent and a few present once. Thus, any statistical learning approach using k-mers as features becomes susceptible to noisy training set k-mer frequencies once k becomes large. To address this problem, we introduce alternative feature sets using gapped k-mers, a new classifier, gkm-SVM, and a general method for robust estimation of k-mer frequencies. To make the method applicable to large-scale genome wide applications, we develop an efficient tree data structure for computing the kernel matrix. We show that compared to our original kmer-SVM and alternative approaches, our gkm-SVM predicts functional genomic regulatory elements and tissue specific enhancers with significantly improved accuracy, increasing the precision by up to a factor of two. We then show that gkm-SVM consistently outperforms kmer-SVM on human ENCODE ChIP-seq datasets, and further demonstrate the general utility of our method using a Naïve-Bayes classifier. Although developed for regulatory sequence analysis, these methods can be applied to any sequence classification problem.

  5. Application of LANDSAT system for improving methodology for inventory and classification of wetlands

    NASA Technical Reports Server (NTRS)

    Gilmer, D. S. (Principal Investigator)

    1976-01-01

    The author has identified the following significant results. A newly developed software system for generating statistics on surface water features was tested using LANDSAT data acquired previous to 1975. This software test provided a satisfactory evaluation of the system and also allowed expansion of data base on prairie water features. The software system recognizes water on the basis of a classification algorithm. This classification is accomplished by level thresholding a single near infrared data channel. After each pixel is classified as water or nonwater, the software system then recognizes ponds or lakes as sets of contiguous pixels or as single isolated pixels in the case of very small ponds. Pixels are considered to be contiguous if they are adjacent between successive scan lines. After delineating each water feature, the software system then assigns the feature a position based upon a geographic grid system and calculates the feature's planimetric area, its perimeter, and a parameter known as the shape factor.

  6. Fine-tuning convolutional deep features for MRI based brain tumor classification

    NASA Astrophysics Data System (ADS)

    Ahmed, Kaoutar B.; Hall, Lawrence O.; Goldgof, Dmitry B.; Liu, Renhao; Gatenby, Robert A.

    2017-03-01

    Prediction of survival time from brain tumor magnetic resonance images (MRI) is not commonly performed and would ordinarily be a time consuming process. However, current cross-sectional imaging techniques, particularly MRI, can be used to generate many features that may provide information on the patient's prognosis, including survival. This information can potentially be used to identify individuals who would benefit from more aggressive therapy. Rather than using pre-defined and hand-engineered features as with current radiomics methods, we investigated the use of deep features extracted from pre-trained convolutional neural networks (CNNs) in predicting survival time. We also provide evidence for the power of domain specific fine-tuning in improving the performance of a pre-trained CNN's, even though our data set is small. We fine-tuned a CNN initially trained on a large natural image recognition dataset (Imagenet ILSVRC) and transferred the learned feature representations to the survival time prediction task, obtaining over 81% accuracy in a leave one out cross validation.

  7. Early melanoma diagnosis with mobile imaging.

    PubMed

    Do, Thanh-Toan; Zhou, Yiren; Zheng, Haitian; Cheung, Ngai-Man; Koh, Dawn

    2014-01-01

    We research a mobile imaging system for early diagnosis of melanoma. Different from previous work, we focus on smartphone-captured images, and propose a detection system that runs entirely on the smartphone. Smartphone-captured images taken under loosely-controlled conditions introduce new challenges for melanoma detection, while processing performed on the smartphone is subject to computation and memory constraints. To address these challenges, we propose to localize the skin lesion by combining fast skin detection and fusion of two fast segmentation results. We propose new features to capture color variation and border irregularity which are useful for smartphone-captured images. We also propose a new feature selection criterion to select a small set of good features used in the final lightweight system. Our evaluation confirms the effectiveness of proposed algorithms and features. In addition, we present our system prototype which computes selected visual features from a user-captured skin lesion image, and analyzes them to estimate the likelihood of malignance, all on an off-the-shelf smartphone.

  8. Development and validation of a radiomics nomogram for progression-free survival prediction in stage IV EGFR-mutant non-small cell lung cancer

    NASA Astrophysics Data System (ADS)

    Song, Jiangdian; Zang, Yali; Li, Weimin; Zhong, Wenzhao; Shi, Jingyun; Dong, Di; Fang, Mengjie; Liu, Zaiyi; Tian, Jie

    2017-03-01

    Accurately predict the risk of disease progression and benefit of tyrosine kinase inhibitors (TKIs) therapy for stage IV non-small cell lung cancer (NSCLC) patients with activing epidermal growth factor receptor (EGFR) mutations by current staging methods are challenge. We postulated that integrating a classifier consisted of multiple computed tomography (CT) phenotypic features, and other clinicopathological risk factors into a single model could improve risk stratification and prediction of progression-free survival (PFS) of EGFR TKIs for these patients. Patients confirmed as stage IV EGFR-mutant NSCLC received EGFR TKIs with no resection; pretreatment contrast enhanced CT performed at approximately 2 weeks before the treatment was enrolled. A six-CT-phenotypic-feature-based classifier constructed by the LASSO Cox regression model, and three clinicopathological factors: pathologic N category, performance status (PS) score, and intrapulmonary metastasis status were used to construct a nomogram in a training set of 115 patients. The prognostic and predictive accuracy of this nomogram was then subjected to an external independent validation of 107 patients. PFS between the training and independent validation set is no statistical difference by Mann-Whitney U test (P = 0.2670). PFS of the patients could be predicted with good consistency compared with the actual survival. C-index of the proposed individualized nomogram in the training set (0·707, 95%CI: 0·643, 0·771) and the independent validation set (0·715, 95%CI: 0·650, 0·780) showed the potential of clinical prognosis to predict PFS of stage IV EGFR-mutant NSCLC from EGFR TKIs. The individualized nomogram might facilitate patient counselling and individualise management of patients with this disease.

  9. Interobserver Agreement on Endoscopic Classification of Oesophageal Varices in Children.

    PubMed

    D'Antiga, Lorenzo; Betalli, Pietro; De Angelis, Paola; Davenport, Mark; Di Giorgio, Angelo; McKiernan, Patrick J; McLin, Valerie; Ravelli, Paolo; Durmaz, Ozlem; Talbotec, Cecile; Sturm, Ekkehard; Woynarowski, Marek; Burroughs, Andrew K

    2015-08-01

    Data regarding agreement on endoscopic features of oesophageal varices in children with portal hypertension (PH) are scant. The aim of this study was to evaluate endoscopic visualisation and classification of oesophageal varices in children by several European clinicians, to build a rational basis for future multicentre trials. Endoscopic pictures of the distal oesophagus of 100 children with a clinical diagnosis of PH were distributed to 10 endoscopists. Observers were requested to classify variceal size according to a 3-degree scale (small, medium, and large, class A), a 2-degree scale (small and large, class B), and to recognise red wales (presence or absence, class Red). Overall agreement was considered fair if Fleiss and Cohen κ test was ≥0.30, good if ≥0.40, excellent if ≥0.60, and perfect if ≥0.80. Agreement between observers was fair with class A (κ = 0.34) and class B (κ = 0.38), and good with class Red (κ = 0.49). The agreement was good on presence versus absence of varices (class A = 0.53, class B = 0.48). The agreement among the observers was good in class A when endoscopic features of severe PH (medium and large sizes, red marks) were grouped and compared with mild features (absent and small varices) (κ = 0.58). Experts working in different centres show a fairly good agreement on endoscopic features of PH in children, although a better training of paediatric endoscopists may improve the agreement in grading severity of varices in this setting.

  10. Neuroimaging standards for research into small vessel disease and its contribution to ageing and neurodegeneration

    PubMed Central

    Wardlaw, Joanna M; Smith, Eric E; Biessels, Geert J; Cordonnier, Charlotte; Fazekas, Franz; Frayne, Richard; Lindley, Richard I; O'Brien, John T; Barkhof, Frederik; Benavente, Oscar R; Black, Sandra E; Brayne, Carol; Breteler, Monique; Chabriat, Hugues; DeCarli, Charles; de Leeuw, Frank-Erik; Doubal, Fergus; Duering, Marco; Fox, Nick C; Greenberg, Steven; Hachinski, Vladimir; Kilimann, Ingo; Mok, Vincent; Oostenbrugge, Robert van; Pantoni, Leonardo; Speck, Oliver; Stephan, Blossom C M; Teipel, Stefan; Viswanathan, Anand; Werring, David; Chen, Christopher; Smith, Colin; van Buchem, Mark; Norrving, Bo; Gorelick, Philip B; Dichgans, Martin

    2013-01-01

    Summary Cerebral small vessel disease (SVD) is a common accompaniment of ageing. Features seen on neuroimaging include recent small subcortical infarcts, lacunes, white matter hyperintensities, perivascular spaces, microbleeds, and brain atrophy. SVD can present as a stroke or cognitive decline, or can have few or no symptoms. SVD frequently coexists with neurodegenerative disease, and can exacerbate cognitive deficits, physical disabilities, and other symptoms of neurodegeneration. Terminology and definitions for imaging the features of SVD vary widely, which is also true for protocols for image acquisition and image analysis. This lack of consistency hampers progress in identifying the contribution of SVD to the pathophysiology and clinical features of common neurodegenerative diseases. We are an international working group from the Centres of Excellence in Neurodegeneration. We completed a structured process to develop definitions and imaging standards for markers and consequences of SVD. We aimed to achieve the following: first, to provide a common advisory about terms and definitions for features visible on MRI; second, to suggest minimum standards for image acquisition and analysis; third, to agree on standards for scientific reporting of changes related to SVD on neuroimaging; and fourth, to review emerging imaging methods for detection and quantification of preclinical manifestations of SVD. Our findings and recommendations apply to research studies, and can be used in the clinical setting to standardise image interpretation, acquisition, and reporting. This Position Paper summarises the main outcomes of this international effort to provide the STandards for ReportIng Vascular changes on nEuroimaging (STRIVE). PMID:23867200

  11. Three small deployed satellites

    NASA Image and Video Library

    2012-10-04

    ISS033-E-009282 (4 Oct. 2012) --- Several tiny satellites are featured in this image photographed by an Expedition 33 crew member on the International Space Station. The satellites were released outside the Kibo laboratory using a Small Satellite Orbital Deployer attached to the Japanese module’s robotic arm on Oct. 4, 2012. Japan Aerospace Exploration Agency astronaut Aki Hoshide, flight engineer, set up the satellite deployment gear inside the lab and placed it in the Kibo airlock. The Japanese robotic arm then grappled the deployment system and its satellites from the airlock for deployment. Earth’s horizon and the blackness of space provide the backdrop for the scene.

  12. JEMRMS Small Satellite Deployment Observation

    NASA Image and Video Library

    2012-10-04

    ISS033-E-009315 (4 Oct. 2012) --- Several tiny satellites are featured in this image photographed by an Expedition 33 crew member on the International Space Station. The satellites were released outside the Kibo laboratory using a Small Satellite Orbital Deployer attached to the Japanese module’s robotic arm on Oct. 4, 2012. Japan Aerospace Exploration Agency astronaut Aki Hoshide, flight engineer, set up the satellite deployment gear inside the lab and placed it in the Kibo airlock. The Japanese robotic arm then grappled the deployment system and its satellites from the airlock for deployment. A blue and white part of Earth provides the backdrop for the scene.

  13. Architecture of marine food webs: To be or not be a 'small-world'.

    PubMed

    Marina, Tomás Ignacio; Saravia, Leonardo A; Cordone, Georgina; Salinas, Vanesa; Doyle, Santiago R; Momo, Fernando R

    2018-01-01

    The search for general properties in network structure has been a central issue for food web studies in recent years. One such property is the small-world topology that combines a high clustering and a small distance between nodes of the network. This property may increase food web resilience but make them more sensitive to the extinction of connected species. Food web theory has been developed principally from freshwater and terrestrial ecosystems, largely omitting marine habitats. If theory needs to be modified to accommodate observations from marine ecosystems, based on major differences in several topological characteristics is still on debate. Here we investigated if the small-world topology is a common structural pattern in marine food webs. We developed a novel, simple and statistically rigorous method to examine the largest set of complex marine food webs to date. More than half of the analyzed marine networks exhibited a similar or lower characteristic path length than the random expectation, whereas 39% of the webs presented a significantly higher clustering than its random counterpart. Our method proved that 5 out of 28 networks fulfilled both features of the small-world topology: short path length and high clustering. This work represents the first rigorous analysis of the small-world topology and its associated features in high-quality marine networks. We conclude that such topology is a structural pattern that is not maximized in marine food webs; thus it is probably not an effective model to study robustness, stability and feasibility of marine ecosystems.

  14. a Fully Automated Pipeline for Classification Tasks with AN Application to Remote Sensing

    NASA Astrophysics Data System (ADS)

    Suzuki, K.; Claesen, M.; Takeda, H.; De Moor, B.

    2016-06-01

    Nowadays deep learning has been intensively in spotlight owing to its great victories at major competitions, which undeservedly pushed `shallow' machine learning methods, relatively naive/handy algorithms commonly used by industrial engineers, to the background in spite of their facilities such as small requisite amount of time/dataset for training. We, with a practical point of view, utilized shallow learning algorithms to construct a learning pipeline such that operators can utilize machine learning without any special knowledge, expensive computation environment, and a large amount of labelled data. The proposed pipeline automates a whole classification process, namely feature-selection, weighting features and the selection of the most suitable classifier with optimized hyperparameters. The configuration facilitates particle swarm optimization, one of well-known metaheuristic algorithms for the sake of generally fast and fine optimization, which enables us not only to optimize (hyper)parameters but also to determine appropriate features/classifier to the problem, which has conventionally been a priori based on domain knowledge and remained untouched or dealt with naïve algorithms such as grid search. Through experiments with the MNIST and CIFAR-10 datasets, common datasets in computer vision field for character recognition and object recognition problems respectively, our automated learning approach provides high performance considering its simple setting (i.e. non-specialized setting depending on dataset), small amount of training data, and practical learning time. Moreover, compared to deep learning the performance stays robust without almost any modification even with a remote sensing object recognition problem, which in turn indicates that there is a high possibility that our approach contributes to general classification problems.

  15. Utility of Ward-Based Retinal Photography in Stroke Patients.

    PubMed

    Frost, Shaun; Brown, Michael; Stirling, Verity; Vignarajan, Janardhan; Prentice, David; Kanagasingam, Yogesan

    2017-03-01

    Improvements in acute care of stroke patients have decreased mortality, but survivors are still at increased risk of future vascular events and mitigation of this risk requires thorough assessment of the underlying factors leading to the stroke. The brain and eye share a common embryological origin and numerous similarities exist between the small vessels of the retina and brain. Recent population-based studies have demonstrated a close link between retinal vascular changes and stroke, suggesting that retinal photography could have utility in assessing underlying stroke risk factors and prognosis after stroke. Modern imaging equipment can facilitate precise measurement and monitoring of vascular features. However, use of this equipment is a challenge in the stroke ward setting as patients are frequently unable to maintain the required seated position, and pupil dilatation is often not feasible as it could potentially obscure important neurological signs of stroke progression. This small study investigated the utility of a novel handheld, nonmydriatic retinal camera in the stroke ward and explored associations between retinal vascular features and stroke risk factors. This camera circumvented the practical limitations of conducting retinal photography in the stroke ward setting. A positive correlation was found between carotid disease and both mean width of arterioles (r = .40, P = .00571) and venules (r = .30, P = .0381). The results provide further evidence that retinal vascular features are clinically informative about underlying stroke risk factors and demonstrate the utility of handheld retinal photography in the stroke ward. Copyright © 2017 National Stroke Association. Published by Elsevier Inc. All rights reserved.

  16. Comparative Study of SVM Methods Combined with Voxel Selection for Object Category Classification on fMRI Data

    PubMed Central

    Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li

    2011-01-01

    Background Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Methodology/Principal Findings Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. Conclusions/Significance The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice. PMID:21359184

  17. Comparative study of SVM methods combined with voxel selection for object category classification on fMRI data.

    PubMed

    Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li

    2011-02-16

    Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice.

  18. SU-F-R-22: Malignancy Classification for Small Pulmonary Nodules with Radiomics and Logistic Regression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Huang, W; Tu, S

    Purpose: We conducted a retrospective study of Radiomics research for classifying malignancy of small pulmonary nodules. A machine learning algorithm of logistic regression and open research platform of Radiomics, IBEX (Imaging Biomarker Explorer), were used to evaluate the classification accuracy. Methods: The training set included 100 CT image series from cancer patients with small pulmonary nodules where the average diameter is 1.10 cm. These patients registered at Chang Gung Memorial Hospital and received a CT-guided operation of lung cancer lobectomy. The specimens were classified by experienced pathologists with a B (benign) or M (malignant). CT images with slice thickness ofmore » 0.625 mm were acquired from a GE BrightSpeed 16 scanner. The study was formally approved by our institutional internal review board. Nodules were delineated and 374 feature parameters were extracted from IBEX. We first used the t-test and p-value criteria to study which feature can differentiate between group B and M. Then we implemented a logistic regression algorithm to perform nodule malignancy classification. 10-fold cross-validation and the receiver operating characteristic curve (ROC) were used to evaluate the classification accuracy. Finally hierarchical clustering analysis, Spearman rank correlation coefficient, and clustering heat map were used to further study correlation characteristics among different features. Results: 238 features were found differentiable between group B and M based on whether their statistical p-values were less than 0.05. A forward search algorithm was used to select an optimal combination of features for the best classification and 9 features were identified. Our study found the best accuracy of classifying malignancy was 0.79±0.01 with the 10-fold cross-validation. The area under the ROC curve was 0.81±0.02. Conclusion: Benign nodules may be treated as a malignant tumor in low-dose CT and patients may undergo unnecessary surgeries or treatments. Our study may help radiologists to differentiate nodule malignancy for low-dose CT.« less

  19. Effect of finite sample size on feature selection and classification: a simulation study.

    PubMed

    Way, Ted W; Sahiner, Berkman; Hadjiiski, Lubomir M; Chan, Heang-Ping

    2010-02-01

    The small number of samples available for training and testing is often the limiting factor in finding the most effective features and designing an optimal computer-aided diagnosis (CAD) system. Training on a limited set of samples introduces bias and variance in the performance of a CAD system relative to that trained with an infinite sample size. In this work, the authors conducted a simulation study to evaluate the performances of various combinations of classifiers and feature selection techniques and their dependence on the class distribution, dimensionality, and the training sample size. The understanding of these relationships will facilitate development of effective CAD systems under the constraint of limited available samples. Three feature selection techniques, the stepwise feature selection (SFS), sequential floating forward search (SFFS), and principal component analysis (PCA), and two commonly used classifiers, Fisher's linear discriminant analysis (LDA) and support vector machine (SVM), were investigated. Samples were drawn from multidimensional feature spaces of multivariate Gaussian distributions with equal or unequal covariance matrices and unequal means, and with equal covariance matrices and unequal means estimated from a clinical data set. Classifier performance was quantified by the area under the receiver operating characteristic curve Az. The mean Az values obtained by resubstitution and hold-out methods were evaluated for training sample sizes ranging from 15 to 100 per class. The number of simulated features available for selection was chosen to be 50, 100, and 200. It was found that the relative performance of the different combinations of classifier and feature selection method depends on the feature space distributions, the dimensionality, and the available training sample sizes. The LDA and SVM with radial kernel performed similarly for most of the conditions evaluated in this study, although the SVM classifier showed a slightly higher hold-out performance than LDA for some conditions and vice versa for other conditions. PCA was comparable to or better than SFS and SFFS for LDA at small samples sizes, but inferior for SVM with polynomial kernel. For the class distributions simulated from clinical data, PCA did not show advantages over the other two feature selection methods. Under this condition, the SVM with radial kernel performed better than the LDA when few training samples were available, while LDA performed better when a large number of training samples were available. None of the investigated feature selection-classifier combinations provided consistently superior performance under the studied conditions for different sample sizes and feature space distributions. In general, the SFFS method was comparable to the SFS method while PCA may have an advantage for Gaussian feature spaces with unequal covariance matrices. The performance of the SVM with radial kernel was better than, or comparable to, that of the SVM with polynomial kernel under most conditions studied.

  20. Constraints on inflation with LSS surveys: features in the primordial power spectrum

    NASA Astrophysics Data System (ADS)

    Palma, Gonzalo A.; Sapone, Domenico; Sypsas, Spyros

    2018-06-01

    We analyse the efficiency of future large scale structure surveys to unveil the presence of scale dependent features in the primordial spectrum—resulting from cosmic inflation—imprinted in the distribution of galaxies. Features may appear as a consequence of non-trivial dynamics during cosmic inflation, in which one or more background quantities experienced small but rapid deviations from their characteristic slow-roll evolution. We consider two families of features: localised features and oscillatory extended features. To characterise them we employ various possible templates parametrising their scale dependence and provide forecasts on the constraints on these parametrisations for LSST like surveys. We perform a Fisher matrix analysis for three observables: cosmic microwave background (CMB), galaxy clustering and weak lensing. We find that the combined data set of these observables will be able to limit the presence of features down to levels that are more restrictive than current constraints coming from CMB observations only. In particular, we address the possibility of gaining information on currently known deviations from scale invariance inferred from CMB data, such as the feature appearing at the l ~ 20 multipole (which is the main contribution to the low-l deficit) and another one around l ~ 800.

  1. A combined pharmacophore modeling, 3D-QSAR and molecular docking study of substituted bicyclo-[3.3.0]oct-2-enes as liver receptor homolog-1 (LRH-1) agonists

    NASA Astrophysics Data System (ADS)

    Lalit, Manisha; Gangwal, Rahul P.; Dhoke, Gaurao V.; Damre, Mangesh V.; Khandelwal, Kanchan; Sangamwar, Abhay T.

    2013-10-01

    A combined pharmacophore modelling, 3D-QSAR and molecular docking approach was employed to reveal structural and chemical features essential for the development of small molecules as LRH-1 agonists. The best HypoGen pharmacophore hypothesis (Hypo1) consists of one hydrogen-bond donor (HBD), two general hydrophobic (H), one hydrophobic aromatic (HYAr) and one hydrophobic aliphatic (HYA) feature. It has exhibited high correlation coefficient of 0.927, cost difference of 85.178 bit and low RMS value of 1.411. This pharmacophore hypothesis was cross-validated using test set, decoy set and Cat-Scramble methodology. Subsequently, validated pharmacophore hypothesis was used in the screening of small chemical databases. Further, 3D-QSAR models were developed based on the alignment obtained using substructure alignment. The best CoMFA and CoMSIA model has exhibited excellent rncv2 values of 0.991 and 0.987, and rcv2 values of 0.767 and 0.703, respectively. CoMFA predicted rpred2 of 0.87 and CoMSIA predicted rpred2 of 0.78 showed that the predicted values were in good agreement with the experimental values. Molecular docking analysis reveals that π-π interaction with His390 and hydrogen bond interaction with His390/Arg393 is essential for LRH-1 agonistic activity. The results from pharmacophore modelling, 3D-QSAR and molecular docking are complementary to each other and could serve as a powerful tool for the discovery of potent small molecules as LRH-1 agonists.

  2. The geometry of distributional preferences and a non-parametric identification approach: The Equality Equivalence Test.

    PubMed

    Kerschbamer, Rudolf

    2015-05-01

    This paper proposes a geometric delineation of distributional preference types and a non-parametric approach for their identification in a two-person context. It starts with a small set of assumptions on preferences and shows that this set (i) naturally results in a taxonomy of distributional archetypes that nests all empirically relevant types considered in previous work; and (ii) gives rise to a clean experimental identification procedure - the Equality Equivalence Test - that discriminates between archetypes according to core features of preferences rather than properties of specific modeling variants. As a by-product the test yields a two-dimensional index of preference intensity.

  3. Emotion-independent face recognition

    NASA Astrophysics Data System (ADS)

    De Silva, Liyanage C.; Esther, Kho G. P.

    2000-12-01

    Current face recognition techniques tend to work well when recognizing faces under small variations in lighting, facial expression and pose, but deteriorate under more extreme conditions. In this paper, a face recognition system to recognize faces of known individuals, despite variations in facial expression due to different emotions, is developed. The eigenface approach is used for feature extraction. Classification methods include Euclidean distance, back propagation neural network and generalized regression neural network. These methods yield 100% recognition accuracy when the training database is representative, containing one image representing the peak expression for each emotion of each person apart from the neutral expression. The feature vectors used for comparison in the Euclidean distance method and for training the neural network must be all the feature vectors of the training set. These results are obtained for a face database consisting of only four persons.

  4. Quantitative Comparison of Minimum Inductance and Minimum Power Algorithms for the Design of Shim Coils for Small Animal Imaging

    PubMed Central

    HUDSON, PARISA; HUDSON, STEPHEN D.; HANDLER, WILLIAM B.; SCHOLL, TIMOTHY J.; CHRONIK, BLAINE A.

    2010-01-01

    High-performance shim coils are required for high-field magnetic resonance imaging and spectroscopy. Complete sets of high-power and high-performance shim coils were designed using two different methods: the minimum inductance and the minimum power target field methods. A quantitative comparison of shim performance in terms of merit of inductance (ML) and merit of resistance (MR) was made for shim coils designed using the minimum inductance and the minimum power design algorithms. In each design case, the difference in ML and the difference in MR given by the two design methods was <15%. Comparison of wire patterns obtained using the two design algorithms show that minimum inductance designs tend to feature oscillations within the current density; while minimum power designs tend to feature less rapidly varying current densities and lower power dissipation. Overall, the differences in coil performance obtained by the two methods are relatively small. For the specific case of shim systems customized for small animal imaging, the reduced power dissipation obtained when using the minimum power method is judged to be more significant than the improvements in switching speed obtained from the minimum inductance method. PMID:20411157

  5. Approximate maximum likelihood decoding of block codes

    NASA Technical Reports Server (NTRS)

    Greenberger, H. J.

    1979-01-01

    Approximate maximum likelihood decoding algorithms, based upon selecting a small set of candidate code words with the aid of the estimated probability of error of each received symbol, can give performance close to optimum with a reasonable amount of computation. By combining the best features of various algorithms and taking care to perform each step as efficiently as possible, a decoding scheme was developed which can decode codes which have better performance than those presently in use and yet not require an unreasonable amount of computation. The discussion of the details and tradeoffs of presently known efficient optimum and near optimum decoding algorithms leads, naturally, to the one which embodies the best features of all of them.

  6. Partial Least Squares Regression Can Aid in Detecting Differential Abundance of Multiple Features in Sets of Metagenomic Samples

    PubMed Central

    Libiger, Ondrej; Schork, Nicholas J.

    2015-01-01

    It is now feasible to examine the composition and diversity of microbial communities (i.e., “microbiomes”) that populate different human organs and orifices using DNA sequencing and related technologies. To explore the potential links between changes in microbial communities and various diseases in the human body, it is essential to test associations involving different species within and across microbiomes, environmental settings and disease states. Although a number of statistical techniques exist for carrying out relevant analyses, it is unclear which of these techniques exhibit the greatest statistical power to detect associations given the complexity of most microbiome datasets. We compared the statistical power of principal component regression, partial least squares regression, regularized regression, distance-based regression, Hill's diversity measures, and a modified test implemented in the popular and widely used microbiome analysis methodology “Metastats” across a wide range of simulated scenarios involving changes in feature abundance between two sets of metagenomic samples. For this purpose, simulation studies were used to change the abundance of microbial species in a real dataset from a published study examining human hands. Each technique was applied to the same data, and its ability to detect the simulated change in abundance was assessed. We hypothesized that a small subset of methods would outperform the rest in terms of the statistical power. Indeed, we found that the Metastats technique modified to accommodate multivariate analysis and partial least squares regression yielded high power under the models and data sets we studied. The statistical power of diversity measure-based tests, distance-based regression and regularized regression was significantly lower. Our results provide insight into powerful analysis strategies that utilize information on species counts from large microbiome data sets exhibiting skewed frequency distributions obtained on a small to moderate number of samples. PMID:26734061

  7. What Can We Learn from a Simple Physics-Based Earthquake Simulator?

    NASA Astrophysics Data System (ADS)

    Artale Harris, Pietro; Marzocchi, Warner; Melini, Daniele

    2018-03-01

    Physics-based earthquake simulators are becoming a popular tool to investigate on the earthquake occurrence process. So far, the development of earthquake simulators is commonly led by the approach "the more physics, the better". However, this approach may hamper the comprehension of the outcomes of the simulator; in fact, within complex models, it may be difficult to understand which physical parameters are the most relevant to the features of the seismic catalog at which we are interested. For this reason, here, we take an opposite approach and analyze the behavior of a purposely simple earthquake simulator applied to a set of California faults. The idea is that a simple simulator may be more informative than a complex one for some specific scientific objectives, because it is more understandable. Our earthquake simulator has three main components: the first one is a realistic tectonic setting, i.e., a fault data set of California; the second is the application of quantitative laws for earthquake generation on each single fault, and the last is the fault interaction modeling through the Coulomb Failure Function. The analysis of this simple simulator shows that: (1) the short-term clustering can be reproduced by a set of faults with an almost periodic behavior, which interact according to a Coulomb failure function model; (2) a long-term behavior showing supercycles of the seismic activity exists only in a markedly deterministic framework, and quickly disappears introducing a small degree of stochasticity on the recurrence of earthquakes on a fault; (3) faults that are strongly coupled in terms of Coulomb failure function model are synchronized in time only in a marked deterministic framework, and as before, such a synchronization disappears introducing a small degree of stochasticity on the recurrence of earthquakes on a fault. Overall, the results show that even in a simple and perfectly known earthquake occurrence world, introducing a small degree of stochasticity may blur most of the deterministic time features, such as long-term trend and synchronization among nearby coupled faults.

  8. Recovery of sparse translation-invariant signals with continuous basis pursuit

    PubMed Central

    Ekanadham, Chaitanya; Tranchina, Daniel; Simoncelli, Eero

    2013-01-01

    We consider the problem of decomposing a signal into a linear combination of features, each a continuously translated version of one of a small set of elementary features. Although these constituents are drawn from a continuous family, most current signal decomposition methods rely on a finite dictionary of discrete examples selected from this family (e.g., shifted copies of a set of basic waveforms), and apply sparse optimization methods to select and solve for the relevant coefficients. Here, we generate a dictionary that includes auxiliary interpolation functions that approximate translates of features via adjustment of their coefficients. We formulate a constrained convex optimization problem, in which the full set of dictionary coefficients represents a linear approximation of the signal, the auxiliary coefficients are constrained so as to only represent translated features, and sparsity is imposed on the primary coefficients using an L1 penalty. The basis pursuit denoising (BP) method may be seen as a special case, in which the auxiliary interpolation functions are omitted, and we thus refer to our methodology as continuous basis pursuit (CBP). We develop two implementations of CBP for a one-dimensional translation-invariant source, one using a first-order Taylor approximation, and another using a form of trigonometric spline. We examine the tradeoff between sparsity and signal reconstruction accuracy in these methods, demonstrating empirically that trigonometric CBP substantially outperforms Taylor CBP, which in turn offers substantial gains over ordinary BP. In addition, the CBP bases can generally achieve equally good or better approximations with much coarser sampling than BP, leading to a reduction in dictionary dimensionality. PMID:24352562

  9. Differential prioritization between relevance and redundancy in correlation-based feature selection techniques for multiclass gene expression data.

    PubMed

    Ooi, Chia Huey; Chetty, Madhu; Teng, Shyh Wei

    2006-06-23

    Due to the large number of genes in a typical microarray dataset, feature selection looks set to play an important role in reducing noise and computational cost in gene expression-based tissue classification while improving accuracy at the same time. Surprisingly, this does not appear to be the case for all multiclass microarray datasets. The reason is that many feature selection techniques applied on microarray datasets are either rank-based and hence do not take into account correlations between genes, or are wrapper-based, which require high computational cost, and often yield difficult-to-reproduce results. In studies where correlations between genes are considered, attempts to establish the merit of the proposed techniques are hampered by evaluation procedures which are less than meticulous, resulting in overly optimistic estimates of accuracy. We present two realistically evaluated correlation-based feature selection techniques which incorporate, in addition to the two existing criteria involved in forming a predictor set (relevance and redundancy), a third criterion called the degree of differential prioritization (DDP). DDP functions as a parameter to strike the balance between relevance and redundancy, providing our techniques with the novel ability to differentially prioritize the optimization of relevance against redundancy (and vice versa). This ability proves useful in producing optimal classification accuracy while using reasonably small predictor set sizes for nine well-known multiclass microarray datasets. For multiclass microarray datasets, especially the GCM and NCI60 datasets, DDP enables our filter-based techniques to produce accuracies better than those reported in previous studies which employed similarly realistic evaluation procedures.

  10. Linear and Non-Linear Visual Feature Learning in Rat and Humans

    PubMed Central

    Bossens, Christophe; Op de Beeck, Hans P.

    2016-01-01

    The visual system processes visual input in a hierarchical manner in order to extract relevant features that can be used in tasks such as invariant object recognition. Although typically investigated in primates, recent work has shown that rats can be trained in a variety of visual object and shape recognition tasks. These studies did not pinpoint the complexity of the features used by these animals. Many tasks might be solved by using a combination of relatively simple features which tend to be correlated. Alternatively, rats might extract complex features or feature combinations which are nonlinear with respect to those simple features. In the present study, we address this question by starting from a small stimulus set for which one stimulus-response mapping involves a simple linear feature to solve the task while another mapping needs a well-defined nonlinear combination of simpler features related to shape symmetry. We verified computationally that the nonlinear task cannot be trivially solved by a simple V1-model. We show how rats are able to solve the linear feature task but are unable to acquire the nonlinear feature. In contrast, humans are able to use the nonlinear feature and are even faster in uncovering this solution as compared to the linear feature. The implications for the computational capabilities of the rat visual system are discussed. PMID:28066201

  11. Impacts of uncertainties in European gridded precipitation observations on regional climate analysis.

    PubMed

    Prein, Andreas F; Gobiet, Andreas

    2017-01-01

    Gridded precipitation data sets are frequently used to evaluate climate models or to remove model output biases. Although precipitation data are error prone due to the high spatio-temporal variability of precipitation and due to considerable measurement errors, relatively few attempts have been made to account for observational uncertainty in model evaluation or in bias correction studies. In this study, we compare three types of European daily data sets featuring two Pan-European data sets and a set that combines eight very high-resolution station-based regional data sets. Furthermore, we investigate seven widely used, larger scale global data sets. Our results demonstrate that the differences between these data sets have the same magnitude as precipitation errors found in regional climate models. Therefore, including observational uncertainties is essential for climate studies, climate model evaluation, and statistical post-processing. Following our results, we suggest the following guidelines for regional precipitation assessments. (1) Include multiple observational data sets from different sources (e.g. station, satellite, reanalysis based) to estimate observational uncertainties. (2) Use data sets with high station densities to minimize the effect of precipitation undersampling (may induce about 60% error in data sparse regions). The information content of a gridded data set is mainly related to its underlying station density and not to its grid spacing. (3) Consider undercatch errors of up to 80% in high latitudes and mountainous regions. (4) Analyses of small-scale features and extremes are especially uncertain in gridded data sets. For higher confidence, use climate-mean and larger scale statistics. In conclusion, neglecting observational uncertainties potentially misguides climate model development and can severely affect the results of climate change impact assessments.

  12. When will Low-Contrast Features be Visible in a STEM X-Ray Spectrum Image?

    PubMed

    Parish, Chad M

    2015-06-01

    When will a small or low-contrast feature, such as an embedded second-phase particle, be visible in a scanning transmission electron microscopy (STEM) X-ray map? This work illustrates a computationally inexpensive method to simulate X-ray maps and spectrum images (SIs), based upon the equations of X-ray generation and detection. To particularize the general procedure, an example of nanostructured ferritic alloy (NFA) containing nm-sized Y2Ti2O7 embedded precipitates in ferritic stainless steel matrix is chosen. The proposed model produces physically appearing simulated SI data sets, which can either be reduced to X-ray dot maps or analyzed via multivariate statistical analysis. Comparison to NFA X-ray maps acquired using three different STEM instruments match the generated simulations quite well, despite the large number of simplifying assumptions used. A figure of merit of electron dose multiplied by X-ray collection solid angle is proposed to compare feature detectability from one data set (simulated or experimental) to another. The proposed method can scope experiments that are feasible under specific analysis conditions on a given microscope. Future applications, such as spallation proton-neutron irradiations, core-shell nanoparticles, or dopants in polycrystalline photovoltaic solar cells, are proposed.

  13. Density-Dependent Quantized Least Squares Support Vector Machine for Large Data Sets.

    PubMed

    Nan, Shengyu; Sun, Lei; Chen, Badong; Lin, Zhiping; Toh, Kar-Ann

    2017-01-01

    Based on the knowledge that input data distribution is important for learning, a data density-dependent quantization scheme (DQS) is proposed for sparse input data representation. The usefulness of the representation scheme is demonstrated by using it as a data preprocessing unit attached to the well-known least squares support vector machine (LS-SVM) for application on big data sets. Essentially, the proposed DQS adopts a single shrinkage threshold to obtain a simple quantization scheme, which adapts its outputs to input data density. With this quantization scheme, a large data set is quantized to a small subset where considerable sample size reduction is generally obtained. In particular, the sample size reduction can save significant computational cost when using the quantized subset for feature approximation via the Nyström method. Based on the quantized subset, the approximated features are incorporated into LS-SVM to develop a data density-dependent quantized LS-SVM (DQLS-SVM), where an analytic solution is obtained in the primal solution space. The developed DQLS-SVM is evaluated on synthetic and benchmark data with particular emphasis on large data sets. Extensive experimental results show that the learning machine incorporating DQS attains not only high computational efficiency but also good generalization performance.

  14. AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity.

    PubMed

    Sun, Lei; Wang, Jun; Wei, Jinmao

    2017-03-14

    The Receiver Operator Characteristic (ROC) curve is well-known in evaluating classification performance in biomedical field. Owing to its superiority in dealing with imbalanced and cost-sensitive data, the ROC curve has been exploited as a popular metric to evaluate and find out disease-related genes (features). The existing ROC-based feature selection approaches are simple and effective in evaluating individual features. However, these approaches may fail to find real target feature subset due to their lack of effective means to reduce the redundancy between features, which is essential in machine learning. In this paper, we propose to assess feature complementarity by a trick of measuring the distances between the misclassified instances and their nearest misses on the dimensions of pairwise features. If a misclassified instance and its nearest miss on one feature dimension are far apart on another feature dimension, the two features are regarded as complementary to each other. Subsequently, we propose a novel filter feature selection approach on the basis of the ROC analysis. The new approach employs an efficient heuristic search strategy to select optimal features with highest complementarities. The experimental results on a broad range of microarray data sets validate that the classifiers built on the feature subset selected by our approach can get the minimal balanced error rate with a small amount of significant features. Compared with other ROC-based feature selection approaches, our new approach can select fewer features and effectively improve the classification performance.

  15. MR Image Analytics to Characterize the Upper Airway Structure in Obese Children with Obstructive Sleep Apnea Syndrome

    PubMed Central

    Tong, Yubing; Udupa, Jayaram K.; Sin, Sanghun; Liu, Zhengbing; Wileyto, E. Paul; Torigian, Drew A.; Arens, Raanan

    2016-01-01

    Purpose Quantitative image analysis in previous research in obstructive sleep apnea syndrome (OSAS) has focused on the upper airway or several objects in its immediate vicinity and measures of object size. In this paper, we take a more general approach of considering all major objects in the upper airway region and measures pertaining to their individual morphological properties, their tissue characteristics revealed by image intensities, and the 3D architecture of the object assembly. We propose a novel methodology to select a small set of salient features from this large collection of measures and demonstrate the ability of these features to discriminate with very high prediction accuracy between obese OSAS and obese non-OSAS groups. Materials and Methods Thirty children were involved in this study with 15 in the obese OSAS group with an apnea-hypopnea index (AHI) = 14.4 ± 10.7) and 15 in the obese non-OSAS group with an AHI = 1.0 ± 1.0 (p<0.001). Subjects were between 8–17 years and underwent T1- and T2-weighted magnetic resonance imaging (MRI) of the upper airway during wakefulness. Fourteen objects in the vicinity of the upper airways were segmented in these images and a total of 159 measurements were derived from each subject image which included object size, surface area, volume, sphericity, standardized T2-weighted image intensity value, and inter-object distances. A small set of discriminating features was identified from this set in several steps. First, a subset of measures that have a low level of correlation among the measures was determined. A heat map visualization technique that allows grouping of parameters based on correlations among them was used for this purpose. Then, through T-tests, another subset of measures which are capable of separating the two groups was identified. The intersection of these subsets yielded the final feature set. The accuracy of these features to perform classification of unseen images into the two patient groups was tested by using logistic regression and multi-fold cross validation. Results A set of 16 features identified with low inter-feature correlation (< 0.36) yielded a high classification accuracy of 96% with sensitivity and specificity of 97.8% and 94.4%, respectively. In addition to the previously observed increase in linear size, surface area, and volume of adenoid, tonsils, and fat pad in OSAS, the following new markers have been found. Standardized T2-weighted image intensities differed between the two groups for the entire neck body region, pharynx, and nasopharynx, possibly indicating changes in object tissue characteristics. Fat pad and oropharynx become less round or more complex in shape in OSAS. Fat pad and tongue move closer in OSAS, and so also oropharynx and tonsils and fat pad and tonsils. In contrast, fat pad and oropharynx move farther apart from the skin object. Conclusions The study has found several new anatomic bio-markers of OSAS. Changes in standardized T2-weighted image intensities in objects may imply that intrinsic tissue composition undergoes changes in OSAS. The results on inter-object distances imply that treatment methods should respect the relationships that exist among objects and not just their size. The proposed method of analysis may lead to an improved understanding of the mechanisms underlying OSAS. PMID:27487240

  16. Spectral-Spatial Shared Linear Regression for Hyperspectral Image Classification.

    PubMed

    Haoliang Yuan; Yuan Yan Tang

    2017-04-01

    Classification of the pixels in hyperspectral image (HSI) is an important task and has been popularly applied in many practical applications. Its major challenge is the high-dimensional small-sized problem. To deal with this problem, lots of subspace learning (SL) methods are developed to reduce the dimension of the pixels while preserving the important discriminant information. Motivated by ridge linear regression (RLR) framework for SL, we propose a spectral-spatial shared linear regression method (SSSLR) for extracting the feature representation. Comparing with RLR, our proposed SSSLR has the following two advantages. First, we utilize a convex set to explore the spatial structure for computing the linear projection matrix. Second, we utilize a shared structure learning model, which is formed by original data space and a hidden feature space, to learn a more discriminant linear projection matrix for classification. To optimize our proposed method, an efficient iterative algorithm is proposed. Experimental results on two popular HSI data sets, i.e., Indian Pines and Salinas demonstrate that our proposed methods outperform many SL methods.

  17. Feature singletons attract spatial attention independently of feature priming

    PubMed Central

    Yashar, Amit; White, Alex L.; Fang, Wanghaoming; Carrasco, Marisa

    2017-01-01

    People perform better in visual search when the target feature repeats across trials (intertrial feature priming [IFP]). Here, we investigated whether repetition of a feature singleton's color modulates stimulus-driven shifts of spatial attention by presenting a probe stimulus immediately after each singleton display. The task alternated every two trials between a probe discrimination task and a singleton search task. We measured both stimulus-driven spatial attention (via the distance between the probe and singleton) and IFP (via repetition of the singleton's color). Color repetition facilitated search performance (IFP effect) when the set size was small. When the probe appeared at the singleton's location, performance was better than at the opposite location (stimulus-driven attention effect). The magnitude of this attention effect increased with the singleton's set size (which increases its saliency) but did not depend on whether the singleton's color repeated across trials, even when the previous singleton had been attended as a search target. Thus, our findings show that repetition of a salient singleton's color affects performance when the singleton is task relevant and voluntarily attended (as in search trials). However, color repetition does not affect performance when the singleton becomes irrelevant to the current task, even though the singleton does capture attention (as in probe trials). Therefore, color repetition per se does not make a singleton more salient for stimulus-driven attention. Rather, we suggest that IFP requires voluntary selection of color singletons in each consecutive trial. PMID:28800369

  18. Feature singletons attract spatial attention independently of feature priming.

    PubMed

    Yashar, Amit; White, Alex L; Fang, Wanghaoming; Carrasco, Marisa

    2017-08-01

    People perform better in visual search when the target feature repeats across trials (intertrial feature priming [IFP]). Here, we investigated whether repetition of a feature singleton's color modulates stimulus-driven shifts of spatial attention by presenting a probe stimulus immediately after each singleton display. The task alternated every two trials between a probe discrimination task and a singleton search task. We measured both stimulus-driven spatial attention (via the distance between the probe and singleton) and IFP (via repetition of the singleton's color). Color repetition facilitated search performance (IFP effect) when the set size was small. When the probe appeared at the singleton's location, performance was better than at the opposite location (stimulus-driven attention effect). The magnitude of this attention effect increased with the singleton's set size (which increases its saliency) but did not depend on whether the singleton's color repeated across trials, even when the previous singleton had been attended as a search target. Thus, our findings show that repetition of a salient singleton's color affects performance when the singleton is task relevant and voluntarily attended (as in search trials). However, color repetition does not affect performance when the singleton becomes irrelevant to the current task, even though the singleton does capture attention (as in probe trials). Therefore, color repetition per se does not make a singleton more salient for stimulus-driven attention. Rather, we suggest that IFP requires voluntary selection of color singletons in each consecutive trial.

  19. Incorporation of local structure into kriging models for the prediction of atomistic properties in the water decamer.

    PubMed

    Davie, Stuart J; Di Pasquale, Nicodemo; Popelier, Paul L A

    2016-10-15

    Machine learning algorithms have been demonstrated to predict atomistic properties approaching the accuracy of quantum chemical calculations at significantly less computational cost. Difficulties arise, however, when attempting to apply these techniques to large systems, or systems possessing excessive conformational freedom. In this article, the machine learning method kriging is applied to predict both the intra-atomic and interatomic energies, as well as the electrostatic multipole moments, of the atoms of a water molecule at the center of a 10 water molecule (decamer) cluster. Unlike previous work, where the properties of small water clusters were predicted using a molecular local frame, and where training set inputs (features) were based on atomic index, a variety of feature definitions and coordinate frames are considered here to increase prediction accuracy. It is shown that, for a water molecule at the center of a decamer, no single method of defining features or coordinate schemes is optimal for every property. However, explicitly accounting for the structure of the first solvation shell in the definition of the features of the kriging training set, and centring the coordinate frame on the atom-of-interest will, in general, return better predictions than models that apply the standard methods of feature definition, or a molecular coordinate frame. © 2016 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc. © 2016 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.

  20. htsint: a Python library for sequencing pipelines that combines data through gene set generation.

    PubMed

    Richards, Adam J; Herrel, Anthony; Bonneaud, Camille

    2015-09-24

    Sequencing technologies provide a wealth of details in terms of genes, expression, splice variants, polymorphisms, and other features. A standard for sequencing analysis pipelines is to put genomic or transcriptomic features into a context of known functional information, but the relationships between ontology terms are often ignored. For RNA-Seq, considering genes and their genetic variants at the group level enables a convenient way to both integrate annotation data and detect small coordinated changes between experimental conditions, a known caveat of gene level analyses. We introduce the high throughput data integration tool, htsint, as an extension to the commonly used gene set enrichment frameworks. The central aim of htsint is to compile annotation information from one or more taxa in order to calculate functional distances among all genes in a specified gene space. Spectral clustering is then used to partition the genes, thereby generating functional modules. The gene space can range from a targeted list of genes, like a specific pathway, all the way to an ensemble of genomes. Given a collection of gene sets and a count matrix of transcriptomic features (e.g. expression, polymorphisms), the gene sets produced by htsint can be tested for 'enrichment' or conditional differences using one of a number of commonly available packages. The database and bundled tools to generate functional modules were designed with sequencing pipelines in mind, but the toolkit nature of htsint allows it to also be used in other areas of genomics. The software is freely available as a Python library through GitHub at https://github.com/ajrichards/htsint.

  1. Principal component analysis-based unsupervised feature extraction applied to in silico drug discovery for posttraumatic stress disorder-mediated heart disease.

    PubMed

    Taguchi, Y-h; Iwadate, Mitsuo; Umeyama, Hideaki

    2015-04-30

    Feature extraction (FE) is difficult, particularly if there are more features than samples, as small sample numbers often result in biased outcomes or overfitting. Furthermore, multiple sample classes often complicate FE because evaluating performance, which is usual in supervised FE, is generally harder than the two-class problem. Developing sample classification independent unsupervised methods would solve many of these problems. Two principal component analysis (PCA)-based FE, specifically, variational Bayes PCA (VBPCA) was extended to perform unsupervised FE, and together with conventional PCA (CPCA)-based unsupervised FE, were tested as sample classification independent unsupervised FE methods. VBPCA- and CPCA-based unsupervised FE both performed well when applied to simulated data, and a posttraumatic stress disorder (PTSD)-mediated heart disease data set that had multiple categorical class observations in mRNA/microRNA expression of stressed mouse heart. A critical set of PTSD miRNAs/mRNAs were identified that show aberrant expression between treatment and control samples, and significant, negative correlation with one another. Moreover, greater stability and biological feasibility than conventional supervised FE was also demonstrated. Based on the results obtained, in silico drug discovery was performed as translational validation of the methods. Our two proposed unsupervised FE methods (CPCA- and VBPCA-based) worked well on simulated data, and outperformed two conventional supervised FE methods on a real data set. Thus, these two methods have suggested equivalence for FE on categorical multiclass data sets, with potential translational utility for in silico drug discovery.

  2. Training of polyp staging systems using mixed imaging modalities.

    PubMed

    Wimmer, Georg; Gadermayr, Michael; Kwitt, Roland; Häfner, Michael; Tamaki, Toru; Yoshida, Shigeto; Tanaka, Shinji; Merhof, Dorit; Uhl, Andreas

    2018-05-04

    In medical image data sets, the number of images is usually quite small. The small number of training samples does not allow to properly train classifiers which leads to massive overfitting to the training data. In this work, we investigate whether increasing the number of training samples by merging datasets from different imaging modalities can be effectively applied to improve predictive performance. Further, we investigate if the extracted features from the employed image representations differ between different imaging modalities and if domain adaption helps to overcome these differences. We employ twelve feature extraction methods to differentiate between non-neoplastic and neoplastic lesions. Experiments are performed using four different classifier training strategies, each with a different combination of training data. The specifically designed setup for these experiments enables a fair comparison between the four training strategies. Combining high definition with high magnification training data and chromoscopic with non-chromoscopic training data partly improved the results. The usage of domain adaptation has only a small effect on the results compared to just using non-adapted training data. Merging datasets from different imaging modalities turned out to be partially beneficial for the case of combining high definition endoscopic data with high magnification endoscopic data and for combining chromoscopic with non-chromoscopic data. NBI and chromoendoscopy on the other hand are mostly too different with respect to the extracted features to combine images of these two modalities for classifier training. Copyright © 2018 Elsevier Ltd. All rights reserved.

  3. Three small deployed satellites

    NASA Image and Video Library

    2012-10-04

    ISS033-E-009286 (4 Oct. 2012) --- Several tiny satellites are featured in this image photographed by an Expedition 33 crew member on the International Space Station. The satellites were released outside the Kibo laboratory using a Small Satellite Orbital Deployer attached to the Japanese module’s robotic arm on Oct. 4, 2012. Japan Aerospace Exploration Agency astronaut Aki Hoshide, flight engineer, set up the satellite deployment gear inside the lab and placed it in the Kibo airlock. The Japanese robotic arm then grappled the deployment system and its satellites from the airlock for deployment. A portion of the station’s solar array panels and a blue and white part of Earth provide the backdrop for the scene.

  4. Three small deployed satellites

    NASA Image and Video Library

    2012-10-04

    ISS033-E-009285 (4 Oct. 2012) --- Several tiny satellites are featured in this image photographed by an Expedition 33 crew member on the International Space Station. The satellites were released outside the Kibo laboratory using a Small Satellite Orbital Deployer attached to the Japanese module’s robotic arm on Oct. 4, 2012. Japan Aerospace Exploration Agency astronaut Aki Hoshide, flight engineer, set up the satellite deployment gear inside the lab and placed it in the Kibo airlock. The Japanese robotic arm then grappled the deployment system and its satellites from the airlock for deployment. A portion of the station’s solar array panels and a blue and white part of Earth provide the backdrop for the scene.

  5. Driving on the surface of Mars with the rover sequencing and visualization program

    NASA Technical Reports Server (NTRS)

    Wright, J.; Hartman, F.; Cooper, B.; Maxwell, S.; Yen, J.; Morrison, J.

    2005-01-01

    Operating a rover on Mars is not possible using teleoperations due to the distance involved and the bandwith limitations. To operate these rovers requires sophisticated tools to make operators knowledgeable of the terrain, hazards, features of interest, and rover state and limitations, and to support building command sequences and rehearsing expected operations. This paper discusses how the Rover Sequencing and Visualization program and a small set of associated tools support this requirement.

  6. A new radar determination of the spin vector of Venus

    NASA Technical Reports Server (NTRS)

    Zohar, S.; Goldstein, R. M.; Rumsey, H. C.

    1980-01-01

    Two radar observations of a set of three relatively small features on the surface of Venus have facilitated a refined determination of the spin vector of Venus. The period is found to be 243.019 + or 0.014 days, while the obliquity is 177.22 + or - 0.18 deg. The effects of deviations from exact sphericity on the interpretation of the measurements are discussed at length and the question of resonance with earth is reexamined.

  7. Combining image processing and modeling to generate traces of beta-strands from cryo-EM density images of beta-barrels.

    PubMed

    Si, Dong; He, Jing

    2014-01-01

    Electron cryo-microscopy (Cryo-EM) technique produces 3-dimensional (3D) density images of proteins. When resolution of the images is not high enough to resolve the molecular details, it is challenging for image processing methods to enhance the molecular features. β-barrel is a particular structure feature that is formed by multiple β-strands in a barrel shape. There is no existing method to derive β-strands from the 3D image of a β-barrel at medium resolutions. We propose a new method, StrandRoller, to generate a small set of possible β-traces from the density images at medium resolutions of 5-10Å. StrandRoller has been tested using eleven β-barrel images simulated to 10Å resolution and one image isolated from the experimentally derived cryo-EM density image at 6.7Å resolution. StrandRoller was able to detect 81.84% of the β-strands with an overall 1.5Å 2-way distance between the detected and the observed β-traces, if the best of fifteen detections is considered. Our results suggest that it is possible to derive a small set of possible β-traces from the β-barrel cryo-EM image at medium resolutions even when no separation of the β-strands is visible in the images.

  8. ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches.

    PubMed

    Sharma, Ashok K; Srivastava, Gopal N; Roy, Ankita; Sharma, Vineet K

    2017-01-01

    The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84-0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better ( R 2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better ( R 2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules.

  9. ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches

    PubMed Central

    Sharma, Ashok K.; Srivastava, Gopal N.; Roy, Ankita; Sharma, Vineet K.

    2017-01-01

    The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84–0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better (R2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better (R2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules. PMID:29249969

  10. Operator for object recognition and scene analysis by estimation of set occupancy with noisy and incomplete data sets

    NASA Astrophysics Data System (ADS)

    Rees, S. J.; Jones, Bryan F.

    1992-11-01

    Once feature extraction has occurred in a processed image, the recognition problem becomes one of defining a set of features which maps sufficiently well onto one of the defined shape/object models to permit a claimed recognition. This process is usually handled by aggregating features until a large enough weighting is obtained to claim membership, or an adequate number of located features are matched to the reference set. A requirement has existed for an operator or measure capable of a more direct assessment of membership/occupancy between feature sets, particularly where the feature sets may be defective representations. Such feature set errors may be caused by noise, by overlapping of objects, and by partial obscuration of features. These problems occur at the point of acquisition: repairing the data would then assume a priori knowledge of the solution. The technique described in this paper offers a set theoretical measure for partial occupancy defined in terms of the set of minimum additions to permit full occupancy and the set of locations of occupancy if such additions are made. As is shown, this technique permits recognition of partial feature sets with quantifiable degrees of uncertainty. A solution to the problems of obscuration and overlapping is therefore available.

  11. YSAR: a compact low-cost synthetic aperture radar

    NASA Astrophysics Data System (ADS)

    Thompson, Douglas G.; Arnold, David V.; Long, David G.; Miner, Gayle F.; Karlinsey, Thomas W.; Robertson, Adam E.

    1997-09-01

    The Brigham Young University Synthetic Aperture Radar (YSAR) is a compact, inexpensive SAR system which can be flown on a small aircraft. The system has exhibited a resolution of approximately 0.8 m by 0.8 m in test flights in calm conditions. YSAR has been used to collect data over archeological sites in Israel. Using a relatively low frequency (2.1 GHz), we hope to be able to identify walls or other archeological features to assist in excavation. A large data set of radar and photographic data have been collected over sites at Tel Safi, Qumran, Tel Micnah, and the Zippori National Forest in Israel. We show sample images from the archeological data. We are currently working on improved autofocus algorithms for this data and are developing a small, low-cost interferometric SAR system (YINSAR) for operation from a small aircraft.

  12. Characterization of cervigram image sharpness using multiple self-referenced measurements and random forest classifiers

    NASA Astrophysics Data System (ADS)

    Jaiswal, Mayoore; Horning, Matt; Hu, Liming; Ben-Or, Yau; Champlin, Cary; Wilson, Benjamin; Levitz, David

    2018-02-01

    Cervical cancer is the fourth most common cancer among women worldwide and is especially prevalent in low resource settings due to lack of screening and treatment options. Visual inspection with acetic acid (VIA) is a widespread and cost-effective screening method for cervical pre-cancer lesions, but accuracy depends on the experience level of the health worker. Digital cervicography, capturing images of the cervix, enables review by an off-site expert or potentially a machine learning algorithm. These reviews require images of sufficient quality. However, image quality varies greatly across users. A novel algorithm was developed to evaluate the sharpness of images captured with the MobileODT's digital cervicography device (EVA System), in order to, eventually provide feedback to the health worker. The key challenges are that the algorithm evaluates only a single image of each cervix, it needs to be robust to the variability in cervix images and fast enough to run in real time on a mobile device, and the machine learning model needs to be small enough to fit on a mobile device's memory, train on a small imbalanced dataset and run in real-time. In this paper, the focus scores of a preprocessed image and a Gaussian-blurred version of the image are calculated using established methods and used as features. A feature selection metric is proposed to select the top features which were then used in a random forest classifier to produce the final focus score. The resulting model, based on nine calculated focus scores, achieved significantly better accuracy than any single focus measure when tested on a holdout set of images. The area under the receiver operating characteristics curve was 0.9459.

  13. Improved multi-stage neonatal seizure detection using a heuristic classifier and a data-driven post-processor.

    PubMed

    Ansari, A H; Cherian, P J; Dereymaeker, A; Matic, V; Jansen, K; De Wispelaere, L; Dielman, C; Vervisch, J; Swarte, R M; Govaert, P; Naulaers, G; De Vos, M; Van Huffel, S

    2016-09-01

    After identifying the most seizure-relevant characteristics by a previously developed heuristic classifier, a data-driven post-processor using a novel set of features is applied to improve the performance. The main characteristics of the outputs of the heuristic algorithm are extracted by five sets of features including synchronization, evolution, retention, segment, and signal features. Then, a support vector machine and a decision making layer remove the falsely detected segments. Four datasets including 71 neonates (1023h, 3493 seizures) recorded in two different university hospitals, are used to train and test the algorithm without removing the dubious seizures. The heuristic method resulted in a false alarm rate of 3.81 per hour and good detection rate of 88% on the entire test databases. The post-processor, effectively reduces the false alarm rate by 34% while the good detection rate decreases by 2%. This post-processing technique improves the performance of the heuristic algorithm. The structure of this post-processor is generic, improves our understanding of the core visually determined EEG features of neonatal seizures and is applicable for other neonatal seizure detectors. The post-processor significantly decreases the false alarm rate at the expense of a small reduction of the good detection rate. Copyright © 2016 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

  14. Predicting tumor hypoxia in non-small cell lung cancer by combining CT, FDG PET and dynamic contrast-enhanced CT.

    PubMed

    Even, Aniek J G; Reymen, Bart; La Fontaine, Matthew D; Das, Marco; Jochems, Arthur; Mottaghy, Felix M; Belderbos, José S A; De Ruysscher, Dirk; Lambin, Philippe; van Elmpt, Wouter

    2017-11-01

    Most solid tumors contain inadequately oxygenated (i.e., hypoxic) regions, which tend to be more aggressive and treatment resistant. Hypoxia PET allows visualization of hypoxia and may enable treatment adaptation. However, hypoxia PET imaging is expensive, time-consuming and not widely available. We aimed to predict hypoxia levels in non-small cell lung cancer (NSCLC) using more easily available imaging modalities: FDG-PET/CT and dynamic contrast-enhanced CT (DCE-CT). For 34 NSCLC patients, included in two clinical trials, hypoxia HX4-PET/CT, planning FDG-PET/CT and DCE-CT scans were acquired before radiotherapy. Scans were non-rigidly registered to the planning CT. Tumor blood flow (BF) and blood volume (BV) were calculated by kinetic analysis of DCE-CT images. Within the gross tumor volume, independent clusters, i.e., supervoxels, were created based on FDG-PET/CT. For each supervoxel, tumor-to-background ratios (TBR) were calculated (median SUV/aorta SUV mean ) for HX4-PET/CT and supervoxel features (median, SD, entropy) for the other modalities. Two random forest models (cross-validated: 10 folds, five repeats) were trained to predict the hypoxia TBR; one based on CT, FDG, BF and BV, and one with only CT and FDG features. Patients were split in a training (trial NCT01024829) and independent test set (trial NCT01210378). For each patient, predicted, and observed hypoxic volumes (HV) (TBR > 1.2) were compared. Fifteen patients (3291 supervoxels) were used for training and 19 patients (1502 supervoxels) for testing. The model with all features (RMSE training: 0.19 ± 0.01, test: 0.27) outperformed the model with only CT and FDG-PET features (RMSE training: 0.20 ± 0.01, test: 0.29). All tumors of the test set were correctly classified as normoxic or hypoxic (HV > 1 cm 3 ) by the best performing model. We created a data-driven methodology to predict hypoxia levels and hypoxia spatial patterns using CT, FDG-PET and DCE-CT features in NSCLC. The model correctly classifies all tumors, and could therefore, aid tumor hypoxia classification and patient stratification.

  15. Robust Learning of High-dimensional Biological Networks with Bayesian Networks

    NASA Astrophysics Data System (ADS)

    Nägele, Andreas; Dejori, Mathäus; Stetter, Martin

    Structure learning of Bayesian networks applied to gene expression data has become a potentially useful method to estimate interactions between genes. However, the NP-hardness of Bayesian network structure learning renders the reconstruction of the full genetic network with thousands of genes unfeasible. Consequently, the maximal network size is usually restricted dramatically to a small set of genes (corresponding with variables in the Bayesian network). Although this feature reduction step makes structure learning computationally tractable, on the downside, the learned structure might be adversely affected due to the introduction of missing genes. Additionally, gene expression data are usually very sparse with respect to the number of samples, i.e., the number of genes is much greater than the number of different observations. Given these problems, learning robust network features from microarray data is a challenging task. This chapter presents several approaches tackling the robustness issue in order to obtain a more reliable estimation of learned network features.

  16. Fractal based modelling and analysis of electromyography (EMG) to identify subtle actions.

    PubMed

    Arjunan, Sridhar P; Kumar, Dinesh K

    2007-01-01

    The paper reports the use of fractal theory and fractal dimension to study the non-linear properties of surface electromyogram (sEMG) and to use these properties to classify subtle hand actions. The paper reports identifying a new feature of the fractal dimension, the bias that has been found to be useful in modelling the muscle activity and of sEMG. Experimental results demonstrate that the feature set consisting of bias values and fractal dimension of the recordings is suitable for classification of sEMG against the different hand gestures. The scatter plots demonstrate the presence of simple relationships of these features against the four hand gestures. The results indicate that there is small inter-experimental variation but large inter-subject variation. This may be due to differences in the size and shape of muscles for different subjects. The possible applications of this research include use in developing prosthetic hands, controlling machines and computers.

  17. A time-series method for automated measurement of changes in mitotic and interphase duration from time-lapse movies.

    PubMed

    Sigoillot, Frederic D; Huckins, Jeremy F; Li, Fuhai; Zhou, Xiaobo; Wong, Stephen T C; King, Randall W

    2011-01-01

    Automated time-lapse microscopy can visualize proliferation of large numbers of individual cells, enabling accurate measurement of the frequency of cell division and the duration of interphase and mitosis. However, extraction of quantitative information by manual inspection of time-lapse movies is too time-consuming to be useful for analysis of large experiments. Here we present an automated time-series approach that can measure changes in the duration of mitosis and interphase in individual cells expressing fluorescent histone 2B. The approach requires analysis of only 2 features, nuclear area and average intensity. Compared to supervised learning approaches, this method reduces processing time and does not require generation of training data sets. We demonstrate that this method is as sensitive as manual analysis in identifying small changes in interphase or mitotic duration induced by drug or siRNA treatment. This approach should facilitate automated analysis of high-throughput time-lapse data sets to identify small molecules or gene products that influence timing of cell division.

  18. Multi-level basis selection of wavelet packet decomposition tree for heart sound classification.

    PubMed

    Safara, Fatemeh; Doraisamy, Shyamala; Azman, Azreen; Jantan, Azrul; Abdullah Ramaiah, Asri Ranga

    2013-10-01

    Wavelet packet transform decomposes a signal into a set of orthonormal bases (nodes) and provides opportunities to select an appropriate set of these bases for feature extraction. In this paper, multi-level basis selection (MLBS) is proposed to preserve the most informative bases of a wavelet packet decomposition tree through removing less informative bases by applying three exclusion criteria: frequency range, noise frequency, and energy threshold. MLBS achieved an accuracy of 97.56% for classifying normal heart sound, aortic stenosis, mitral regurgitation, and aortic regurgitation. MLBS is a promising basis selection to be suggested for signals with a small range of frequencies. Copyright © 2013 The Authors. Published by Elsevier Ltd.. All rights reserved.

  19. X-ray EM simulation tool for ptychography dataset construction

    NASA Astrophysics Data System (ADS)

    Stoevelaar, L. Pjotr; Gerini, Giampiero

    2018-03-01

    In this paper, we present an electromagnetic full-wave modeling framework, as a support EM tool providing data sets for X-ray ptychographic imaging. Modeling the entire scattering problem with Finite Element Method (FEM) tools is, in fact, a prohibitive task, because of the large area illuminated by the beam (due to the poor focusing power at these wavelengths) and the very small features to be imaged. To overcome this problem, the spectrum of the illumination beam is decomposed into a discrete set of plane waves. This allows reducing the electromagnetic modeling volume to the one enclosing the area to be imaged. The total scattered field is reconstructed by superimposing the solutions for each plane wave illumination.

  20. Demographic and clinical features of patients with fibromyalgia syndrome of different settings: a gender comparison.

    PubMed

    Häuser, Winfried; Kühn-Becker, Hedi; von Wilmoswky, Hubertus; Settan, Margit; Brähler, Elmar; Petzke, Frank

    2011-04-01

    Well-established gender differences in the clinical picture of fibromyalgia syndrome (FMS) have been suggested. However, studies on gender differences in demographic and clinical features of FMS have contradictory results. Their significance is limited by the small number of patients included and selection bias of single settings. The purpose of this study was to compare demographic characteristics (age, family status) and clinical variables (duration of chronic pain and FMS diagnosis, tender point count, number of pain sites, and somatic and depressive symptoms) of male and female patients in different settings (general population, FMS self-help organization, and different clinical settings). FMS was diagnosed according to survey criteria in the general population and in the self-help organization setting and by 1990 criteria of the American College of Rheumatology in the clinical settings. Tender point examination was performed according to the manual tender point survey protocol in clinical settings. Somatic and depressive symptoms were assessed by validated questionnaires. A total of 1023 patients (885 female, 138 male) were included in the analysis. Compared with male participants, female participants reported a longer duration of chronic widespread pain (P = 0.009) and time since FMS diagnosis (P = 0.05), and they had a higher tender point count (P = 0.04). There were no gender differences in age, family status, number of pain sites, or somatic and depressive symptoms. We found no relevant gender differences in the clinical picture of FMS. The assumption of well-established gender differences in the clinical picture of FMS could not be supported. Copyright © 2011 Elsevier HS Journals, Inc. All rights reserved.

  1. Automatic Identification of Messages Related to Adverse Drug Reactions from Online User Reviews using Feature-based Classification.

    PubMed

    Liu, Jingfang; Zhang, Pengzhu; Lu, Yingjie

    2014-11-01

    User-generated medical messages on Internet contain extensive information related to adverse drug reactions (ADRs) and are known as valuable resources for post-marketing drug surveillance. The aim of this study was to find an effective method to identify messages related to ADRs automatically from online user reviews. We conducted experiments on online user reviews using different feature set and different classification technique. Firstly, the messages from three communities, allergy community, schizophrenia community and pain management community, were collected, the 3000 messages were annotated. Secondly, the N-gram-based features set and medical domain-specific features set were generated. Thirdly, three classification techniques, SVM, C4.5 and Naïve Bayes, were used to perform classification tasks separately. Finally, we evaluated the performance of different method using different feature set and different classification technique by comparing the metrics including accuracy and F-measure. In terms of accuracy, the accuracy of SVM classifier was higher than 0.8, the accuracy of C4.5 classifier or Naïve Bayes classifier was lower than 0.8; meanwhile, the combination feature sets including n-gram-based feature set and domain-specific feature set consistently outperformed single feature set. In terms of F-measure, the highest F-measure is 0.895 which was achieved by using combination feature sets and a SVM classifier. In all, we can get the best classification performance by using combination feature sets and SVM classifier. By using combination feature sets and SVM classifier, we can get an effective method to identify messages related to ADRs automatically from online user reviews.

  2. Comprehensive Computational Pathological Image Analysis Predicts Lung Cancer Prognosis.

    PubMed

    Luo, Xin; Zang, Xiao; Yang, Lin; Huang, Junzhou; Liang, Faming; Rodriguez-Canales, Jaime; Wistuba, Ignacio I; Gazdar, Adi; Xie, Yang; Xiao, Guanghua

    2017-03-01

    Pathological examination of histopathological slides is a routine clinical procedure for lung cancer diagnosis and prognosis. Although the classification of lung cancer has been updated to become more specific, only a small subset of the total morphological features are taken into consideration. The vast majority of the detailed morphological features of tumor tissues, particularly tumor cells' surrounding microenvironment, are not fully analyzed. The heterogeneity of tumor cells and close interactions between tumor cells and their microenvironments are closely related to tumor development and progression. The goal of this study is to develop morphological feature-based prediction models for the prognosis of patients with lung cancer. We developed objective and quantitative computational approaches to analyze the morphological features of pathological images for patients with NSCLC. Tissue pathological images were analyzed for 523 patients with adenocarcinoma (ADC) and 511 patients with squamous cell carcinoma (SCC) from The Cancer Genome Atlas lung cancer cohorts. The features extracted from the pathological images were used to develop statistical models that predict patients' survival outcomes in ADC and SCC, respectively. We extracted 943 morphological features from pathological images of hematoxylin and eosin-stained tissue and identified morphological features that are significantly associated with prognosis in ADC and SCC, respectively. Statistical models based on these extracted features stratified NSCLC patients into high-risk and low-risk groups. The models were developed from training sets and validated in independent testing sets: a predicted high-risk group versus a predicted low-risk group (for patients with ADC: hazard ratio = 2.34, 95% confidence interval: 1.12-4.91, p = 0.024; for patients with SCC: hazard ratio = 2.22, 95% confidence interval: 1.15-4.27, p = 0.017) after adjustment for age, sex, smoking status, and pathologic tumor stage. The results suggest that the quantitative morphological features of tumor pathological images predict prognosis in patients with lung cancer. Copyright © 2016 International Association for the Study of Lung Cancer. Published by Elsevier Inc. All rights reserved.

  3. A Ground-Based 2-Micron DIAL System to Profile Tropospheric CO2 and Aerosol Distributions for Atmospheric Studies

    NASA Technical Reports Server (NTRS)

    Ismail, Syed; Koch, Grady; Abedin, Nurul; Refaat, Tamer; Rubio, Manuel; Davis, Kenneth; Miller, Charles; Singh, Upendra

    2006-01-01

    System will operate at a temperature insensitive CO2 line (2050.967 nm) with side-line tuning and off-set locking. Demonstrated an order of magnitude improvement in laser line locking needed for high precision measurements, side-line operation, and simultaneously double pulsing and line locking. Detector testing of phototransistor has demonstrated sensitivity to aerosol features over long distances in the atmosphere and resolve features approx. 100m. Optical systems that collect light onto small area detectors work well. Receiver optical designs are being optimized and data acquisition systems developed. CO2 line parameter characterization in progress In situ sensor calibration in progress for validation of DIAL CO2 system.

  4. Land use classification using texture information in ERTS-A MSS imagery

    NASA Technical Reports Server (NTRS)

    Haralick, R. M. (Principal Investigator); Shanmugam, K. S.; Bosley, R.

    1973-01-01

    The author has identified the following significant results. Preliminary digital analysis of ERTS-1 MSS imagery reveals that the textural features of the imagery are very useful for land use classification. A procedure for extracting the textural features of ERTS-1 imagery is presented and the results of a land use classification scheme based on the textural features are also presented. The land use classification algorithm using textural features was tested on a 5100 square mile area covered by part of an ERTS-1 MSS band 5 image over the California coastline. The image covering this area was blocked into 648 subimages of size 8.9 square miles each. Based on a color composite of the image set, a total of 7 land use categories were identified. These land use categories are: coastal forest, woodlands, annual grasslands, urban areas, large irrigated fields, small irrigated fields, and water. The automatic classifier was trained to identify the land use categories using only the textural characteristics of the subimages; 75 percent of the subimages were assigned correct identifications. Since texture and spectral features provide completely different kinds of information, a significant increase in identification accuracy will take place when both features are used together.

  5. AlphaSpace: Fragment-Centric Topographical Mapping To Target Protein–Protein Interaction Interfaces

    PubMed Central

    2016-01-01

    Inhibition of protein–protein interactions (PPIs) is emerging as a promising therapeutic strategy despite the difficulty in targeting such interfaces with drug-like small molecules. PPIs generally feature large and flat binding surfaces as compared to typical drug targets. These features pose a challenge for structural characterization of the surface using geometry-based pocket-detection methods. An attractive mapping strategy—that builds on the principles of fragment-based drug discovery (FBDD)—is to detect the fragment-centric modularity at the protein surface and then characterize the large PPI interface as a set of localized, fragment-targetable interaction regions. Here, we introduce AlphaSpace, a computational analysis tool designed for fragment-centric topographical mapping (FCTM) of PPI interfaces. Our approach uses the alpha sphere construct, a geometric feature of a protein’s Voronoi diagram, to map out concave interaction space at the protein surface. We introduce two new features—alpha-atom and alpha-space—and the concept of the alpha-atom/alpha-space pair to rank pockets for fragment-targetability and to facilitate the evaluation of pocket/fragment complementarity. The resulting high-resolution interfacial map of targetable pocket space can be used to guide the rational design and optimization of small molecule or biomimetic PPI inhibitors. PMID:26225450

  6. An analysis of a digital variant of the Trail Making Test using machine learning techniques.

    PubMed

    Dahmen, Jessamyn; Cook, Diane; Fellows, Robert; Schmitter-Edgecombe, Maureen

    2017-01-01

    The goal of this work is to develop a digital version of a standard cognitive assessment, the Trail Making Test (TMT), and assess its utility. This paper introduces a novel digital version of the TMT and introduces a machine learning based approach to assess its capabilities. Using digital Trail Making Test (dTMT) data collected from (N = 54) older adult participants as feature sets, we use machine learning techniques to analyze the utility of the dTMT and evaluate the insights provided by the digital features. Predicted TMT scores correlate well with clinical digital test scores (r = 0.98) and paper time to completion scores (r = 0.65). Predicted TICS exhibited a small correlation with clinically derived TICS scores (r = 0.12 Part A, r = 0.10 Part B). Predicted FAB scores exhibited a small correlation with clinically derived FAB scores (r = 0.13 Part A, r = 0.29 for Part B). Digitally derived features were also used to predict diagnosis (AUC of 0.65). Our findings indicate that the dTMT is capable of measuring the same aspects of cognition as the paper-based TMT. Furthermore, the dTMT's additional data may be able to help monitor other cognitive processes not captured by the paper-based TMT alone.

  7. SING: Subgraph search In Non-homogeneous Graphs

    PubMed Central

    2010-01-01

    Background Finding the subgraphs of a graph database that are isomorphic to a given query graph has practical applications in several fields, from cheminformatics to image understanding. Since subgraph isomorphism is a computationally hard problem, indexing techniques have been intensively exploited to speed up the process. Such systems filter out those graphs which cannot contain the query, and apply a subgraph isomorphism algorithm to each residual candidate graph. The applicability of such systems is limited to databases of small graphs, because their filtering power degrades on large graphs. Results In this paper, SING (Subgraph search In Non-homogeneous Graphs), a novel indexing system able to cope with large graphs, is presented. The method uses the notion of feature, which can be a small subgraph, subtree or path. Each graph in the database is annotated with the set of all its features. The key point is to make use of feature locality information. This idea is used to both improve the filtering performance and speed up the subgraph isomorphism task. Conclusions Extensive tests on chemical compounds, biological networks and synthetic graphs show that the proposed system outperforms the most popular systems in query time over databases of medium and large graphs. Other specific tests show that the proposed system is effective for single large graphs. PMID:20170516

  8. The Nonlinear Magnetosphere: Expressions in MHD and in Kinetic Models

    NASA Technical Reports Server (NTRS)

    Hesse, Michael; Birn, Joachim

    2011-01-01

    Like most plasma systems, the magnetosphere of the Earth is governed by nonlinear dynamic evolution equations. The impact of nonlinearities ranges from large scales, where overall dynamics features are exhibiting nonlinear behavior, to small scale, kinetic, processes, where nonlinear behavior governs, among others, energy conversion and dissipation. In this talk we present a select set of examples of such behavior, with a specific emphasis on how nonlinear effects manifest themselves in MHD and in kinetic models of magnetospheric plasma dynamics.

  9. Simulation Studies as Designed Experiments: The Comparison of Penalized Regression Models in the “Large p, Small n” Setting

    PubMed Central

    Chaibub Neto, Elias; Bare, J. Christopher; Margolin, Adam A.

    2014-01-01

    New algorithms are continuously proposed in computational biology. Performance evaluation of novel methods is important in practice. Nonetheless, the field experiences a lack of rigorous methodology aimed to systematically and objectively evaluate competing approaches. Simulation studies are frequently used to show that a particular method outperforms another. Often times, however, simulation studies are not well designed, and it is hard to characterize the particular conditions under which different methods perform better. In this paper we propose the adoption of well established techniques in the design of computer and physical experiments for developing effective simulation studies. By following best practices in planning of experiments we are better able to understand the strengths and weaknesses of competing algorithms leading to more informed decisions about which method to use for a particular task. We illustrate the application of our proposed simulation framework with a detailed comparison of the ridge-regression, lasso and elastic-net algorithms in a large scale study investigating the effects on predictive performance of sample size, number of features, true model sparsity, signal-to-noise ratio, and feature correlation, in situations where the number of covariates is usually much larger than sample size. Analysis of data sets containing tens of thousands of features but only a few hundred samples is nowadays routine in computational biology, where “omics” features such as gene expression, copy number variation and sequence data are frequently used in the predictive modeling of complex phenotypes such as anticancer drug response. The penalized regression approaches investigated in this study are popular choices in this setting and our simulations corroborate well established results concerning the conditions under which each one of these methods is expected to perform best while providing several novel insights. PMID:25289666

  10. Unsupervised image matching based on manifold alignment.

    PubMed

    Pei, Yuru; Huang, Fengchun; Shi, Fuhao; Zha, Hongbin

    2012-08-01

    This paper challenges the issue of automatic matching between two image sets with similar intrinsic structures and different appearances, especially when there is no prior correspondence. An unsupervised manifold alignment framework is proposed to establish correspondence between data sets by a mapping function in the mutual embedding space. We introduce a local similarity metric based on parameterized distance curves to represent the connection of one point with the rest of the manifold. A small set of valid feature pairs can be found without manual interactions by matching the distance curve of one manifold with the curve cluster of the other manifold. To avoid potential confusions in image matching, we propose an extended affine transformation to solve the nonrigid alignment in the embedding space. The comparatively tight alignments and the structure preservation can be obtained simultaneously. The point pairs with the minimum distance after alignment are viewed as the matchings. We apply manifold alignment to image set matching problems. The correspondence between image sets of different poses, illuminations, and identities can be established effectively by our approach.

  11. Effective traffic features selection algorithm for cyber-attacks samples

    NASA Astrophysics Data System (ADS)

    Li, Yihong; Liu, Fangzheng; Du, Zhenyu

    2018-05-01

    By studying the defense scheme of Network attacks, this paper propose an effective traffic features selection algorithm based on k-means++ clustering to deal with the problem of high dimensionality of traffic features which extracted from cyber-attacks samples. Firstly, this algorithm divide the original feature set into attack traffic feature set and background traffic feature set by the clustering. Then, we calculates the variation of clustering performance after removing a certain feature. Finally, evaluating the degree of distinctiveness of the feature vector according to the result. Among them, the effective feature vector is whose degree of distinctiveness exceeds the set threshold. The purpose of this paper is to select out the effective features from the extracted original feature set. In this way, it can reduce the dimensionality of the features so as to reduce the space-time overhead of subsequent detection. The experimental results show that the proposed algorithm is feasible and it has some advantages over other selection algorithms.

  12. Computer-Aided Breast Cancer Diagnosis with Optimal Feature Sets: Reduction Rules and Optimization Techniques.

    PubMed

    Mathieson, Luke; Mendes, Alexandre; Marsden, John; Pond, Jeffrey; Moscato, Pablo

    2017-01-01

    This chapter introduces a new method for knowledge extraction from databases for the purpose of finding a discriminative set of features that is also a robust set for within-class classification. Our method is generic and we introduce it here in the field of breast cancer diagnosis from digital mammography data. The mathematical formalism is based on a generalization of the k-Feature Set problem called (α, β)-k-Feature Set problem, introduced by Cotta and Moscato (J Comput Syst Sci 67(4):686-690, 2003). This method proceeds in two steps: first, an optimal (α, β)-k-feature set of minimum cardinality is identified and then, a set of classification rules using these features is obtained. We obtain the (α, β)-k-feature set in two phases; first a series of extremely powerful reduction techniques, which do not lose the optimal solution, are employed; and second, a metaheuristic search to identify the remaining features to be considered or disregarded. Two algorithms were tested with a public domain digital mammography dataset composed of 71 malignant and 75 benign cases. Based on the results provided by the algorithms, we obtain classification rules that employ only a subset of these features.

  13. 13 CFR 121.406 - How does a small business concern qualify to provide manufactured products or other supply items...

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ...-disabled veteran-owned small business set-aside, WOSB or EDWOSB set-aside, or 8(a) contract? 121.406... items under a small business set-aside, service-disabled veteran-owned small business set-aside, WOSB or... small business set-aside, service-disabled veteran-owned small business set-aside, WOSB or EDWOSB set...

  14. 13 CFR 121.406 - How does a small business concern qualify to provide manufactured products or other supply items...

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ...-disabled veteran-owned small business set-aside, WOSB or EDWOSB set-aside, or 8(a) contract? 121.406... items under a small business set-aside, service-disabled veteran-owned small business set-aside, WOSB or... small business set-aside, service-disabled veteran-owned small business set-aside, WOSB or EDWOSB set...

  15. 13 CFR 121.406 - How does a small business concern qualify to provide manufactured products or other supply items...

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ...-disabled veteran-owned small business set-aside, WOSB or EDWOSB set-aside, or 8(a) contract? 121.406... items under a small business set-aside, service-disabled veteran-owned small business set-aside, WOSB or... small business set-aside, service-disabled veteran-owned small business set-aside, WOSB or EDWOSB set...

  16. Classification of small lesions on dynamic breast MRI: Integrating dimension reduction and out-of-sample extension into CADx methodology

    PubMed Central

    Nagarajan, Mahesh B.; Huber, Markus B.; Schlossbauer, Thomas; Leinsinger, Gerda; Krol, Andrzej; Wismüller, Axel

    2014-01-01

    Objective While dimension reduction has been previously explored in computer aided diagnosis (CADx) as an alternative to feature selection, previous implementations of its integration into CADx do not ensure strict separation between training and test data required for the machine learning task. This compromises the integrity of the independent test set, which serves as the basis for evaluating classifier performance. Methods and Materials We propose, implement and evaluate an improved CADx methodology where strict separation is maintained. This is achieved by subjecting the training data alone to dimension reduction; the test data is subsequently processed with out-of-sample extension methods. Our approach is demonstrated in the research context of classifying small diagnostically challenging lesions annotated on dynamic breast magnetic resonance imaging (MRI) studies. The lesions were dynamically characterized through topological feature vectors derived from Minkowski functionals. These feature vectors were then subject to dimension reduction with different linear and non-linear algorithms applied in conjunction with out-of-sample extension techniques. This was followed by classification through supervised learning with support vector regression. Area under the receiver-operating characteristic curve (AUC) was evaluated as the metric of classifier performance. Results Of the feature vectors investigated, the best performance was observed with Minkowski functional ’perimeter’ while comparable performance was observed with ’area’. Of the dimension reduction algorithms tested with ’perimeter’, the best performance was observed with Sammon’s mapping (0.84 ± 0.10) while comparable performance was achieved with exploratory observation machine (0.82 ± 0.09) and principal component analysis (0.80 ± 0.10). Conclusions The results reported in this study with the proposed CADx methodology present a significant improvement over previous results reported with such small lesions on dynamic breast MRI. In particular, non-linear algorithms for dimension reduction exhibited better classification performance than linear approaches, when integrated into our CADx methodology. We also note that while dimension reduction techniques may not necessarily provide an improvement in classification performance over feature selection, they do allow for a higher degree of feature compaction. PMID:24355697

  17. Driver face recognition as a security and safety feature

    NASA Astrophysics Data System (ADS)

    Vetter, Volker; Giefing, Gerd-Juergen; Mai, Rudolf; Weisser, Hubert

    1995-09-01

    We present a driver face recognition system for comfortable access control and individual settings of automobiles. The primary goals are the prevention of car thefts and heavy accidents caused by unauthorized use (joy-riders), as well as the increase of safety through optimal settings, e.g. of the mirrors and the seat position. The person sitting on the driver's seat is observed automatically by a small video camera in the dashboard. All he has to do is to behave cooperatively, i.e. to look into the camera. A classification system validates his access. Only after a positive identification, the car can be used and the driver-specific environment (e.g. seat position, mirrors, etc.) may be set up to ensure the driver's comfort and safety. The driver identification system has been integrated in a Volkswagen research car. Recognition results are presented.

  18. Performance comparison of phenomenology-based features to generic features for false alarm reduction in UWB SAR imagery

    NASA Astrophysics Data System (ADS)

    Marble, Jay A.; Gorman, John D.

    1999-08-01

    A feature based approach is taken to reduce the occurrence of false alarms in foliage penetrating, ultra-wideband, synthetic aperture radar data. A set of 'generic' features is defined based on target size, shape, and pixel intensity. A second set of features is defined that contains generic features combined with features based on scattering phenomenology. Each set is combined using a quadratic polynomial discriminant (QPD), and performance is characterized by generating a receiver operating characteristic (ROC) curve. Results show that the feature set containing phenomenological features improves performance against both broadside and end-on targets. Performance against end-on targets, however, is especially pronounced.

  19. The feature-weighted receptive field: an interpretable encoding model for complex feature spaces.

    PubMed

    St-Yves, Ghislain; Naselaris, Thomas

    2017-06-20

    We introduce the feature-weighted receptive field (fwRF), an encoding model designed to balance expressiveness, interpretability and scalability. The fwRF is organized around the notion of a feature map-a transformation of visual stimuli into visual features that preserves the topology of visual space (but not necessarily the native resolution of the stimulus). The key assumption of the fwRF model is that activity in each voxel encodes variation in a spatially localized region across multiple feature maps. This region is fixed for all feature maps; however, the contribution of each feature map to voxel activity is weighted. Thus, the model has two separable sets of parameters: "where" parameters that characterize the location and extent of pooling over visual features, and "what" parameters that characterize tuning to visual features. The "where" parameters are analogous to classical receptive fields, while "what" parameters are analogous to classical tuning functions. By treating these as separable parameters, the fwRF model complexity is independent of the resolution of the underlying feature maps. This makes it possible to estimate models with thousands of high-resolution feature maps from relatively small amounts of data. Once a fwRF model has been estimated from data, spatial pooling and feature tuning can be read-off directly with no (or very little) additional post-processing or in-silico experimentation. We describe an optimization algorithm for estimating fwRF models from data acquired during standard visual neuroimaging experiments. We then demonstrate the model's application to two distinct sets of features: Gabor wavelets and features supplied by a deep convolutional neural network. We show that when Gabor feature maps are used, the fwRF model recovers receptive fields and spatial frequency tuning functions consistent with known organizational principles of the visual cortex. We also show that a fwRF model can be used to regress entire deep convolutional networks against brain activity. The ability to use whole networks in a single encoding model yields state-of-the-art prediction accuracy. Our results suggest a wide variety of uses for the feature-weighted receptive field model, from retinotopic mapping with natural scenes, to regressing the activities of whole deep neural networks onto measured brain activity. Copyright © 2017. Published by Elsevier Inc.

  20. KFC2: a knowledge-based hot spot prediction method based on interface solvation, atomic density, and plasticity features.

    PubMed

    Zhu, Xiaolei; Mitchell, Julie C

    2011-09-01

    Hot spots constitute a small fraction of protein-protein interface residues, yet they account for a large fraction of the binding affinity. Based on our previous method (KFC), we present two new methods (KFC2a and KFC2b) that outperform other methods at hot spot prediction. A number of improvements were made in developing these new methods. First, we created a training data set that contained a similar number of hot spot and non-hot spot residues. In addition, we generated 47 different features, and different numbers of features were used to train the models to avoid over-fitting. Finally, two feature combinations were selected: One (used in KFC2a) is composed of eight features that are mainly related to solvent accessible surface area and local plasticity; the other (KFC2b) is composed of seven features, only two of which are identical to those used in KFC2a. The two models were built using support vector machines (SVM). The two KFC2 models were then tested on a mixed independent test set, and compared with other methods such as Robetta, FOLDEF, HotPoint, MINERVA, and KFC. KFC2a showed the highest predictive accuracy for hot spot residues (True Positive Rate: TPR = 0.85); however, the false positive rate was somewhat higher than for other models. KFC2b showed the best predictive accuracy for hot spot residues (True Positive Rate: TPR = 0.62) among all methods other than KFC2a, and the False Positive Rate (FPR = 0.15) was comparable with other highly predictive methods. Copyright © 2011 Wiley-Liss, Inc.

  1. The LAILAPS search engine: a feature model for relevance ranking in life science databases.

    PubMed

    Lange, Matthias; Spies, Karl; Colmsee, Christian; Flemming, Steffen; Klapperstück, Matthias; Scholz, Uwe

    2010-03-25

    Efficient and effective information retrieval in life sciences is one of the most pressing challenge in bioinformatics. The incredible growth of life science databases to a vast network of interconnected information systems is to the same extent a big challenge and a great chance for life science research. The knowledge found in the Web, in particular in life-science databases, are a valuable major resource. In order to bring it to the scientist desktop, it is essential to have well performing search engines. Thereby, not the response time nor the number of results is important. The most crucial factor for millions of query results is the relevance ranking. In this paper, we present a feature model for relevance ranking in life science databases and its implementation in the LAILAPS search engine. Motivated by the observation of user behavior during their inspection of search engine result, we condensed a set of 9 relevance discriminating features. These features are intuitively used by scientists, who briefly screen database entries for potential relevance. The features are both sufficient to estimate the potential relevance, and efficiently quantifiable. The derivation of a relevance prediction function that computes the relevance from this features constitutes a regression problem. To solve this problem, we used artificial neural networks that have been trained with a reference set of relevant database entries for 19 protein queries. Supporting a flexible text index and a simple data import format, this concepts are implemented in the LAILAPS search engine. It can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases. LAILAPS is publicly available for SWISSPROT data at http://lailaps.ipk-gatersleben.de.

  2. Gravity waves and the LHC: towards high-scale inflation with low-energy SUSY

    NASA Astrophysics Data System (ADS)

    He, Temple; Kachru, Shamit; Westphal, Alexander

    2010-06-01

    It has been argued that rather generic features of string-inspired inflationary theories with low-energy supersymmetry (SUSY) make it difficult to achieve inflation with a Hubble scale H > m 3/2, where m 3/2 is the gravitino mass in the SUSY-breaking vacuum state. We present a class of string-inspired supergravity realizations of chaotic inflation where a simple, dynamical mechanism yields hierarchically small scales of post-inflationary supersymmetry breaking. Within these toy models we can easily achieve small ratios between m 3/2 and the Hubble scale of inflation. This is possible because the expectation value of the superpotential < W> relaxes from large to small values during the course of inflation. However, our toy models do not provide a reasonable fit to cosmological data if one sets the SUSY-breaking scale to m 3/2 ≤ TeV. Our work is a small step towards relieving the apparent tension between high-scale inflation and low-scale supersymmetry breaking in string compactifications.

  3. 48 CFR 19.502-2 - Total small business set-asides.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 48 Federal Acquisition Regulations System 1 2014-10-01 2014-10-01 false Total small business set... SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 19.502-2 Total small business set... exclusively for small business concerns and shall be set aside for small business unless the contracting...

  4. Computer Design Technology of the Small Thrust Rocket Engines Using CAE / CAD Systems

    NASA Astrophysics Data System (ADS)

    Ryzhkov, V.; Lapshin, E.

    2018-01-01

    The paper presents an algorithm for designing liquid small thrust rocket engine, the process of which consists of five aggregated stages with feedback. Three stages of the algorithm provide engineering support for design, and two stages - the actual engine design. A distinctive feature of the proposed approach is a deep study of the main technical solutions at the stage of engineering analysis and interaction with the created knowledge (data) base, which accelerates the process and provides enhanced design quality. The using multifunctional graphic package Siemens NX allows to obtain the final product -rocket engine and a set of design documentation in a fairly short time; the engine design does not require a long experimental development.

  5. Microfluidic interconnects

    DOEpatents

    Benett, William J.; Krulevitch, Peter A.

    2001-01-01

    A miniature connector for introducing microliter quantities of solutions into microfabricated fluidic devices, and which incorporates a molded ring or seal set into a ferrule cartridge, with or without a compression screw. The fluidic connector, for example, joins standard high pressure liquid chromatography (HPLC) tubing to 1 mm diameter holes in silicon or glass, enabling ml-sized volumes of sample solutions to be merged with .mu.l-sized devices. The connector has many features, including ease of connect and disconnect; a small footprint which enables numerous connectors to be located in a small area; low dead volume; helium leak-tight; and tubing does not twist during connection. Thus the connector enables easy and effective change of microfluidic devices and introduction of different solutions in the devices.

  6. Optimizing data collection for public health decisions: a data mining approach

    PubMed Central

    2014-01-01

    Background Collecting data can be cumbersome and expensive. Lack of relevant, accurate and timely data for research to inform policy may negatively impact public health. The aim of this study was to test if the careful removal of items from two community nutrition surveys guided by a data mining technique called feature selection, can (a) identify a reduced dataset, while (b) not damaging the signal inside that data. Methods The Nutrition Environment Measures Surveys for stores (NEMS-S) and restaurants (NEMS-R) were completed on 885 retail food outlets in two counties in West Virginia between May and November of 2011. A reduced dataset was identified for each outlet type using feature selection. Coefficients from linear regression modeling were used to weight items in the reduced datasets. Weighted item values were summed with the error term to compute reduced item survey scores. Scores produced by the full survey were compared to the reduced item scores using a Wilcoxon rank-sum test. Results Feature selection identified 9 store and 16 restaurant survey items as significant predictors of the score produced from the full survey. The linear regression models built from the reduced feature sets had R2 values of 92% and 94% for restaurant and grocery store data, respectively. Conclusions While there are many potentially important variables in any domain, the most useful set may only be a small subset. The use of feature selection in the initial phase of data collection to identify the most influential variables may be a useful tool to greatly reduce the amount of data needed thereby reducing cost. PMID:24919484

  7. Optimizing data collection for public health decisions: a data mining approach.

    PubMed

    Partington, Susan N; Papakroni, Vasil; Menzies, Tim

    2014-06-12

    Collecting data can be cumbersome and expensive. Lack of relevant, accurate and timely data for research to inform policy may negatively impact public health. The aim of this study was to test if the careful removal of items from two community nutrition surveys guided by a data mining technique called feature selection, can (a) identify a reduced dataset, while (b) not damaging the signal inside that data. The Nutrition Environment Measures Surveys for stores (NEMS-S) and restaurants (NEMS-R) were completed on 885 retail food outlets in two counties in West Virginia between May and November of 2011. A reduced dataset was identified for each outlet type using feature selection. Coefficients from linear regression modeling were used to weight items in the reduced datasets. Weighted item values were summed with the error term to compute reduced item survey scores. Scores produced by the full survey were compared to the reduced item scores using a Wilcoxon rank-sum test. Feature selection identified 9 store and 16 restaurant survey items as significant predictors of the score produced from the full survey. The linear regression models built from the reduced feature sets had R2 values of 92% and 94% for restaurant and grocery store data, respectively. While there are many potentially important variables in any domain, the most useful set may only be a small subset. The use of feature selection in the initial phase of data collection to identify the most influential variables may be a useful tool to greatly reduce the amount of data needed thereby reducing cost.

  8. Reinforcement Learning with Autonomous Small Unmanned Aerial Vehicles in Cluttered Environments

    NASA Technical Reports Server (NTRS)

    Tran, Loc; Cross, Charles; Montague, Gilbert; Motter, Mark; Neilan, James; Qualls, Garry; Rothhaar, Paul; Trujillo, Anna; Allen, B. Danette

    2015-01-01

    We present ongoing work in the Autonomy Incubator at NASA Langley Research Center (LaRC) exploring the efficacy of a data set aggregation approach to reinforcement learning for small unmanned aerial vehicle (sUAV) flight in dense and cluttered environments with reactive obstacle avoidance. The goal is to learn an autonomous flight model using training experiences from a human piloting a sUAV around static obstacles. The training approach uses video data from a forward-facing camera that records the human pilot's flight. Various computer vision based features are extracted from the video relating to edge and gradient information. The recorded human-controlled inputs are used to train an autonomous control model that correlates the extracted feature vector to a yaw command. As part of the reinforcement learning approach, the autonomous control model is iteratively updated with feedback from a human agent who corrects undesired model output. This data driven approach to autonomous obstacle avoidance is explored for simulated forest environments furthering autonomous flight under the tree canopy research. This enables flight in previously inaccessible environments which are of interest to NASA researchers in Earth and Atmospheric sciences.

  9. Video enhancement workbench: an operational real-time video image processing system

    NASA Astrophysics Data System (ADS)

    Yool, Stephen R.; Van Vactor, David L.; Smedley, Kirk G.

    1993-01-01

    Video image sequences can be exploited in real-time, giving analysts rapid access to information for military or criminal investigations. Video-rate dynamic range adjustment subdues fluctuations in image intensity, thereby assisting discrimination of small or low- contrast objects. Contrast-regulated unsharp masking enhances differentially shadowed or otherwise low-contrast image regions. Real-time removal of localized hotspots, when combined with automatic histogram equalization, may enhance resolution of objects directly adjacent. In video imagery corrupted by zero-mean noise, real-time frame averaging can assist resolution and location of small or low-contrast objects. To maximize analyst efficiency, lengthy video sequences can be screened automatically for low-frequency, high-magnitude events. Combined zoom, roam, and automatic dynamic range adjustment permit rapid analysis of facial features captured by video cameras recording crimes in progress. When trying to resolve small objects in murky seawater, stereo video places the moving imagery in an optimal setting for human interpretation.

  10. Hierarchical Kohonenen net for anomaly detection in network security.

    PubMed

    Sarasamma, Suseela T; Zhu, Qiuming A; Huff, Julie

    2005-04-01

    A novel multilevel hierarchical Kohonen Net (K-Map) for an intrusion detection system is presented. Each level of the hierarchical map is modeled as a simple winner-take-all K-Map. One significant advantage of this multilevel hierarchical K-Map is its computational efficiency. Unlike other statistical anomaly detection methods such as nearest neighbor approach, K-means clustering or probabilistic analysis that employ distance computation in the feature space to identify the outliers, our approach does not involve costly point-to-point computation in organizing the data into clusters. Another advantage is the reduced network size. We use the classification capability of the K-Map on selected dimensions of data set in detecting anomalies. Randomly selected subsets that contain both attacks and normal records from the KDD Cup 1999 benchmark data are used to train the hierarchical net. We use a confidence measure to label the clusters. Then we use the test set from the same KDD Cup 1999 benchmark to test the hierarchical net. We show that a hierarchical K-Map in which each layer operates on a small subset of the feature space is superior to a single-layer K-Map operating on the whole feature space in detecting a variety of attacks in terms of detection rate as well as false positive rate.

  11. When will low-contrast features be visible in a STEM X-ray spectrum image?

    DOE PAGES

    Parish, Chad M.

    2015-04-01

    When will a small or low-contrast feature, such as an embedded second-phase particle, be visible in a scanning transmission electron microscopy (STEM) X-ray map? This work illustrates a computationally inexpensive method to simulate X-ray maps and spectrum images (SIs), based upon the equations of X-ray generation and detection. To particularize the general procedure, an example of nanostructured ferritic alloy (NFA) containing nm-sized Y 2Ti 2O 7 embedded precipitates in ferritic stainless steel matrix is chosen. The proposed model produces physically appearing simulated SI data sets, which can either be reduced to X-ray dot maps or analyzed via multivariate statistical analysis.more » Comparison to NFA X-ray maps acquired using three different STEM instruments match the generated simulations quite well, despite the large number of simplifying assumptions used. A figure of merit of electron dose multiplied by X-ray collection solid angle is proposed to compare feature detectability from one data set (simulated or experimental) to another. The proposed method can scope experiments that are feasible under specific analysis conditions on a given microscope. As a result, future applications, such as spallation proton–neutron irradiations, core-shell nanoparticles, or dopants in polycrystalline photovoltaic solar cells, are proposed.« less

  12. Characterizing Smartphone Engagement for Schizophrenia: Results of a Naturalist Mobile Health Study.

    PubMed

    Torous, John; Staples, Patrick; Slaters, Linda; Adams, Jared; Sandoval, Luis; Onnela, J P; Keshavan, Matcheri

    2017-08-04

    Despite growing interest in smartphone apps for schizophrenia, little is known about how these apps are utilized in the real world. Understanding how app users are engaging with these tools outside of the confines of traditional clinical studies offers an important information on who is most likely to use apps and what type of data they are willing to share. The Schizophrenia and Related Disorders Alliance of America, in partnership with Self Care Catalyst, has created a smartphone app for schizophrenia that is free and publically available on both Apple iTunes and Google Android Play stores. We analyzed user engagement data from this app across its medication tracking, mood tracking, and symptom tracking features from August 16 th 2015 to January 1 st 2017 using the R programming language. We included all registered app users in our analysis with reported ages less than 100. We analyzed a total of 43,451 mood, medication and symptom entries from 622 registered users, and excluded a single patient with a reported age of 114. Seventy one percent of the 622 users tried the mood-tracking feature at least once, 49% the symptom tracking feature, and 36% the medication-tracking feature. The mean number of uses of the mood feature was two, the symptom feature 10, and the medication feature 14. However, a small subset of users were very engaged with the app and the top 10 users for each feature accounted for 35% or greater of all entries for that feature. We find that user engagement follows a power law distribution for each feature, and this fit was largely invariant when stratifying for age or gender. Engagement with this app for schizophrenia was overall low, but similar to prior naturalistic studies for mental health app use in other diseases. The low rate of engagement in naturalistic settings, compared to higher rates of use in clinical studies, suggests the importance of clinical involvement as one factor in driving engagement for mental health apps. Power law relationships suggest strongly skewed user engagement, with a small subset of users accounting for the majority of substantial engagements. There is a need for further research on app engagement in schizophrenia.

  13. Application of statistical mining in healthcare data management for allergic diseases

    NASA Astrophysics Data System (ADS)

    Wawrzyniak, Zbigniew M.; Martínez Santolaya, Sara

    2014-11-01

    The paper aims to discuss data mining techniques based on statistical tools in medical data management in case of long-term diseases. The data collected from a population survey is the source for reasoning and identifying disease processes responsible for patient's illness and its symptoms, and prescribing a knowledge and decisions in course of action to correct patient's condition. The case considered as a sample of constructive approach to data management is a dependence of allergic diseases of chronic nature on some symptoms and environmental conditions. The knowledge summarized in a systematic way as accumulated experience constitutes to an experiential simplified model of the diseases with feature space constructed of small set of indicators. We have presented the model of disease-symptom-opinion with knowledge discovery for data management in healthcare. The feature is evident that the model is purely data-driven to evaluate the knowledge of the diseases` processes and probability dependence of future disease events on symptoms and other attributes. The example done from the outcomes of the survey of long-term (chronic) disease shows that a small set of core indicators as 4 or more symptoms and opinions could be very helpful in reflecting health status change over disease causes. Furthermore, the data driven understanding of the mechanisms of diseases gives physicians the basis for choices of treatment what outlines the need of data governance in this research domain of discovered knowledge from surveys.

  14. Small numbers are sensed directly, high numbers constructed from size and density.

    PubMed

    Zimmermann, Eckart

    2018-04-01

    Two theories compete to explain how we estimate the numerosity of visual object sets. The first suggests that the apparent numerosity is derived from an analysis of more low-level features like size and density of the set. The second theory suggests that numbers are sensed directly. Consistent with the latter claim is the existence of neurons in parietal cortex which are specialized for processing the numerosity of elements in the visual scene. However, recent evidence suggests that only low numbers can be sensed directly whereas the perception of high numbers is supported by the analysis of low-level features. Processing of low and high numbers, being located at different levels of the neural hierarchy should involve different receptive field sizes. Here, I tested this idea with visual adaptation. I measured the spatial spread of number adaptation for low and high numerosities. A focused adaptation spread of high numerosities suggested the involvement of early neural levels where receptive fields are comparably small and the broad spread for low numerosities was consistent with processing of number neurons which have larger receptive fields. These results provide evidence for the claim that different mechanism exist generating the perception of visual numerosity. Whereas low numbers are sensed directly as a primary visual attribute, the estimation of high numbers however likely depends on the area size over which the objects are spread. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. 48 CFR 819.502-2 - Total small business set-asides.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 48 Federal Acquisition Regulations System 5 2010-10-01 2010-10-01 false Total small business set... SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 819.502-2 Total small business set-asides. (a) When a total small business set-aside is made, one of the following statements, as applicable...

  16. 48 CFR 819.502-2 - Total small business set-asides.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 48 Federal Acquisition Regulations System 5 2011-10-01 2011-10-01 false Total small business set... SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 819.502-2 Total small business set-asides. (a) When a total small business set-aside is made, one of the following statements, as applicable...

  17. 48 CFR 819.502-2 - Total small business set-asides.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 48 Federal Acquisition Regulations System 5 2013-10-01 2013-10-01 false Total small business set... SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 819.502-2 Total small business set-asides. (a) When a total small business set-aside is made, one of the following statements, as applicable...

  18. 48 CFR 19.502-2 - Total small business set-asides.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 48 Federal Acquisition Regulations System 1 2012-10-01 2012-10-01 false Total small business set... SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 19.502-2 Total small business set... contracting officer does not proceed with the small business set-aside and purchases on an unrestricted basis...

  19. 48 CFR 19.502-2 - Total small business set-asides.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 48 Federal Acquisition Regulations System 1 2011-10-01 2011-10-01 false Total small business set... SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 19.502-2 Total small business set... contracting officer does not proceed with the small business set-aside and purchases on an unrestricted basis...

  20. 48 CFR 19.502-2 - Total small business set-asides.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 48 Federal Acquisition Regulations System 1 2013-10-01 2013-10-01 false Total small business set... SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 19.502-2 Total small business set... contracting officer does not proceed with the small business set-aside and purchases on an unrestricted basis...

  1. 48 CFR 819.502-2 - Total small business set-asides.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 48 Federal Acquisition Regulations System 5 2014-10-01 2014-10-01 false Total small business set... SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 819.502-2 Total small business set-asides. (a) When a total small business set-aside is made, one of the following statements, as applicable...

  2. 48 CFR 819.502-2 - Total small business set-asides.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 48 Federal Acquisition Regulations System 5 2012-10-01 2012-10-01 false Total small business set... SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 819.502-2 Total small business set-asides. (a) When a total small business set-aside is made, one of the following statements, as applicable...

  3. Checklist/Guide to Selecting a Small Computer.

    ERIC Educational Resources Information Center

    Bennett, Wilma E.

    This 322-point checklist was designed to help executives make an intelligent choice when selecting a small computer for a business. For ease of use the questions have been divided into ten categories: Display Features, Keyboard Features, Printer Features, Controller Features, Software, Word Processing, Service, Training, Miscellaneous, and Costs.…

  4. The Small Mars System

    NASA Astrophysics Data System (ADS)

    Fantino, E.; Grassi, M.; Pasolini, P.; Causa, F.; Molfese, C.; Aurigemma, R.; Cimminiello, N.; de la Torre, D.; Dell'Aversana, P.; Esposito, F.; Gramiccia, L.; Paudice, F.; Punzo, F.; Roma, I.; Savino, R.; Zuppardi, G.

    2017-08-01

    The Small Mars System is a proposed mission to Mars. Funded by the European Space Agency, the project has successfully completed Phase 0. The contractor is ALI S.c.a.r.l., and the study team includes the University of Naples ;Federico II;, the Astronomical Observatory of Capodimonte and the Space Studies Institute of Catalonia. The objectives of the mission are both technological and scientific, and will be achieved by delivering a small Mars lander carrying a dust particle analyser and an aerial drone. The former shall perform in situ measurements of the size distribution and abundance of dust particles suspended in the Martian atmosphere, whereas the latter shall demonstrate low-altitude flight in the rarefied planetary environment. The mission-enabling technology is an innovative umbrella-like heat shield, known as IRENE, developed and patented by ALI. The mission is also a technological demonstration of the shield in the upper atmosphere of Mars. The core characteristics of SMS are the low cost (120 M€) and the small size (320 kg of wet mass at launch, 110 kg at landing), features which stand out with respect to previous Mars landers. To comply with them is extremely challenging at all levels, and sets strict requirements on the choice of the materials, the sizing of payloads and subsystems, their arrangement inside the spacecraft and the launcher's selection. In this contribution, the mission and system concept and design are illustrated and discussed. Special emphasis is given to the innovative features and to the challenges faced in the development of the work.

  5. 3D variational brain tumor segmentation on a clustered feature set

    NASA Astrophysics Data System (ADS)

    Popuri, Karteek; Cobzas, Dana; Jagersand, Martin; Shah, Sirish L.; Murtha, Albert

    2009-02-01

    Tumor segmentation from MRI data is a particularly challenging and time consuming task. Tumors have a large diversity in shape and appearance with intensities overlapping the normal brain tissues. In addition, an expanding tumor can also deflect and deform nearby tissue. Our work addresses these last two difficult problems. We use the available MRI modalities (T1, T1c, T2) and their texture characteristics to construct a multi-dimensional feature set. Further, we extract clusters which provide a compact representation of the essential information in these features. The main idea in this paper is to incorporate these clustered features into the 3D variational segmentation framework. In contrast to the previous variational approaches, we propose a segmentation method that evolves the contour in a supervised fashion. The segmentation boundary is driven by the learned inside and outside region voxel probabilities in the cluster space. We incorporate prior knowledge about the normal brain tissue appearance, during the estimation of these region statistics. In particular, we use a Dirichlet prior that discourages the clusters in the ventricles to be in the tumor and hence better disambiguate the tumor from brain tissue. We show the performance of our method on real MRI scans. The experimental dataset includes MRI scans, from patients with difficult instances, with tumors that are inhomogeneous in appearance, small in size and in proximity to the major structures in the brain. Our method shows good results on these test cases.

  6. Classification of mass and normal breast tissue: A convolution neural network classifier with spatial domain and texture images

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sahiner, B.; Chan, H.P.; Petrick, N.

    1996-10-01

    The authors investigated the classification of regions of interest (ROI`s) on mammograms as either mass or normal tissue using a convolution neural network (CNN). A CNN is a back-propagation neural network with two-dimensional (2-D) weight kernels that operate on images. A generalized, fast and stable implementation of the CNN was developed. The input images to the CNN were obtained form the ROI`s using two techniques. The first technique employed averaging and subsampling. The second technique employed texture feature extraction methods applied to small subregions inside the ROI. Features computed over different subregions were arranged as texture images, which were subsequentlymore » used as CNN inputs. The effects of CNN architecture and texture feature parameters on classification accuracy were studied. Receiver operating characteristic (ROC) methodology was used to evaluate the classification accuracy. A data set consisting of 168 ROI`s containing biopsy-proven masses and 504 ROI`s containing normal breast tissue was extracted from 168 mammograms by radiologists experienced in mammography. This data set was used for training and testing the CNN. With the best combination of CNN architecture and texture feature parameters, the area under the test ROC curve reached 0.87, which corresponded to a true-positive fraction of 90% at a false positive fraction of 31%. The results demonstrate the feasibility of using a CNN for classification of masses and normal tissue on mammograms.« less

  7. Combining Accuracy and Efficiency: An Incremental Focal-Point Method Based on Pair Natural Orbitals.

    PubMed

    Fiedler, Benjamin; Schmitz, Gunnar; Hättig, Christof; Friedrich, Joachim

    2017-12-12

    In this work, we present a new pair natural orbitals (PNO)-based incremental scheme to calculate CCSD(T) and CCSD(T0) reaction, interaction, and binding energies. We perform an extensive analysis, which shows small incremental errors similar to previous non-PNO calculations. Furthermore, slight PNO errors are obtained by using T PNO = T TNO with appropriate values of 10 -7 to 10 -8 for reactions and 10 -8 for interaction or binding energies. The combination with the efficient MP2 focal-point approach yields chemical accuracy relative to the complete basis-set (CBS) limit. In this method, small basis sets (cc-pVDZ, def2-TZVP) for the CCSD(T) part are sufficient in case of reactions or interactions, while some larger ones (e.g., (aug)-cc-pVTZ) are necessary for molecular clusters. For these larger basis sets, we show the very high efficiency of our scheme. We obtain not only tremendous decreases of the wall times (i.e., factors >10 2 ) due to the parallelization of the increment calculations as well as of the total times due to the application of PNOs (i.e., compared to the normal incremental scheme) but also smaller total times with respect to the standard PNO method. That way, our new method features a perfect applicability by combining an excellent accuracy with a very high efficiency as well as the accessibility to larger systems due to the separation of the full computation into several small increments.

  8. Cassini UVIS Auroral Observations in 2016 and 2017

    NASA Astrophysics Data System (ADS)

    Pryor, Wayne R.; Esposito, Larry W.; Jouchoux, Alain; Radioti, Aikaterini; Grodent, Denis; Gustin, Jacques; Gerard, Jean-Claude; Lamy, Laurent; Badman, Sarah; Dyudina, Ulyana A.; Cassini UVIS Team, Cassini VIMS Team, Cassini ISS Team, HST Saturn Auroral Team

    2017-10-01

    In 2016 and 2017, the Cassini Saturn orbiter executed a final series of high-inclination, low-periapsis orbits ideal for studies of Saturn's polar regions. The Cassini Ultraviolet Imaging Spectrograph (UVIS) obtained an extensive set of auroral images, some at the highest spatial resolution obtained during Cassini's long orbital mission (2004-2017). In some cases, two or three spacecraft slews at right angles to the long slit of the spectrograph were required to cover the entire auroral region to form auroral images. We will present selected images from this set showing narrow arcs of emission, more diffuse auroral emissions, multiple auroral arcs in a single image, discrete spots of emission, small scale vortices, large-scale spiral forms, and parallel linear features that appear to cross in places like twisted wires. Some shorter features are transverse to the main auroral arcs, like barbs on a wire. UVIS observations were in some cases simultaneous with auroral observations from the Cassini Imaging Science Subsystem (ISS) the Cassini Visual and Infrared Mapping Spectrometer (VIMS), and the Hubble Space Telescope Space Telescope Imaging Spectrograph (STIS) that will also be presented.

  9. AIS Spectra for Stressed and Unstressed Plant Communities in the Carolina Slate Belt

    NASA Technical Reports Server (NTRS)

    Wickland, D. E.

    1985-01-01

    Airborne imaging spectrometer (AIS) data were collected over a number of derelict heavy metal mine sites in the Carolina slate belt of North Carolina. A 32 channel (1156 to 1456 nm) data set was acquired in October, 1983 at the time of peak fall foliage display, and a 128 channel (1220 to 2420) data set was acquired near the end of the spring leaf flush in May, 1984. Spectral curves were extracted from the AIS data for differing ground cover types (e.g., pine forests, mixed deciduous forests, mine sites, and pastures). Variation in the width of an absorption feature located at approximately 1190 nm has been related to differences in forest type. Small differences in the location and shape of features in the near infrared plateau (1156 to 1300 nm) and the region 2000 to 2420 nm have yet to be evaluated. Because these variations were subtle, and because atmospheric effects were apparent in the data, high priority must be assigned to devising a means of removing atmospheric effects from AIS spectra.

  10. Performance Analysis of Continuous Black-Box Optimization Algorithms via Footprints in Instance Space.

    PubMed

    Muñoz, Mario A; Smith-Miles, Kate A

    2017-01-01

    This article presents a method for the objective assessment of an algorithm's strengths and weaknesses. Instead of examining the performance of only one or more algorithms on a benchmark set, or generating custom problems that maximize the performance difference between two algorithms, our method quantifies both the nature of the test instances and the algorithm performance. Our aim is to gather information about possible phase transitions in performance, that is, the points in which a small change in problem structure produces algorithm failure. The method is based on the accurate estimation and characterization of the algorithm footprints, that is, the regions of instance space in which good or exceptional performance is expected from an algorithm. A footprint can be estimated for each algorithm and for the overall portfolio. Therefore, we select a set of features to generate a common instance space, which we validate by constructing a sufficiently accurate prediction model. We characterize the footprints by their area and density. Our method identifies complementary performance between algorithms, quantifies the common features of hard problems, and locates regions where a phase transition may lie.

  11. Phosphaturic mesenchymal tumor, mixed connective tissue variant, of the mandible: report of a case and review of the literature.

    PubMed

    Woo, Victoria L; Landesberg, Regina; Imel, Erik A; Singer, Steven R; Folpe, Andrew L; Econs, Michael J; Kim, Taeyun; Harik, Lara R; Jacobs, Thomas P

    2009-12-01

    Tumor-induced osteomalacia (TIO) is a rare paraneoplastic syndrome that results in renal phosphate wasting with hypophosphatemia. In most cases, the underlying cause of TIO is a small mesenchymal neoplasm that is often difficult to detect, resulting in delayed diagnosis. One such neoplasm is the phosphaturic mesenchymal tumor, mixed connective tissue variant (PMTMCT), an unusual entity with unique morphologic and biochemical features. Most of these tumors are found at appendicular sites with only rare cases reported in the jaws. We describe a PMTMCT involving the mandible in a patient with a protracted history of osteomalacia. A review of the current literature is provided with emphasis on the clinical and histologic features, etiopathogenesis, and management of PMTMCT in the setting of TIO.

  12. Phosphaturic mesenchymal tumor, mixed connective tissue variant, of the mandible: Report of a case and review of the literature

    PubMed Central

    Woo, Victoria L.; Landesberg, Regina; Imel, Erik A.; Singer, Steven R.; Folpe, Andrew L.; Econs, Michael J.; Kim, Taeyun; Harik, Lara R.; Jacobs, Thomas P.

    2009-01-01

    Tumor-induced osteomalacia (TIO) is a rare paraneoplastic syndrome that results in renal phosphate wasting with hypophosphatemia. In most cases, the underlying cause of TIO is a small mesenchymal neoplasm that is often difficult to detect, resulting in delayed diagnosis. One such neoplasm is the phosphaturic mesenchymal tumor, mixed connective tissue variant (PMTMCT), an unusual entity with unique morphologic and biochemical features. The majority of these tumors are found at appendicular sites with only rare cases reported in the jaws. We describe a PMTMCT involving the mandible in a patient with a protracted history of osteomalacia. A review of the current literature is provided with emphasis on the clinical and histologic features, etiopathogenesis, and management of PMTMCT in the setting of TIO. PMID:19828339

  13. Nuclear Physics Around the Unitarity Limit.

    PubMed

    König, Sebastian; Grießhammer, Harald W; Hammer, H-W; van Kolck, U

    2017-05-19

    We argue that many features of the structure of nuclei emerge from a strictly perturbative expansion around the unitarity limit, where the two-nucleon S waves have bound states at zero energy. In this limit, the gross features of states in the nuclear chart are correlated to only one dimensionful parameter, which is related to the breaking of scale invariance to a discrete scaling symmetry and set by the triton binding energy. Observables are moved to their physical values by small perturbative corrections, much like in descriptions of the fine structure of atomic spectra. We provide evidence in favor of the conjecture that light, and possibly heavier, nuclei are bound weakly enough to be insensitive to the details of the interactions but strongly enough to be insensitive to the exact size of the two-nucleon system.

  14. Active Contours Driven by Multi-Feature Gaussian Distribution Fitting Energy with Application to Vessel Segmentation.

    PubMed

    Wang, Lei; Zhang, Huimao; He, Kan; Chang, Yan; Yang, Xiaodong

    2015-01-01

    Active contour models are of great importance for image segmentation and can extract smooth and closed boundary contours of the desired objects with promising results. However, they cannot work well in the presence of intensity inhomogeneity. Hence, a novel region-based active contour model is proposed by taking image intensities and 'vesselness values' from local phase-based vesselness enhancement into account simultaneously to define a novel multi-feature Gaussian distribution fitting energy in this paper. This energy is then incorporated into a level set formulation with a regularization term for accurate segmentations. Experimental results based on publicly available STructured Analysis of the Retina (STARE) demonstrate our model is more accurate than some existing typical methods and can successfully segment most small vessels with varying width.

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, Seyong; Vetter, Jeffrey S

    Computer architecture experts expect that non-volatile memory (NVM) hierarchies will play a more significant role in future systems including mobile, enterprise, and HPC architectures. With this expectation in mind, we present NVL-C: a novel programming system that facilitates the efficient and correct programming of NVM main memory systems. The NVL-C programming abstraction extends C with a small set of intuitive language features that target NVM main memory, and can be combined directly with traditional C memory model features for DRAM. We have designed these new features to enable compiler analyses and run-time checks that can improve performance and guard againstmore » a number of subtle programming errors, which, when left uncorrected, can corrupt NVM-stored data. Moreover, to enable recovery of data across application or system failures, these NVL-C features include a flexible directive for specifying NVM transactions. So that our implementation might be extended to other compiler front ends and languages, the majority of our compiler analyses are implemented in an extended version of LLVM's intermediate representation (LLVM IR). We evaluate NVL-C on a number of applications to show its flexibility, performance, and correctness.« less

  16. Geometrical constraint on the localization of deep water formation

    NASA Astrophysics Data System (ADS)

    Ferreira, D.; Marshall, J.

    2008-12-01

    That deep water formation occurs in the North Atlantic and not North Pacific is one of the most notable features of the present climate. In an effort to build a system able to mimic such basic aspects of climate using a minimal description, we study here the influence of ocean geometry on the localization of deep water formation. Using the MIT GCM, two idealized configurations of an ocean-atmosphere-sea ice climate system are studied: Drake and Double-Drake. In Drake, one narrow barrier extends from the North Pole to 35°S while, in Double-Drake, two such barriers set 90° apart join at the North Pole to delimit a Small and a Large basin. Despite the different continental configurations, the two climates are strikingly similar in the zonal average (almost identical heat and fresh water transports, and meridional overturning circulation). However, regional circulations in the Small and Large basins exhibit distinctive Atlantic-like and Pacific-like characteristics: the Small basin is warmer and saltier than the Large one, concentrates dense water formation and deep overturning circulation and achieve the largest fraction of the northward ocean heat transport. We show that the warmer temperature and higher evaporation over the Small basin is not its distinguishing factor. Rather, it is the width of the basin in relation to the zonal fetch of the precipitation pattern. This generates a deficit/excess of precipitation over the Small/Large basin: a fraction of the moisture evaporated from the Small basin is transported zonally and rains out over the Large basin. This creates a salt contrast between the 2 basins, leading to the localization of deep convection in the salty Small basin. Finally, given on the broad similarities between the Double-Drake and real World, we suggest that many gross features that define the present climate are a consequence of 2 asymmetries: a meridional asymmetry (a zonally unblocked southern/blocked northern ocean) and a zonal one (a small and a large basin in the northern hemisphere).

  17. FEATURE 1, SMALL GUN POSITION, VIEW FACING NORTH. Naval ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    FEATURE 1, SMALL GUN POSITION, VIEW FACING NORTH. - Naval Air Station Barbers Point, Anti-Aircraft Battery Complex-Small Gun Position, East of Coral Sea Road, northwest of Hamilton Road, Ewa, Honolulu County, HI

  18. Three Small-Receptive-Field Ganglion Cells in the Mouse Retina Are Distinctly Tuned to Size, Speed, and Object Motion

    PubMed Central

    Jacoby, Jason

    2017-01-01

    Retinal ganglion cells (RGCs) are frequently divided into functional types by their ability to extract and relay specific features from a visual scene, such as the capacity to discern local or global motion, direction of motion, stimulus orientation, contrast or uniformity, or the presence of large or small objects. Here we introduce three previously uncharacterized, nondirection-selective ON–OFF RGC types that represent a distinct set of feature detectors in the mouse retina. The three high-definition (HD) RGCs possess small receptive-field centers and strong surround suppression. They respond selectively to objects of specific sizes, speeds, and types of motion. We present comprehensive morphological characterization of the HD RGCs and physiological recordings of their light responses, receptive-field size and structure, and synaptic mechanisms of surround suppression. We also explore the similarities and differences between the HD RGCs and a well characterized RGC with a comparably small receptive field, the local edge detector, in response to moving objects and textures. We model populations of each RGC type to study how they differ in their performance tracking a moving object. These results, besides introducing three new RGC types that together constitute a substantial fraction of mouse RGCs, provide insights into the role of different circuits in shaping RGC receptive fields and establish a foundation for continued study of the mechanisms of surround suppression and the neural basis of motion detection. SIGNIFICANCE STATEMENT The output cells of the retina, retinal ganglion cells (RGCs), are a diverse group of ∼40 distinct neuron types that are often assigned “feature detection” profiles based on the specific aspects of the visual scene to which they respond. Here we describe, for the first time, morphological and physiological characterization of three new RGC types in the mouse retina, substantially augmenting our understanding of feature selectivity. Experiments and modeling show that while these three “high-definition” RGCs share certain receptive-field properties, they also have distinct tuning to the size, speed, and type of motion on the retina, enabling them to occupy different niches in stimulus space. PMID:28100743

  19. Improved computer-aided detection of small polyps in CT colonography using interpolation for curvature estimationa

    PubMed Central

    Liu, Jiamin; Kabadi, Suraj; Van Uitert, Robert; Petrick, Nicholas; Deriche, Rachid; Summers, Ronald M.

    2011-01-01

    Purpose: Surface curvatures are important geometric features for the computer-aided analysis and detection of polyps in CT colonography (CTC). However, the general kernel approach for curvature computation can yield erroneous results for small polyps and for polyps that lie on haustral folds. Those erroneous curvatures will reduce the performance of polyp detection. This paper presents an analysis of interpolation’s effect on curvature estimation for thin structures and its application on computer-aided detection of small polyps in CTC. Methods: The authors demonstrated that a simple technique, image interpolation, can improve the accuracy of curvature estimation for thin structures and thus significantly improve the sensitivity of small polyp detection in CTC. Results: Our experiments showed that the merits of interpolating included more accurate curvature values for simulated data, and isolation of polyps near folds for clinical data. After testing on a large clinical data set, it was observed that sensitivities with linear, quadratic B-spline and cubic B-spline interpolations significantly improved the sensitivity for small polyp detection. Conclusions: The image interpolation can improve the accuracy of curvature estimation for thin structures and thus improve the computer-aided detection of small polyps in CTC. PMID:21859029

  20. Two-way coupled SPH and particle level set fluid simulation.

    PubMed

    Losasso, Frank; Talton, Jerry; Kwatra, Nipun; Fedkiw, Ronald

    2008-01-01

    Grid-based methods have difficulty resolving features on or below the scale of the underlying grid. Although adaptive methods (e.g. RLE, octrees) can alleviate this to some degree, separate techniques are still required for simulating small-scale phenomena such as spray and foam, especially since these more diffuse materials typically behave quite differently than their denser counterparts. In this paper, we propose a two-way coupled simulation framework that uses the particle level set method to efficiently model dense liquid volumes and a smoothed particle hydrodynamics (SPH) method to simulate diffuse regions such as sprays. Our novel SPH method allows us to simulate both dense and diffuse water volumes, fully incorporates the particles that are automatically generated by the particle level set method in under-resolved regions, and allows for two way mixing between dense SPH volumes and grid-based liquid representations.

  1. Sensor-oriented feature usability evaluation in fingerprint segmentation

    NASA Astrophysics Data System (ADS)

    Li, Ying; Yin, Yilong; Yang, Gongping

    2013-06-01

    Existing fingerprint segmentation methods usually process fingerprint images captured by different sensors with the same feature or feature set. We propose to improve the fingerprint segmentation result in view of an important fact that images from different sensors have different characteristics for segmentation. Feature usability evaluation, which means to evaluate the usability of features to find the personalized feature or feature set for different sensors to improve the performance of segmentation. The need for feature usability evaluation for fingerprint segmentation is raised and analyzed as a new issue. To address this issue, we present a decision-tree-based feature-usability evaluation method, which utilizes a C4.5 decision tree algorithm to evaluate and pick the best suitable feature or feature set for fingerprint segmentation from a typical candidate feature set. We apply the novel method on the FVC2002 database of fingerprint images, which are acquired by four different respective sensors and technologies. Experimental results show that the accuracy of segmentation is improved, and time consumption for feature extraction is dramatically reduced with selected feature(s).

  2. 48 CFR 19.506 - Withdrawing or modifying small business set-asides.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... a withdrawal of an individual small business set-aside by giving written notice to the agency small... small business set-asides. 19.506 Section 19.506 Federal Acquisition Regulations System FEDERAL ACQUISITION REGULATION SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 19.506...

  3. 48 CFR 19.506 - Withdrawing or modifying small business set-asides.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... small business set-asides. 19.506 Section 19.506 Federal Acquisition Regulations System FEDERAL ACQUISITION REGULATION SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 19.506 Withdrawing or modifying small business set-asides. (a) If, before award of a contract involving a small...

  4. 48 CFR 19.506 - Withdrawing or modifying small business set-asides.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... small business set-asides. 19.506 Section 19.506 Federal Acquisition Regulations System FEDERAL ACQUISITION REGULATION SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 19.506 Withdrawing or modifying small business set-asides. (a) If, before award of a contract involving a small...

  5. 48 CFR 19.506 - Withdrawing or modifying small business set-asides.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... small business set-asides. 19.506 Section 19.506 Federal Acquisition Regulations System FEDERAL ACQUISITION REGULATION SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 19.506 Withdrawing or modifying small business set-asides. (a) If, before award of a contract involving a small...

  6. 48 CFR 19.506 - Withdrawing or modifying small business set-asides.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... small business set-asides. 19.506 Section 19.506 Federal Acquisition Regulations System FEDERAL ACQUISITION REGULATION SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 19.506 Withdrawing or modifying small business set-asides. (a) If, before award of a contract involving a small...

  7. Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets

    PubMed Central

    2013-01-01

    Background While a large body of work exists on comparing and benchmarking descriptors of molecular structures, a similar comparison of protein descriptor sets is lacking. Hence, in the current work a total of 13 amino acid descriptor sets have been benchmarked with respect to their ability of establishing bioactivity models. The descriptor sets included in the study are Z-scales (3 variants), VHSE, T-scales, ST-scales, MS-WHIM, FASGAI, BLOSUM, a novel protein descriptor set (termed ProtFP (4 variants)), and in addition we created and benchmarked three pairs of descriptor combinations. Prediction performance was evaluated in seven structure-activity benchmarks which comprise Angiotensin Converting Enzyme (ACE) dipeptidic inhibitor data, and three proteochemometric data sets, namely (1) GPCR ligands modeled against a GPCR panel, (2) enzyme inhibitors (NNRTIs) with associated bioactivities against a set of HIV enzyme mutants, and (3) enzyme inhibitors (PIs) with associated bioactivities on a large set of HIV enzyme mutants. Results The amino acid descriptor sets compared here show similar performance (<0.1 log units RMSE difference and <0.1 difference in MCC), while errors for individual proteins were in some cases found to be larger than those resulting from descriptor set differences ( > 0.3 log units RMSE difference and >0.7 difference in MCC). Combining different descriptor sets generally leads to better modeling performance than utilizing individual sets. The best performers were Z-scales (3) combined with ProtFP (Feature), or Z-Scales (3) combined with an average Z-Scale value for each target, while ProtFP (PCA8), ST-Scales, and ProtFP (Feature) rank last. Conclusions While amino acid descriptor sets capture different aspects of amino acids their ability to be used for bioactivity modeling is still – on average – surprisingly similar. Still, combining sets describing complementary information consistently leads to small but consistent improvement in modeling performance (average MCC 0.01 better, average RMSE 0.01 log units lower). Finally, performance differences exist between the targets compared thereby underlining that choosing an appropriate descriptor set is of fundamental for bioactivity modeling, both from the ligand- as well as the protein side. PMID:24059743

  8. SFRP1 is a possible candidate for epigenetic therapy in non-small cell lung cancer.

    PubMed

    Taguchi, Y-H; Iwadate, Mitsuo; Umeyama, Hideaki

    2016-08-12

    Non-small cell lung cancer (NSCLC) remains a lethal disease despite many proposed treatments. Recent studies have indicated that epigenetic therapy, which targets epigenetic effects, might be a new therapeutic methodology for NSCLC. However, it is not clear which objects (e.g., genes) this treatment specifically targets. Secreted frizzled-related proteins (SFRPs) are promising candidates for epigenetic therapy in many cancers, but there have been no reports of SFRPs targeted by epigenetic therapy for NSCLC. This study performed a meta-analysis of reprogrammed NSCLC cell lines instead of the direct examination of epigenetic therapy treatment to identify epigenetic therapy targets. In addition, mRNA expression/promoter methylation profiles were processed by recently proposed principal component analysis based unsupervised feature extraction and categorical regression analysis based feature extraction. The Wnt/β-catenin signalling pathway was extensively enriched among 32 genes identified by feature extraction. Among the genes identified, SFRP1 was specifically indicated to target β-catenin, and thus might be targeted by epigenetic therapy in NSCLC cell lines. A histone deacetylase inhibitor might reactivate SFRP1 based upon the re-analysis of a public domain data set. Numerical computation validated the binding of SFRP1 to WNT1 to suppress Wnt signalling pathway activation in NSCLC. The meta-analysis of reprogrammed NSCLC cell lines identified SFRP1 as a promising target of epigenetic therapy for NSCLC.

  9. Robust Vehicle Detection in Aerial Images Based on Cascaded Convolutional Neural Networks.

    PubMed

    Zhong, Jiandan; Lei, Tao; Yao, Guangle

    2017-11-24

    Vehicle detection in aerial images is an important and challenging task. Traditionally, many target detection models based on sliding-window fashion were developed and achieved acceptable performance, but these models are time-consuming in the detection phase. Recently, with the great success of convolutional neural networks (CNNs) in computer vision, many state-of-the-art detectors have been designed based on deep CNNs. However, these CNN-based detectors are inefficient when applied in aerial image data due to the fact that the existing CNN-based models struggle with small-size object detection and precise localization. To improve the detection accuracy without decreasing speed, we propose a CNN-based detection model combining two independent convolutional neural networks, where the first network is applied to generate a set of vehicle-like regions from multi-feature maps of different hierarchies and scales. Because the multi-feature maps combine the advantage of the deep and shallow convolutional layer, the first network performs well on locating the small targets in aerial image data. Then, the generated candidate regions are fed into the second network for feature extraction and decision making. Comprehensive experiments are conducted on the Vehicle Detection in Aerial Imagery (VEDAI) dataset and Munich vehicle dataset. The proposed cascaded detection model yields high performance, not only in detection accuracy but also in detection speed.

  10. An Analysis of a Digital Variant of the Trail Making Test Using Machine Learning Techniques

    PubMed Central

    Dahmen, Jessamyn; Cook, Diane; Fellows, Robert; Schmitter-Edgecombe, Maureen

    2017-01-01

    BACKGROUND The goal of this work is to develop a digital version of a standard cognitive assessment, the Trail Making Test (TMT), and assess its utility. OBJECTIVE This paper introduces a novel digital version of the TMT and introduces a machine learning based approach to assess its capabilities. METHODS Using digital Trail Making Test (dTMT) data collected from (N=54) older adult participants as feature sets, we use machine learning techniques to analyze the utility of the dTMT and evaluate the insights provided by the digital features. RESULTS Predicted TMT scores correlate well with clinical digital test scores (r=0.98) and paper time to completion scores (r=0.65). Predicted TICS exhibited a small correlation with clinically-derived TICS scores (r=0.12 Part A, r=0.10 Part B). Predicted FAB scores exhibited a small correlation with clinically-derived FAB scores (r=0.13 Part A, r=0.29 for Part B). Digitally-derived features were also used to predict diagnosis (AUC of 0.65). CONCLUSION Our findings indicate that the dTMT is capable of measuring the same aspects of cognition as the paper-based TMT. Furthermore, the dTMT’s additional data may be able to help monitor other cognitive processes not captured by the paper-based TMT alone. PMID:27886019

  11. Robust Vehicle Detection in Aerial Images Based on Cascaded Convolutional Neural Networks

    PubMed Central

    Zhong, Jiandan; Lei, Tao; Yao, Guangle

    2017-01-01

    Vehicle detection in aerial images is an important and challenging task. Traditionally, many target detection models based on sliding-window fashion were developed and achieved acceptable performance, but these models are time-consuming in the detection phase. Recently, with the great success of convolutional neural networks (CNNs) in computer vision, many state-of-the-art detectors have been designed based on deep CNNs. However, these CNN-based detectors are inefficient when applied in aerial image data due to the fact that the existing CNN-based models struggle with small-size object detection and precise localization. To improve the detection accuracy without decreasing speed, we propose a CNN-based detection model combining two independent convolutional neural networks, where the first network is applied to generate a set of vehicle-like regions from multi-feature maps of different hierarchies and scales. Because the multi-feature maps combine the advantage of the deep and shallow convolutional layer, the first network performs well on locating the small targets in aerial image data. Then, the generated candidate regions are fed into the second network for feature extraction and decision making. Comprehensive experiments are conducted on the Vehicle Detection in Aerial Imagery (VEDAI) dataset and Munich vehicle dataset. The proposed cascaded detection model yields high performance, not only in detection accuracy but also in detection speed. PMID:29186756

  12. FEATURE 1, SMALL GUN POSITION, VIEW FACING NORTH, (with scale ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    FEATURE 1, SMALL GUN POSITION, VIEW FACING NORTH, (with scale stick). - Naval Air Station Barbers Point, Anti-Aircraft Battery Complex-Small Gun Position, East of Coral Sea Road, northwest of Hamilton Road, Ewa, Honolulu County, HI

  13. SSOD on JEM RMS

    NASA Image and Video Library

    2012-10-04

    ISS033-E-009269 (4 Oct. 2012) --- A Small Satellite Orbital Deployer (SSOD) attached to the Japanese module’s robotic arm is featured in this image photographed by an Expedition 33 crew member on the International Space Station. Several tiny satellites were released outside the Kibo laboratory using the SSOD on Oct. 4, 2012. Japan Aerospace Exploration Agency astronaut Aki Hoshide, flight engineer, set up the satellite deployment gear inside the lab and placed it in the Kibo airlock. The Japanese robotic arm then grappled the deployment system and its satellites from the airlock for deployment.

  14. SCP -- A Simple CCD Processing Package

    NASA Astrophysics Data System (ADS)

    Lewis, J. R.

    This note describes a small set of programs, written at RGO, which deal with basic CCD frame processing (e.g. bias subtraction, flat fielding, trimming etc.). The need to process large numbers of CCD frames from devices such as FOS or ISIS in order to extract spectra has prompted the writing of routines which will do the basic hack-work with a minimal amount of interaction from the user. Although they were written with spectral data in mind, there are no ``spectrum-specific'' features in the software which means they can be applied to any CCD data.

  15. Prevalence of neuropathic features of back pain in clinical populations: implications for the diagnostic triage paradigm.

    PubMed

    Hush, Julia M; Marcuzzi, Anna

    2012-07-01

    SUMMARY Contemporary clinical assessment of back pain is based on the diagnostic triage paradigm. The most common diagnostic classification is nonspecific back pain, considered to be of nociceptive etiology. A small proportion are diagnosed with radicular pain, of neuropathic origin. In this study we review the body of literature on the prevalence of neuropathic features of back pain, revealing that the point prevalence is 17% in primary care, 34% in mixed clinical settings and 53% in tertiary care. There is evidence that neuropathic features of back pain are not restricted to typical clinical radicular pain phenotypes and may be under-recognized, particularly in primary care. The consequence of this is that in the clinic, diagnostic triage may erroneously classify patients with nonspecific back pain or radicular pain. A promising alternative is the development of mechanism-based pain phenotyping in patients with back pain. Timely identification of contributory pain mechanisms may enable greater opportunity to select appropriate therapeutic targets and improve patient outcomes.

  16. ELUCIDATING BRAIN CONNECTIVITY NETWORKS IN MAJOR DEPRESSIVE DISORDER USING CLASSIFICATION-BASED SCORING.

    PubMed

    Sacchet, Matthew D; Prasad, Gautam; Foland-Ross, Lara C; Thompson, Paul M; Gotlib, Ian H

    2014-04-01

    Graph theory is increasingly used in the field of neuroscience to understand the large-scale network structure of the human brain. There is also considerable interest in applying machine learning techniques in clinical settings, for example, to make diagnoses or predict treatment outcomes. Here we used support-vector machines (SVMs), in conjunction with whole-brain tractography, to identify graph metrics that best differentiate individuals with Major Depressive Disorder (MDD) from nondepressed controls. To do this, we applied a novel feature-scoring procedure that incorporates iterative classifier performance to assess feature robustness. We found that small-worldness , a measure of the balance between global integration and local specialization, most reliably differentiated MDD from nondepressed individuals. Post-hoc regional analyses suggested that heightened connectivity of the subcallosal cingulate gyrus (SCG) in MDDs contributes to these differences. The current study provides a novel way to assess the robustness of classification features and reveals anomalies in large-scale neural networks in MDD.

  17. Identifying predictive features in drug response using machine learning: opportunities and challenges.

    PubMed

    Vidyasagar, Mathukumalli

    2015-01-01

    This article reviews several techniques from machine learning that can be used to study the problem of identifying a small number of features, from among tens of thousands of measured features, that can accurately predict a drug response. Prediction problems are divided into two categories: sparse classification and sparse regression. In classification, the clinical parameter to be predicted is binary, whereas in regression, the parameter is a real number. Well-known methods for both classes of problems are briefly discussed. These include the SVM (support vector machine) for classification and various algorithms such as ridge regression, LASSO (least absolute shrinkage and selection operator), and EN (elastic net) for regression. In addition, several well-established methods that do not directly fall into machine learning theory are also reviewed, including neural networks, PAM (pattern analysis for microarrays), SAM (significance analysis for microarrays), GSEA (gene set enrichment analysis), and k-means clustering. Several references indicative of the application of these methods to cancer biology are discussed.

  18. Differentiation of arterioles from venules in mouse histology images using machine learning

    NASA Astrophysics Data System (ADS)

    Elkerton, J. S.; Xu, Yiwen; Pickering, J. G.; Ward, Aaron D.

    2016-03-01

    Analysis and morphological comparison of arteriolar and venular networks are essential to our understanding of multiple diseases affecting every organ system. We have developed and evaluated the first fully automatic software system for differentiation of arterioles from venules on high-resolution digital histology images of the mouse hind limb immunostained for smooth muscle α-actin. Classifiers trained on texture and morphologic features by supervised machine learning provided excellent classification accuracy for differentiation of arterioles and venules, achieving an area under the receiver operating characteristic curve of 0.90 and balanced false-positive and false-negative rates. Feature selection was consistent across cross-validation iterations, and a small set of three features was required to achieve the reported performance, suggesting potential generalizability of the system. This system eliminates the need for laborious manual classification of the hundreds of microvessels occurring in a typical sample, and paves the way for high-throughput analysis the arteriolar and venular networks in the mouse.

  19. Vessel Classification in Cosmo-Skymed SAR Data Using Hierarchical Feature Selection

    NASA Astrophysics Data System (ADS)

    Makedonas, A.; Theoharatos, C.; Tsagaris, V.; Anastasopoulos, V.; Costicoglou, S.

    2015-04-01

    SAR based ship detection and classification are important elements of maritime monitoring applications. Recently, high-resolution SAR data have opened new possibilities to researchers for achieving improved classification results. In this work, a hierarchical vessel classification procedure is presented based on a robust feature extraction and selection scheme that utilizes scale, shape and texture features in a hierarchical way. Initially, different types of feature extraction algorithms are implemented in order to form the utilized feature pool, able to represent the structure, material, orientation and other vessel type characteristics. A two-stage hierarchical feature selection algorithm is utilized next in order to be able to discriminate effectively civilian vessels into three distinct types, in COSMO-SkyMed SAR images: cargos, small ships and tankers. In our analysis, scale and shape features are utilized in order to discriminate smaller types of vessels present in the available SAR data, or shape specific vessels. Then, the most informative texture and intensity features are incorporated in order to be able to better distinguish the civilian types with high accuracy. A feature selection procedure that utilizes heuristic measures based on features' statistical characteristics, followed by an exhaustive research with feature sets formed by the most qualified features is carried out, in order to discriminate the most appropriate combination of features for the final classification. In our analysis, five COSMO-SkyMed SAR data with 2.2m x 2.2m resolution were used to analyse the detailed characteristics of these types of ships. A total of 111 ships with available AIS data were used in the classification process. The experimental results show that this method has good performance in ship classification, with an overall accuracy reaching 83%. Further investigation of additional features and proper feature selection is currently in progress.

  20. Small-Group Technology-Assisted Instruction: Virtual Teacher and Robot Peer for Individuals with Autism Spectrum Disorder.

    PubMed

    Saadatzi, Mohammad Nasser; Pennington, Robert C; Welch, Karla C; Graham, James H

    2018-06-20

    The authors combined virtual reality technology and social robotics to develop a tutoring system that resembled a small-group arrangement. This tutoring system featured a virtual teacher instructing sight words, and included a humanoid robot emulating a peer. The authors used a multiple-probe design across word sets to evaluate the effects of the instructional package on the explicit acquisition and vicarious learning of sight words instructed to three children with autism spectrum disorder (ASD) and the robot peer. Results indicated that participants acquired, maintained, and generalized 100% of the words explicitly instructed to them, made fewer errors while learning the words common between them and the robot peer, and vicariously learned 94% of the words solely instructed to the robot.

  1. Hypoparathyroidism-retardation-Dysmorphism (HRD) syndrome--a review.

    PubMed

    Hershkovitz, Eli; Parvari, Ruti; Diaz, George A; Gorodischer, Rafael

    2004-12-01

    Hypoparathyroidism, retardation, and dysmorphism (HRD) is a newly recognized genetic syndrome, described in patients of Arab origin. The syndrome consists of permanent congenital hypoparathyroidism, severe prenatal and postnatal growth retardation, and profound global developmental delay. The patients are susceptible to severe infections including life-threatening pneumococcal infections especially during infancy. The main dysmorphic features are microcephaly, deep-set eyes or microphthalmia, ear abnormalities, depressed nasal bridge, thin upper lip, hooked small nose, micrognathia, and small hands and feet. A single 12-bp deletion (del52-55) in the second coding exon of the tubulin cofactor E (TCFE) gene, located on the long arm of chromosome 1, is the cause of HRD among Arab patients. Early recognition and therapy of hypocalcemia is important as is daily antibiotic prophylaxis against pneumococcal infections.

  2. Neural network approach to proximity effect corrections in electron-beam lithography

    NASA Astrophysics Data System (ADS)

    Frye, Robert C.; Cummings, Kevin D.; Rietman, Edward A.

    1990-05-01

    The proximity effect, caused by electron beam backscattering during resist exposure, is an important concern in writing submicron features. It can be compensated by appropriate local changes in the incident beam dose, but computation of the optimal correction usually requires a prohibitively long time. We present an example of such a computation on a small test pattern, which we performed by an iterative method. We then used this solution as a training set for an adaptive neural network. After training, the network computed the same correction as the iterative method, but in a much shorter time. Correcting the image with a software based neural network resulted in a decrease in the computation time by a factor of 30, and a hardware based network enhanced the computation speed by more than a factor of 1000. Both methods had an acceptably small error of 0.5% compared to the results of the iterative computation. Additionally, we verified that the neural network correctly generalized the solution of the problem to include patterns not contained in its training set.

  3. Comparing Pattern Recognition Feature Sets for Sorting Triples in the FIRST Database

    NASA Astrophysics Data System (ADS)

    Proctor, D. D.

    2006-07-01

    Pattern recognition techniques have been used with increasing success for coping with the tremendous amounts of data being generated by automated surveys. Usually this process involves construction of training sets, the typical examples of data with known classifications. Given a feature set, along with the training set, statistical methods can be employed to generate a classifier. The classifier is then applied to process the remaining data. Feature set selection, however, is still an issue. This paper presents techniques developed for accommodating data for which a substantive portion of the training set cannot be classified unambiguously, a typical case for low-resolution data. Significance tests on the sort-ordered, sample-size-normalized vote distribution of an ensemble of decision trees is introduced as a method of evaluating relative quality of feature sets. The technique is applied to comparing feature sets for sorting a particular radio galaxy morphology, bent-doubles, from the Faint Images of the Radio Sky at Twenty Centimeters (FIRST) database. Also examined are alternative functional forms for feature sets. Associated standard deviations provide the means to evaluate the effect of the number of folds, the number of classifiers per fold, and the sample size on the resulting classifications. The technique also may be applied to situations for which, although accurate classifications are available, the feature set is clearly inadequate, but is desired nonetheless to make the best of available information.

  4. Feature Selection for Ridge Regression with Provable Guarantees.

    PubMed

    Paul, Saurabh; Drineas, Petros

    2016-04-01

    We introduce single-set spectral sparsification as a deterministic sampling-based feature selection technique for regularized least-squares classification, which is the classification analog to ridge regression. The method is unsupervised and gives worst-case guarantees of the generalization power of the classification function after feature selection with respect to the classification function obtained using all features. We also introduce leverage-score sampling as an unsupervised randomized feature selection method for ridge regression. We provide risk bounds for both single-set spectral sparsification and leverage-score sampling on ridge regression in the fixed design setting and show that the risk in the sampled space is comparable to the risk in the full-feature space. We perform experiments on synthetic and real-world data sets; a subset of TechTC-300 data sets, to support our theory. Experimental results indicate that the proposed methods perform better than the existing feature selection methods.

  5. More target features in visual working memory leads to poorer search guidance: Evidence from contralateral delay activity

    PubMed Central

    Schmidt, Joseph; MacNamara, Annmarie; Proudfit, Greg Hajcak; Zelinsky, Gregory J.

    2014-01-01

    The visual-search literature has assumed that the top-down target representation used to guide search resides in visual working memory (VWM). We directly tested this assumption using contralateral delay activity (CDA) to estimate the VWM load imposed by the target representation. In Experiment 1, observers previewed four photorealistic objects and were cued to remember the two objects appearing to the left or right of central fixation; Experiment 2 was identical except that observers previewed two photorealistic objects and were cued to remember one. CDA was measured during a delay following preview offset but before onset of a four-object search array. One of the targets was always present, and observers were asked to make an eye movement to it and press a button. We found lower magnitude CDA on trials when the initial search saccade was directed to the target (strong guidance) compared to when it was not (weak guidance). This difference also tended to be larger shortly before search-display onset and was largely unaffected by VWM item-capacity limits or number of previews. Moreover, the difference between mean strong- and weak-guidance CDA was proportional to the increase in search time between mean strong-and weak-guidance trials (as measured by time-to-target and reaction-time difference scores). Contrary to most search models, our data suggest that trials resulting in the maintenance of more target features results in poor search guidance to a target. We interpret these counterintuitive findings as evidence for strong search guidance using a small set of highly discriminative target features that remain after pruning from a larger set of features, with the load imposed on VWM varying with this feature-consolidation process. PMID:24599946

  6. More target features in visual working memory leads to poorer search guidance: evidence from contralateral delay activity.

    PubMed

    Schmidt, Joseph; MacNamara, Annmarie; Proudfit, Greg Hajcak; Zelinsky, Gregory J

    2014-03-05

    The visual-search literature has assumed that the top-down target representation used to guide search resides in visual working memory (VWM). We directly tested this assumption using contralateral delay activity (CDA) to estimate the VWM load imposed by the target representation. In Experiment 1, observers previewed four photorealistic objects and were cued to remember the two objects appearing to the left or right of central fixation; Experiment 2 was identical except that observers previewed two photorealistic objects and were cued to remember one. CDA was measured during a delay following preview offset but before onset of a four-object search array. One of the targets was always present, and observers were asked to make an eye movement to it and press a button. We found lower magnitude CDA on trials when the initial search saccade was directed to the target (strong guidance) compared to when it was not (weak guidance). This difference also tended to be larger shortly before search-display onset and was largely unaffected by VWM item-capacity limits or number of previews. Moreover, the difference between mean strong- and weak-guidance CDA was proportional to the increase in search time between mean strong-and weak-guidance trials (as measured by time-to-target and reaction-time difference scores). Contrary to most search models, our data suggest that trials resulting in the maintenance of more target features results in poor search guidance to a target. We interpret these counterintuitive findings as evidence for strong search guidance using a small set of highly discriminative target features that remain after pruning from a larger set of features, with the load imposed on VWM varying with this feature-consolidation process.

  7. A dual small-molecule rheostat for precise control of protein concentration in Mammalian cells.

    PubMed

    Lin, Yu Hsuan; Pratt, Matthew R

    2014-04-14

    One of the most successful strategies for controlling protein concentrations in living cells relies on protein destabilization domains (DD). Under normal conditions, a DD will be rapidly degraded by the proteasome. However, the same DD can be stabilized or "shielded" in a stoichiometric complex with a small molecule, enabling dose-dependent control of its concentration. This process has been exploited by several labs to post-translationally control the expression levels of proteins in vitro as well as in vivo, although the previous technologies resulted in permanent fusion of the protein of interest to the DD, which can affect biological activity and complicate results. We previously reported a complementary strategy, termed traceless shielding (TShld), in which the protein of interest is released in its native form. Here, we describe an optimized protein concentration control system, TTShld, which retains the traceless features of TShld but utilizes two tiers of small molecule control to set protein concentrations in living cells. These experiments provide the first protein concentration control system that results in both a wide range of protein concentrations and proteins free from engineered fusion constructs. The TTShld system has a greatly improved dynamic range compared to our previously reported system, and the traceless feature is attractive for elucidation of the consequences of protein concentration in cell biology. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Strata-1: An International Space Station Experiment into Fundamental Regolith Processes in Microgravity

    NASA Technical Reports Server (NTRS)

    Fries, M.; Abell, P.; Brisset, J.; Britt, D.; Colwell, J.; Durda, D.; Dove, A.; Graham, L.; Hartzell, C.; John, K.; hide

    2016-01-01

    The Strata-1 experiment will study the evolution of asteroidal regolith through long-duration exposure of simulant materials to the microgravity environment on the International Space Station (ISS). Many asteroids feature low bulk densities, which implies high values of porosity and a mechanical structure composed of loosely bound particles, (i.e. the "rubble pile" model), a prime example of a granular medium. Even the higher-density, mechanically coherent asteroids feature a significant surface layer of loose regolith. These bodies are subjected to a variety of forces and will evolve in response to very small perturbations such as micrometeoroid impacts, planetary flybys, and the YORP effect. Our understanding of this dynamical evolution and the inter-particle forces involved would benefit from long-term observations of granular materials exposed to small vibrations in microgravity. A detailed understanding of asteroid mechanical evolution is needed in order to predict the surface characteristics of as-of-yet unvisited bodies, to understand the larger context of samples collected by missions such as OSIRIS-REx and Hayabusa 1 and 2, and to mitigate risks for both manned and unmanned missions to asteroidal bodies. Understanding regolith dynamics will inform designs of how to land and set anchors, safely sample/move material on asteroidal surfaces, process large volumes of material for in situ resource utilization (ISRU) purposes, and, in general, predict behavior of large and small particles on disturbed asteroid surfaces.

  9. A novel feature extraction approach for microarray data based on multi-algorithm fusion

    PubMed Central

    Jiang, Zhu; Xu, Rong

    2015-01-01

    Feature extraction is one of the most important and effective method to reduce dimension in data mining, with emerging of high dimensional data such as microarray gene expression data. Feature extraction for gene selection, mainly serves two purposes. One is to identify certain disease-related genes. The other is to find a compact set of discriminative genes to build a pattern classifier with reduced complexity and improved generalization capabilities. Depending on the purpose of gene selection, two types of feature extraction algorithms including ranking-based feature extraction and set-based feature extraction are employed in microarray gene expression data analysis. In ranking-based feature extraction, features are evaluated on an individual basis, without considering inter-relationship between features in general, while set-based feature extraction evaluates features based on their role in a feature set by taking into account dependency between features. Just as learning methods, feature extraction has a problem in its generalization ability, which is robustness. However, the issue of robustness is often overlooked in feature extraction. In order to improve the accuracy and robustness of feature extraction for microarray data, a novel approach based on multi-algorithm fusion is proposed. By fusing different types of feature extraction algorithms to select the feature from the samples set, the proposed approach is able to improve feature extraction performance. The new approach is tested against gene expression dataset including Colon cancer data, CNS data, DLBCL data, and Leukemia data. The testing results show that the performance of this algorithm is better than existing solutions. PMID:25780277

  10. A novel feature extraction approach for microarray data based on multi-algorithm fusion.

    PubMed

    Jiang, Zhu; Xu, Rong

    2015-01-01

    Feature extraction is one of the most important and effective method to reduce dimension in data mining, with emerging of high dimensional data such as microarray gene expression data. Feature extraction for gene selection, mainly serves two purposes. One is to identify certain disease-related genes. The other is to find a compact set of discriminative genes to build a pattern classifier with reduced complexity and improved generalization capabilities. Depending on the purpose of gene selection, two types of feature extraction algorithms including ranking-based feature extraction and set-based feature extraction are employed in microarray gene expression data analysis. In ranking-based feature extraction, features are evaluated on an individual basis, without considering inter-relationship between features in general, while set-based feature extraction evaluates features based on their role in a feature set by taking into account dependency between features. Just as learning methods, feature extraction has a problem in its generalization ability, which is robustness. However, the issue of robustness is often overlooked in feature extraction. In order to improve the accuracy and robustness of feature extraction for microarray data, a novel approach based on multi-algorithm fusion is proposed. By fusing different types of feature extraction algorithms to select the feature from the samples set, the proposed approach is able to improve feature extraction performance. The new approach is tested against gene expression dataset including Colon cancer data, CNS data, DLBCL data, and Leukemia data. The testing results show that the performance of this algorithm is better than existing solutions.

  11. Diagnostic and prognostic histopathology system using morphometric indices

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Parvin, Bahram; Chang, Hang; Han, Ju

    Determining at least one of a prognosis or a therapy for a patient based on a stained tissue section of the patient. An image of a stained tissue section of a patient is processed by a processing device. A set of features values for a set of cell-based features is extracted from the processed image, and the processed image is associated with a particular cluster of a plurality of clusters based on the set of feature values, where the plurality of clusters is defined with respect to a feature space corresponding to the set of features.

  12. The Impact of Individual Differences, Types of Model and Social Settings on Block Building Performance among Chinese Preschoolers.

    PubMed

    Tian, Mi; Deng, Zhu; Meng, Zhaokun; Li, Rui; Zhang, Zhiyi; Qi, Wenhui; Wang, Rui; Yin, Tingting; Ji, Menghui

    2018-01-01

    Children's block building performances are used as indicators of other abilities in multiple domains. In the current study, we examined individual differences, types of model and social settings as influences on children's block building performance. Chinese preschoolers ( N = 180) participated in a block building activity in a natural setting, and performance was assessed with multiple measures in order to identify a range of specific skills. Using scores generated across these measures, three dependent variables were analyzed: block building skills, structural balance and structural features. An overall MANOVA showed that there were significant main effects of gender and grade level across most measures. Types of model showed no significant effect in children's block building. There was a significant main effect of social settings on structural features, with the best performance in the 5-member group, followed by individual and then the 10-member block building. These findings suggest that boys performed better than girls in block building activity. Block building performance increased significantly from 1st to 2nd year of preschool, but not from second to third. The preschoolers created more representational constructions when presented with a model made of wooden rather than with a picture. There was partial evidence that children performed better when working with peers in a small group than when working alone or working in a large group. It is suggested that future study should examine other modalities rather than the visual one, diversify the samples and adopt a longitudinal investigation.

  13. Prediction of recombinant protein overexpression in Escherichia coli using a machine learning based model (RPOLP).

    PubMed

    Habibi, Narjeskhatoon; Norouzi, Alireza; Mohd Hashim, Siti Z; Shamsir, Mohd Shahir; Samian, Razip

    2015-11-01

    Recombinant protein overexpression, an important biotechnological process, is ruled by complex biological rules which are mostly unknown, is in need of an intelligent algorithm so as to avoid resource-intensive lab-based trial and error experiments in order to determine the expression level of the recombinant protein. The purpose of this study is to propose a predictive model to estimate the level of recombinant protein overexpression for the first time in the literature using a machine learning approach based on the sequence, expression vector, and expression host. The expression host was confined to Escherichia coli which is the most popular bacterial host to overexpress recombinant proteins. To provide a handle to the problem, the overexpression level was categorized as low, medium and high. A set of features which were likely to affect the overexpression level was generated based on the known facts (e.g. gene length) and knowledge gathered from related literature. Then, a representative sub-set of features generated in the previous objective was determined using feature selection techniques. Finally a predictive model was developed using random forest classifier which was able to adequately classify the multi-class imbalanced small dataset constructed. The result showed that the predictive model provided a promising accuracy of 80% on average, in estimating the overexpression level of a recombinant protein. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. 48 CFR 19.501 - General.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ...). (i) Except as authorized by law, a contract may not be awarded as a result of a small business set... BUSINESS PROGRAMS Set-Asides for Small Business 19.501 General. (a) The purpose of small business set-asides is to award certain acquisitions exclusively to small business concerns. A “set-aside for small...

  15. 48 CFR 1319.202-70 - Small business set-aside review form.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 48 Federal Acquisition Regulations System 5 2010-10-01 2010-10-01 false Small business set-aside... COMMERCE SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Policies. 1319.202-70 Small business set-aside review form. Form CD 570, Small Business Set-Aside Review, shall be submitted for approval to the...

  16. 48 CFR 919.502-2 - Total small business set-asides.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 48 Federal Acquisition Regulations System 5 2013-10-01 2013-10-01 false Total small business set-asides. 919.502-2 Section 919.502-2 Federal Acquisition Regulations System DEPARTMENT OF ENERGY SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 919.502-2 Total small business set...

  17. 48 CFR 1319.202-70 - Small business set-aside review form.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 48 Federal Acquisition Regulations System 5 2014-10-01 2014-10-01 false Small business set-aside... COMMERCE SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Policies. 1319.202-70 Small business set-aside review form. Form CD 570, Small Business Set-Aside Review, shall be submitted for approval to the...

  18. 48 CFR 1419.506 - Withdrawing or modifying small business set-asides.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... small business set-asides. 1419.506 Section 1419.506 Federal Acquisition Regulations System DEPARTMENT OF THE INTERIOR SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 1419.506 Withdrawing or modifying small business set-asides. The HCA is authorized, without the power of redelegation...

  19. 48 CFR 1419.506 - Withdrawing or modifying small business set-asides.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... small business set-asides. 1419.506 Section 1419.506 Federal Acquisition Regulations System DEPARTMENT OF THE INTERIOR SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 1419.506 Withdrawing or modifying small business set-asides. The HCA is authorized, without the power of redelegation...

  20. 48 CFR 1319.202-70 - Small business set-aside review form.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 48 Federal Acquisition Regulations System 5 2011-10-01 2011-10-01 false Small business set-aside... COMMERCE SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Policies. 1319.202-70 Small business set-aside review form. Form CD 570, Small Business Set-Aside Review, shall be submitted for approval to the...

  1. 48 CFR 919.502-2 - Total small business set-asides.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 48 Federal Acquisition Regulations System 5 2014-10-01 2014-10-01 false Total small business set-asides. 919.502-2 Section 919.502-2 Federal Acquisition Regulations System DEPARTMENT OF ENERGY SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 919.502-2 Total small business set...

  2. 48 CFR 1419.506 - Withdrawing or modifying small business set-asides.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... small business set-asides. 1419.506 Section 1419.506 Federal Acquisition Regulations System DEPARTMENT OF THE INTERIOR SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 1419.506 Withdrawing or modifying small business set-asides. The HCA is authorized, without the power of redelegation...

  3. 48 CFR 1319.202-70 - Small business set-aside review form.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 48 Federal Acquisition Regulations System 5 2012-10-01 2012-10-01 false Small business set-aside... COMMERCE SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Policies. 1319.202-70 Small business set-aside review form. Form CD 570, Small Business Set-Aside Review, shall be submitted for approval to the...

  4. 48 CFR 919.502-2 - Total small business set-asides.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 48 Federal Acquisition Regulations System 5 2012-10-01 2012-10-01 false Total small business set-asides. 919.502-2 Section 919.502-2 Federal Acquisition Regulations System DEPARTMENT OF ENERGY SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 919.502-2 Total small business set...

  5. 48 CFR 1419.506 - Withdrawing or modifying small business set-asides.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... small business set-asides. 1419.506 Section 1419.506 Federal Acquisition Regulations System DEPARTMENT OF THE INTERIOR SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 1419.506 Withdrawing or modifying small business set-asides. The HCA is authorized, without the power of redelegation...

  6. 48 CFR 1319.202-70 - Small business set-aside review form.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 48 Federal Acquisition Regulations System 5 2013-10-01 2013-10-01 false Small business set-aside... COMMERCE SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Policies. 1319.202-70 Small business set-aside review form. Form CD 570, Small Business Set-Aside Review, shall be submitted for approval to the...

  7. 48 CFR 919.502-2 - Total small business set-asides.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 48 Federal Acquisition Regulations System 5 2011-10-01 2011-10-01 false Total small business set-asides. 919.502-2 Section 919.502-2 Federal Acquisition Regulations System DEPARTMENT OF ENERGY SOCIOECONOMIC PROGRAMS SMALL BUSINESS PROGRAMS Set-Asides for Small Business 919.502-2 Total small business set...

  8. Anomaly Detection Using an Ensemble of Feature Models

    PubMed Central

    Noto, Keith; Brodley, Carla; Slonim, Donna

    2011-01-01

    We present a new approach to semi-supervised anomaly detection. Given a set of training examples believed to come from the same distribution or class, the task is to learn a model that will be able to distinguish examples in the future that do not belong to the same class. Traditional approaches typically compare the position of a new data point to the set of “normal” training data points in a chosen representation of the feature space. For some data sets, the normal data may not have discernible positions in feature space, but do have consistent relationships among some features that fail to appear in the anomalous examples. Our approach learns to predict the values of training set features from the values of other features. After we have formed an ensemble of predictors, we apply this ensemble to new data points. To combine the contribution of each predictor in our ensemble, we have developed a novel, information-theoretic anomaly measure that our experimental results show selects against noisy and irrelevant features. Our results on 47 data sets show that for most data sets, this approach significantly improves performance over current state-of-the-art feature space distance and density-based approaches. PMID:22020249

  9. DEWS (DEep White matter hyperintensity Segmentation framework): A fully automated pipeline for detecting small deep white matter hyperintensities in migraineurs.

    PubMed

    Park, Bo-Yong; Lee, Mi Ji; Lee, Seung-Hak; Cha, Jihoon; Chung, Chin-Sang; Kim, Sung Tae; Park, Hyunjin

    2018-01-01

    Migraineurs show an increased load of white matter hyperintensities (WMHs) and more rapid deep WMH progression. Previous methods for WMH segmentation have limited efficacy to detect small deep WMHs. We developed a new fully automated detection pipeline, DEWS (DEep White matter hyperintensity Segmentation framework), for small and superficially-located deep WMHs. A total of 148 non-elderly subjects with migraine were included in this study. The pipeline consists of three components: 1) white matter (WM) extraction, 2) WMH detection, and 3) false positive reduction. In WM extraction, we adjusted the WM mask to re-assign misclassified WMHs back to WM using many sequential low-level image processing steps. In WMH detection, the potential WMH clusters were detected using an intensity based threshold and region growing approach. For false positive reduction, the detected WMH clusters were classified into final WMHs and non-WMHs using the random forest (RF) classifier. Size, texture, and multi-scale deep features were used to train the RF classifier. DEWS successfully detected small deep WMHs with a high positive predictive value (PPV) of 0.98 and true positive rate (TPR) of 0.70 in the training and test sets. Similar performance of PPV (0.96) and TPR (0.68) was attained in the validation set. DEWS showed a superior performance in comparison with other methods. Our proposed pipeline is freely available online to help the research community in quantifying deep WMHs in non-elderly adults.

  10. Fast detection of vascular plaque in optical coherence tomography images using a reduced feature set

    NASA Astrophysics Data System (ADS)

    Prakash, Ammu; Ocana Macias, Mariano; Hewko, Mark; Sowa, Michael; Sherif, Sherif

    2018-03-01

    Optical coherence tomography (OCT) images are capable of detecting vascular plaque by using the full set of 26 Haralick textural features and a standard K-means clustering algorithm. However, the use of the full set of 26 textural features is computationally expensive and may not be feasible for real time implementation. In this work, we identified a reduced set of 3 textural feature which characterizes vascular plaque and used a generalized Fuzzy C-means clustering algorithm. Our work involves three steps: 1) the reduction of a full set 26 textural feature to a reduced set of 3 textural features by using genetic algorithm (GA) optimization method 2) the implementation of an unsupervised generalized clustering algorithm (Fuzzy C-means) on the reduced feature space, and 3) the validation of our results using histology and actual photographic images of vascular plaque. Our results show an excellent match with histology and actual photographic images of vascular tissue. Therefore, our results could provide an efficient pre-clinical tool for the detection of vascular plaque in real time OCT imaging.

  11. Classification of independent components of EEG into multiple artifact classes.

    PubMed

    Frølich, Laura; Andersen, Tobias S; Mørup, Morten

    2015-01-01

    In this study, we aim to automatically identify multiple artifact types in EEG. We used multinomial regression to classify independent components of EEG data, selecting from 65 spatial, spectral, and temporal features of independent components using forward selection. The classifier identified neural and five nonneural types of components. Between subjects within studies, high classification performances were obtained. Between studies, however, classification was more difficult. For neural versus nonneural classifications, performance was on par with previous results obtained by others. We found that automatic separation of multiple artifact classes is possible with a small feature set. Our method can reduce manual workload and allow for the selective removal of artifact classes. Identifying artifacts during EEG recording may be used to instruct subjects to refrain from activity causing them. Copyright © 2014 Society for Psychophysiological Research.

  12. Nuclear Physics Around the Unitarity Limit

    DOE PAGES

    König, Sebastian; Grießhammer, Harald W.; Hammer, H. -W.; ...

    2017-05-15

    We argue that many features of the structure of nuclei emerge from a strictly perturbative expansion around the unitarity limit, where the two-nucleon S waves have bound states at zero energy. In this limit, the gross features of states in the nuclear chart are correlated to only one dimensionful parameter, which is related to the breaking of scale invariance to a discrete scaling symmetry and set by the triton binding energy. Observables are moved to their physical values by small perturbative corrections, much like in descriptions of the fine structure of atomic spectra. We provide evidence in favor of themore » conjecture that light, and possibly heavier, nuclei are bound weakly enough to be insensitive to the details of the interactions but strongly enough to be insensitive to the exact size of the two-nucleon system.« less

  13. Small Animal Models for Evaluating Filovirus Countermeasures.

    PubMed

    Banadyga, Logan; Wong, Gary; Qiu, Xiangguo

    2018-05-11

    The development of novel therapeutics and vaccines to treat or prevent disease caused by filoviruses, such as Ebola and Marburg viruses, depends on the availability of animal models that faithfully recapitulate clinical hallmarks of disease as it is observed in humans. In particular, small animal models (such as mice and guinea pigs) are historically and frequently used for the primary evaluation of antiviral countermeasures, prior to testing in nonhuman primates, which represent the gold-standard filovirus animal model. In the past several years, however, the filovirus field has witnessed the continued refinement of the mouse and guinea pig models of disease, as well as the introduction of the hamster and ferret models. We now have small animal models for most human-pathogenic filoviruses, many of which are susceptible to wild type virus and demonstrate key features of disease, including robust virus replication, coagulopathy, and immune system dysfunction. Although none of these small animal model systems perfectly recapitulates Ebola virus disease or Marburg virus disease on its own, collectively they offer a nearly complete set of tools in which to carry out the preclinical development of novel antiviral drugs.

  14. Feature Selection Methods for Zero-Shot Learning of Neural Activity.

    PubMed

    Caceres, Carlos A; Roos, Matthew J; Rupp, Kyle M; Milsap, Griffin; Crone, Nathan E; Wolmetz, Michael E; Ratto, Christopher R

    2017-01-01

    Dimensionality poses a serious challenge when making predictions from human neuroimaging data. Across imaging modalities, large pools of potential neural features (e.g., responses from particular voxels, electrodes, and temporal windows) have to be related to typically limited sets of stimuli and samples. In recent years, zero-shot prediction models have been introduced for mapping between neural signals and semantic attributes, which allows for classification of stimulus classes not explicitly included in the training set. While choices about feature selection can have a substantial impact when closed-set accuracy, open-set robustness, and runtime are competing design objectives, no systematic study of feature selection for these models has been reported. Instead, a relatively straightforward feature stability approach has been adopted and successfully applied across models and imaging modalities. To characterize the tradeoffs in feature selection for zero-shot learning, we compared correlation-based stability to several other feature selection techniques on comparable data sets from two distinct imaging modalities: functional Magnetic Resonance Imaging and Electrocorticography. While most of the feature selection methods resulted in similar zero-shot prediction accuracies and spatial/spectral patterns of selected features, there was one exception; A novel feature/attribute correlation approach was able to achieve those accuracies with far fewer features, suggesting the potential for simpler prediction models that yield high zero-shot classification accuracy.

  15. Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data.

    PubMed

    Shah, M; Marchand, M; Corbeil, J

    2012-01-01

    One of the objectives of designing feature selection learning algorithms is to obtain classifiers that depend on a small number of attributes and have verifiable future performance guarantees. There are few, if any, approaches that successfully address the two goals simultaneously. To the best of our knowledge, such algorithms that give theoretical bounds on the future performance have not been proposed so far in the context of the classification of gene expression data. In this work, we investigate the premise of learning a conjunction (or disjunction) of decision stumps in Occam's Razor, Sample Compression, and PAC-Bayes learning settings for identifying a small subset of attributes that can be used to perform reliable classification tasks. We apply the proposed approaches for gene identification from DNA microarray data and compare our results to those of the well-known successful approaches proposed for the task. We show that our algorithm not only finds hypotheses with a much smaller number of genes while giving competitive classification accuracy but also having tight risk guarantees on future performance, unlike other approaches. The proposed approaches are general and extensible in terms of both designing novel algorithms and application to other domains.

  16. Venus small volcano classification and description

    NASA Technical Reports Server (NTRS)

    Aubele, J. C.

    1993-01-01

    The high resolution and global coverage of the Magellan radar image data set allows detailed study of the smallest volcanoes on the planet. A modified classification scheme for volcanoes less than 20 km in diameter is shown and described. It is based on observations of all members of the 556 significant clusters or fields of small volcanoes located and described by this author during data collection for the Magellan Volcanic and Magmatic Feature Catalog. This global study of approximately 10 exp 4 volcanoes provides new information for refining small volcano classification based on individual characteristics. Total number of these volcanoes was estimated to be 10 exp 5 to 10 exp 6 planetwide based on pre-Magellan analysis of Venera 15/16, and during preparation of the global catalog, small volcanoes were identified individually or in clusters in every C1-MIDR mosaic of the Magellan data set. Basal diameter (based on 1000 measured edifices) generally ranges from 2 to 12 km with a mode of 34 km, and follows an exponential distribution similar to the size frequency distribution of seamounts as measured from GLORIA sonar images. This is a typical distribution for most size-limited natural phenomena unlike impact craters which follow a power law distribution and continue to infinitely increase in number with decreasing size. Using an exponential distribution calculated from measured small volcanoes selected globally at random, we can calculate total number possible given a minimum size. The paucity of edifice diameters less than 2 km may be due to inability to identify very small volcanic edifices in this data set; however, summit pits are recognizable at smaller diameters, and 2 km may represent a significant minimum diameter related to style of volcanic eruption. Guest, et al, discussed four general types of small volcanic edifices on Venus: (1) small lava shields; (2) small volcanic cones; (3) small volcanic domes; and (4) scalloped margin domes ('ticks'). Steep-sided domes or 'pancake domes', larger than 20 km in diameter, were included with the small volcanic domes. For the purposes of this study, only volcanic edifices less than 20 km in diameter are discussed. This forms a convenient cutoff since most of the steep-sided domes ('pancake domes') and scalloped margin domes ('ticks') are 20 to 100 km in diameter, are much less numerous globally than are the smaller diameter volcanic edifices (2 to 3 orders of magnitude lower in total global number), and do not commonly occur in large clusters or fields of large numbers of edifices.

  17. Automated identification of diagnosis and co-morbidity in clinical records.

    PubMed

    Cano, C; Blanco, A; Peshkin, L

    2009-01-01

    Automated understanding of clinical records is a challenging task involving various legal and technical difficulties. Clinical free text is inherently redundant, unstructured, and full of acronyms, abbreviations and domain-specific language which make it challenging to mine automatically. There is much effort in the field focused on creating specialized ontology, lexicons and heuristics based on expert knowledge of the domain. However, ad-hoc solutions poorly generalize across diseases or diagnoses. This paper presents a successful approach for a rapid prototyping of a diagnosis classifier based on a popular computational linguistics platform. The corpus consists of several hundred of full length discharge summaries provided by Partners Healthcare. The goal is to identify a diagnosis and assign co-morbidi-ty. Our approach is based on the rapid implementation of a logistic regression classifier using an existing toolkit: LingPipe (http://alias-i.com/lingpipe). We implement and compare three different classifiers. The baseline approach uses character 5-grams as features. The second approach uses a bag-of-words representation enriched with a small additional set of features. The third approach reduces a feature set to the most informative features according to the information content. The proposed systems achieve high performance (average F-micro 0.92) for the task. We discuss the relative merit of the three classifiers. Supplementary material with detailed results is available at: http:// decsai.ugr.es/~ccano/LR/supplementary_ material/ We show that our methodology for rapid prototyping of a domain-unaware system is effective for building an accurate classifier for clinical records.

  18. 3-D seismic study into the origin of a large seafloor depression on the Chatham Rise, New Zealand

    NASA Astrophysics Data System (ADS)

    Pecher, I. A.; Waghorn, K. A.; Strachan, L. J.; Crutchley, G. J.; Bialas, J.; Sarkar, S.; Davy, B. W.; Papenberg, C. A.; Koch, S.; Eckardt, T.; Kroeger, K. F.; Rose, P. S.; Coffin, R. B.

    2014-12-01

    Vast areas of the Chatham Rise, east of New Zealand's South Island, are covered by circular to elliptical seafloor depressions. Distribution and size of these seafloor depressions appear to be linked to bathymetry: Small depressions several hundred meters in diameter are found in a depth range of ~500-800 m while two types of larger depressions with 2-5 km and >10 km in diameter, respectively, are present in water depths of 800-1100 m. Here we evaluate 3-D seismic reflection data acquired off the R/V Sonne in 2013 over one of the 2-5 km large depressions. We interpret that the seafloor bathymetry associated with the 2-5 km depressions was most likely created by contour current erosion and deposition. These contourite features are underlain by structures that indicate upward fluid flow, including polygonal fault networks and a conical feature that we interpret to result from sediment re-mobilization. We also discovered a set of smaller buried depressions immediately beneath the contourites. These features are directly connected to the stratigraphy containing the conical feature through sets of polygonal faults which truncate against the base of the paleo-depressions. We interpret these depressions as paleo-pockmarks resulting from fluid expulsion, presumably including gas. Based on interpretation and age correlation of a regional-scale seismic line, the paleo-pockmarks could be as old as 5.5 Ma. We suggest the resulting paleo-topography provided the initial roughness required to form mounded contourite deposits that lead to depressions in seafloor bathymetry.

  19. A dataset of images and morphological profiles of 30 000 small-molecule treatments using the Cell Painting assay

    PubMed Central

    Bray, Mark-Anthony; Gustafsdottir, Sigrun M; Rohban, Mohammad H; Singh, Shantanu; Ljosa, Vebjorn; Sokolnicki, Katherine L; Bittker, Joshua A; Bodycombe, Nicole E; Dančík, Vlado; Hasaka, Thomas P; Hon, Cindy S; Kemp, Melissa M; Li, Kejie; Walpita, Deepika; Wawer, Mathias J; Golub, Todd R; Schreiber, Stuart L; Clemons, Paul A; Shamji, Alykhan F

    2017-01-01

    Abstract Background Large-scale image sets acquired by automated microscopy of perturbed samples enable a detailed comparison of cell states induced by each perturbation, such as a small molecule from a diverse library. Highly multiplexed measurements of cellular morphology can be extracted from each image and subsequently mined for a number of applications. Findings This microscopy dataset includes 919 265 five-channel fields of view, representing 30 616 tested compounds, available at “The Cell Image Library” (CIL) repository. It also includes data files containing morphological features derived from each cell in each image, both at the single-cell level and population-averaged (i.e., per-well) level; the image analysis workflows that generated the morphological features are also provided. Quality-control metrics are provided as metadata, indicating fields of view that are out-of-focus or containing highly fluorescent material or debris. Lastly, chemical annotations are supplied for the compound treatments applied. Conclusions Because computational algorithms and methods for handling single-cell morphological measurements are not yet routine, the dataset serves as a useful resource for the wider scientific community applying morphological (image-based) profiling. The dataset can be mined for many purposes, including small-molecule library enrichment and chemical mechanism-of-action studies, such as target identification. Integration with genetically perturbed datasets could enable identification of small-molecule mimetics of particular disease- or gene-related phenotypes that could be useful as probes or potential starting points for development of future therapeutics. PMID:28327978

  20. Martian cratering 11. Utilizing decameter scale crater populations to study Martian history

    NASA Astrophysics Data System (ADS)

    Hartmann, W. K.; Daubar, I. J.

    2017-03-01

    New information has been obtained in recent years regarding formation rates and the production size-frequency distribution (PSFD) of decameter-scale primary Martian craters formed during recent orbiter missions. Here we compare the PSFD of the currently forming small primaries (P) with new data on the PSFD of the total small crater population that includes primaries and field secondaries (P + fS), which represents an average over longer time periods. The two data sets, if used in a combined manner, have extraordinary potential for clarifying not only the evolutionary history and resurfacing episodes of small Martian geological formations (as small as one or few km2) but also possible episodes of recent climatic change. In response to recent discussions of statistical methodologies, we point out that crater counts do not produce idealized statistics, and that inherent uncertainties limit improvements that can be made by more sophisticated statistical analyses. We propose three mutually supportive procedures for interpreting crater counts of small craters in this context. Applications of these procedures support suggestions that topographic features in upper meters of mid-latitude ice-rich areas date only from the last few periods of extreme Martian obliquity, and associated predicted climate excursions.

  1. Robust Statistical Fusion of Image Labels

    PubMed Central

    Landman, Bennett A.; Asman, Andrew J.; Scoggins, Andrew G.; Bogovic, John A.; Xing, Fangxu; Prince, Jerry L.

    2011-01-01

    Image labeling and parcellation (i.e. assigning structure to a collection of voxels) are critical tasks for the assessment of volumetric and morphometric features in medical imaging data. The process of image labeling is inherently error prone as images are corrupted by noise and artifacts. Even expert interpretations are subject to subjectivity and the precision of the individual raters. Hence, all labels must be considered imperfect with some degree of inherent variability. One may seek multiple independent assessments to both reduce this variability and quantify the degree of uncertainty. Existing techniques have exploited maximum a posteriori statistics to combine data from multiple raters and simultaneously estimate rater reliabilities. Although quite successful, wide-scale application has been hampered by unstable estimation with practical datasets, for example, with label sets with small or thin objects to be labeled or with partial or limited datasets. As well, these approaches have required each rater to generate a complete dataset, which is often impossible given both human foibles and the typical turnover rate of raters in a research or clinical environment. Herein, we propose a robust approach to improve estimation performance with small anatomical structures, allow for missing data, account for repeated label sets, and utilize training/catch trial data. With this approach, numerous raters can label small, overlapping portions of a large dataset, and rater heterogeneity can be robustly controlled while simultaneously estimating a single, reliable label set and characterizing uncertainty. The proposed approach enables many individuals to collaborate in the construction of large datasets for labeling tasks (e.g., human parallel processing) and reduces the otherwise detrimental impact of rater unavailability. PMID:22010145

  2. High-density stretchable microelectrode arrays: An integrated technology platform for neural and muscular surface interfacing

    NASA Astrophysics Data System (ADS)

    Guo, Liang

    2011-12-01

    Numerous applications in neuroscience research and neural prosthetics, such as retinal prostheses, spinal-cord surface stimulation for prosthetics, electrocorticogram (ECoG) recording for epilepsy detection, etc., involve electrical interaction with soft excitable tissues using a surface stimulation and/or recording approach. These applications require an interface that is able to set up electrical communications with a high throughput between electronics and the excitable tissue and that can dynamically conform to the shape of the soft tissue. Being a compliant and biocompatible material with mechanical impedance close to that of soft tissues, polydimethylsiloxane (PDMS) offers excellent potential as the substrate material for such neural interfaces. However, fabrication of electrical functionalities on PDMS has long been very challenging. This thesis work has successfully overcome many challenges associated with PDMS-based microfabrication and achieved an integrated technology platform for PDMS-based stretchable microelectrode arrays (sMEAs). This platform features a set of technological advances: (1) we have fabricated uniform current density profile microelectrodes as small as 10 mum in diameter; (2) we have patterned high-resolution (feature as small as 10 mum), high-density (pitch as small as 20 mum) thin-film gold interconnects on PDMS substrate; (3) we have developed a multilayer wiring interconnect technology within the PDMS substrate to further boost the achievable integration density of such sMEA; and (4) we have invented a bonding technology---via-bonding---to facilitate high-resolution, high-density integration of the sMEA with integrated circuits (ICs) to form a compact implant. Taken together, this platform provides a high-resolution, high-density integrated system solution for neural and muscular surface interfacing. sMEAs of example designs are evaluated through in vitro and in vivo experimentations on their biocompatibility, surface conformability, and surface recording/stimulation capabilities, with a focus on epimysial (i.e. on the surface of muscle) applications. Finally, as an example medical application, we investigate a prosthesis for unilateral vocal cord paralysis (UVCP) based on simultaneous multichannel epimysial recording and stimulation.

  3. SU-E-T-630: Predictive Modeling of Mortality, Tumor Control, and Normal Tissue Complications After Stereotactic Body Radiotherapy for Stage I Non-Small Cell Lung Cancer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lindsay, WD; Oncora Medical, LLC, Philadelphia, PA; Berlind, CG

    Purpose: While rates of local control have been well characterized after stereotactic body radiotherapy (SBRT) for stage I non-small cell lung cancer (NSCLC), less data are available characterizing survival and normal tissue toxicities, and no validated models exist assessing these parameters after SBRT. We evaluate the reliability of various machine learning techniques when applied to radiation oncology datasets to create predictive models of mortality, tumor control, and normal tissue complications. Methods: A dataset of 204 consecutive patients with stage I non-small cell lung cancer (NSCLC) treated with stereotactic body radiotherapy (SBRT) at the University of Pennsylvania between 2009 and 2013more » was used to create predictive models of tumor control, normal tissue complications, and mortality in this IRB-approved study. Nearly 200 data fields of detailed patient- and tumor-specific information, radiotherapy dosimetric measurements, and clinical outcomes data were collected. Predictive models were created for local tumor control, 1- and 3-year overall survival, and nodal failure using 60% of the data (leaving the remainder as a test set). After applying feature selection and dimensionality reduction, nonlinear support vector classification was applied to the resulting features. Models were evaluated for accuracy and area under ROC curve on the 81-patient test set. Results: Models for common events in the dataset (such as mortality at one year) had the highest predictive power (AUC = .67, p < 0.05). For rare occurrences such as radiation pneumonitis and local failure (each occurring in less than 10% of patients), too few events were present to create reliable models. Conclusion: Although this study demonstrates the validity of predictive analytics using information extracted from patient medical records and can most reliably predict for survival after SBRT, larger sample sizes are needed to develop predictive models for normal tissue toxicities and more advanced machine learning methodologies need be consider in the future.« less

  4. Simulating the water budget of a Prairie Potholes complex from LiDAR and hydrological models in North Dakota, USA

    USGS Publications Warehouse

    Huang, Shengli; Young, Claudia; Abdul-Aziz, Omar I.; Dahal, Devendra; Feng, Min; Liu, Shuguang

    2013-01-01

    Hydrological processes of the wetland complex in the Prairie Pothole Region (PPR) are difficult to model, partly due to a lack of wetland morphology data. We used Light Detection And Ranging (LiDAR) data sets to derive wetland features; we then modelled rainfall, snowfall, snowmelt, runoff, evaporation, the “fill-and-spill” mechanism, shallow groundwater loss, and the effect of wet and dry conditions. For large wetlands with a volume greater than thousands of cubic metres (e.g. about 3000 m3), the modelled water volume agreed fairly well with observations; however, it did not succeed for small wetlands (e.g. volume less than 450 m3). Despite the failure for small wetlands, the modelled water area of the wetland complex coincided well with interpretation of aerial photographs, showing a linear regression with R2 of around 0.80 and a mean average error of around 0.55 km2. The next step is to improve the water budget modelling for small wetlands.

  5. The NST observation of a small loop eruption in He I D3 line on 2016 May 30

    NASA Astrophysics Data System (ADS)

    Kim, Yeon-Han; Xu, Yan; Bong, Su-Chan; Lim, Eunkyung; Yang, Heesu; Park, Young-Deuk; Yurchyshyn, Vasyl B.; Ahn, Kwangsu; Goode, Philip R.

    2017-08-01

    Since the He I D3 line has a unique response to a flare impact on the low solar atmosphere, it can be a powerful diagnostic tool for energy transport processes. In order to obtain comprehensive data sets for studying solar flare activities in D3 spectral line, we performed observations for several days using the 1.6m New Solar Telescope of Big Bear Solar Observatory (BBSO) in 2015 and 2016, equipped with the He I D3 filter, the photospheric broadband filter, and Near IR imaging spectrograph (NIRIS). On 2016 May 30, we observed a small loop eruption in He I D3 images associated with a B class brightening, which is occurred around 17:10 UT in a small active region, and dynamic variations of photospheric features in G-band images. Accordingly, the cause of the loop eruption can be magnetic reconnection driven by photospheric plasma motions. In this presentation, we will give the observation results and the interpretation.

  6. Is performance in task-cuing experiments mediated by task set selection or associative compound retrieval?

    PubMed

    Forrest, Charlotte L D; Monsell, Stephen; McLaren, Ian P L

    2014-07-01

    Task-cuing experiments are usually intended to explore control of task set. But when small stimulus sets are used, they plausibly afford learning of the response associated with a combination of cue and stimulus, without reference to tasks. In 3 experiments we presented the typical trials of a task-cuing experiment: a cue (colored shape) followed, after a short or long interval, by a digit to which 1 of 2 responses was required. In a tasks condition, participants were (as usual) directed to interpret the cue as an instruction to perform either an odd/even or a high/low classification task. In a cue + stimulus → response (CSR) condition, to induce learning of mappings between cue-stimulus compound and response, participants were, in Experiment 1, given standard task instructions and additionally encouraged to learn the CSR mappings; in Experiment 2, informed of all the CSR mappings and asked to learn them, without standard task instructions; in Experiment 3, required to learn the mappings by trial and error. The effects of a task switch, response congruence, preparation, and transfer to a new set of stimuli differed substantially between the conditions in ways indicative of classification according to task rules in the tasks condition, and retrieval of responses specific to stimulus-cue combinations in the CSR conditions. Qualitative features of the latter could be captured by an associative learning network. Hence associatively based compound retrieval can serve as the basis for performance with a small stimulus set. But when organization by tasks is apparent, control via task set selection is the natural and efficient strategy. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  7. Machine learning-based quantitative texture analysis of CT images of small renal masses: Differentiation of angiomyolipoma without visible fat from renal cell carcinoma.

    PubMed

    Feng, Zhichao; Rong, Pengfei; Cao, Peng; Zhou, Qingyu; Zhu, Wenwei; Yan, Zhimin; Liu, Qianyun; Wang, Wei

    2018-04-01

    To evaluate the diagnostic performance of machine-learning based quantitative texture analysis of CT images to differentiate small (≤ 4 cm) angiomyolipoma without visible fat (AMLwvf) from renal cell carcinoma (RCC). This single-institutional retrospective study included 58 patients with pathologically proven small renal mass (17 in AMLwvf and 41 in RCC groups). Texture features were extracted from the largest possible tumorous regions of interest (ROIs) by manual segmentation in preoperative three-phase CT images. Interobserver reliability and the Mann-Whitney U test were applied to select features preliminarily. Then support vector machine with recursive feature elimination (SVM-RFE) and synthetic minority oversampling technique (SMOTE) were adopted to establish discriminative classifiers, and the performance of classifiers was assessed. Of the 42 extracted features, 16 candidate features showed significant intergroup differences (P < 0.05) and had good interobserver agreement. An optimal feature subset including 11 features was further selected by the SVM-RFE method. The SVM-RFE+SMOTE classifier achieved the best performance in discriminating between small AMLwvf and RCC, with the highest accuracy, sensitivity, specificity and AUC of 93.9 %, 87.8 %, 100 % and 0.955, respectively. Machine learning analysis of CT texture features can facilitate the accurate differentiation of small AMLwvf from RCC. • Although conventional CT is useful for diagnosis of SRMs, it has limitations. • Machine-learning based CT texture analysis facilitate differentiation of small AMLwvf from RCC. • The highest accuracy of SVM-RFE+SMOTE classifier reached 93.9 %. • Texture analysis combined with machine-learning methods might spare unnecessary surgery for AMLwvf.

  8. 36 CFR 223.103 - Award of small business set-aside sales.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 36 Parks, Forests, and Public Property 2 2010-07-01 2010-07-01 false Award of small business set....103 Award of small business set-aside sales. If timber is advertised as set aside for competitive bidding by small business concerns, award will be made to the highest bidder who qualifies as a small...

  9. 48 CFR 5119.1070-2 - Emerging small business set-aside.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 48 Federal Acquisition Regulations System 7 2013-10-01 2012-10-01 true Emerging small business set... ACQUISITION REGULATIONS SMALL BUSINESS AND SMALL DISADVANTAGED BUSINESS CONCERNS Small Business Competitiveness Demonstration Program 5119.1070-2 Emerging small business set-aside. (a)(S-90) Solicitations for...

  10. Prediction of protein interaction hot spots using rough set-based multiple criteria linear programming.

    PubMed

    Chen, Ruoying; Zhang, Zhiwang; Wu, Di; Zhang, Peng; Zhang, Xinyang; Wang, Yong; Shi, Yong

    2011-01-21

    Protein-protein interactions are fundamentally important in many biological processes and it is in pressing need to understand the principles of protein-protein interactions. Mutagenesis studies have found that only a small fraction of surface residues, known as hot spots, are responsible for the physical binding in protein complexes. However, revealing hot spots by mutagenesis experiments are usually time consuming and expensive. In order to complement the experimental efforts, we propose a new computational approach in this paper to predict hot spots. Our method, Rough Set-based Multiple Criteria Linear Programming (RS-MCLP), integrates rough sets theory and multiple criteria linear programming to choose dominant features and computationally predict hot spots. Our approach is benchmarked by a dataset of 904 alanine-mutated residues and the results show that our RS-MCLP method performs better than other methods, e.g., MCLP, Decision Tree, Bayes Net, and the existing HotSprint database. In addition, we reveal several biological insights based on our analysis. We find that four features (the change of accessible surface area, percentage of the change of accessible surface area, size of a residue, and atomic contacts) are critical in predicting hot spots. Furthermore, we find that three residues (Tyr, Trp, and Phe) are abundant in hot spots through analyzing the distribution of amino acids. Copyright © 2010 Elsevier Ltd. All rights reserved.

  11. Galactic Abundance Patterns via Peimbert Types I & II Planetary Nebulae

    NASA Astrophysics Data System (ADS)

    Milingo, J. B.; Barnes, K. L.; Kwitter, K. B.; Souza, S. P.; Henry, R. B. C.; Skinner, J. N.

    2005-12-01

    Planetary Nebulae (PNe) are well known fonts of information about both stellar evolution and galactic chemical evolution. Abundance patterns in PNe are used to note signatures and constraints of nuclear processing, and as tracers of the distribution of metals throughout galaxies. In this poster abundance gradients and heavy element ratios are presented based upon newly acquired spectrophotometry of a sample of Galactic Peimbert Type I PNe. This new data set is extracted from spectra that extend from λ 3600 - 9600Å allowing the use of [S III] features at λ 9069 and 9532Å. Since a significant portion of S in PNe resides in S+2 and higher ionization stages, including these features improves the extrapolation from observed ion abundances to total element abundance. An alternate metallicity tracer, Sulfur is precluded from enhancement and depletion across the range of PNe progenitor masses. Its stability in intermediate mass stars makes it a useful tool to probe the natal conditions as well as the evolution of PNe progenitors. This is a continuation of our Type II PNe work, the impetus being to compile a relatively large set of line strengths and abundances with internally consistent observation, reduction, calibration, and abundance determination, minimizing systematic affects that come from compiling various data sets. This research is supported by the AAS Small Research Grants program, the Franklin & Marshall Committee on Grants, and NSF grant AST-0307118.

  12. Casting a Wider Net: Data Driven Discovery of Proxies for Target Diagnoses

    PubMed Central

    Ramljak, Dusan; Davey, Adam; Uversky, Alexey; Roychoudhury, Shoumik; Obradovic, Zoran

    2015-01-01

    Background: The Hospital Readmissions Reduction Program (HRRP) introduced in October 2012 as part of the Affordable Care Act (ACA), ties hospital reimbursement rates to adjusted 30-day readmissions and mortality performance for a small set of target diagnoses. There is growing concern and emerging evidence that use of a small set of target diagnoses to establish reimbursement rates can lead to unstable results that are susceptible to manipulation (gaming) by hospitals. Methods: We propose a novel approach to identifying co-occurring diagnoses and procedures that can themselves serve as a proxy indicator of the target diagnosis. The proposed approach constructs a Markov Blanket that allows a high level of performance, in terms of predictive accuracy and scalability, along with interpretability of obtained results. In order to scale to a large number of co-occuring diagnoses (features) and hospital discharge records (samples), our approach begins with Google’s PageRank algorithm and exploits the stability of obtained results to rank the contribution of each diagnosis/procedure in terms of presence in a Markov Blanket for outcome prediction. Results: Presence of target diagnoses acute myocardial infarction (AMI), congestive heart failure (CHF), pneumonia (PN), and Sepsis in hospital discharge records for Medicare and Medicaid patients in California and New York state hospitals (2009–2011), were predicted using models trained on a subset of California state hospitals (2003–2008). Using repeated holdout evaluation, we used ~30,000,000 hospital discharge records and analyzed the stability of the proposed approach. Model performance was measured using the Area Under the ROC Curve (AUC) metric, and importance and contribution of single features to the final result. The results varied from AUC=0.68 (with SE<1e-4) for PN on cross validation datasets to AUC=0.94, with (SE<1e-7) for Sepsis on California hospitals (2009 – 2011), while the stability of features was consistently better with more training data for each target diagnosis. Prediction accuracy for considered target diagnoses approaches or exceeds accuracy estimates for discharge record data. Conclusions: This paper presents a novel approach to identifying a small subset of relevant diagnoses and procedures that approximate the Markov Blanket for target diagnoses. Accuracy and interpretability of results demonstrate the potential of our approach. PMID:26958243

  13. Landscape of Fluid Sets of Hairpin-Derived 21-/24-nt-Long Small RNAs at Seed Set Uncovers Special Epigenetic Features in Picea glauca.

    PubMed

    Liu, Yang; El-Kassaby, Yousry A

    2017-01-01

    Conifers' exceptionally large genome (20-30 Gb) is scattered with 60% retrotransposon (RT) components and we have little knowledge on their origin and evolutionary implications. RTs may impede the expression of flanking genes and provide sources of the formation of novel small RNA (sRNAs) populations to constrain events of transposon (TE) proliferation/transposition. Here we show a declining expression of 24-nt-long sRNAs and low expression levels of their key processing gene, pgRTL2 (RNASE THREE LIKE 2) at seed set in Picea glauca. The sRNAs in 24-nt size class are significantly less enriched in type and read number than 21-nt sRNAs and have not been documented in other species. The architecture of MIR loci generating highly expressed 24-/21-nt sRNAs is featured by long terminal repeat-retrotransposons (LTR-RTs) in families of Ty3/Gypsy and Ty1/Copia elements. This implies that the production of sRNAs may be predominantly originated from TE fragments on chromosomes. Furthermore, a large proportion of highly expressed 24-nt sRNAs does not have predictable targets against unique genes in Picea, suggestive of their potential pathway in DNA methylation modifications on, for instance, TEs. Additionally, the classification of computationally predicted sRNAs suggests that 24-nt sRNA targets may bear particular functions in metabolic processes while 21-nt sRNAs target genes involved in many different biological processes. This study, therefore, directs our attention to a possible extrapolation that lacking of 24-nt sRNAs at the late conifer seed developmental phase may result in less constraints in TE activities, thus contributing to the massive expansion of genome size. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  14. Improving the Accuracy and Training Speed of Motor Imagery Brain-Computer Interfaces Using Wavelet-Based Combined Feature Vectors and Gaussian Mixture Model-Supervectors.

    PubMed

    Lee, David; Park, Sang-Hoon; Lee, Sang-Goog

    2017-10-07

    In this paper, we propose a set of wavelet-based combined feature vectors and a Gaussian mixture model (GMM)-supervector to enhance training speed and classification accuracy in motor imagery brain-computer interfaces. The proposed method is configured as follows: first, wavelet transforms are applied to extract the feature vectors for identification of motor imagery electroencephalography (EEG) and principal component analyses are used to reduce the dimensionality of the feature vectors and linearly combine them. Subsequently, the GMM universal background model is trained by the expectation-maximization (EM) algorithm to purify the training data and reduce its size. Finally, a purified and reduced GMM-supervector is used to train the support vector machine classifier. The performance of the proposed method was evaluated for three different motor imagery datasets in terms of accuracy, kappa, mutual information, and computation time, and compared with the state-of-the-art algorithms. The results from the study indicate that the proposed method achieves high accuracy with a small amount of training data compared with the state-of-the-art algorithms in motor imagery EEG classification.

  15. Feature reduction and payload location with WAM steganalysis

    NASA Astrophysics Data System (ADS)

    Ker, Andrew D.; Lubenko, Ivans

    2009-02-01

    WAM steganalysis is a feature-based classifier for detecting LSB matching steganography, presented in 2006 by Goljan et al. and demonstrated to be sensitive even to small payloads. This paper makes three contributions to the development of the WAM method. First, we benchmark some variants of WAM in a number of sets of cover images, and we are able to quantify the significance of differences in results between different machine learning algorithms based on WAM features. It turns out that, like many of its competitors, WAM is not effective in certain types of cover, and furthermore it is hard to predict which types of cover are suitable for WAM steganalysis. Second, we demonstrate that only a few the features used in WAM steganalysis do almost all of the work, so that a simplified WAM steganalyser can be constructed in exchange for a little less detection power. Finally, we demonstrate how the WAM method can be extended to provide forensic tools to identify the location (and potentially content) of LSB matching payload, given a number of stego images with payload placed in the same locations. Although easily evaded, this is a plausible situation if the same stego key is mistakenly re-used for embedding in multiple images.

  16. Anonymization of electronic medical records for validating genome-wide association studies

    PubMed Central

    Loukides, Grigorios; Gkoulalas-Divanis, Aris; Malin, Bradley

    2010-01-01

    Genome-wide association studies (GWAS) facilitate the discovery of genotype–phenotype relations from population-based sequence databases, which is an integral facet of personalized medicine. The increasing adoption of electronic medical records allows large amounts of patients’ standardized clinical features to be combined with the genomic sequences of these patients and shared to support validation of GWAS findings and to enable novel discoveries. However, disseminating these data “as is” may lead to patient reidentification when genomic sequences are linked to resources that contain the corresponding patients’ identity information based on standardized clinical features. This work proposes an approach that provably prevents this type of data linkage and furnishes a result that helps support GWAS. Our approach automatically extracts potentially linkable clinical features and modifies them in a way that they can no longer be used to link a genomic sequence to a small number of patients, while preserving the associations between genomic sequences and specific sets of clinical features corresponding to GWAS-related diseases. Extensive experiments with real patient data derived from the Vanderbilt's University Medical Center verify that our approach generates data that eliminate the threat of individual reidentification, while supporting GWAS validation and clinical case analysis tasks. PMID:20385806

  17. True polar wander on Europa from global-scale small-circle depressions.

    PubMed

    Schenk, Paul; Matsuyama, Isamu; Nimmo, Francis

    2008-05-15

    The tectonic patterns and stress history of Europa are exceedingly complex and many large-scale features remain unexplained. True polar wander, involving reorientation of Europa's floating outer ice shell about the tidal axis with Jupiter, has been proposed as a possible explanation for some of the features. This mechanism is possible if the icy shell is latitudinally variable in thickness and decoupled from the rocky interior. It would impose high stress levels on the shell, leading to predictable fracture patterns. No satisfactory match to global-scale features has hitherto been found for polar wander stress patterns. Here we describe broad arcuate troughs and depressions on Europa that do not fit other proposed stress mechanisms in their current position. Using imaging from three spacecraft, we have mapped two global-scale organized concentric antipodal sets of arcuate troughs up to hundreds of kilometres long and 300 m to approximately 1.5 km deep. An excellent match to these features is found with stresses caused by an episode of approximately 80 degrees true polar wander. These depressions also appear to be geographically related to other large-scale bright and dark lineaments, suggesting that many of Europa's tectonic patterns may also be related to true polar wander.

  18. Business grants

    NASA Astrophysics Data System (ADS)

    Twelve small businesses who are developing equipment and computer programs for geophysics have won Small Business Innovative Research (SBIR) grants from the National Science Foundation for their 1989 proposals. The SBIR program was set up to encourage the private sector to undertake costly, advanced experimental work that has potential for great benefit.The geophysical research projects are a long-path intracavity laser spectrometer for measuring atmospheric trace gases, optimizing a local weather forecast model, a new platform for high-altitude atmospheric science, an advanced density logging tool, a deep-Earth sampling system, superconducting seismometers, a phased-array Doppler current profiler, monitoring mesoscale surface features of the ocean through automated analysis, krypton-81 dating in polar ice samples, discrete stochastic modeling of thunderstorm winds, a layered soil-synthetic liner base system to isolate buildings from earthquakes, and a low-cost continuous on-line organic-content monitor for water-quality determination.

  19. Brain tumor detection and segmentation in a CRF (conditional random fields) framework with pixel-pairwise affinity and superpixel-level features.

    PubMed

    Wu, Wei; Chen, Albert Y C; Zhao, Liang; Corso, Jason J

    2014-03-01

    Detection and segmentation of a brain tumor such as glioblastoma multiforme (GBM) in magnetic resonance (MR) images are often challenging due to its intrinsically heterogeneous signal characteristics. A robust segmentation method for brain tumor MRI scans was developed and tested. Simple thresholds and statistical methods are unable to adequately segment the various elements of the GBM, such as local contrast enhancement, necrosis, and edema. Most voxel-based methods cannot achieve satisfactory results in larger data sets, and the methods based on generative or discriminative models have intrinsic limitations during application, such as small sample set learning and transfer. A new method was developed to overcome these challenges. Multimodal MR images are segmented into superpixels using algorithms to alleviate the sampling issue and to improve the sample representativeness. Next, features were extracted from the superpixels using multi-level Gabor wavelet filters. Based on the features, a support vector machine (SVM) model and an affinity metric model for tumors were trained to overcome the limitations of previous generative models. Based on the output of the SVM and spatial affinity models, conditional random fields theory was applied to segment the tumor in a maximum a posteriori fashion given the smoothness prior defined by our affinity model. Finally, labeling noise was removed using "structural knowledge" such as the symmetrical and continuous characteristics of the tumor in spatial domain. The system was evaluated with 20 GBM cases and the BraTS challenge data set. Dice coefficients were computed, and the results were highly consistent with those reported by Zikic et al. (MICCAI 2012, Lecture notes in computer science. vol 7512, pp 369-376, 2012). A brain tumor segmentation method using model-aware affinity demonstrates comparable performance with other state-of-the art algorithms.

  20. Accurate prediction of personalized olfactory perception from large-scale chemoinformatic features.

    PubMed

    Li, Hongyang; Panwar, Bharat; Omenn, Gilbert S; Guan, Yuanfang

    2018-02-01

    The olfactory stimulus-percept problem has been studied for more than a century, yet it is still hard to precisely predict the odor given the large-scale chemoinformatic features of an odorant molecule. A major challenge is that the perceived qualities vary greatly among individuals due to different genetic and cultural backgrounds. Moreover, the combinatorial interactions between multiple odorant receptors and diverse molecules significantly complicate the olfaction prediction. Many attempts have been made to establish structure-odor relationships for intensity and pleasantness, but no models are available to predict the personalized multi-odor attributes of molecules. In this study, we describe our winning algorithm for predicting individual and population perceptual responses to various odorants in the DREAM Olfaction Prediction Challenge. We find that random forest model consisting of multiple decision trees is well suited to this prediction problem, given the large feature spaces and high variability of perceptual ratings among individuals. Integrating both population and individual perceptions into our model effectively reduces the influence of noise and outliers. By analyzing the importance of each chemical feature, we find that a small set of low- and nondegenerative features is sufficient for accurate prediction. Our random forest model successfully predicts personalized odor attributes of structurally diverse molecules. This model together with the top discriminative features has the potential to extend our understanding of olfactory perception mechanisms and provide an alternative for rational odorant design.

  1. Integrating dimension reduction and out-of-sample extension in automated classification of ex vivo human patellar cartilage on phase contrast X-ray computed tomography.

    PubMed

    Nagarajan, Mahesh B; Coan, Paola; Huber, Markus B; Diemoz, Paul C; Wismüller, Axel

    2015-01-01

    Phase contrast X-ray computed tomography (PCI-CT) has been demonstrated as a novel imaging technique that can visualize human cartilage with high spatial resolution and soft tissue contrast. Different textural approaches have been previously investigated for characterizing chondrocyte organization on PCI-CT to enable classification of healthy and osteoarthritic cartilage. However, the large size of feature sets extracted in such studies motivates an investigation into algorithmic feature reduction for computing efficient feature representations without compromising their discriminatory power. For this purpose, geometrical feature sets derived from the scaling index method (SIM) were extracted from 1392 volumes of interest (VOI) annotated on PCI-CT images of ex vivo human patellar cartilage specimens. The extracted feature sets were subject to linear and non-linear dimension reduction techniques as well as feature selection based on evaluation of mutual information criteria. The reduced feature set was subsequently used in a machine learning task with support vector regression to classify VOIs as healthy or osteoarthritic; classification performance was evaluated using the area under the receiver-operating characteristic (ROC) curve (AUC). Our results show that the classification performance achieved by 9-D SIM-derived geometric feature sets (AUC: 0.96 ± 0.02) can be maintained with 2-D representations computed from both dimension reduction and feature selection (AUC values as high as 0.97 ± 0.02). Thus, such feature reduction techniques can offer a high degree of compaction to large feature sets extracted from PCI-CT images while maintaining their ability to characterize the underlying chondrocyte patterns.

  2. Joint classification and contour extraction of large 3D point clouds

    NASA Astrophysics Data System (ADS)

    Hackel, Timo; Wegner, Jan D.; Schindler, Konrad

    2017-08-01

    We present an effective and efficient method for point-wise semantic classification and extraction of object contours of large-scale 3D point clouds. What makes point cloud interpretation challenging is the sheer size of several millions of points per scan and the non-grid, sparse, and uneven distribution of points. Standard image processing tools like texture filters, for example, cannot handle such data efficiently, which calls for dedicated point cloud labeling methods. It turns out that one of the major drivers for efficient computation and handling of strong variations in point density, is a careful formulation of per-point neighborhoods at multiple scales. This allows, both, to define an expressive feature set and to extract topologically meaningful object contours. Semantic classification and contour extraction are interlaced problems. Point-wise semantic classification enables extracting a meaningful candidate set of contour points while contours help generating a rich feature representation that benefits point-wise classification. These methods are tailored to have fast run time and small memory footprint for processing large-scale, unstructured, and inhomogeneous point clouds, while still achieving high classification accuracy. We evaluate our methods on the semantic3d.net benchmark for terrestrial laser scans with >109 points.

  3. Pin stripe lamination: A distinctive feature of modern and ancient eolian sediments

    USGS Publications Warehouse

    Fryberger, S.G.; Schenk, C.J.

    1988-01-01

    Pin stripe laminations are a distinctive feature of modern and ancient eolian sediments. In sets of eolian ripple (or translatent) strata they represent deposition of silt and very fine sand in the troughs of the advancing wind ripples. In sets of avalanche strata they probably result from the downward settling of fine sand and silt within the moving avalanche to the interface of moving and unmoving sands. Wind tunnel experiments suggest that pin stripe laminations can also form in grainfall deposits. The textural segregation associated with deposition of the fine layers in most cases leads to early cementation along and near the finest sand and silt comprising the pin stripe lamination. The pin stripe effect seen in outcrops is usually due to resistance to weathering along such cemented zones. The cementation of the pin stripe laminations can occur early in the history of diagenesis and thus may provide clues to the post-depositional history of the rock. Pin stripe laminations in many instances represent the sequestering of the small population of ultrafine sediment present in most eolian depositional systems. They may prove useful in the recognition of ancient eolian sediments. ?? 1988.

  4. Feature Selection Methods for Zero-Shot Learning of Neural Activity

    PubMed Central

    Caceres, Carlos A.; Roos, Matthew J.; Rupp, Kyle M.; Milsap, Griffin; Crone, Nathan E.; Wolmetz, Michael E.; Ratto, Christopher R.

    2017-01-01

    Dimensionality poses a serious challenge when making predictions from human neuroimaging data. Across imaging modalities, large pools of potential neural features (e.g., responses from particular voxels, electrodes, and temporal windows) have to be related to typically limited sets of stimuli and samples. In recent years, zero-shot prediction models have been introduced for mapping between neural signals and semantic attributes, which allows for classification of stimulus classes not explicitly included in the training set. While choices about feature selection can have a substantial impact when closed-set accuracy, open-set robustness, and runtime are competing design objectives, no systematic study of feature selection for these models has been reported. Instead, a relatively straightforward feature stability approach has been adopted and successfully applied across models and imaging modalities. To characterize the tradeoffs in feature selection for zero-shot learning, we compared correlation-based stability to several other feature selection techniques on comparable data sets from two distinct imaging modalities: functional Magnetic Resonance Imaging and Electrocorticography. While most of the feature selection methods resulted in similar zero-shot prediction accuracies and spatial/spectral patterns of selected features, there was one exception; A novel feature/attribute correlation approach was able to achieve those accuracies with far fewer features, suggesting the potential for simpler prediction models that yield high zero-shot classification accuracy. PMID:28690513

  5. Stabilizing l1-norm prediction models by supervised feature grouping.

    PubMed

    Kamkar, Iman; Gupta, Sunil Kumar; Phung, Dinh; Venkatesh, Svetha

    2016-02-01

    Emerging Electronic Medical Records (EMRs) have reformed the modern healthcare. These records have great potential to be used for building clinical prediction models. However, a problem in using them is their high dimensionality. Since a lot of information may not be relevant for prediction, the underlying complexity of the prediction models may not be high. A popular way to deal with this problem is to employ feature selection. Lasso and l1-norm based feature selection methods have shown promising results. But, in presence of correlated features, these methods select features that change considerably with small changes in data. This prevents clinicians to obtain a stable feature set, which is crucial for clinical decision making. Grouping correlated variables together can improve the stability of feature selection, however, such grouping is usually not known and needs to be estimated for optimal performance. Addressing this problem, we propose a new model that can simultaneously learn the grouping of correlated features and perform stable feature selection. We formulate the model as a constrained optimization problem and provide an efficient solution with guaranteed convergence. Our experiments with both synthetic and real-world datasets show that the proposed model is significantly more stable than Lasso and many existing state-of-the-art shrinkage and classification methods. We further show that in terms of prediction performance, the proposed method consistently outperforms Lasso and other baselines. Our model can be used for selecting stable risk factors for a variety of healthcare problems, so it can assist clinicians toward accurate decision making. Copyright © 2015 Elsevier Inc. All rights reserved.

  6. Reproducibility of radiomics for deciphering tumor phenotype with imaging

    NASA Astrophysics Data System (ADS)

    Zhao, Binsheng; Tan, Yongqiang; Tsai, Wei-Yann; Qi, Jing; Xie, Chuanmiao; Lu, Lin; Schwartz, Lawrence H.

    2016-03-01

    Radiomics (radiogenomics) characterizes tumor phenotypes based on quantitative image features derived from routine radiologic imaging to improve cancer diagnosis, prognosis, prediction and response to therapy. Although radiomic features must be reproducible to qualify as biomarkers for clinical care, little is known about how routine imaging acquisition techniques/parameters affect reproducibility. To begin to fill this knowledge gap, we assessed the reproducibility of a comprehensive, commonly-used set of radiomic features using a unique, same-day repeat computed tomography data set from lung cancer patients. Each scan was reconstructed at 6 imaging settings, varying slice thicknesses (1.25 mm, 2.5 mm and 5 mm) and reconstruction algorithms (sharp, smooth). Reproducibility was assessed using the repeat scans reconstructed at identical imaging setting (6 settings in total). In separate analyses, we explored differences in radiomic features due to different imaging parameters by assessing the agreement of these radiomic features extracted from the repeat scans reconstructed at the same slice thickness but different algorithms (3 settings in total). Our data suggest that radiomic features are reproducible over a wide range of imaging settings. However, smooth and sharp reconstruction algorithms should not be used interchangeably. These findings will raise awareness of the importance of properly setting imaging acquisition parameters in radiomics/radiogenomics research.

  7. Segmentation of magnetic resonance images using fuzzy algorithms for learning vector quantization.

    PubMed

    Karayiannis, N B; Pai, P I

    1999-02-01

    This paper evaluates a segmentation technique for magnetic resonance (MR) images of the brain based on fuzzy algorithms for learning vector quantization (FALVQ). These algorithms perform vector quantization by updating all prototypes of a competitive network through an unsupervised learning process. Segmentation of MR images is formulated as an unsupervised vector quantization process, where the local values of different relaxation parameters form the feature vectors which are represented by a relatively small set of prototypes. The experiments evaluate a variety of FALVQ algorithms in terms of their ability to identify different tissues and discriminate between normal tissues and abnormalities.

  8. Letter processing and font information during reading: beyond distinctiveness, where vision meets design.

    PubMed

    Sanocki, Thomas; Dyson, Mary C

    2012-01-01

    Letter identification is a critical front end of the reading process. In general, conceptualizations of the identification process have emphasized arbitrary sets of distinctive features. However, a richer view of letter processing incorporates principles from the field of type design, including an emphasis on uniformities across letters within a font. The importance of uniformities is supported by a small body of research indicating that consistency of font increases letter identification efficiency. We review design concepts and the relevant literature, with the goal of stimulating further thinking about letter processing during reading.

  9. How many breaks do we need to CATCH on 22q11?

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dallapiccola, B.; Pizzuti, A.; Novelli, G.

    1996-07-01

    The major clinical manifestations of DiGeorge syndrome (DGS; MIM 188400), which reflect developmental abnormalities of the 3d and 4th pharyngeal pouch derivatives, include thymus- and parathyroid-gland aplasia or hypoplasia and conotruncal cardiac malformations. The additional dysmorphic facial features, such as hypertelorism, cleft lip and palate, bifid uvula, and small/low-set ears, which are also common, presumably reflect the same defect. The DGS phenotype has been associated with chromosome abnormalities and, sometimes, is the effect of teratogenic agents such as retinoic acid and alcohol. 53 refs., 1 fig.

  10. A new autosomal dominant syndrome of distinctive face showing ptosis and prominent eyes associated with cleft palate, ear anomalies, and learning disability.

    PubMed

    Tyshchenko, N; Neuhann, T M; Gerlach, E; Hahn, G; Heisch, K; Rump, A; Schrock, E; Tinschert, S; Hackmann, K

    2011-09-01

    We report on three patients from two families with apparently a novel clinical entity. The main features of which include unusual craniofacial dysmorphism with ptosis, prominent eyes, flat midface, Cupid's bow configuration of the upper lip, low-set, posteriorly rotated small ears, as well as conductive hearing loss, cleft palate, heart defect, and mild developmental delay. We suggest that this entity is an autosomal dominant disorder given the occurrence in a mother and daughter as well as in an unrelated boy. Copyright © 2011 Wiley-Liss, Inc.

  11. Feature Detection in SAR Interferograms With Missing Data Displays Fault Slip Near El Mayor-Cucapah and South Napa Earthquakes

    NASA Astrophysics Data System (ADS)

    Parker, J. W.; Donnellan, A.; Glasscoe, M. T.; Stough, T.

    2015-12-01

    Edge detection identifies seismic or aseismic fault motion, as demonstrated in repeat-pass inteferograms obtained by the Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR) program. But this identification, demonstrated in 2010, was not robust: for best results, it requires a flattened background image, interpolation into missing data (holes) and outliers, and background noise that is either sufficiently small or roughly white Gaussian. Proper treatment of missing data, bursting noise patches, and tiny noise differences at short distances apart from bursts are essential to creating an acceptably reliable method sensitive to small near-surface fractures. Clearly a robust method is needed for machine scanning of the thousands of UAVSAR repeat-pass interferograms for evidence of fault slip, landslides, and other local features: hand-crafted intervention will not do. Effective methods of identifying, removing and filling in bad pixels reveal significant features of surface fractures. A rich network of edges (probably fractures and subsidence) in difference images spanning the South Napa earthquake give way to a simple set of postseismically slipping faults. Coseismic El Mayor-Cucapah interferograms compared to post-seismic difference images show nearly disjoint patterns of surface fractures in California's Sonoran Desert; the combined pattern reveals a network of near-perpendicular, probably conjugate faults not mapped before the earthquake. The current algorithms for UAVSAR interferogram edge detections are shown to be effective in difficult environments, including agricultural (Napa, Imperial Valley) and difficult urban areas (Orange County.).

  12. Computer aided detection of clusters of microcalcifications on full field digital mammograms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ge Jun; Sahiner, Berkman; Hadjiiski, Lubomir M.

    2006-08-15

    We are developing a computer-aided detection (CAD) system to identify microcalcification clusters (MCCs) automatically on full field digital mammograms (FFDMs). The CAD system includes six stages: preprocessing; image enhancement; segmentation of microcalcification candidates; false positive (FP) reduction for individual microcalcifications; regional clustering; and FP reduction for clustered microcalcifications. At the stage of FP reduction for individual microcalcifications, a truncated sum-of-squares error function was used to improve the efficiency and robustness of the training of an artificial neural network in our CAD system for FFDMs. At the stage of FP reduction for clustered microcalcifications, morphological features and features derived from themore » artificial neural network outputs were extracted from each cluster. Stepwise linear discriminant analysis (LDA) was used to select the features. An LDA classifier was then used to differentiate clustered microcalcifications from FPs. A data set of 96 cases with 192 images was collected at the University of Michigan. This data set contained 96 MCCs, of which 28 clusters were proven by biopsy to be malignant and 68 were proven to be benign. The data set was separated into two independent data sets for training and testing of the CAD system in a cross-validation scheme. When one data set was used to train and validate the convolution neural network (CNN) in our CAD system, the other data set was used to evaluate the detection performance. With the use of a truncated error metric, the training of CNN could be accelerated and the classification performance was improved. The CNN in combination with an LDA classifier could substantially reduce FPs with a small tradeoff in sensitivity. By using the free-response receiver operating characteristic methodology, it was found that our CAD system can achieve a cluster-based sensitivity of 70, 80, and 90 % at 0.21, 0.61, and 1.49 FPs/image, respectively. For case-based performance evaluation, a sensitivity of 70, 80, and 90 % can be achieved at 0.07, 0.17, and 0.65 FPs/image, respectively. We also used a data set of 216 mammograms negative for clustered microcalcifications to further estimate the FP rate of our CAD system. The corresponding FP rates were 0.15, 0.31, and 0.86 FPs/image for cluster-based detection when negative mammograms were used for estimation of FP rates.« less

  13. A global/local affinity graph for image segmentation.

    PubMed

    Xiaofang Wang; Yuxing Tang; Masnou, Simon; Liming Chen

    2015-04-01

    Construction of a reliable graph capturing perceptual grouping cues of an image is fundamental for graph-cut based image segmentation methods. In this paper, we propose a novel sparse global/local affinity graph over superpixels of an input image to capture both short- and long-range grouping cues, and thereby enabling perceptual grouping laws, including proximity, similarity, continuity, and to enter in action through a suitable graph-cut algorithm. Moreover, we also evaluate three major visual features, namely, color, texture, and shape, for their effectiveness in perceptual segmentation and propose a simple graph fusion scheme to implement some recent findings from psychophysics, which suggest combining these visual features with different emphases for perceptual grouping. In particular, an input image is first oversegmented into superpixels at different scales. We postulate a gravitation law based on empirical observations and divide superpixels adaptively into small-, medium-, and large-sized sets. Global grouping is achieved using medium-sized superpixels through a sparse representation of superpixels' features by solving a ℓ0-minimization problem, and thereby enabling continuity or propagation of local smoothness over long-range connections. Small- and large-sized superpixels are then used to achieve local smoothness through an adjacent graph in a given feature space, and thus implementing perceptual laws, for example, similarity and proximity. Finally, a bipartite graph is also introduced to enable propagation of grouping cues between superpixels of different scales. Extensive experiments are carried out on the Berkeley segmentation database in comparison with several state-of-the-art graph constructions. The results show the effectiveness of the proposed approach, which outperforms state-of-the-art graphs using four different objective criteria, namely, the probabilistic rand index, the variation of information, the global consistency error, and the boundary displacement error.

  14. Polycyclic Aromatic Hydrocarbons in Protoplanetary Disks around Herbig Ae/Be and T Tauri Stars

    NASA Astrophysics Data System (ADS)

    Seok, Ji Yeon; Li, Aigen

    2017-02-01

    A distinct set of broad emission features at 3.3, 6.2, 7.7, 8.6, 11.3, and 12.7 μm, is often detected in protoplanetary disks (PPDs). These features are commonly attributed to polycyclic aromatic hydrocarbons (PAHs). We model these emission features in the infrared spectra of 69 PPDs around 14 T Tauri and 55 Herbig Ae/Be stars in terms of astronomical PAHs. For each PPD, we derive the size distribution and the charge state of the PAHs. We then examine the correlations of the PAH properties (I.e., sizes and ionization fractions) with the stellar properties (e.g., stellar effective temperature, luminosity, and mass). We find that the characteristic size of the PAHs tends to correlate with the stellar effective temperature ({T}{eff}) and interpret this as the preferential photodissociation of small PAHs in systems with higher {T}{eff} of which the stellar photons are more energetic. In addition, the PAH size shows a moderate correlation with the red-ward wavelength shift of the 7.7 μm PAH feature that is commonly observed in disks around cool stars. The ionization fraction of PAHs does not seem to correlate with any stellar parameters. This is because the charging of PAHs depends on not only the stellar properties (e.g., {T}{eff}, luminosity) but also their spatial distribution in the disks. The marginally negative correlation between PAH size and stellar age suggests that continuous replenishment of PAHs via the outgassing of cometary bodies and/or the collisional grinding of planetesimals and asteroids is required to maintain the abundance of small PAHs against complete destruction by photodissociation.

  15. 48 CFR 2919.502 - Setting aside acquisitions.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... whether procurements should be conducted via 8(a) procedures, HUBZone procedures or as small business set-asides. If a reasonable expectation exists that at least two responsible small businesses may submit... PROGRAMS SMALL BUSINESS AND SMALL DISADVANTAGED BUSINESS CONCERNS Set-Asides for Small Business 2919.502...

  16. Probabilistic hazard assessment for skin sensitization potency by dose–response modeling using feature elimination instead of quantitative structure–activity relationships

    PubMed Central

    McKim, James M.; Hartung, Thomas; Kleensang, Andre; Sá-Rocha, Vanessa

    2016-01-01

    Supervised learning methods promise to improve integrated testing strategies (ITS), but must be adjusted to handle high dimensionality and dose–response data. ITS approaches are currently fueled by the increasing mechanistic understanding of adverse outcome pathways (AOP) and the development of tests reflecting these mechanisms. Simple approaches to combine skin sensitization data sets, such as weight of evidence, fail due to problems in information redundancy and high dimension-ality. The problem is further amplified when potency information (dose/response) of hazards would be estimated. Skin sensitization currently serves as the foster child for AOP and ITS development, as legislative pressures combined with a very good mechanistic understanding of contact dermatitis have led to test development and relatively large high-quality data sets. We curated such a data set and combined a recursive variable selection algorithm to evaluate the information available through in silico, in chemico and in vitro assays. Chemical similarity alone could not cluster chemicals’ potency, and in vitro models consistently ranked high in recursive feature elimination. This allows reducing the number of tests included in an ITS. Next, we analyzed with a hidden Markov model that takes advantage of an intrinsic inter-relationship among the local lymph node assay classes, i.e. the monotonous connection between local lymph node assay and dose. The dose-informed random forest/hidden Markov model was superior to the dose-naive random forest model on all data sets. Although balanced accuracy improvement may seem small, this obscures the actual improvement in misclassifications as the dose-informed hidden Markov model strongly reduced "false-negatives" (i.e. extreme sensitizers as non-sensitizer) on all data sets. PMID:26046447

  17. Probabilistic hazard assessment for skin sensitization potency by dose-response modeling using feature elimination instead of quantitative structure-activity relationships.

    PubMed

    Luechtefeld, Thomas; Maertens, Alexandra; McKim, James M; Hartung, Thomas; Kleensang, Andre; Sá-Rocha, Vanessa

    2015-11-01

    Supervised learning methods promise to improve integrated testing strategies (ITS), but must be adjusted to handle high dimensionality and dose-response data. ITS approaches are currently fueled by the increasing mechanistic understanding of adverse outcome pathways (AOP) and the development of tests reflecting these mechanisms. Simple approaches to combine skin sensitization data sets, such as weight of evidence, fail due to problems in information redundancy and high dimensionality. The problem is further amplified when potency information (dose/response) of hazards would be estimated. Skin sensitization currently serves as the foster child for AOP and ITS development, as legislative pressures combined with a very good mechanistic understanding of contact dermatitis have led to test development and relatively large high-quality data sets. We curated such a data set and combined a recursive variable selection algorithm to evaluate the information available through in silico, in chemico and in vitro assays. Chemical similarity alone could not cluster chemicals' potency, and in vitro models consistently ranked high in recursive feature elimination. This allows reducing the number of tests included in an ITS. Next, we analyzed with a hidden Markov model that takes advantage of an intrinsic inter-relationship among the local lymph node assay classes, i.e. the monotonous connection between local lymph node assay and dose. The dose-informed random forest/hidden Markov model was superior to the dose-naive random forest model on all data sets. Although balanced accuracy improvement may seem small, this obscures the actual improvement in misclassifications as the dose-informed hidden Markov model strongly reduced " false-negatives" (i.e. extreme sensitizers as non-sensitizer) on all data sets. Copyright © 2015 John Wiley & Sons, Ltd.

  18. Feature Selection for Chemical Sensor Arrays Using Mutual Information

    PubMed Central

    Wang, X. Rosalind; Lizier, Joseph T.; Nowotny, Thomas; Berna, Amalia Z.; Prokopenko, Mikhail; Trowell, Stephen C.

    2014-01-01

    We address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best features and upper bounds on classification performance. We selected feature sets that exhibit the maximal mutual information with the identity of the chemicals. The selected features closely match those found to perform well in the previous study using a wrapper approach to conduct an exhaustive search of all permitted feature combinations. By comparing the classification performance of support vector machines (using features selected by mutual information) with the performance observed in the previous study, we found that while our approach does not always give the maximum possible classification performance, it always selects features that achieve classification performance approaching the optimum obtained by exhaustive search. We performed further classification using the selected feature set with some common classifiers and found that, for the selected features, Bayesian Networks gave the best performance. Finally, we compared the observed classification performances with the performance of classifiers using randomly selected features. We found that the selected features consistently outperformed randomly selected features for all tested classifiers. The mutual information filter approach is therefore a computationally efficient method for selecting near optimal features for chemical sensor arrays. PMID:24595058

  19. A High-Resolution Tile-Based Approach for Classifying Biological Regions in Whole-Slide Histopathological Images

    PubMed Central

    Hoffman, R.A.; Kothari, S.; Phan, J.H.; Wang, M.D.

    2016-01-01

    Computational analysis of histopathological whole slide images (WSIs) has emerged as a potential means for improving cancer diagnosis and prognosis. However, an open issue relating to the automated processing of WSIs is the identification of biological regions such as tumor, stroma, and necrotic tissue on the slide. We develop a method for classifying WSI portions (512x512-pixel tiles) into biological regions by (1) extracting a set of 461 image features from each WSI tile, (2) optimizing tile-level prediction models using nested cross-validation on a small (600 tile) manually annotated tile-level training set, and (3) validating the models against a much larger (1.7x106 tile) data set for which ground truth was available on the whole-slide level. We calculated the predicted prevalence of each tissue region and compared this prevalence to the ground truth prevalence for each image in an independent validation set. Results show significant correlation between the predicted (using automated system) and reported biological region prevalences with p < 0.001 for eight of nine cases considered. PMID:27532012

  20. A High-Resolution Tile-Based Approach for Classifying Biological Regions in Whole-Slide Histopathological Images.

    PubMed

    Hoffman, R A; Kothari, S; Phan, J H; Wang, M D

    Computational analysis of histopathological whole slide images (WSIs) has emerged as a potential means for improving cancer diagnosis and prognosis. However, an open issue relating to the automated processing of WSIs is the identification of biological regions such as tumor, stroma, and necrotic tissue on the slide. We develop a method for classifying WSI portions (512x512-pixel tiles) into biological regions by (1) extracting a set of 461 image features from each WSI tile, (2) optimizing tile-level prediction models using nested cross-validation on a small (600 tile) manually annotated tile-level training set, and (3) validating the models against a much larger (1.7x10 6 tile) data set for which ground truth was available on the whole-slide level. We calculated the predicted prevalence of each tissue region and compared this prevalence to the ground truth prevalence for each image in an independent validation set. Results show significant correlation between the predicted (using automated system) and reported biological region prevalences with p < 0.001 for eight of nine cases considered.

  1. 48 CFR 19.507 - Automatic dissolution of a small business set-aside.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 48 Federal Acquisition Regulations System 1 2011-10-01 2011-10-01 false Automatic dissolution of a small business set-aside. 19.507 Section 19.507 Federal Acquisition Regulations System FEDERAL... Automatic dissolution of a small business set-aside. (a) If a small business set-aside acquisition or...

  2. 48 CFR 19.507 - Automatic dissolution of a small business set-aside.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 48 Federal Acquisition Regulations System 1 2010-10-01 2010-10-01 false Automatic dissolution of a small business set-aside. 19.507 Section 19.507 Federal Acquisition Regulations System FEDERAL... Automatic dissolution of a small business set-aside. (a) If a small business set-aside acquisition or...

  3. 48 CFR 19.507 - Automatic dissolution of a small business set-aside.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 48 Federal Acquisition Regulations System 1 2014-10-01 2014-10-01 false Automatic dissolution of a small business set-aside. 19.507 Section 19.507 Federal Acquisition Regulations System FEDERAL... Automatic dissolution of a small business set-aside. (a) If a small business set-aside acquisition or...

  4. 48 CFR 19.507 - Automatic dissolution of a small business set-aside.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 48 Federal Acquisition Regulations System 1 2012-10-01 2012-10-01 false Automatic dissolution of a small business set-aside. 19.507 Section 19.507 Federal Acquisition Regulations System FEDERAL... Automatic dissolution of a small business set-aside. (a) If a small business set-aside acquisition or...

  5. 48 CFR 19.507 - Automatic dissolution of a small business set-aside.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 48 Federal Acquisition Regulations System 1 2013-10-01 2013-10-01 false Automatic dissolution of a small business set-aside. 19.507 Section 19.507 Federal Acquisition Regulations System FEDERAL... Automatic dissolution of a small business set-aside. (a) If a small business set-aside acquisition or...

  6. Studies of electron-molecule collisions - Applications to e-H2O

    NASA Technical Reports Server (NTRS)

    Brescansin, L. M.; Lima, M. A. P.; Gibson, T. L.; Mckoy, V.; Huo, W. M.

    1986-01-01

    Elastic differential and momentum transfer cross sections for the elastic scattering of electrons by H2O are reported for collision energies from 2 to 20 eV. These fixed-nuclei static-exchange cross sections were obtained using the Schwinger variational approach. In these studies the exchange potential is directly evaluated and not approximated by local models. The calculated differential cross sections, obtained with a basis set expansion of the scattering wave function, agree well with available experimental data at intermediate and larger angles. As used here, the results cannot adequately describe the divergent cross sections at small angles. An interesting feature of the calculated cross sections, particularly at 15 and 20 eV, is their significant backward peaking. This peaking occurs in the experimentally inaccessible region beyond a scattering angle of 120 deg. The implication of this feature for the determination of momentum transfer cross sections is described.

  7. Evolution, functions, and mysteries of plant ARGONAUTE proteins.

    PubMed

    Zhang, Han; Xia, Rui; Meyers, Blake C; Walbot, Virginia

    2015-10-01

    ARGONAUTE (AGO) proteins bind small RNAs (sRNAs) to form RNA-induced silencing complexes for transcriptional and post-transcriptional gene silencing. Genomes of primitive plants encode only a few AGO proteins. The Arabidopsis thaliana genome encodes ten AGO proteins, designated AGO1 to AGO10. Most early studies focused on these ten proteins and their interacting sRNAs. AGOs in other flowering plant species have duplicated and diverged from this set, presumably corresponding to new, diverged or specific functions. Among these, the grass-specific AGO18 family has been discovered and implicated as playing important roles during plant reproduction and viral defense. This review covers our current knowledge about functions and features of AGO proteins in both eudicots and monocots and compares their similarities and differences. On the basis of these features, we propose a new nomenclature for some plant AGOs. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. A Distributed Wireless Camera System for the Management of Parking Spaces.

    PubMed

    Vítek, Stanislav; Melničuk, Petr

    2017-12-28

    The importance of detection of parking space availability is still growing, particularly in major cities. This paper deals with the design of a distributed wireless camera system for the management of parking spaces, which can determine occupancy of the parking space based on the information from multiple cameras. The proposed system uses small camera modules based on Raspberry Pi Zero and computationally efficient algorithm for the occupancy detection based on the histogram of oriented gradients (HOG) feature descriptor and support vector machine (SVM) classifier. We have included information about the orientation of the vehicle as a supporting feature, which has enabled us to achieve better accuracy. The described solution can deliver occupancy information at the rate of 10 parking spaces per second with more than 90% accuracy in a wide range of conditions. Reliability of the implemented algorithm is evaluated with three different test sets which altogether contain over 700,000 samples of parking spaces.

  9. Three-dimensional object recognition using similar triangles and decision trees

    NASA Technical Reports Server (NTRS)

    Spirkovska, Lilly

    1993-01-01

    A system, TRIDEC, that is capable of distinguishing between a set of objects despite changes in the objects' positions in the input field, their size, or their rotational orientation in 3D space is described. TRIDEC combines very simple yet effective features with the classification capabilities of inductive decision tree methods. The feature vector is a list of all similar triangles defined by connecting all combinations of three pixels in a coarse coded 127 x 127 pixel input field. The classification is accomplished by building a decision tree using the information provided from a limited number of translated, scaled, and rotated samples. Simulation results are presented which show that TRIDEC achieves 94 percent recognition accuracy in the 2D invariant object recognition domain and 98 percent recognition accuracy in the 3D invariant object recognition domain after training on only a small sample of transformed views of the objects.

  10. Ultrahigh-Dimensional Multiclass Linear Discriminant Analysis by Pairwise Sure Independence Screening

    PubMed Central

    Pan, Rui; Wang, Hansheng; Li, Runze

    2016-01-01

    This paper is concerned with the problem of feature screening for multi-class linear discriminant analysis under ultrahigh dimensional setting. We allow the number of classes to be relatively large. As a result, the total number of relevant features is larger than usual. This makes the related classification problem much more challenging than the conventional one, where the number of classes is small (very often two). To solve the problem, we propose a novel pairwise sure independence screening method for linear discriminant analysis with an ultrahigh dimensional predictor. The proposed procedure is directly applicable to the situation with many classes. We further prove that the proposed method is screening consistent. Simulation studies are conducted to assess the finite sample performance of the new procedure. We also demonstrate the proposed methodology via an empirical analysis of a real life example on handwritten Chinese character recognition. PMID:28127109

  11. Preliminary Planck constant measurements via UME oscillating magnet Kibble balance

    NASA Astrophysics Data System (ADS)

    Ahmedov, H.; Babayiğit Aşkın, N.; Korutlu, B.; Orhan, R.

    2018-06-01

    The UME Kibble balance project was initiated in the second half of 2014. During this period we have studied the theoretical aspects of Kibble balances, in which an oscillating magnet generates AC Faraday’s voltage in a stationary coil, and constructed a trial version to implement this idea. The remarkable feature of this approach is that it can establish the link between the Planck constant and a macroscopic mass by one single experiment in the most natural way. Weak dependences on variations of environmental and experimental conditions, small size, and other useful features offered by this novel approach reduce the complexity of the experimental set-up. This paper describes the principles of the oscillating magnet Kibble balance and gives details of the preliminary Planck constant measurements. The value of the Planck constant determined with our apparatus is \\boldsymbol{h}/{{\\boldsymbol{h}}\\boldsymbol 90}={1}{.000} {004}~ , with a relative standard uncertainty of 6 ppm.

  12. The Apollo passive seismic experiment

    NASA Technical Reports Server (NTRS)

    Latham, G. V.; Dorman, H. J.; Horvath, P.; Ibrahim, A. K.; Koyama, J.; Nakamura, Y.

    1979-01-01

    The completed data set obtained from the 4-station Apollo seismic network includes signals from approximately 11,800 events of various types. Four data sets for use by other investigators, through the NSSDC, are in preparation. Some refinement of the lunar model based on seismic data can be expected, but its gross features remain as presented two years ago. The existence of a small, molten core remains dependent upon the analysis of signals from a single, far-side impact. Analysis of secondary arrivals from other sources may eventually resolve this issue, as well as continued refinement of the magnetic field measurements. Evidence of considerable lateral heterogeneity within the moon continues to build. The mystery of the much meteoroid flux estimate derived from lunar seismic measurements, as compared with earth-based estimates, remains; although, significant correlations between terrestrial and lunar observations are beginning to emerge.

  13. Universal Hitting Time Statistics for Integrable Flows

    NASA Astrophysics Data System (ADS)

    Dettmann, Carl P.; Marklof, Jens; Strömbergsson, Andreas

    2017-02-01

    The perceived randomness in the time evolution of "chaotic" dynamical systems can be characterized by universal probabilistic limit laws, which do not depend on the fine features of the individual system. One important example is the Poisson law for the times at which a particle with random initial data hits a small set. This was proved in various settings for dynamical systems with strong mixing properties. The key result of the present study is that, despite the absence of mixing, the hitting times of integrable flows also satisfy universal limit laws which are, however, not Poisson. We describe the limit distributions for "generic" integrable flows and a natural class of target sets, and illustrate our findings with two examples: the dynamics in central force fields and ellipse billiards. The convergence of the hitting time process follows from a new equidistribution theorem in the space of lattices, which is of independent interest. Its proof exploits Ratner's measure classification theorem for unipotent flows, and extends earlier work of Elkies and McMullen.

  14. Logic Learning Machine and standard supervised methods for Hodgkin's lymphoma prognosis using gene expression data and clinical variables.

    PubMed

    Parodi, Stefano; Manneschi, Chiara; Verda, Damiano; Ferrari, Enrico; Muselli, Marco

    2018-03-01

    This study evaluates the performance of a set of machine learning techniques in predicting the prognosis of Hodgkin's lymphoma using clinical factors and gene expression data. Analysed samples from 130 Hodgkin's lymphoma patients included a small set of clinical variables and more than 54,000 gene features. Machine learning classifiers included three black-box algorithms ( k-nearest neighbour, Artificial Neural Network, and Support Vector Machine) and two methods based on intelligible rules (Decision Tree and the innovative Logic Learning Machine method). Support Vector Machine clearly outperformed any of the other methods. Among the two rule-based algorithms, Logic Learning Machine performed better and identified a set of simple intelligible rules based on a combination of clinical variables and gene expressions. Decision Tree identified a non-coding gene ( XIST) involved in the early phases of X chromosome inactivation that was overexpressed in females and in non-relapsed patients. XIST expression might be responsible for the better prognosis of female Hodgkin's lymphoma patients.

  15. Microcomputer based controller for the Langley 0.3-meter Transonic Cryogenic Tunnel

    NASA Technical Reports Server (NTRS)

    Balakrishna, S.; Kilgore, W. Allen

    1989-01-01

    Flow control of the Langley 0.3-meter Transonic Cryogenic Tunnel (TCT) is a multivariable nonlinear control problem. Globally stable control laws were generated to hold tunnel conditions in the presence of geometrical disturbances in the test section and precisely control the tunnel states for small and large set point changes. The control laws are mechanized as four inner control loops for tunnel pressure, temperature, fan speed, and liquid nitrogen supply pressure, and two outer loops for Mach number and Reynolds number. These integrated control laws have been mechanized on a 16-bit microcomputer working on DOS. This document details the model of the 0.3-m TCT, control laws, microcomputer realization, and its performance. The tunnel closed loop responses to small and large set point changes were presented. The controller incorporates safe thermal management of the tunnel cooldown based on thermal restrictions. The controller was shown to provide control of temperature to + or - 0.2K, pressure to + or - 0.07 psia, and Mach number to + or - 0.002 of a given set point during aerodynamic data acquisition in the presence of intrusive geometrical changes like flexwall movement, angle-of-attack changes, and drag rake traverse. The controller also provides a new feature of Reynolds number control. The controller provides a safe, reliable, and economical control of the 0.3-m TCT.

  16. A fuzzy neural network for intelligent data processing

    NASA Astrophysics Data System (ADS)

    Xie, Wei; Chu, Feng; Wang, Lipo; Lim, Eng Thiam

    2005-03-01

    In this paper, we describe an incrementally generated fuzzy neural network (FNN) for intelligent data processing. This FNN combines the features of initial fuzzy model self-generation, fast input selection, partition validation, parameter optimization and rule-base simplification. A small FNN is created from scratch -- there is no need to specify the initial network architecture, initial membership functions, or initial weights. Fuzzy IF-THEN rules are constantly combined and pruned to minimize the size of the network while maintaining accuracy; irrelevant inputs are detected and deleted, and membership functions and network weights are trained with a gradient descent algorithm, i.e., error backpropagation. Experimental studies on synthesized data sets demonstrate that the proposed Fuzzy Neural Network is able to achieve accuracy comparable to or higher than both a feedforward crisp neural network, i.e., NeuroRule, and a decision tree, i.e., C4.5, with more compact rule bases for most of the data sets used in our experiments. The FNN has achieved outstanding results for cancer classification based on microarray data. The excellent classification result for Small Round Blue Cell Tumors (SRBCTs) data set is shown. Compared with other published methods, we have used a much fewer number of genes for perfect classification, which will help researchers directly focus their attention on some specific genes and may lead to discovery of deep reasons of the development of cancers and discovery of drugs.

  17. Integrating Dimension Reduction and Out-of-Sample Extension in Automated Classification of Ex Vivo Human Patellar Cartilage on Phase Contrast X-Ray Computed Tomography

    PubMed Central

    Nagarajan, Mahesh B.; Coan, Paola; Huber, Markus B.; Diemoz, Paul C.; Wismüller, Axel

    2015-01-01

    Phase contrast X-ray computed tomography (PCI-CT) has been demonstrated as a novel imaging technique that can visualize human cartilage with high spatial resolution and soft tissue contrast. Different textural approaches have been previously investigated for characterizing chondrocyte organization on PCI-CT to enable classification of healthy and osteoarthritic cartilage. However, the large size of feature sets extracted in such studies motivates an investigation into algorithmic feature reduction for computing efficient feature representations without compromising their discriminatory power. For this purpose, geometrical feature sets derived from the scaling index method (SIM) were extracted from 1392 volumes of interest (VOI) annotated on PCI-CT images of ex vivo human patellar cartilage specimens. The extracted feature sets were subject to linear and non-linear dimension reduction techniques as well as feature selection based on evaluation of mutual information criteria. The reduced feature set was subsequently used in a machine learning task with support vector regression to classify VOIs as healthy or osteoarthritic; classification performance was evaluated using the area under the receiver-operating characteristic (ROC) curve (AUC). Our results show that the classification performance achieved by 9-D SIM-derived geometric feature sets (AUC: 0.96 ± 0.02) can be maintained with 2-D representations computed from both dimension reduction and feature selection (AUC values as high as 0.97 ± 0.02). Thus, such feature reduction techniques can offer a high degree of compaction to large feature sets extracted from PCI-CT images while maintaining their ability to characterize the underlying chondrocyte patterns. PMID:25710875

  18. Speech recognition features for EEG signal description in detection of neonatal seizures.

    PubMed

    Temko, A; Boylan, G; Marnane, W; Lightbody, G

    2010-01-01

    In this work, features which are usually employed in automatic speech recognition (ASR) are used for the detection of neonatal seizures in newborn EEG. Three conventional ASR feature sets are compared to the feature set which has been previously developed for this task. The results indicate that the thoroughly-studied spectral envelope based ASR features perform reasonably well on their own. Additionally, the SVM Recursive Feature Elimination routine is applied to all extracted features pooled together. It is shown that ASR features consistently appear among the top-rank features.

  19. The Impact of Individual Differences, Types of Model and Social Settings on Block Building Performance among Chinese Preschoolers

    PubMed Central

    Tian, Mi; Deng, Zhu; Meng, Zhaokun; Li, Rui; Zhang, Zhiyi; Qi, Wenhui; Wang, Rui; Yin, Tingting; Ji, Menghui

    2018-01-01

    Children’s block building performances are used as indicators of other abilities in multiple domains. In the current study, we examined individual differences, types of model and social settings as influences on children’s block building performance. Chinese preschoolers (N = 180) participated in a block building activity in a natural setting, and performance was assessed with multiple measures in order to identify a range of specific skills. Using scores generated across these measures, three dependent variables were analyzed: block building skills, structural balance and structural features. An overall MANOVA showed that there were significant main effects of gender and grade level across most measures. Types of model showed no significant effect in children’s block building. There was a significant main effect of social settings on structural features, with the best performance in the 5-member group, followed by individual and then the 10-member block building. These findings suggest that boys performed better than girls in block building activity. Block building performance increased significantly from 1st to 2nd year of preschool, but not from second to third. The preschoolers created more representational constructions when presented with a model made of wooden rather than with a picture. There was partial evidence that children performed better when working with peers in a small group than when working alone or working in a large group. It is suggested that future study should examine other modalities rather than the visual one, diversify the samples and adopt a longitudinal investigation. PMID:29441031

  20. Murine glomerulotropic monoclonal antibodies are highly oligoclonal and exhibit distinctive molecular features.

    PubMed

    Lefkowith, J B; Di Valerio, R; Norris, J; Glick, G D; Alexander, A L; Jackson, L; Gilkeson, G S

    1996-08-01

    We recently produced a panel of seven glomerular-binding mAbs from a nephritic MRL-lpr mouse that bind to histones/nucleosomes (group I) or DNA (group II) adherent to glomerular basement membrane. To elucidate the molecular basis of their binding and ontogeny, we sequenced their variable (V) regions, analyzed the apparent somatic mutations, and predicted their three-dimensional structures. There were two clonally related sets (3 of 4 in group I, 3 of 3 in group II) both of the VHJ1558 family, and one mAb of the VH 7183 family. V region somatic mutations within clonally related sets had little effect on glomerular binding and did not appear to be selected for based on glomerular binding. The VH regions were most homologous with those from autoantibodies to histones, DNA, or IgG (i.e., rheumatoid factors), the Vkappa regions, with those from autoantibodies to small nuclear ribonucleoproteins (snRNP). The VH regions also exhibited an unusual VD junction (in the group I clonally related set) and an overall high content of charged amino acids (arginine, aspartic acid) in complementarity-determining regions (CDRs), particularly in CDR3. Molecular modeling studies suggested that the Fv regions of these mAbs converge to form a flat, open surface with a net positive charge. The CDR arginines in group I mAbs; appear to be located in Ag contact regions of the binding cleft. In sum, these data suggest that glomerulotropic mAbs are a highly restricted set of Abs with distinctive molecular features that may mediate their binding to glomeruli.

  1. Automatic feature design for optical character recognition using an evolutionary search procedure.

    PubMed

    Stentiford, F W

    1985-03-01

    An automatic evolutionary search is applied to the problem of feature extraction in an OCR application. A performance measure based on feature independence is used to generate features which do not appear to suffer from peaking effects [17]. Features are extracted from a training set of 30 600 machine printed 34 class alphanumeric characters derived from British mail. Classification results on the training set and a test set of 10 200 characters are reported for an increasing number of features. A 1.01 percent forced decision error rate is obtained on the test data using 316 features. The hardware implementation should be cheap and fast to operate. The performance compares favorably with current low cost OCR page readers.

  2. Stratification of risk to the surgical team in removal of small arms ammunition implanted in the craniofacial region: case report.

    PubMed

    Forbes, Jonathan A; Laughlin, Ian; Newberry, Shane; Ryhn, Michael; Pasley, Jason; Newberry, Travis

    2016-09-01

    In cases of penetrating injury with implantation of small arms ammunition, it can often be difficult to tell the difference between simple ballistics and ballistics associated with unexploded ordnances (UXOs). In the operative environment, where highly flammable substances are often close to the surgical site, detonation of UXOs could have catastrophic consequences for both the patient and surgical team. There is a paucity of information in the literature regarding how to evaluate whether an implanted munition contains explosive material. This report describes a patient who presented during Operation Enduring Freedom with an implanted munition suspicious for a UXO and the subsequent workup organized by Explosive Ordnance Disposal (EOD) Company prior to surgical removal. Clinical risk factors for UXOs include assassination attempts and/or wartime settings. Specific radiological features suggestive of a UXO include projectile size greater than 7.62-mm caliber, alterations in density of the tip, as well as radiological evidence of a hollowed-out core. If an implanted UXO is suspected, risks to the surgical and anesthesia teams can be minimized by notifying the nearest military installation with EOD capabilities and following clinical practice guidelines set forth by the Joint Theater Trauma System.

  3. Semi-automatic mapping of linear-trending bedforms using 'Self-Organizing Maps' algorithm

    NASA Astrophysics Data System (ADS)

    Foroutan, M.; Zimbelman, J. R.

    2017-09-01

    Increased application of high resolution spatial data such as high resolution satellite or Unmanned Aerial Vehicle (UAV) images from Earth, as well as High Resolution Imaging Science Experiment (HiRISE) images from Mars, makes it necessary to increase automation techniques capable of extracting detailed geomorphologic elements from such large data sets. Model validation by repeated images in environmental management studies such as climate-related changes as well as increasing access to high-resolution satellite images underline the demand for detailed automatic image-processing techniques in remote sensing. This study presents a methodology based on an unsupervised Artificial Neural Network (ANN) algorithm, known as Self Organizing Maps (SOM), to achieve the semi-automatic extraction of linear features with small footprints on satellite images. SOM is based on competitive learning and is efficient for handling huge data sets. We applied the SOM algorithm to high resolution satellite images of Earth and Mars (Quickbird, Worldview and HiRISE) in order to facilitate and speed up image analysis along with the improvement of the accuracy of results. About 98% overall accuracy and 0.001 quantization error in the recognition of small linear-trending bedforms demonstrate a promising framework.

  4. FSR: feature set reduction for scalable and accurate multi-class cancer subtype classification based on copy number.

    PubMed

    Wong, Gerard; Leckie, Christopher; Kowalczyk, Adam

    2012-01-15

    Feature selection is a key concept in machine learning for microarray datasets, where features represented by probesets are typically several orders of magnitude larger than the available sample size. Computational tractability is a key challenge for feature selection algorithms in handling very high-dimensional datasets beyond a hundred thousand features, such as in datasets produced on single nucleotide polymorphism microarrays. In this article, we present a novel feature set reduction approach that enables scalable feature selection on datasets with hundreds of thousands of features and beyond. Our approach enables more efficient handling of higher resolution datasets to achieve better disease subtype classification of samples for potentially more accurate diagnosis and prognosis, which allows clinicians to make more informed decisions in regards to patient treatment options. We applied our feature set reduction approach to several publicly available cancer single nucleotide polymorphism (SNP) array datasets and evaluated its performance in terms of its multiclass predictive classification accuracy over different cancer subtypes, its speedup in execution as well as its scalability with respect to sample size and array resolution. Feature Set Reduction (FSR) was able to reduce the dimensions of an SNP array dataset by more than two orders of magnitude while achieving at least equal, and in most cases superior predictive classification performance over that achieved on features selected by existing feature selection methods alone. An examination of the biological relevance of frequently selected features from FSR-reduced feature sets revealed strong enrichment in association with cancer. FSR was implemented in MATLAB R2010b and is available at http://ww2.cs.mu.oz.au/~gwong/FSR.

  5. 36 CFR 223.103 - Award of small business set-aside sales.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 36 Parks, Forests, and Public Property 2 2013-07-01 2013-07-01 false Award of small business set... PRODUCTS Timber Sale Contracts Award of Contracts § 223.103 Award of small business set-aside sales. If timber is advertised as set aside for competitive bidding by small business concerns, award will be made...

  6. 36 CFR 223.103 - Award of small business set-aside sales.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 36 Parks, Forests, and Public Property 2 2011-07-01 2011-07-01 false Award of small business set... PRODUCTS Timber Sale Contracts Award of Contracts § 223.103 Award of small business set-aside sales. If timber is advertised as set aside for competitive bidding by small business concerns, award will be made...

  7. 36 CFR 223.103 - Award of small business set-aside sales.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 36 Parks, Forests, and Public Property 2 2012-07-01 2012-07-01 false Award of small business set... PRODUCTS Timber Sale Contracts Award of Contracts § 223.103 Award of small business set-aside sales. If timber is advertised as set aside for competitive bidding by small business concerns, award will be made...

  8. 36 CFR 223.103 - Award of small business set-aside sales.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 36 Parks, Forests, and Public Property 2 2014-07-01 2014-07-01 false Award of small business set... PRODUCTS Timber Sale Contracts Award of Contracts § 223.103 Award of small business set-aside sales. If timber is advertised as set aside for competitive bidding by small business concerns, award will be made...

  9. Computer-aided diagnosis of psoriasis skin images with HOS, texture and color features: A first comparative study of its kind.

    PubMed

    Shrivastava, Vimal K; Londhe, Narendra D; Sonawane, Rajendra S; Suri, Jasjit S

    2016-04-01

    Psoriasis is an autoimmune skin disease with red and scaly plaques on skin and affecting about 125 million people worldwide. Currently, dermatologist use visual and haptic methods for diagnosis the disease severity. This does not help them in stratification and risk assessment of the lesion stage and grade. Further, current methods add complexity during monitoring and follow-up phase. The current diagnostic tools lead to subjectivity in decision making and are unreliable and laborious. This paper presents a first comparative performance study of its kind using principal component analysis (PCA) based CADx system for psoriasis risk stratification and image classification utilizing: (i) 11 higher order spectra (HOS) features, (ii) 60 texture features, and (iii) 86 color feature sets and their seven combinations. Aggregate 540 image samples (270 healthy and 270 diseased) from 30 psoriasis patients of Indian ethnic origin are used in our database. Machine learning using PCA is used for dominant feature selection which is then fed to support vector machine classifier (SVM) to obtain optimized performance. Three different protocols are implemented using three kinds of feature sets. Reliability index of the CADx is computed. Among all feature combinations, the CADx system shows optimal performance of 100% accuracy, 100% sensitivity and specificity, when all three sets of feature are combined. Further, our experimental result with increasing data size shows that all feature combinations yield high reliability index throughout the PCA-cutoffs except color feature set and combination of color and texture feature sets. HOS features are powerful in psoriasis disease classification and stratification. Even though, independently, all three set of features HOS, texture, and color perform competitively, but when combined, the machine learning system performs the best. The system is fully automated, reliable and accurate. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  10. 3D for Geosciences: Interactive Tangibles and Virtual Models

    NASA Astrophysics Data System (ADS)

    Pippin, J. E.; Matheney, M.; Kitsch, N.; Rosado, G.; Thompson, Z.; Pierce, S. A.

    2016-12-01

    Point cloud processing provides a method of studying and modelling geologic features relevant to geoscience systems and processes. Here, software including Skanect, MeshLab, Blender, PDAL, and PCL are used in conjunction with 3D scanning hardware, including a Structure scanner and a Kinect camera, to create and analyze point cloud images of small scale topography, karst features, tunnels, and structures at high resolution. This project successfully scanned internal karst features ranging from small stalactites to large rooms, as well as an external waterfall feature. For comparison purposes, multiple scans of the same object were merged into single object files both automatically, using commercial software, and manually using open source libraries and code. Files with format .ply were manually converted into numeric data sets to be analyzed for similar regions between files in order to match them together. We can assume a numeric process would be more powerful and efficient than the manual method, however it could lack other useful features that GUI's may have. The digital models have applications in mining as efficient means of replacing topography functions such as measuring distances and areas. Additionally, it is possible to make simulation models such as drilling templates and calculations related to 3D spaces. Advantages of using methods described here for these procedures include the relatively quick time to obtain data and the easy transport of the equipment. With regard to openpit mining, obtaining 3D images of large surfaces and with precision would be a high value tool by georeferencing scan data to interactive maps. The digital 3D images obtained from scans may be saved as printable files to create physical 3D-printable models to create tangible objects based on scientific information, as well as digital "worlds" able to be navigated virtually. The data, models, and algorithms explored here can be used to convey complex scientific ideas to a range of professionals and audiences.

  11. No-Reference Video Quality Assessment Based on Statistical Analysis in 3D-DCT Domain.

    PubMed

    Li, Xuelong; Guo, Qun; Lu, Xiaoqiang

    2016-05-13

    It is an important task to design models for universal no-reference video quality assessment (NR-VQA) in multiple video processing and computer vision applications. However, most existing NR-VQA metrics are designed for specific distortion types which are not often aware in practical applications. A further deficiency is that the spatial and temporal information of videos is hardly considered simultaneously. In this paper, we propose a new NR-VQA metric based on the spatiotemporal natural video statistics (NVS) in 3D discrete cosine transform (3D-DCT) domain. In the proposed method, a set of features are firstly extracted based on the statistical analysis of 3D-DCT coefficients to characterize the spatiotemporal statistics of videos in different views. These features are used to predict the perceived video quality via the efficient linear support vector regression (SVR) model afterwards. The contributions of this paper are: 1) we explore the spatiotemporal statistics of videos in 3DDCT domain which has the inherent spatiotemporal encoding advantage over other widely used 2D transformations; 2) we extract a small set of simple but effective statistical features for video visual quality prediction; 3) the proposed method is universal for multiple types of distortions and robust to different databases. The proposed method is tested on four widely used video databases. Extensive experimental results demonstrate that the proposed method is competitive with the state-of-art NR-VQA metrics and the top-performing FR-VQA and RR-VQA metrics.

  12. SOLAR FLARE PREDICTION USING SDO/HMI VECTOR MAGNETIC FIELD DATA WITH A MACHINE-LEARNING ALGORITHM

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bobra, M. G.; Couvidat, S., E-mail: couvidat@stanford.edu

    2015-01-10

    We attempt to forecast M- and X-class solar flares using a machine-learning algorithm, called support vector machine (SVM), and four years of data from the Solar Dynamics Observatory's Helioseismic and Magnetic Imager, the first instrument to continuously map the full-disk photospheric vector magnetic field from space. Most flare forecasting efforts described in the literature use either line-of-sight magnetograms or a relatively small number of ground-based vector magnetograms. This is the first time a large data set of vector magnetograms has been used to forecast solar flares. We build a catalog of flaring and non-flaring active regions sampled from a databasemore » of 2071 active regions, comprised of 1.5 million active region patches of vector magnetic field data, and characterize each active region by 25 parameters. We then train and test the machine-learning algorithm and we estimate its performances using forecast verification metrics with an emphasis on the true skill statistic (TSS). We obtain relatively high TSS scores and overall predictive abilities. We surmise that this is partly due to fine-tuning the SVM for this purpose and also to an advantageous set of features that can only be calculated from vector magnetic field data. We also apply a feature selection algorithm to determine which of our 25 features are useful for discriminating between flaring and non-flaring active regions and conclude that only a handful are needed for good predictive abilities.« less

  13. A primitive study of voxel feature generation by multiple stacked denoising autoencoders for detecting cerebral aneurysms on MRA

    NASA Astrophysics Data System (ADS)

    Nemoto, Mitsutaka; Hayashi, Naoto; Hanaoka, Shouhei; Nomura, Yukihiro; Miki, Soichiro; Yoshikawa, Takeharu; Ohtomo, Kuni

    2016-03-01

    The purpose of this study is to evaluate the feasibility of a novel feature generation, which is based on multiple deep neural networks (DNNs) with boosting, for computer-assisted detection (CADe). It is hard and time-consuming to optimize the hyperparameters for DNNs such as stacked denoising autoencoder (SdA). The proposed method allows using SdA based features without the burden of the hyperparameter setting. The proposed method was evaluated by an application for detecting cerebral aneurysms on magnetic resonance angiogram (MRA). A baseline CADe process included four components; scaling, candidate area limitation, candidate detection, and candidate classification. Proposed feature generation method was applied to extract the optimal features for candidate classification. Proposed method only required setting range of the hyperparameters for SdA. The optimal feature set was selected from a large quantity of SdA based features by multiple SdAs, each of which was trained using different hyperparameter set. The feature selection was operated through ada-boost ensemble learning method. Training of the baseline CADe process and proposed feature generation were operated with 200 MRA cases, and the evaluation was performed with 100 MRA cases. Proposed method successfully provided SdA based features just setting the range of some hyperparameters for SdA. The CADe process by using both previous voxel features and SdA based features had the best performance with 0.838 of an area under ROC curve and 0.312 of ANODE score. The results showed that proposed method was effective in the application for detecting cerebral aneurysms on MRA.

  14. Variability of textural features in FDG PET images due to different acquisition modes and reconstruction parameters.

    PubMed

    Galavis, Paulina E; Hollensen, Christian; Jallow, Ngoneh; Paliwal, Bhudatt; Jeraj, Robert

    2010-10-01

    Characterization of textural features (spatial distributions of image intensity levels) has been considered as a tool for automatic tumor segmentation. The purpose of this work is to study the variability of the textural features in PET images due to different acquisition modes and reconstruction parameters. Twenty patients with solid tumors underwent PET/CT scans on a GE Discovery VCT scanner, 45-60 minutes post-injection of 10 mCi of [(18)F]FDG. Scans were acquired in both 2D and 3D modes. For each acquisition the raw PET data was reconstructed using five different reconstruction parameters. Lesions were segmented on a default image using the threshold of 40% of maximum SUV. Fifty different texture features were calculated inside the tumors. The range of variations of the features were calculated with respect to the average value. Fifty textural features were classified based on the range of variation in three categories: small, intermediate and large variability. Features with small variability (range ≤ 5%) were entropy-first order, energy, maximal correlation coefficient (second order feature) and low-gray level run emphasis (high-order feature). The features with intermediate variability (10% ≤ range ≤ 25%) were entropy-GLCM, sum entropy, high gray level run emphsis, gray level non-uniformity, small number emphasis, and entropy-NGL. Forty remaining features presented large variations (range > 30%). Textural features such as entropy-first order, energy, maximal correlation coefficient, and low-gray level run emphasis exhibited small variations due to different acquisition modes and reconstruction parameters. Features with low level of variations are better candidates for reproducible tumor segmentation. Even though features such as contrast-NGTD, coarseness, homogeneity, and busyness have been previously used, our data indicated that these features presented large variations, therefore they could not be considered as a good candidates for tumor segmentation.

  15. Variability of textural features in FDG PET images due to different acquisition modes and reconstruction parameters

    PubMed Central

    GALAVIS, PAULINA E.; HOLLENSEN, CHRISTIAN; JALLOW, NGONEH; PALIWAL, BHUDATT; JERAJ, ROBERT

    2014-01-01

    Background Characterization of textural features (spatial distributions of image intensity levels) has been considered as a tool for automatic tumor segmentation. The purpose of this work is to study the variability of the textural features in PET images due to different acquisition modes and reconstruction parameters. Material and methods Twenty patients with solid tumors underwent PET/CT scans on a GE Discovery VCT scanner, 45–60 minutes post-injection of 10 mCi of [18F]FDG. Scans were acquired in both 2D and 3D modes. For each acquisition the raw PET data was reconstructed using five different reconstruction parameters. Lesions were segmented on a default image using the threshold of 40% of maximum SUV. Fifty different texture features were calculated inside the tumors. The range of variations of the features were calculated with respect to the average value. Results Fifty textural features were classified based on the range of variation in three categories: small, intermediate and large variability. Features with small variability (range ≤ 5%) were entropy-first order, energy, maximal correlation coefficient (second order feature) and low-gray level run emphasis (high-order feature). The features with intermediate variability (10% ≤ range ≤ 25%) were entropy-GLCM, sum entropy, high gray level run emphsis, gray level non-uniformity, small number emphasis, and entropy-NGL. Forty remaining features presented large variations (range > 30%). Conclusion Textural features such as entropy-first order, energy, maximal correlation coefficient, and low-gray level run emphasis exhibited small variations due to different acquisition modes and reconstruction parameters. Features with low level of variations are better candidates for reproducible tumor segmentation. Even though features such as contrast-NGTD, coarseness, homogeneity, and busyness have been previously used, our data indicated that these features presented large variations, therefore they could not be considered as a good candidates for tumor segmentation. PMID:20831489

  16. Working memory for visual features and conjunctions in schizophrenia.

    PubMed

    Gold, James M; Wilk, Christopher M; McMahon, Robert P; Buchanan, Robert W; Luck, Steven J

    2003-02-01

    The visual working memory (WM) storage capacity of patients with schizophrenia was investigated using a change detection paradigm. Participants were presented with 2, 3, 4, or 6 colored bars with testing of both single feature (color, orientation) and feature conjunction conditions. Patients performed significantly worse than controls at all set sizes but demonstrated normal feature binding. Unlike controls, patient WM capacity declined at set size 6 relative to set size 4. Impairments with subcapacity arrays suggest a deficit in task set maintenance: Greater impairment for supercapacity set sizes suggests a deficit in the ability to selectively encode information for WM storage. Thus, the WM impairment in schizophrenia appears to be a consequence of attentional deficits rather than a reduction in storage capacity.

  17. Visual homing with a pan-tilt based stereo camera

    NASA Astrophysics Data System (ADS)

    Nirmal, Paramesh; Lyons, Damian M.

    2013-01-01

    Visual homing is a navigation method based on comparing a stored image of the goal location and the current image (current view) to determine how to navigate to the goal location. It is theorized that insects, such as ants and bees, employ visual homing methods to return to their nest. Visual homing has been applied to autonomous robot platforms using two main approaches: holistic and feature-based. Both methods aim at determining distance and direction to the goal location. Navigational algorithms using Scale Invariant Feature Transforms (SIFT) have gained great popularity in the recent years due to the robustness of the feature operator. Churchill and Vardy have developed a visual homing method using scale change information (Homing in Scale Space, HiSS) from SIFT. HiSS uses SIFT feature scale change information to determine distance between the robot and the goal location. Since the scale component is discrete with a small range of values, the result is a rough measurement with limited accuracy. We have developed a method that uses stereo data, resulting in better homing performance. Our approach utilizes a pan-tilt based stereo camera, which is used to build composite wide-field images. We use the wide-field images combined with stereo-data obtained from the stereo camera to extend the keypoint vector described in to include a new parameter, depth (z). Using this info, our algorithm determines the distance and orientation from the robot to the goal location. We compare our method with HiSS in a set of indoor trials using a Pioneer 3-AT robot equipped with a BumbleBee2 stereo camera. We evaluate the performance of both methods using a set of performance measures described in this paper.

  18. Open-source sea ice drift algorithm for Sentinel-1 SAR imagery using a combination of feature-tracking and pattern-matching

    NASA Astrophysics Data System (ADS)

    Muckenhuber, Stefan; Sandven, Stein

    2017-04-01

    An open-source sea ice drift algorithm for Sentinel-1 SAR imagery is introduced based on the combination of feature-tracking and pattern-matching. A computational efficient feature-tracking algorithm produces an initial drift estimate and limits the search area for the pattern-matching, that provides small to medium scale drift adjustments and normalised cross correlation values as quality measure. The algorithm is designed to utilise the respective advantages of the two approaches and allows drift calculation at user defined locations. The pre-processing of the Sentinel-1 data has been optimised to retrieve a feature distribution that depends less on SAR backscatter peak values. A recommended parameter set for the algorithm has been found using a representative image pair over Fram Strait and 350 manually derived drift vectors as validation. Applying the algorithm with this parameter setting, sea ice drift retrieval with a vector spacing of 8 km on Sentinel-1 images covering 400 km x 400 km, takes less than 3.5 minutes on a standard 2.7 GHz processor with 8 GB memory. For validation, buoy GPS data, collected in 2015 between 15th January and 22nd April and covering an area from 81° N to 83.5° N and 12° E to 27° E, have been compared to calculated drift results from 261 corresponding Sentinel-1 image pairs. We found a logarithmic distribution of the error with a peak at 300 m. All software requirements necessary for applying the presented sea ice drift algorithm are open-source to ensure free implementation and easy distribution.

  19. General Approach for Rock Classification Based on Digital Image Analysis of Electrical Borehole Wall Images

    NASA Astrophysics Data System (ADS)

    Linek, M.; Jungmann, M.; Berlage, T.; Clauser, C.

    2005-12-01

    Within the Ocean Drilling Program (ODP), image logging tools have been routinely deployed such as the Formation MicroScanner (FMS) or the Resistivity-At-Bit (RAB) tools. Both logging methods are based on resistivity measurements at the borehole wall and therefore are sensitive to conductivity contrasts, which are mapped in color scale images. These images are commonly used to study the structure of the sedimentary rocks and the oceanic crust (petrologic fabric, fractures, veins, etc.). So far, mapping of lithology from electrical images is purely based on visual inspection and subjective interpretation. We apply digital image analysis on electrical borehole wall images in order to develop a method, which augments objective rock identification. We focus on supervised textural pattern recognition which studies the spatial gray level distribution with respect to certain rock types. FMS image intervals of rock classes known from core data are taken in order to train textural characteristics for each class. A so-called gray level co-occurrence matrix is computed by counting the occurrence of a pair of gray levels that are a certain distant apart. Once the matrix for an image interval is computed, we calculate the image contrast, homogeneity, energy, and entropy. We assign characteristic textural features to different rock types by reducing the image information into a small set of descriptive features. Once a discriminating set of texture features for each rock type is found, we are able to discriminate the entire FMS images regarding the trained rock type classification. A rock classification based on texture features enables quantitative lithology mapping and is characterized by a high repeatability, in contrast to a purely visual subjective image interpretation. We show examples for the rock classification between breccias, pillows, massive units, and horizontally bedded tuffs based on ODP image data.

  20. Dose-volume histogram prediction using density estimation.

    PubMed

    Skarpman Munter, Johanna; Sjölund, Jens

    2015-09-07

    Knowledge of what dose-volume histograms can be expected for a previously unseen patient could increase consistency and quality in radiotherapy treatment planning. We propose a machine learning method that uses previous treatment plans to predict such dose-volume histograms. The key to the approach is the framing of dose-volume histograms in a probabilistic setting.The training consists of estimating, from the patients in the training set, the joint probability distribution of some predictive features and the dose. The joint distribution immediately provides an estimate of the conditional probability of the dose given the values of the predictive features. The prediction consists of estimating, from the new patient, the distribution of the predictive features and marginalizing the conditional probability from the training over this. Integrating the resulting probability distribution for the dose yields an estimate of the dose-volume histogram.To illustrate how the proposed method relates to previously proposed methods, we use the signed distance to the target boundary as a single predictive feature. As a proof-of-concept, we predicted dose-volume histograms for the brainstems of 22 acoustic schwannoma patients treated with stereotactic radiosurgery, and for the lungs of 9 lung cancer patients treated with stereotactic body radiation therapy. Comparing with two previous attempts at dose-volume histogram prediction we find that, given the same input data, the predictions are similar.In summary, we propose a method for dose-volume histogram prediction that exploits the intrinsic probabilistic properties of dose-volume histograms. We argue that the proposed method makes up for some deficiencies in previously proposed methods, thereby potentially increasing ease of use, flexibility and ability to perform well with small amounts of training data.

  1. 48 CFR 52.219-7 - Notice of Partial Small Business Set-Aside.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... Clauses 52.219-7 Notice of Partial Small Business Set-Aside. As prescribed in 19.508(d), insert the following clause: Notice of Partial Small Business Set-Aside (JUN 2003) (a) Definitions. Small business..., and qualified as a small business under the size standards in this solicitation. (b) General. (1) A...

  2. 48 CFR 52.219-7 - Notice of Partial Small Business Set-Aside.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... Clauses 52.219-7 Notice of Partial Small Business Set-Aside. As prescribed in 19.508(d), insert the following clause: Notice of Partial Small Business Set-Aside (JUN 2003) (a) Definitions. Small business..., and qualified as a small business under the size standards in this solicitation. (b) General. (1) A...

  3. 48 CFR 52.219-7 - Notice of Partial Small Business Set-Aside.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... Clauses 52.219-7 Notice of Partial Small Business Set-Aside. As prescribed in 19.508(d), insert the following clause: Notice of Partial Small Business Set-Aside (JUN 2003) (a) Definitions. Small business..., and qualified as a small business under the size standards in this solicitation. (b) General. (1) A...

  4. 48 CFR 52.219-6 - Notice of Total Small Business Set-Aside.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... Clauses 52.219-6 Notice of Total Small Business Set-Aside. As prescribed in 19.508(c), insert the following clause: Notice of Total Small Business Set-Aside (NOV 2011) (a) Definition. Small business concern... qualified as a small business under the size standards in this solicitation. (b) Applicability. This clause...

  5. 48 CFR 52.219-7 - Notice of Partial Small Business Set-Aside.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... Clauses 52.219-7 Notice of Partial Small Business Set-Aside. As prescribed in 19.508(d), insert the following clause: Notice of Partial Small Business Set-Aside (JUN 2003) (a) Definitions. Small business..., and qualified as a small business under the size standards in this solicitation. (b) General. (1) A...

  6. 48 CFR 52.219-7 - Notice of Partial Small Business Set-Aside.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... Clauses 52.219-7 Notice of Partial Small Business Set-Aside. As prescribed in 19.508(d), insert the following clause: Notice of Partial Small Business Set-Aside (JUN 2003) (a) Definitions. Small business..., and qualified as a small business under the size standards in this solicitation. (b) General. (1) A...

  7. 48 CFR 52.219-6 - Notice of Total Small Business Set-Aside.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... Clauses 52.219-6 Notice of Total Small Business Set-Aside. As prescribed in 19.508(c), insert the following clause: Notice of Total Small Business Set-Aside (NOV 2011) (a) Definition. Small business concern... qualified as a small business under the size standards in this solicitation. (b) Applicability. This clause...

  8. 48 CFR 52.219-6 - Notice of Total Small Business Set-Aside.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... Clauses 52.219-6 Notice of Total Small Business Set-Aside. As prescribed in 19.508(c), insert the following clause: Notice of Total Small Business Set-Aside (NOV 2011) (a) Definition. Small business concern... qualified as a small business under the size standards in this solicitation. (b) Applicability. This clause...

  9. Perceptual quality estimation of H.264/AVC videos using reduced-reference and no-reference models

    NASA Astrophysics Data System (ADS)

    Shahid, Muhammad; Pandremmenou, Katerina; Kondi, Lisimachos P.; Rossholm, Andreas; Lövström, Benny

    2016-09-01

    Reduced-reference (RR) and no-reference (NR) models for video quality estimation, using features that account for the impact of coding artifacts, spatio-temporal complexity, and packet losses, are proposed. The purpose of this study is to analyze a number of potentially quality-relevant features in order to select the most suitable set of features for building the desired models. The proposed sets of features have not been used in the literature and some of the features are used for the first time in this study. The features are employed by the least absolute shrinkage and selection operator (LASSO), which selects only the most influential of them toward perceptual quality. For comparison, we apply feature selection in the complete feature sets and ridge regression on the reduced sets. The models are validated using a database of H.264/AVC encoded videos that were subjectively assessed for quality in an ITU-T compliant laboratory. We infer that just two features selected by RR LASSO and two bitstream-based features selected by NR LASSO are able to estimate perceptual quality with high accuracy, higher than that of ridge, which uses more features. The comparisons with competing works and two full-reference metrics also verify the superiority of our models.

  10. Evidence of tampering in watermark identification

    NASA Astrophysics Data System (ADS)

    McLauchlan, Lifford; Mehrübeoglu, Mehrübe

    2009-08-01

    In this work, watermarks are embedded in digital images in the discrete wavelet transform (DWT) domain. Principal component analysis (PCA) is performed on the DWT coefficients. Next higher order statistics based on the principal components and the eigenvalues are determined for different sets of images. Feature sets are analyzed for different types of attacks in m dimensional space. The results demonstrate the separability of the features for the tampered digital copies. Different feature sets are studied to determine more effective tamper evident feature sets. The digital forensics, the probable manipulation(s) or modification(s) performed on the digital information can be identified using the described technique.

  11. An ensemble heterogeneous classification methodology for discovering health-related knowledge in social media messages.

    PubMed

    Tuarob, Suppawong; Tucker, Conrad S; Salathe, Marcel; Ram, Nilam

    2014-06-01

    The role of social media as a source of timely and massive information has become more apparent since the era of Web 2.0.Multiple studies illustrated the use of information in social media to discover biomedical and health-related knowledge.Most methods proposed in the literature employ traditional document classification techniques that represent a document as a bag of words.These techniques work well when documents are rich in text and conform to standard English; however, they are not optimal for social media data where sparsity and noise are norms.This paper aims to address the limitations posed by the traditional bag-of-word based methods and propose to use heterogeneous features in combination with ensemble machine learning techniques to discover health-related information, which could prove to be useful to multiple biomedical applications, especially those needing to discover health-related knowledge in large scale social media data.Furthermore, the proposed methodology could be generalized to discover different types of information in various kinds of textual data. Social media data is characterized by an abundance of short social-oriented messages that do not conform to standard languages, both grammatically and syntactically.The problem of discovering health-related knowledge in social media data streams is then transformed into a text classification problem, where a text is identified as positive if it is health-related and negative otherwise.We first identify the limitations of the traditional methods which train machines with N-gram word features, then propose to overcome such limitations by utilizing the collaboration of machine learning based classifiers, each of which is trained to learn a semantically different aspect of the data.The parameter analysis for tuning each classifier is also reported. Three data sets are used in this research.The first data set comprises of approximately 5000 hand-labeled tweets, and is used for cross validation of the classification models in the small scale experiment, and for training the classifiers in the real-world large scale experiment.The second data set is a random sample of real-world Twitter data in the US.The third data set is a random sample of real-world Facebook Timeline posts. Two sets of evaluations are conducted to investigate the proposed model's ability to discover health-related information in the social media domain: small scale and large scale evaluations.The small scale evaluation employs 10-fold cross validation on the labeled data, and aims to tune parameters of the proposed models, and to compare with the stage-of-the-art method.The large scale evaluation tests the trained classification models on the native, real-world data sets, and is needed to verify the ability of the proposed model to handle the massive heterogeneity in real-world social media. The small scale experiment reveals that the proposed method is able to mitigate the limitations in the well established techniques existing in the literature, resulting in performance improvement of 18.61% (F-measure).The large scale experiment further reveals that the baseline fails to perform well on larger data with higher degrees of heterogeneity, while the proposed method is able to yield reasonably good performance and outperform the baseline by 46.62% (F-Measure) on average. Copyright © 2014 Elsevier Inc. All rights reserved.

  12. Asteroid shape and spin statistics from convex models

    NASA Astrophysics Data System (ADS)

    Torppa, J.; Hentunen, V.-P.; Pääkkönen, P.; Kehusmaa, P.; Muinonen, K.

    2008-11-01

    We introduce techniques for characterizing convex shape models of asteroids with a small number of parameters, and apply these techniques to a set of 87 models from convex inversion. We present three different approaches for determining the overall dimensions of an asteroid. With the first technique, we measured the dimensions of the shapes in the direction of the rotation axis and in the equatorial plane and with the two other techniques, we derived the best-fit ellipsoid. We also computed the inertia matrix of the model shape to test how well it represents the target asteroid, i.e., to find indications of possible non-convex features or albedo variegation, which the convex shape model cannot reproduce. We used shape models for 87 asteroids to perform statistical analyses and to study dependencies between shape and rotation period, size, and taxonomic type. We detected correlations, but more data are required, especially on small and large objects, as well as slow and fast rotators, to reach a more thorough understanding about the dependencies. Results show, e.g., that convex models of asteroids are not that far from ellipsoids in root-mean-square sense, even though clearly irregular features are present. We also present new spin and shape solutions for Asteroids (31) Euphrosyne, (54) Alexandra, (79) Eurynome, (93) Minerva, (130) Elektra, (376) Geometria, (471) Papagena, and (776) Berbericia. We used a so-called semi-statistical approach to obtain a set of possible spin state solutions. The number of solutions depends on the abundancy of the data, which for Eurynome, Elektra, and Geometria was extensive enough for determining an unambiguous spin and shape solution. Data of Euphrosyne, on the other hand, provided a wide distribution of possible spin solutions, whereas the rest of the targets have two or three possible solutions.

  13. Mass wasting features in Juventae Chasma, Mars

    NASA Astrophysics Data System (ADS)

    Sarkar, Ranjan; Singh, Pragya; Porwal, Alok; Ganesh, Indujaa

    2016-07-01

    Introduction : We report mass-wasting features preserved as debris aprons from Juventae Chasma. Diverse lines of evidence and associated geomorphological features indicate that fluidized ice or water within the wall rocks of the chasma could be responsible for mobilizing the debris. Description : The distinctive features of the landslides in Juvenate Chasma are: (1) lack of a well-defined crown or a clear-cut section at their point of origin and instead the presence of amphitheatre-headed tributary canyons; (2) absence of slump blocks; (3) overlapping of debris aprons; (4) a variety of surface textures from fresh and grooved to degraded and chaotic; (5) rounded lobes of debris aprons; (6) large variation of sizes from small lumps (~0.52 m2) to large tongue shaped ones (~ 80 m2); (7) smaller average size of landslides as compared to other chasmas; and (8) occasional preservation of fresh surficial features indicating recent emplacement. Discussion : Amphitheatre-headed tributary canyons, which are formed due to ground water sapping, indicate that the same was responsible for wall-section collapse, although a structural control cannot be completely ruled out. The emplacement of the mass wasting features preferentially at the mouths of amphitheatre-headed tributary canyons along with the rounded flow fronts of the debris suggest fluids may have played a vital role in their emplacement. The mass-wasting features in Juventae Chasma are unique compared to other landslides in Valles Marineris despite commonalities such as the radial furrows, fan-shaped outlines, overlapping aprons and overtopped obstacles. The unique set of features and close association with amphitheatre-headed tributary canyons imply that the trigger of the landslides was not structural or tectonic but possibly weakness imparted by the presence of water or ice in the pore-spaces of the wall. Craters with fluidized ejecta blankets and scalloped depressions in the surrounding plateau also support this possibility. Depending on the amounts of fluids involved at the time of emplacement, these mass movements may also qualify as debris flows. The role of fluids in the Valles Marineris landslides is still debated; however, in the Juventae Chasma landslides we see unique features which set these apart from other landslides in Valles Marineris. Further study is required to fully investigate the mechanism of emplacement of these debris.

  14. Classification of small lesions in dynamic breast MRI: Eliminating the need for precise lesion segmentation through spatio-temporal analysis of contrast enhancement over time.

    PubMed

    Nagarajan, Mahesh B; Huber, Markus B; Schlossbauer, Thomas; Leinsinger, Gerda; Krol, Andrzej; Wismüller, Axel

    2013-10-01

    Characterizing the dignity of breast lesions as benign or malignant is specifically difficult for small lesions; they don't exhibit typical characteristics of malignancy and are harder to segment since margins are harder to visualize. Previous attempts at using dynamic or morphologic criteria to classify small lesions (mean lesion diameter of about 1 cm) have not yielded satisfactory results. The goal of this work was to improve the classification performance in such small diagnostically challenging lesions while concurrently eliminating the need for precise lesion segmentation. To this end, we introduce a method for topological characterization of lesion enhancement patterns over time. Three Minkowski Functionals were extracted from all five post-contrast images of sixty annotated lesions on dynamic breast MRI exams. For each Minkowski Functional, topological features extracted from each post-contrast image of the lesions were combined into a high-dimensional texture feature vector. These feature vectors were classified in a machine learning task with support vector regression. For comparison, conventional Haralick texture features derived from gray-level co-occurrence matrices (GLCM) were also used. A new method for extracting thresholded GLCM features was also introduced and investigated here. The best classification performance was observed with Minkowski Functionals area and perimeter , thresholded GLCM features f8 and f9, and conventional GLCM features f4 and f6. However, both Minkowski Functionals and thresholded GLCM achieved such results without lesion segmentation while the performance of GLCM features significantly deteriorated when lesions were not segmented ( p < 0.05). This suggests that such advanced spatio-temporal characterization can improve the classification performance achieved in such small lesions, while simultaneously eliminating the need for precise segmentation.

  15. A dimension reduction strategy for improving the efficiency of computer-aided detection for CT colonography

    NASA Astrophysics Data System (ADS)

    Song, Bowen; Zhang, Guopeng; Wang, Huafeng; Zhu, Wei; Liang, Zhengrong

    2013-02-01

    Various types of features, e.g., geometric features, texture features, projection features etc., have been introduced for polyp detection and differentiation tasks via computer aided detection and diagnosis (CAD) for computed tomography colonography (CTC). Although these features together cover more information of the data, some of them are statistically highly-related to others, which made the feature set redundant and burdened the computation task of CAD. In this paper, we proposed a new dimension reduction method which combines hierarchical clustering and principal component analysis (PCA) for false positives (FPs) reduction task. First, we group all the features based on their similarity using hierarchical clustering, and then PCA is employed within each group. Different numbers of principal components are selected from each group to form the final feature set. Support vector machine is used to perform the classification. The results show that when three principal components were chosen from each group we can achieve an area under the curve of receiver operating characteristics of 0.905, which is as high as the original dataset. Meanwhile, the computation time is reduced by 70% and the feature set size is reduce by 77%. It can be concluded that the proposed method captures the most important information of the feature set and the classification accuracy is not affected after the dimension reduction. The result is promising and further investigation, such as automatically threshold setting, are worthwhile and are under progress.

  16. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bocci, Valerio; Chiodi, Giacomo; Iacoangeli, Francesco

    The necessity to use Photo Multipliers (PM) as light detector limited in the past the use of crystals in radiation handled device preferring the Geiger approach. The Silicon Photomultipliers (SiPMs) are very small and cheap, solid photon detectors with good dynamic range and single photon detection capability, they are usable to supersede cumbersome and difficult to use Photo Multipliers (PM). A SiPM can be coupled with a scintillator crystal to build efficient, small and solid radiation detector. A cost effective and easily replicable Hardware software module for SiPM detector readout is made using the ArduSiPM solution. The ArduSiPM is anmore » easily battery operable handled device using an Arduino DUE (an open Software/Hardware board) as processor board and a piggy-back custom designed board (ArduSiPM Shield), the Shield contains all the blocks features to monitor, set and acquire the SiPM using internet network. (authors)« less

  17. Analysis of condensation on a horizontal cylinder with unknown wall temperature and comparison with the Nusselt model of film condensation

    NASA Technical Reports Server (NTRS)

    Bahrami, Parviz A.

    1996-01-01

    Theoretical analysis and numerical computations are performed to set forth a new model of film condensation on a horizontal cylinder. The model is more general than the well-known Nusselt model of film condensation and is designed to encompass all essential features of the Nusselt model. It is shown that a single parameter, constructed explicitly and without specification of the cylinder wall temperature, determines the degree of departure from the Nusselt model, which assumes a known and uniform wall temperature. It is also known that the Nusselt model is reached for very small, as well as very large, values of this parameter. In both limiting cases the cylinder wall temperature assumes a uniform distribution and the Nusselt model is approached. The maximum deviations between the two models is rather small for cases which are representative of cylinder dimensions, materials and conditions encountered in practice.

  18. Low conductive support for thermal insulation of a sample holder of a variable temperature scanning tunneling microscope

    NASA Astrophysics Data System (ADS)

    Hanzelka, Pavel; Vonka, Jakub; Musilova, Vera

    2013-08-01

    We have designed a supporting system to fix a sample holder of a scanning tunneling microscope in an UHV chamber at room temperature. The microscope will operate down to a temperature of 20 K. Low thermal conductance, high mechanical stiffness, and small dimensions are the main features of the supporting system. Three sets of four glass balls placed in vertices of a tetrahedron are used for thermal insulation based on small contact areas between the glass balls. We have analyzed the thermal conductivity of the contacts between the balls mutually and between a ball and a metallic plate while the results have been applied to the entire support. The calculation based on a simple model of the setup has been verified with some experimental measurements. In comparison with other feasible supporting structures, the designed support has the lowest thermal conductance.

  19. Low conductive support for thermal insulation of a sample holder of a variable temperature scanning tunneling microscope.

    PubMed

    Hanzelka, Pavel; Vonka, Jakub; Musilova, Vera

    2013-08-01

    We have designed a supporting system to fix a sample holder of a scanning tunneling microscope in an UHV chamber at room temperature. The microscope will operate down to a temperature of 20 K. Low thermal conductance, high mechanical stiffness, and small dimensions are the main features of the supporting system. Three sets of four glass balls placed in vertices of a tetrahedron are used for thermal insulation based on small contact areas between the glass balls. We have analyzed the thermal conductivity of the contacts between the balls mutually and between a ball and a metallic plate while the results have been applied to the entire support. The calculation based on a simple model of the setup has been verified with some experimental measurements. In comparison with other feasible supporting structures, the designed support has the lowest thermal conductance.

  20. Beyond Fine Tuning: Adding capacity to leverage few labels

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hodas, Nathan O.; Shaffer, Kyle J.; Yankov, Artem

    2017-12-09

    In this paper we present a technique to train neural network models on small amounts of data. Current methods for training neural networks on small amounts of rich data typically rely on strategies such as fine-tuning a pre-trained neural networks or the use of domain-specific hand-engineered features. Here we take the approach of treating network layers, or entire networks, as modules and combine pre-trained modules with untrained modules, to learn the shift in distributions between data sets. The central impact of using a modular approach comes from adding new representations to a network, as opposed to replacing representations via fine-tuning.more » Using this technique, we are able surpass results using standard fine-tuning transfer learning approaches, and we are also able to significantly increase performance over such approaches when using smaller amounts of data.« less

  1. On the spatial distribution of small heavy particles in homogeneous shear turbulence

    NASA Astrophysics Data System (ADS)

    Nicolai, C.; Jacob, B.; Piva, R.

    2013-08-01

    We report on a novel experiment aimed at investigating the effects induced by a large-scale velocity gradient on the turbulent transport of small heavy particles. To this purpose, a homogeneous shear flow at Reλ = 540 and shear parameter S* = 4.5 is set-up and laden with glass spheres whose size d is comparable with the Kolmogorov lengthscale η of the flow (d/η ≈ 1). The particle Stokes number is approximately 0.3. The analysis of the instantaneous particle fields by means of Voronoï diagrams confirms the occurrence of intense turbulent clustering at small scales, as observed in homogeneous isotropic flows. It also indicates that the anisotropy of the velocity fluctuations induces a preferential orientation of the particle clusters. In order to characterize the fine-scale features of the dispersed phase, spatial correlations of the particle field are employed in conjunction with statistical tools recently developed for anisotropic turbulence. The scale-by-scale analysis of the particle field clarifies that isotropy of the particle distribution is tendentially recovered at small separations, even though the signatures of the mean shear persist down to smaller scales as compared to the fluid velocity field.

  2. The core contribution of transmission electron microscopy to functional nanomaterials engineering

    NASA Astrophysics Data System (ADS)

    Carenco, Sophie; Moldovan, Simona; Roiban, Lucian; Florea, Ileana; Portehault, David; Vallé, Karine; Belleville, Philippe; Boissière, Cédric; Rozes, Laurence; Mézailles, Nicolas; Drillon, Marc; Sanchez, Clément; Ersen, Ovidiu

    2016-01-01

    Research on nanomaterials and nanostructured materials is burgeoning because their numerous and versatile applications contribute to solve societal needs in the domain of medicine, energy, environment and STICs. Optimizing their properties requires in-depth analysis of their structural, morphological and chemical features at the nanoscale. In a transmission electron microscope (TEM), combining tomography with electron energy loss spectroscopy and high-magnification imaging in high-angle annular dark-field mode provides access to all features of the same object. Today, TEM experiments in three dimensions are paramount to solve tough structural problems associated with nanoscale matter. This approach allowed a thorough morphological description of silica fibers. Moreover, quantitative analysis of the mesoporous network of binary metal oxide prepared by template-assisted spray-drying was performed, and the homogeneity of amino functionalized metal-organic frameworks was assessed. Besides, the morphology and internal structure of metal phosphide nanoparticles was deciphered, providing a milestone for understanding phase segregation at the nanoscale. By extrapolating to larger classes of materials, from soft matter to hard metals and/or ceramics, this approach allows probing small volumes and uncovering materials characteristics and properties at two or three dimensions. Altogether, this feature article aims at providing (nano)materials scientists with a representative set of examples that illustrates the capabilities of modern TEM and tomography, which can be transposed to their own research.Research on nanomaterials and nanostructured materials is burgeoning because their numerous and versatile applications contribute to solve societal needs in the domain of medicine, energy, environment and STICs. Optimizing their properties requires in-depth analysis of their structural, morphological and chemical features at the nanoscale. In a transmission electron microscope (TEM), combining tomography with electron energy loss spectroscopy and high-magnification imaging in high-angle annular dark-field mode provides access to all features of the same object. Today, TEM experiments in three dimensions are paramount to solve tough structural problems associated with nanoscale matter. This approach allowed a thorough morphological description of silica fibers. Moreover, quantitative analysis of the mesoporous network of binary metal oxide prepared by template-assisted spray-drying was performed, and the homogeneity of amino functionalized metal-organic frameworks was assessed. Besides, the morphology and internal structure of metal phosphide nanoparticles was deciphered, providing a milestone for understanding phase segregation at the nanoscale. By extrapolating to larger classes of materials, from soft matter to hard metals and/or ceramics, this approach allows probing small volumes and uncovering materials characteristics and properties at two or three dimensions. Altogether, this feature article aims at providing (nano)materials scientists with a representative set of examples that illustrates the capabilities of modern TEM and tomography, which can be transposed to their own research. Electronic supplementary information (ESI) available. See DOI: 10.1039/c5nr05460e

  3. 48 CFR 52.219-6 - Notice of Total Small Business Set-Aside.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... Clauses 52.219-6 Notice of Total Small Business Set-Aside. As prescribed in 19.508(c), insert the following clause: Notice of Total Small Business Set-Aside (JUN 2003) (a) Definition. Small business concern... qualified as a small business under the size standards in this solicitation. (b) General. (1) Offers are...

  4. 48 CFR 52.219-6 - Notice of Total Small Business Set-Aside.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... Clauses 52.219-6 Notice of Total Small Business Set-Aside. As prescribed in 19.508(c), insert the following clause: Notice of Total Small Business Set-Aside (JUN 2003) (a) Definition. Small business concern... qualified as a small business under the size standards in this solicitation. (b) General. (1) Offers are...

  5. McTwo: a two-step feature selection algorithm based on maximal information coefficient.

    PubMed

    Ge, Ruiquan; Zhou, Manli; Luo, Youxi; Meng, Qinghan; Mai, Guoqin; Ma, Dongli; Wang, Guoqing; Zhou, Fengfeng

    2016-03-23

    High-throughput bio-OMIC technologies are producing high-dimension data from bio-samples at an ever increasing rate, whereas the training sample number in a traditional experiment remains small due to various difficulties. This "large p, small n" paradigm in the area of biomedical "big data" may be at least partly solved by feature selection algorithms, which select only features significantly associated with phenotypes. Feature selection is an NP-hard problem. Due to the exponentially increased time requirement for finding the globally optimal solution, all the existing feature selection algorithms employ heuristic rules to find locally optimal solutions, and their solutions achieve different performances on different datasets. This work describes a feature selection algorithm based on a recently published correlation measurement, Maximal Information Coefficient (MIC). The proposed algorithm, McTwo, aims to select features associated with phenotypes, independently of each other, and achieving high classification performance of the nearest neighbor algorithm. Based on the comparative study of 17 datasets, McTwo performs about as well as or better than existing algorithms, with significantly reduced numbers of selected features. The features selected by McTwo also appear to have particular biomedical relevance to the phenotypes from the literature. McTwo selects a feature subset with very good classification performance, as well as a small feature number. So McTwo may represent a complementary feature selection algorithm for the high-dimensional biomedical datasets.

  6. Feature selection gait-based gender classification under different circumstances

    NASA Astrophysics Data System (ADS)

    Sabir, Azhin; Al-Jawad, Naseer; Jassim, Sabah

    2014-05-01

    This paper proposes a gender classification based on human gait features and investigates the problem of two variations: clothing (wearing coats) and carrying bag condition as addition to the normal gait sequence. The feature vectors in the proposed system are constructed after applying wavelet transform. Three different sets of feature are proposed in this method. First, Spatio-temporal distance that is dealing with the distance of different parts of the human body (like feet, knees, hand, Human Height and shoulder) during one gait cycle. The second and third feature sets are constructed from approximation and non-approximation coefficient of human body respectively. To extract these two sets of feature we divided the human body into two parts, upper and lower body part, based on the golden ratio proportion. In this paper, we have adopted a statistical method for constructing the feature vector from the above sets. The dimension of the constructed feature vector is reduced based on the Fisher score as a feature selection method to optimize their discriminating significance. Finally k-Nearest Neighbor is applied as a classification method. Experimental results demonstrate that our approach is providing more realistic scenario and relatively better performance compared with the existing approaches.

  7. Modeling Stokes flow in real pore geometries derived by high resolution micro CT imaging

    NASA Astrophysics Data System (ADS)

    Halisch, M.; Müller, C.

    2012-04-01

    Meanwhile, numerical modeling of rock properties forms an important part of modern petrophysics. Substantially, equivalent rock models are used to describe and assess specific properties and phenomena, like fluid transport or complex electrical properties. In recent years, non-destructive computed X-ray tomography got more and more important - not only to take a quick and three dimensional look into rock samples but also to get access to in-situ sample information for highly accurate modeling purposes. Due to - by now - very high resolution of the 3D CT data sets (micron- to submicron range) also very small structures and sample features - e.g. micro porosity - can be visualized and used for numerical models of very high accuracy. Special demands even arise before numerical modeling can take place. Inappropriate filter applications (e.g. improper type of filter, wrong kernel, etc.) may lead to a significant corruption of spatial sample structure and / or even sample or void space volume. Because of these difficulties, especially small scale mineral- and pore space textures are very often lost and valuable in-situ information is erased. Segmentation of important sample features - porosity as well as rock matrix - based upon grayscale values strongly depends upon the scan quality and upon the experience of the application engineer, respectively. If the threshold for matrix-porosity separation is set too low, porosity can be quickly (and even more, due to restrictions of scanning resolution) underestimated. Contrary to this, a too high threshold over-determines porosity and small void space features as well as interfaces are changed and falsified. Image based phase separation in close combination with "conventional" analytics, as scanning electron microscopy or thin sectioning, greatly increase the reliability of this preliminary work. For segmentation and quantification purposes, a special CT imaging and processing software (Avizo Fire) has been used. By using this tool, 3D rock data can be assessed and interpreted by petrophysical means. Furthermore, pore structures can be directly segmented and hence could be used for so called image based modeling approach. The special XLabHydro module grants a finite volume solver for the direct assessment of Stokes flow (incompressible fluid, constant dynamic viscosity, stationary conditions and laminar flow) in real pore geometries. Nevertheless, also pore network extraction and numerical modeling with standard FE or lattice Boltzmann solvers is possible. By using the achieved voxel resolution as smallest node distance, fluid flow properties can be analyzed even in very small sample structures and hence with very high accuracy, especially with interaction to bigger parts of the pore network. The so derived results in combination with a direct 3D visualization within the structures offer great new insights and understanding in range of meso- and microscopic pore space phenomena.

  8. Methylation guide RNA evolution in archaea: structure, function and genomic organization of 110 C/D box sRNA families across six Pyrobaculum species.

    PubMed

    Lui, Lauren M; Uzilov, Andrew V; Bernick, David L; Corredor, Andrea; Lowe, Todd M; Dennis, Patrick P

    2018-05-16

    Archaeal homologs of eukaryotic C/D box small nucleolar RNAs (C/D box sRNAs) guide precise 2'-O-methyl modification of ribosomal and transfer RNAs. Although C/D box sRNA genes constitute one of the largest RNA gene families in archaeal thermophiles, most genomes have incomplete sRNA gene annotation because reliable, fully automated detection methods are not available. We expanded and curated a comprehensive gene set across six species of the crenarchaeal genus Pyrobaculum, particularly rich in C/D box sRNA genes. Using high-throughput small RNA sequencing, specialized computational searches and comparative genomics, we analyzed 526 Pyrobaculum C/D box sRNAs, organizing them into 110 families based on synteny and conservation of guide sequences which determine methylation targets. We examined gene duplications and rearrangements, including one family that has expanded in a pattern similar to retrotransposed repetitive elements in eukaryotes. New training data and inclusion of kink-turn secondary structural features enabled creation of an improved search model. Our analyses provide the most comprehensive, dynamic view of C/D box sRNA evolutionary history within a genus, in terms of modification function, feature plasticity, and gene mobility.

  9. Using ground penetrating radar in levee assessment to detect small scale animal burrows

    NASA Astrophysics Data System (ADS)

    Chlaib, Hussein K.; Mahdi, Hanan; Al-Shukri, Haydar; Su, Mehmet M.; Catakli, Aycan; Abd, Najah

    2014-04-01

    Levees are civil engineering structures built to protect human lives, property, and agricultural lands during flood events. To keep these important structures in a safe condition, continuous monitoring must be performed regularly and thoroughly. Small rodent burrows are one of the major defects within levees; however, their early detection and repair helps in protecting levees during flooding events. A set of laboratory experiments was conducted to analyze the polarity change in GPR signals in the presence of subsurface voids and water-filled cavities. Ground Penetrating Radar (GPR) surveys using multi frequency antennas (400 MHz and 900 MHz) were conducted along an 875 meter section of the Lollie Levee near Conway, Arkansas, USA, to assess the levee's structural integrity. Many subsurface animal burrows, water-filled cavities, clay clasts, and metallic objects were investigated and identified. These anomalies were located at different depths and have different sizes. To ground truth the observations, hand dug trenches were excavated to confirm several anomalies. Results show an excellent match between GPR interpreted anomalies and the observed features. In-situ dielectric constant measurements were used to calculate the feature depths. The results of this research show that the 900 MHz antenna has more advantages over the 400 MHz antenna.

  10. Relational particle models: I. Reconciliation with standard classical and quantum theory

    NASA Astrophysics Data System (ADS)

    Anderson, Edward

    2006-04-01

    This paper concerns the absolute versus relative motion debate. The Barbour and Bertotti (1982) work may be viewed as an indirectly set up relational formulation of a portion of Newtonian mechanics. I consider further direct formulations of this and argue that the portion in question—universes with zero total angular momentum that are conservative and with kinetic terms that are (homogeneous) quadratic in their velocities—is capable of accommodating a wide range of classical physics phenomena. Furthermore, as I develop in paper II, this relational particle model is a useful toy model for canonical general relativity. I consider what happens if one quantizes relational rather than absolute mechanics, indeed whether the latter is misleading. By exploiting Jacobi coordinates, I show how to access many examples of quantized relational particle models and then interpret these from a relational perspective. By these means, previous suggestions of bad semiclassicality for such models can be eluded. I show how small (particle number) universe relational particle model examples display eigenspectrum truncation, gaps, energy interlocking and counterbalanced total angular momentum. These features mean that these small universe models make interesting toy models for some aspects of closed-universe quantum cosmology. Meanwhile, these features do not compromise the recovery of reality as regards the practicalities of experimentation in a large universe such as our own.

  11. Nodal bilayer-splitting controlled by spin-orbit interactions in underdoped high-T c cuprates

    DOE PAGES

    Harrison, N.; Ramshaw, B. J.; Shekhter, A.

    2015-06-03

    The highest superconducting transition temperatures in the cuprates are achieved in bilayer and trilayer systems, highlighting the importance of interlayer interactions for high T c. It has been argued that interlayer hybridization vanishes along the nodal directions by way of a specific pattern of orbital overlap. Recent quantum oscillation measurements in bilayer cuprates have provided evidence for a residual bilayer-splitting at the nodes that is sufficiently small to enable magnetic breakdown tunneling at the nodes. Here we show that several key features of the experimental data can be understood in terms of weak spin-orbit interactions naturally present in bilayer systems,more » whose primary effect is to cause the magnetic breakdown to be accompanied by a spin flip. These features can now be understood to include the equidistant set of three quantum oscillation frequencies, the asymmetry of the quantum oscillation amplitudes in c-axis transport compared to ab-plane transport, and the anomalous magnetic field angle dependence of the amplitude of the side frequencies suggestive of small effective g-factors. We suggest that spin-orbit interactions in bilayer systems can further affect the structure of the nodal quasiparticle spectrum in the superconducting phase. PACS numbers: 71.45.Lr, 71.20.Ps, 71.18.+y« less

  12. Ecomorphology of the eyes and skull in zooplanktivorous labrid fishes

    NASA Astrophysics Data System (ADS)

    Schmitz, L.; Wainwright, P. C.

    2011-06-01

    Zooplanktivory is one of the most distinct trophic niches in coral reef fishes, and a number of skull traits are widely recognized as being adaptations for feeding in midwater on small planktonic prey. Previous studies have concluded that zooplanktivores have larger eyes for sharper visual acuity, reduced mouth structures to match small prey sizes, and longer gill rakers to help retain captured prey. We tested these three traditional hypotheses plus two novel adaptive hypotheses in labrids, a clade of very diverse coral reef fishes that show multiple independent evolutionary origins of zooplanktivory. Using phylogenetic comparative methods with a data set from 21 species, we failed to find larger eyes in three independent transitions to zooplanktivory. Instead, an impression of large eyes may be caused by a size reduction of the anterior facial region. However, two zooplanktivores ( Clepticus parrae and Halichoeres pictus) possess several features interpreted as adaptations to zooplankton feeding, namely large lens diameters relative to eye axial length, round pupil shape, and long gill rakers. The third zooplanktivore in our analysis, Cirrhilabrus solorensis, lacks all above features. It remains unclear whether Cirrhilabrus shows optical specializations for capturing planktonic prey. Our results support the prediction that increased visual acuity is adaptive for zooplanktivory, but in labrids increases in eye size are apparently not part of the evolutionary response.

  13. Automatic extraction of planetary image features

    NASA Technical Reports Server (NTRS)

    LeMoigne-Stewart, Jacqueline J. (Inventor); Troglio, Giulia (Inventor); Benediktsson, Jon A. (Inventor); Serpico, Sebastiano B. (Inventor); Moser, Gabriele (Inventor)

    2013-01-01

    A method for the extraction of Lunar data and/or planetary features is provided. The feature extraction method can include one or more image processing techniques, including, but not limited to, a watershed segmentation and/or the generalized Hough Transform. According to some embodiments, the feature extraction method can include extracting features, such as, small rocks. According to some embodiments, small rocks can be extracted by applying a watershed segmentation algorithm to the Canny gradient. According to some embodiments, applying a watershed segmentation algorithm to the Canny gradient can allow regions that appear as close contours in the gradient to be segmented.

  14. Rough sets and Laplacian score based cost-sensitive feature selection

    PubMed Central

    Yu, Shenglong

    2018-01-01

    Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of “good” features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms. PMID:29912884

  15. Rough sets and Laplacian score based cost-sensitive feature selection.

    PubMed

    Yu, Shenglong; Zhao, Hong

    2018-01-01

    Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of "good" features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms.

  16. MUSIC-Expected maximization gaussian mixture methodology for clustering and detection of task-related neuronal firing rates.

    PubMed

    Ortiz-Rosario, Alexis; Adeli, Hojjat; Buford, John A

    2017-01-15

    Researchers often rely on simple methods to identify involvement of neurons in a particular motor task. The historical approach has been to inspect large groups of neurons and subjectively separate neurons into groups based on the expertise of the investigator. In cases where neuron populations are small it is reasonable to inspect these neuronal recordings and their firing rates carefully to avoid data omissions. In this paper, a new methodology is presented for automatic objective classification of neurons recorded in association with behavioral tasks into groups. By identifying characteristics of neurons in a particular group, the investigator can then identify functional classes of neurons based on their relationship to the task. The methodology is based on integration of a multiple signal classification (MUSIC) algorithm to extract relevant features from the firing rate and an expectation-maximization Gaussian mixture algorithm (EM-GMM) to cluster the extracted features. The methodology is capable of identifying and clustering similar firing rate profiles automatically based on specific signal features. An empirical wavelet transform (EWT) was used to validate the features found in the MUSIC pseudospectrum and the resulting signal features captured by the methodology. Additionally, this methodology was used to inspect behavioral elements of neurons to physiologically validate the model. This methodology was tested using a set of data collected from awake behaving non-human primates. Copyright © 2016 Elsevier B.V. All rights reserved.

  17. MiRNA-miRNA synergistic network: construction via co-regulating functional modules and disease miRNA topological features.

    PubMed

    Xu, Juan; Li, Chuan-Xing; Li, Yong-Sheng; Lv, Jun-Ying; Ma, Ye; Shao, Ting-Ting; Xu, Liang-De; Wang, Ying-Ying; Du, Lei; Zhang, Yun-Peng; Jiang, Wei; Li, Chun-Quan; Xiao, Yun; Li, Xia

    2011-02-01

    Synergistic regulations among multiple microRNAs (miRNAs) are important to understand the mechanisms of complex post-transcriptional regulations in humans. Complex diseases are affected by several miRNAs rather than a single miRNA. So, it is a challenge to identify miRNA synergism and thereby further determine miRNA functions at a system-wide level and investigate disease miRNA features in the miRNA-miRNA synergistic network from a new view. Here, we constructed a miRNA-miRNA functional synergistic network (MFSN) via co-regulating functional modules that have three features: common targets of corresponding miRNA pairs, enriched in the same gene ontology category and close proximity in the protein interaction network. Predicted miRNA synergism is validated by significantly high co-expression of functional modules and significantly negative regulation to functional modules. We found that the MFSN exhibits a scale free, small world and modular architecture. Furthermore, the topological features of disease miRNAs in the MFSN are distinct from non-disease miRNAs. They have more synergism, indicating their higher complexity of functions and are the global central cores of the MFSN. In addition, miRNAs associated with the same disease are close to each other. The structure of the MFSN and the features of disease miRNAs are validated to be robust using different miRNA target data sets.

  18. On the zero-bias anomaly and Kondo physics in quantum point contacts near pinch-off.

    PubMed

    Xiang, S; Xiao, S; Fuji, K; Shibuya, K; Endo, T; Yumoto, N; Morimoto, T; Aoki, N; Bird, J P; Ochiai, Y

    2014-03-26

    We investigate the linear and non-linear conductance of quantum point contacts (QPCs), in the region near pinch-off where Kondo physics has previously been connected to the appearance of the 0.7 feature. In studies of seven different QPCs, fabricated in the same high-mobility GaAs/AlGaAs heterojunction, the linear conductance is widely found to show the presence of the 0.7 feature. The differential conductance, on the other hand, does not generally exhibit the zero-bias anomaly (ZBA) that has been proposed to indicate the Kondo effect. Indeed, even in the small subset of QPCs found to exhibit such an anomaly, the linear conductance does not always follow the universal temperature-dependent scaling behavior expected for the Kondo effect. Taken collectively, our observations demonstrate that, unlike the 0.7 feature, the ZBA is not a generic feature of low-temperature QPC conduction. We furthermore conclude that the mere observation of the ZBA alone is insufficient evidence for concluding that Kondo physics is active. While we do not rule out the possibility that the Kondo effect may occur in QPCs, our results appear to indicate that its observation requires a very strict set of conditions to be satisfied. This should be contrasted with the case of the 0.7 feature, which has been apparent since the earliest experimental investigations of QPC transport.

  19. Evaluation of Shape and Textural Features from CT as Prognostic Biomarkers in Non-small Cell Lung Cancer.

    PubMed

    Bianconi, Francesco; Fravolini, Mario Luca; Bello-Cerezo, Raquel; Minestrini, Matteo; Scialpi, Michele; Palumbo, Barbara

    2018-04-01

    We retrospectively investigated the prognostic potential (correlation with overall survival) of 9 shape and 21 textural features from non-contrast-enhanced computed tomography (CT) in patients with non-small-cell lung cancer. We considered a public dataset of 203 individuals with inoperable, histologically- or cytologically-confirmed NSCLC. Three-dimensional shape and textural features from CT were computed using proprietary code and their prognostic potential evaluated through four different statistical protocols. Volume and grey-level run length matrix (GLRLM) run length non-uniformity were the only two features to pass all four protocols. Both features correlated negatively with overall survival. The results also showed a strong dependence on the evaluation protocol used. Tumour volume and GLRLM run-length non-uniformity from CT were the best predictor of survival in patients with non-small-cell lung cancer. We did not find enough evidence to claim a relationship with survival for the other features. Copyright© 2018, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.

  20. `Dem DEMs: Comparing Methods of Digital Elevation Model Creation

    NASA Astrophysics Data System (ADS)

    Rezza, C.; Phillips, C. B.; Cable, M. L.

    2017-12-01

    Topographic details of Europa's surface yield implications for large-scale processes that occur on the moon, including surface strength, modification, composition, and formation mechanisms for geologic features. In addition, small scale details presented from this data are imperative for future exploration of Europa's surface, such as by a potential Europa Lander mission. A comparison of different methods of Digital Elevation Model (DEM) creation and variations between them can help us quantify the relative accuracy of each model and improve our understanding of Europa's surface. In this work, we used data provided by Phillips et al. (2013, AGU Fall meeting, abs. P34A-1846) and Schenk and Nimmo (2017, in prep.) to compare DEMs that were created using Ames Stereo Pipeline (ASP), SOCET SET, and Paul Schenk's own method. We began by locating areas of the surface with multiple overlapping DEMs, and our initial comparisons were performed near the craters Manannan, Pwyll, and Cilix. For each region, we used ArcGIS to draw profile lines across matching features to determine elevation. Some of the DEMs had vertical or skewed offsets, and thus had to be corrected. The vertical corrections were applied by adding or subtracting the global minimum of the data set to create a common zero-point. The skewed data sets were corrected by rotating the plot so that it had a global slope of zero and then subtracting for a zero-point vertical offset. Once corrections were made, we plotted the three methods on one graph for each profile of each region. Upon analysis, we found relatively good feature correlation between the three methods. The smoothness of a DEM depends on both the input set of images and the stereo processing methods used. In our comparison, the DEMs produced by SOCET SET were less smoothed than those from ASP or Schenk. Height comparisons show that ASP and Schenk's model appear similar, alternating in maximum height. SOCET SET has more topographic variability due to its decreased smoothing, which is borne out by preliminary offset calculations. In the future, we plan to expand upon this preliminary work with more regions of Europa, continue quantifying the height differences and relative accuracy of each method, and generate more DEMs to expand our available comparison regions.

  1. 48 CFR 6.203 - Set-asides for small business concerns.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 6.203 Set-asides for small business concerns. (a) To fulfill the statutory requirements relating to small business concerns, contracting officers may set aside solicitations to allow only such business concerns to compete. This includes contract actions conducted under the Small Business Innovation Research...

  2. 48 CFR 6.203 - Set-asides for small business concerns.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 6.203 Set-asides for small business concerns. (a) To fulfill the statutory requirements relating to small business concerns, contracting officers may set aside solicitations to allow only such business concerns to compete. This includes contract actions conducted under the Small Business Innovation Research...

  3. 48 CFR 6.203 - Set-asides for small business concerns.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 6.203 Set-asides for small business concerns. (a) To fulfill the statutory requirements relating to small business concerns, contracting officers may set aside solicitations to allow only such business concerns to compete. This includes contract actions conducted under the Small Business Innovation Research...

  4. 48 CFR 6.203 - Set-asides for small business concerns.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 6.203 Set-asides for small business concerns. (a) To fulfill the statutory requirements relating to small business concerns, contracting officers may set aside solicitations to allow only such business concerns to compete. This includes contract actions conducted under the Small Business Innovation Research...

  5. Research of maneuvering target prediction and tracking technology based on IMM algorithm

    NASA Astrophysics Data System (ADS)

    Cao, Zheng; Mao, Yao; Deng, Chao; Liu, Qiong; Chen, Jing

    2016-09-01

    Maneuvering target prediction and tracking technology is widely used in both military and civilian applications, the study of those technologies is all along the hotspot and difficulty. In the Electro-Optical acquisition-tracking-pointing system (ATP), the primary traditional maneuvering targets are ballistic target, large aircraft and other big targets. Those targets have the features of fast velocity and a strong regular trajectory and Kalman Filtering and polynomial fitting have good effects when they are used to track those targets. In recent years, the small unmanned aerial vehicles developed rapidly for they are small, nimble and simple operation. The small unmanned aerial vehicles have strong maneuverability in the observation system of ATP although they are close-in, slow and small targets. Moreover, those vehicles are under the manual operation, therefore, the acceleration of them changes greatly and they move erratically. So the prediction and tracking precision is low when traditional algorithms are used to track the maneuvering fly of those targets, such as speeding up, turning, climbing and so on. The interacting multiple model algorithm (IMM) use multiple models to match target real movement trajectory, there are interactions between each model. The IMM algorithm can switch model based on a Markov chain to adapt to the change of target movement trajectory, so it is suitable to solve the prediction and tracking problems of the small unmanned aerial vehicles because of the better adaptability of irregular movement. This paper has set up model set of constant velocity model (CV), constant acceleration model (CA), constant turning model (CT) and current statistical model. And the results of simulating and analyzing the real movement trajectory data of the small unmanned aerial vehicles show that the prediction and tracking technology based on the interacting multiple model algorithm can get relatively lower tracking error and improve tracking precision comparing with traditional algorithms.

  6. Detecting spam comments on Indonesia’s Instagram posts

    NASA Astrophysics Data System (ADS)

    Septiandri, Ali Akbar; Wibisono, Okiriza

    2017-01-01

    In this paper we experimented with several feature sets for detecting spam comments in social media contents authored by Indonesian public figures. We define spam comments as comments which have promotional purposes (e.g. referring other users to products and services) and thus not related to the content to which the comments are posted. Three sets of features are evaluated for detecting spams: (1) hand-engineered features such as comment length, number of capital letters, and number of emojis, (2) keyword features such as whether the comment contains advertising words or product-related words, and (3) text features, namely, bag-of-words, TF-IDF, and fastText embeddings, each combined with latent semantic analysis. With 24,000 manually-annotated comments scraped from Instagram posts authored by more than 100 Indonesian public figures, we compared the performance of these feature sets and their combinations using 3 popular classification algorithms: Na¨ıve Bayes, SVM, and XGBoost. We find that using all three feature sets (with fastText embedding for the text features) gave the best F 1-score of 0.9601 on a holdout dataset. More interestingly, fastText embedding combined with hand-engineered features (i.e. without keyword features) yield similar F 1-score of 0.9523, and McNemar’s test failed to reject the hypothesis that the two results are not significantly different. This result is important as keyword features are largely dependent on the dataset and may not be as generalisable as the other feature sets when applied to new data. For future work, we hope to collect bigger and more diverse dataset of Indonesian spam comments, improve our model’s performance and generalisability, and publish a programming package for others to reliably detect spam comments.

  7. Contingent attentional capture across multiple feature dimensions in a temporal search task.

    PubMed

    Ito, Motohiro; Kawahara, Jun I

    2016-01-01

    The present study examined whether attention can be flexibly controlled to monitor two different feature dimensions (shape and color) in a temporal search task. Specifically, we investigated the occurrence of contingent attentional capture (i.e., interference from task-relevant distractors) and resulting set reconfiguration (i.e., enhancement of single task-relevant set). If observers can restrict searches to a specific value for each relevant feature dimension independently, the capture and reconfiguration effect should only occur when the single relevant distractor in each dimension appears. Participants identified a target letter surrounded by a non-green square or a non-square green frame. The results revealed contingent attentional capture, as target identification accuracy was lower when the distractor contained a target-defining feature than when it contained a nontarget feature. Resulting set reconfiguration was also obtained in that accuracy was superior when the current target's feature (e.g., shape) corresponded to the defining feature of the present distractor (shape) than when the current target's feature did not match the distractor's feature (color). This enhancement was not due to perceptual priming. The present study demonstrated that the principles of contingent attentional capture and resulting set reconfiguration held even when multiple target feature dimensions were monitored. Copyright © 2015 Elsevier B.V. All rights reserved.

  8. New Features for Neuron Classification.

    PubMed

    Hernández-Pérez, Leonardo A; Delgado-Castillo, Duniel; Martín-Pérez, Rainer; Orozco-Morales, Rubén; Lorenzo-Ginori, Juan V

    2018-04-28

    This paper addresses the problem of obtaining new neuron features capable of improving results of neuron classification. Most studies on neuron classification using morphological features have been based on Euclidean geometry. Here three one-dimensional (1D) time series are derived from the three-dimensional (3D) structure of neuron instead, and afterwards a spatial time series is finally constructed from which the features are calculated. Digitally reconstructed neurons were separated into control and pathological sets, which are related to three categories of alterations caused by epilepsy, Alzheimer's disease (long and local projections), and ischemia. These neuron sets were then subjected to supervised classification and the results were compared considering three sets of features: morphological, features obtained from the time series and a combination of both. The best results were obtained using features from the time series, which outperformed the classification using only morphological features, showing higher correct classification rates with differences of 5.15, 3.75, 5.33% for epilepsy and Alzheimer's disease (long and local projections) respectively. The morphological features were better for the ischemia set with a difference of 3.05%. Features like variance, Spearman auto-correlation, partial auto-correlation, mutual information, local minima and maxima, all related to the time series, exhibited the best performance. Also we compared different evaluators, among which ReliefF was the best ranked.

  9. A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies.

    PubMed

    Khondoker, Mizanur; Dobson, Richard; Skirrow, Caroline; Simmons, Andrew; Stahl, Daniel

    2016-10-01

    Recent literature on the comparison of machine learning methods has raised questions about the neutrality, unbiasedness and utility of many comparative studies. Reporting of results on favourable datasets and sampling error in the estimated performance measures based on single samples are thought to be the major sources of bias in such comparisons. Better performance in one or a few instances does not necessarily imply so on an average or on a population level and simulation studies may be a better alternative for objectively comparing the performances of machine learning algorithms. We compare the classification performance of a number of important and widely used machine learning algorithms, namely the Random Forests (RF), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA) and k-Nearest Neighbour (kNN). Using massively parallel processing on high-performance supercomputers, we compare the generalisation errors at various combinations of levels of several factors: number of features, training sample size, biological variation, experimental variation, effect size, replication and correlation between features. For smaller number of correlated features, number of features not exceeding approximately half the sample size, LDA was found to be the method of choice in terms of average generalisation errors as well as stability (precision) of error estimates. SVM (with RBF kernel) outperforms LDA as well as RF and kNN by a clear margin as the feature set gets larger provided the sample size is not too small (at least 20). The performance of kNN also improves as the number of features grows and outplays that of LDA and RF unless the data variability is too high and/or effect sizes are too small. RF was found to outperform only kNN in some instances where the data are more variable and have smaller effect sizes, in which cases it also provide more stable error estimates than kNN and LDA. Applications to a number of real datasets supported the findings from the simulation study. © The Author(s) 2013.

  10. 48 CFR 6.205 - Set-asides for HUBZone small business concerns.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... small business concerns. 6.205 Section 6.205 Federal Acquisition Regulations System FEDERAL ACQUISITION... 6.205 Set-asides for HUBZone small business concerns. (a) To fulfill the statutory requirements... (see 19.1302) may set aside solicitations to allow only qualified HUBZone small business concerns to...

  11. 48 CFR 6.205 - Set-asides for HUBZone small business concerns.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... small business concerns. 6.205 Section 6.205 Federal Acquisition Regulations System FEDERAL ACQUISITION... 6.205 Set-asides for HUBZone small business concerns. (a) To fulfill the statutory requirements... (see 19.1302) may set aside solicitations to allow only qualified HUBZone small business concerns to...

  12. 48 CFR 6.205 - Set-asides for HUBZone small business concerns.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... small business concerns. 6.205 Section 6.205 Federal Acquisition Regulations System FEDERAL ACQUISITION... 6.205 Set-asides for HUBZone small business concerns. (a) To fulfill the statutory requirements... (see 19.1302) may set aside solicitations to allow only qualified HUBZone small business concerns to...

  13. 48 CFR 6.205 - Set-asides for HUBZone small business concerns.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... small business concerns. 6.205 Section 6.205 Federal Acquisition Regulations System FEDERAL ACQUISITION... 6.205 Set-asides for HUBZone small business concerns. (a) To fulfill the statutory requirements... (see 19.1302) may set aside solicitations to allow only qualified HUBZone small business concerns to...

  14. A broadcast-based key agreement scheme using set reconciliation for wireless body area networks.

    PubMed

    Ali, Aftab; Khan, Farrukh Aslam

    2014-05-01

    Information and communication technologies have thrived over the last few years. Healthcare systems have also benefited from this progression. A wireless body area network (WBAN) consists of small, low-power sensors used to monitor human physiological values remotely, which enables physicians to remotely monitor the health of patients. Communication security in WBANs is essential because it involves human physiological data. Key agreement and authentication are the primary issues in the security of WBANs. To agree upon a common key, the nodes exchange information with each other using wireless communication. This information exchange process must be secure enough or the information exchange should be minimized to a certain level so that if information leak occurs, it does not affect the overall system. Most of the existing solutions for this problem exchange too much information for the sake of key agreement; getting this information is sufficient for an attacker to reproduce the key. Set reconciliation is a technique used to reconcile two similar sets held by two different hosts with minimal communication complexity. This paper presents a broadcast-based key agreement scheme using set reconciliation for secure communication in WBANs. The proposed scheme allows the neighboring nodes to agree upon a common key with the personal server (PS), generated from the electrocardiogram (EKG) feature set of the host body. Minimal information is exchanged in a broadcast manner, and even if every node is missing a different subset, by reconciling these feature sets, the whole network will still agree upon a single common key. Because of the limited information exchange, if an attacker gets the information in any way, he/she will not be able to reproduce the key. The proposed scheme mitigates replay, selective forwarding, and denial of service attacks using a challenge-response authentication mechanism. The simulation results show that the proposed scheme has a great deal of adoptability in terms of security, communication overhead, and running time complexity, as compared to the existing EKG-based key agreement scheme.

  15. Movement trajectories and habitat partitioning of small mammals in logged and unlogged rain forests on Borneo.

    PubMed

    Wells, Konstans; Pfeiffer, Martin; Lakim, Maklarin B; Kalko, Elisabeth K V

    2006-09-01

    1. Non-volant animals in tropical rain forests differ in their ability to exploit the habitat above the forest floor and also in their response to habitat variability. It is predicted that specific movement trajectories are determined both by intrinsic factors such as ecological specialization, morphology and body size and by structural features of the surrounding habitat such as undergrowth and availability of supportive structures. 2. We applied spool-and-line tracking in order to describe movement trajectories and habitat segregation of eight species of small mammals from an assemblage of Muridae, Tupaiidae and Sciuridae in the rain forest of Borneo where we followed a total of 13,525 m path. We also analysed specific changes in the movement patterns of the small mammals in relation to habitat stratification between logged and unlogged forests. Variables related to climbing activity of the tracked species as well as the supportive structures of the vegetation and undergrowth density were measured along their tracks. 3. Movement patterns of the small mammals differed significantly between species. Most similarities were found in congeneric species that converged strongly in body size and morphology. All species were affected in their movement patterns by the altered forest structure in logged forests with most differences found in Leopoldamys sabanus. However, the large proportions of short step lengths found in all species for both forest types and similar path tortuosity suggest that the main movement strategies of the small mammals were not influenced by logging but comprised generally a response to the heterogeneous habitat as opposed to random movement strategies predicted for homogeneous environments. 4. Overall shifts in microhabitat use showed no coherent trend among species. Multivariate (principal component) analysis revealed contrasting trends for convergent species, in particular for Maxomys rajah and M. surifer as well as for Tupaia longipes and T. tana, suggesting that each species was uniquely affected in its movement trajectories by a multiple set of environmental and intrinsic features.

  16. Efficient feature selection using a hybrid algorithm for the task of epileptic seizure detection

    NASA Astrophysics Data System (ADS)

    Lai, Kee Huong; Zainuddin, Zarita; Ong, Pauline

    2014-07-01

    Feature selection is a very important aspect in the field of machine learning. It entails the search of an optimal subset from a very large data set with high dimensional feature space. Apart from eliminating redundant features and reducing computational cost, a good selection of feature also leads to higher prediction and classification accuracy. In this paper, an efficient feature selection technique is introduced in the task of epileptic seizure detection. The raw data are electroencephalography (EEG) signals. Using discrete wavelet transform, the biomedical signals were decomposed into several sets of wavelet coefficients. To reduce the dimension of these wavelet coefficients, a feature selection method that combines the strength of both filter and wrapper methods is proposed. Principal component analysis (PCA) is used as part of the filter method. As for wrapper method, the evolutionary harmony search (HS) algorithm is employed. This metaheuristic method aims at finding the best discriminating set of features from the original data. The obtained features were then used as input for an automated classifier, namely wavelet neural networks (WNNs). The WNNs model was trained to perform a binary classification task, that is, to determine whether a given EEG signal was normal or epileptic. For comparison purposes, different sets of features were also used as input. Simulation results showed that the WNNs that used the features chosen by the hybrid algorithm achieved the highest overall classification accuracy.

  17. The Nature of the Optical "Jets" in the Spiral Galaxy NGC 1097

    NASA Technical Reports Server (NTRS)

    Wehrle, Ann E.; Keel, William C.; Jones, Dayton L.

    1997-01-01

    We present new observations of the jet features in the barred spiral galaxy NGC 1097, including optical spectroscopy of the brightest jet features, two-color optical imagery, new VLA mapping at 327 MHz, and archival 1.4 GHz VLA data reprocessed for improved sensitivity. No optical emission lines appear to an equivalent width limit of 15-30 A (depending on the line wavelength). The jets are uniformly blue, with B - V = 0.45 for the two well-observed jets R1 and R2. No radio emission from the jets is detected at either frequency; the 327-MHz data set particularly stringent limits on "fossil" emission from aging synchrotron electrons. The morphology of the jets is shown to be inconsistent with any conical distribution of emission enhanced by edge-brightening; their combination of transverse profile and relative narrowness cannot be reproduced with cone models. The optical colors, lack of radio emission, and morphology of the features lead us to conclude that they are tidal manifestations, perhaps produced by multiple encounters of the small elliptical companion NGC 1097A with the disk of NGC 1097. We present photometric and morphological comparisons to the tail of NGC 465 1, which is similar in scale and morphology to the northeast "dogleg" feature R1 in NGC 1097.

  18. Semantic attributes for people's appearance description: an appearance modality for video surveillance applications

    NASA Astrophysics Data System (ADS)

    Frikha, Mayssa; Fendri, Emna; Hammami, Mohamed

    2017-09-01

    Using semantic attributes such as gender, clothes, and accessories to describe people's appearance is an appealing modeling method for video surveillance applications. We proposed a midlevel appearance signature based on extracting a list of nameable semantic attributes describing the body in uncontrolled acquisition conditions. Conventional approaches extract the same set of low-level features to learn the semantic classifiers uniformly. Their critical limitation is the inability to capture the dominant visual characteristics for each trait separately. The proposed approach consists of extracting low-level features in an attribute-adaptive way by automatically selecting the most relevant features for each attribute separately. Furthermore, relying on a small training-dataset would easily lead to poor performance due to the large intraclass and interclass variations. We annotated large scale people images collected from different person reidentification benchmarks covering a large attribute sample and reflecting the challenges of uncontrolled acquisition conditions. These annotations were gathered into an appearance semantic attribute dataset that contains 3590 images annotated with 14 attributes. Various experiments prove that carefully designed features for learning the visual characteristics for an attribute provide an improvement of the correct classification accuracy and a reduction of both spatial and temporal complexities against state-of-the-art approaches.

  19. On the use of infrasound for constraining global climate models

    NASA Astrophysics Data System (ADS)

    Millet, Christophe; Ribstein, Bruno; Lott, Francois; Cugnet, David

    2017-11-01

    Numerical prediction of infrasound is a complex issue due to constantly changing atmospheric conditions and to the random nature of small-scale flows. Although part of the upward propagating wave is refracted at stratospheric levels, where gravity waves significantly affect the temperature and the wind, yet the process by which the gravity wave field changes the infrasound arrivals remains poorly understood. In the present work, we use a stochastic parameterization to represent the subgrid scale gravity wave field from the atmospheric specifications provided by the European Centre for Medium-Range Weather Forecasts. It is shown that regardless of whether the gravity wave field possesses relatively small or large features, the sensitivity of acoustic waveforms to atmospheric disturbances can be extremely different. Using infrasound signals recorded during campaigns of ammunition destruction explosions, a new set of tunable parameters is proposed which more accurately predicts the small-scale content of gravity wave fields in the middle atmosphere. Climate simulations are performed using the updated parameterization. Numerical results demonstrate that a network of ground-based infrasound stations is a promising technology for dynamically tuning the gravity wave parameterization.

  20. Effect of intrinsic magnetic field decrease on the low- to middle-latitude upper atmosphere dynamics simulated by GAIA

    NASA Astrophysics Data System (ADS)

    Tao, C.; Jin, H.; Shinagawa, H.; Fujiwara, H.; Miyoshi, Y.

    2017-12-01

    The effects of decreasing the intrinsic magnetic field on the upper atmospheric dynamics at low to middle latitudes are investigated using the Ground-to-topside model of Atmosphere and Ionosphere for Aeronomy (GAIA). GAIA incorporates a meteorological reanalysis data set at low altitudes (<30 km), which enables us to investigate the atmospheric response to various waves under dynamic and chemical interactions with the ionosphere. In this simulation experiment, we reduced the magnetic field strength to as low as 10% of the current value. The averaged neutral velocity, density, and temperature at low to middle latitudes at 300 km altitude show little change with the magnetic field variation, while the dynamo field, current density, and the ionospheric conductivities are modified significantly. The wind velocity and tidal wave amplitude in the thermosphere remain large owing to the small constraint on plasma motion for a small field. On the other hand, the superrotation feature at the dip equator is weakened by 20% for a 10% magnetic field because the increase in ion drag for the small magnetic field prevents the superrotation.

  1. Effect of intrinsic magnetic field decrease on the low- to middle-latitude upper atmosphere dynamics simulated by GAIA

    NASA Astrophysics Data System (ADS)

    Tao, Chihiro; Jin, Hidekatsu; Shinagawa, Hiroyuki; Fujiwara, Hitoshi; Miyoshi, Yasunobu

    2017-09-01

    The effects of decreasing the intrinsic magnetic field on the upper atmospheric dynamics at low to middle latitudes are investigated using the Ground-to-topside model of Atmosphere and Ionosphere for Aeronomy (GAIA). GAIA incorporates a meteorological reanalysis data set at low altitudes (<30 km), which enables us to investigate the atmospheric response to various waves under dynamic and chemical interactions with the ionosphere. In this simulation experiment, we reduced the magnetic field strength to as low as 10% of the current value. The averaged neutral velocity, density, and temperature at low to middle latitudes at 300 km altitude show little change with the magnetic field variation, while the dynamo field, current density, and the ionospheric conductivities are modified significantly. The wind velocity and tidal wave amplitude in the thermosphere remain large owing to the small constraint on plasma motion for a small field. On the other hand, the superrotation feature at the dip equator is weakened by 20% for a 10% magnetic field because the increase in ion drag for the small magnetic field prevents the superrotation.

  2. Feature weight estimation for gene selection: a local hyperlinear learning approach

    PubMed Central

    2014-01-01

    Background Modeling high-dimensional data involving thousands of variables is particularly important for gene expression profiling experiments, nevertheless,it remains a challenging task. One of the challenges is to implement an effective method for selecting a small set of relevant genes, buried in high-dimensional irrelevant noises. RELIEF is a popular and widely used approach for feature selection owing to its low computational cost and high accuracy. However, RELIEF based methods suffer from instability, especially in the presence of noisy and/or high-dimensional outliers. Results We propose an innovative feature weighting algorithm, called LHR, to select informative genes from highly noisy data. LHR is based on RELIEF for feature weighting using classical margin maximization. The key idea of LHR is to estimate the feature weights through local approximation rather than global measurement, which is typically used in existing methods. The weights obtained by our method are very robust in terms of degradation of noisy features, even those with vast dimensions. To demonstrate the performance of our method, extensive experiments involving classification tests have been carried out on both synthetic and real microarray benchmark datasets by combining the proposed technique with standard classifiers, including the support vector machine (SVM), k-nearest neighbor (KNN), hyperplane k-nearest neighbor (HKNN), linear discriminant analysis (LDA) and naive Bayes (NB). Conclusion Experiments on both synthetic and real-world datasets demonstrate the superior performance of the proposed feature selection method combined with supervised learning in three aspects: 1) high classification accuracy, 2) excellent robustness to noise and 3) good stability using to various classification algorithms. PMID:24625071

  3. Learning better deep features for the prediction of occult invasive disease in ductal carcinoma in situ through transfer learning

    NASA Astrophysics Data System (ADS)

    Shi, Bibo; Hou, Rui; Mazurowski, Maciej A.; Grimm, Lars J.; Ren, Yinhao; Marks, Jeffrey R.; King, Lorraine M.; Maley, Carlo C.; Hwang, E. Shelley; Lo, Joseph Y.

    2018-02-01

    Purpose: To determine whether domain transfer learning can improve the performance of deep features extracted from digital mammograms using a pre-trained deep convolutional neural network (CNN) in the prediction of occult invasive disease for patients with ductal carcinoma in situ (DCIS) on core needle biopsy. Method: In this study, we collected digital mammography magnification views for 140 patients with DCIS at biopsy, 35 of which were subsequently upstaged to invasive cancer. We utilized a deep CNN model that was pre-trained on two natural image data sets (ImageNet and DTD) and one mammographic data set (INbreast) as the feature extractor, hypothesizing that these data sets are increasingly more similar to our target task and will lead to better representations of deep features to describe DCIS lesions. Through a statistical pooling strategy, three sets of deep features were extracted using the CNNs at different levels of convolutional layers from the lesion areas. A logistic regression classifier was then trained to predict which tumors contain occult invasive disease. The generalization performance was assessed and compared using repeated random sub-sampling validation and receiver operating characteristic (ROC) curve analysis. Result: The best performance of deep features was from CNN model pre-trained on INbreast, and the proposed classifier using this set of deep features was able to achieve a median classification performance of ROC-AUC equal to 0.75, which is significantly better (p<=0.05) than the performance of deep features extracted using ImageNet data set (ROCAUC = 0.68). Conclusion: Transfer learning is helpful for learning a better representation of deep features, and improves the prediction of occult invasive disease in DCIS.

  4. Real-Time Feature Tracking Using Homography

    NASA Technical Reports Server (NTRS)

    Clouse, Daniel S.; Cheng, Yang; Ansar, Adnan I.; Trotz, David C.; Padgett, Curtis W.

    2010-01-01

    This software finds feature point correspondences in sequences of images. It is designed for feature matching in aerial imagery. Feature matching is a fundamental step in a number of important image processing operations: calibrating the cameras in a camera array, stabilizing images in aerial movies, geo-registration of images, and generating high-fidelity surface maps from aerial movies. The method uses a Shi-Tomasi corner detector and normalized cross-correlation. This process is likely to result in the production of some mismatches. The feature set is cleaned up using the assumption that there is a large planar patch visible in both images. At high altitude, this assumption is often reasonable. A mathematical transformation, called an homography, is developed that allows us to predict the position in image 2 of any point on the plane in image 1. Any feature pair that is inconsistent with the homography is thrown out. The output of the process is a set of feature pairs, and the homography. The algorithms in this innovation are well known, but the new implementation improves the process in several ways. It runs in real-time at 2 Hz on 64-megapixel imagery. The new Shi-Tomasi corner detector tries to produce the requested number of features by automatically adjusting the minimum distance between found features. The homography-finding code now uses an implementation of the RANSAC algorithm that adjusts the number of iterations automatically to achieve a pre-set probability of missing a set of inliers. The new interface allows the caller to pass in a set of predetermined points in one of the images. This allows the ability to track the same set of points through multiple frames.

  5. What Top-Down Task Sets Do for Us: An ERP Study on the Benefits of Advance Preparation in Visual Search

    ERIC Educational Resources Information Center

    Eimer, Martin; Kiss, Monika; Nicholas, Susan

    2011-01-01

    When target-defining features are specified in advance, attentional target selection in visual search is controlled by preparatory top-down task sets. We used ERP measures to study voluntary target selection in the absence of such feature-specific task sets, and to compare it to selection that is guided by advance knowledge about target features.…

  6. Automatic analysis of stereoscopic satellite image pairs for determination of cloud-top height and structure

    NASA Technical Reports Server (NTRS)

    Hasler, A. F.; Strong, J.; Woodward, R. H.; Pierce, H.

    1991-01-01

    Results are presented on an automatic stereo analysis of cloud-top heights from nearly simultaneous satellite image pairs from the GOES and NOAA satellites, using a massively parallel processor computer. Comparisons of computer-derived height fields and manually analyzed fields show that the automatic analysis technique shows promise for performing routine stereo analysis in a real-time environment, providing a useful forecasting tool by augmenting observational data sets of severe thunderstorms and hurricanes. Simulations using synthetic stereo data show that it is possible to automatically resolve small-scale features such as 4000-m-diam clouds to about 1500 m in the vertical.

  7. A simple branching model that reproduces language family and language population distributions

    NASA Astrophysics Data System (ADS)

    Schwämmle, Veit; de Oliveira, Paulo Murilo Castro

    2009-07-01

    Human history leaves fingerprints in human languages. Little is known about language evolution and its study is of great importance. Here we construct a simple stochastic model and compare its results to statistical data of real languages. The model is based on the recent finding that language changes occur independently of the population size. We find agreement with the data additionally assuming that languages may be distinguished by having at least one among a finite, small number of different features. This finite set is also used in order to define the distance between two languages, similarly to linguistics tradition since Swadesh.

  8. Using R to unravel animal-sediment interactions.

    NASA Astrophysics Data System (ADS)

    Soetaert, Karline

    2017-04-01

    Marine sediments are often characterized by seabed features ranging from small sand ripples to large sandbanks. These sediments also form the living space of many marine organisms, impacting the sediment dynamics and the geochemical cycles. In a number of projects in the Northsea, we have started to investigate these interactions, combining field sampling with laboratory experiments and modelling. R is used to interpret the various data sets and to model the effects of biology and geomorphology on the geochemistry. I will discuss these new developments in R, based on my previous R-work (packages FME, ReacTran, deSolve, rootSolve, plot3D, marelac).

  9. Full Field X-Ray Fluorescence Imaging Using Micro Pore Optics for Planetary Surface Exploration

    NASA Technical Reports Server (NTRS)

    Sarrazin, P.; Blake, D. F.; Gailhanou, M.; Walter, P.; Schyns, E.; Marchis, F.; Thompson, K.; Bristow, T.

    2016-01-01

    Many planetary surface processes leave evidence as small features in the sub-millimetre scale. Current planetary X-ray fluorescence spectrometers lack the spatial resolution to analyse such small features as they only provide global analyses of areas greater than 100 mm(exp 2). A micro-XRF spectrometer will be deployed on the NASA Mars 2020 rover to analyse spots as small as 120m. When using its line-scanning capacity combined to perpendicular scanning by the rover arm, elemental maps can be generated. We present a new instrument that provides full-field XRF imaging, alleviating the need for precise positioning and scanning mechanisms. The Mapping X-ray Fluorescence Spectrometer - "Map-X" - will allow elemental imaging with approximately 100µm spatial resolution and simultaneously provide elemental chemistry at the scale where many relict physical, chemical and biological features can be imaged in ancient rocks. The arm-mounted Map-X instrument is placed directly on the surface of an object and held in a fixed position during measurements. A 25x25 mm(exp 2) surface area is uniformly illuminated with X-rays or alpha-particles and gamma-rays. A novel Micro Pore Optic focusses a fraction of the emitted X-ray fluorescence onto a CCD operated at a few frames per second. On board processing allows measuring the energy and coordinates of each X-ray photon collected. Large sets of frames are reduced into 2d histograms used to compute higher level data products such as elemental maps and XRF spectra from selected regions of interest. XRF spectra are processed on the ground to further determine quantitative elemental compositions. The instrument development will be presented with an emphasis on the characterization and modelling of the X-ray focussing Micro Pore Optic. An outlook on possible alternative XRF imaging applications will be discussed.

  10. How High is that Dune? A Comparison of Methods Used to Constrain the Morphometry of Aeolian Bedforms on Mars

    NASA Technical Reports Server (NTRS)

    Bourke, M.; Balme, M.; Beyer, R. A.; Williams, K. K.

    2004-01-01

    Methods traditionally used to estimate the relative height of surface features on Mars include: photoclinometry, shadow length and stereography. The MOLA data set enables a more accurate assessment of the surface topography of Mars. However, many small-scale aeolian bedforms remain below the sample resolution of the MOLA data set. In response to this a number of research teams have adopted and refined existing methods and applied them to high resolution (2-6 m/pixel) narrow angle MOC satellite images. Collectively, the methods provide data on a range of morphometric parameters (many not previously available for dunes on Mars). These include dune height, width, length, surface area, volume, longitudinal and cross profiles). This data will facilitate a more accurate analysis of aeolian bedforms on Mars. In this paper we undertake a comparative analysis of methods used to determine the height of aeolian dunes and ripples.

  11. Rational selection of structurally diverse natural product scaffolds with favorable ADME properties for drug discovery.

    PubMed

    Samiulla, D S; Vaidyanathan, V V; Arun, P C; Balan, G; Blaze, M; Bondre, S; Chandrasekhar, G; Gadakh, A; Kumar, R; Kharvi, G; Kim, H O; Kumar, S; Malikayil, J A; Moger, M; Mone, M K; Nagarjuna, P; Ogbu, C; Pendhalkar, D; Rao, A V S Raja; Rao, G Venkateshwar; Sarma, V K; Shaik, S; Sharma, G V R; Singh, S; Sreedhar, C; Sonawane, R; Timmanna, U; Hardy, L W

    2005-01-01

    Natural product analogs are significant sources for therapeutic agents. To capitalize efficiently on the effective features of naturally occurring substances, a natural product-based library production platform has been devised at Aurigene for drug lead discovery. This approach combines the attractive biological and physicochemical properties of natural product scaffolds, provided by eons of natural selection, with the chemical diversity available from parallel synthetic methods. Virtual property analysis, using computational methods described here, guides the selection of a set of natural product scaffolds that are both structurally diverse and likely to have favorable pharmacokinetic properties. The experimental characterization of several in vitro ADME properties of twenty of these scaffolds, and of a small set of designed congeners based upon one scaffold, is also described. These data confirm that most of the scaffolds and the designed library members have properties favorable to their utilization for creating libraries of lead-like molecules.

  12. Industrial Fuel Gas Demonstration-Plant Program. Volume II. The environment (Deliverable No. 27). [Baseline environmental data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Not Available

    1979-08-01

    The proposed site of the Industrial Fuel Gas Demonstration Plant (IFGDP) is located on a small peninsula extending eastward into Lake McKeller from the south shore. The peninsula is located west-southwest of the City of Memphis near the confluence of Lake McKeller and the Mississippi River. The environmental setting of this site and the region around this site is reported in terms of physical, biological, and human descriptions. Within the physical description, this report divides the environmental setting into sections on physiography, geology, hydrology, water quality, climatology, air quality, and ambient noise. The biological description is divided into sections onmore » aquatic and terrestrial ecology. Finally, the human environment description is reported in sections on land use, demography, socioeconomics, culture, and visual features. This section concludes with a discussion of physical environmental constraints.« less

  13. Nested Tracking Graphs

    DOE PAGES

    Lukasczyk, Jonas; Weber, Gunther; Maciejewski, Ross; ...

    2017-06-01

    Tracking graphs are a well established tool in topological analysis to visualize the evolution of components and their properties over time, i.e., when components appear, disappear, merge, and split. However, tracking graphs are limited to a single level threshold and the graphs may vary substantially even under small changes to the threshold. To examine the evolution of features for varying levels, users have to compare multiple tracking graphs without a direct visual link between them. We propose a novel, interactive, nested graph visualization based on the fact that the tracked superlevel set components for different levels are related to eachmore » other through their nesting hierarchy. This approach allows us to set multiple tracking graphs in context to each other and enables users to effectively follow the evolution of components for different levels simultaneously. We show the effectiveness of our approach on datasets from finite pointset methods, computational fluid dynamics, and cosmology simulations.« less

  14. Molecular dynamics simulations and docking enable to explore the biophysical factors controlling the yields of engineered nanobodies.

    PubMed

    Soler, Miguel A; de Marco, Ario; Fortuna, Sara

    2016-10-10

    Nanobodies (VHHs) have proved to be valuable substitutes of conventional antibodies for molecular recognition. Their small size represents a precious advantage for rational mutagenesis based on modelling. Here we address the problem of predicting how Camelidae nanobody sequences can tolerate mutations by developing a simulation protocol based on all-atom molecular dynamics and whole-molecule docking. The method was tested on two sets of nanobodies characterized experimentally for their biophysical features. One set contained point mutations introduced to humanize a wild type sequence, in the second the CDRs were swapped between single-domain frameworks with Camelidae and human hallmarks. The method resulted in accurate scoring approaches to predict experimental yields and enabled to identify the structural modifications induced by mutations. This work is a promising tool for the in silico development of single-domain antibodies and opens the opportunity to customize single functional domains of larger macromolecules.

  15. Molecular dynamics simulations and docking enable to explore the biophysical factors controlling the yields of engineered nanobodies

    NASA Astrophysics Data System (ADS)

    Soler, Miguel A.; De Marco, Ario; Fortuna, Sara

    2016-10-01

    Nanobodies (VHHs) have proved to be valuable substitutes of conventional antibodies for molecular recognition. Their small size represents a precious advantage for rational mutagenesis based on modelling. Here we address the problem of predicting how Camelidae nanobody sequences can tolerate mutations by developing a simulation protocol based on all-atom molecular dynamics and whole-molecule docking. The method was tested on two sets of nanobodies characterized experimentally for their biophysical features. One set contained point mutations introduced to humanize a wild type sequence, in the second the CDRs were swapped between single-domain frameworks with Camelidae and human hallmarks. The method resulted in accurate scoring approaches to predict experimental yields and enabled to identify the structural modifications induced by mutations. This work is a promising tool for the in silico development of single-domain antibodies and opens the opportunity to customize single functional domains of larger macromolecules.

  16. Classification of large-scale fundus image data sets: a cloud-computing framework.

    PubMed

    Roychowdhury, Sohini

    2016-08-01

    Large medical image data sets with high dimensionality require substantial amount of computation time for data creation and data processing. This paper presents a novel generalized method that finds optimal image-based feature sets that reduce computational time complexity while maximizing overall classification accuracy for detection of diabetic retinopathy (DR). First, region-based and pixel-based features are extracted from fundus images for classification of DR lesions and vessel-like structures. Next, feature ranking strategies are used to distinguish the optimal classification feature sets. DR lesion and vessel classification accuracies are computed using the boosted decision tree and decision forest classifiers in the Microsoft Azure Machine Learning Studio platform, respectively. For images from the DIARETDB1 data set, 40 of its highest-ranked features are used to classify four DR lesion types with an average classification accuracy of 90.1% in 792 seconds. Also, for classification of red lesion regions and hemorrhages from microaneurysms, accuracies of 85% and 72% are observed, respectively. For images from STARE data set, 40 high-ranked features can classify minor blood vessels with an accuracy of 83.5% in 326 seconds. Such cloud-based fundus image analysis systems can significantly enhance the borderline classification performances in automated screening systems.

  17. Effects of lateral boundary condition resolution and update frequency on regional climate model predictions

    NASA Astrophysics Data System (ADS)

    Pankatz, Klaus; Kerkweg, Astrid

    2015-04-01

    The work presented is part of the joint project "DecReg" ("Regional decadal predictability") which is in turn part of the project "MiKlip" ("Decadal predictions"), an effort funded by the German Federal Ministry of Education and Research to improve decadal predictions on a global and regional scale. In MiKlip, one big question is if regional climate modeling shows "added value", i.e. to evaluate, if regional climate models (RCM) produce better results than the driving models. However, the scope of this study is to look more closely at the setup specific details of regional climate modeling. As regional models only simulate a small domain, they have to inherit information about the state of the atmosphere at their lateral boundaries from external data sets. There are many unresolved questions concerning the setup of lateral boundary conditions (LBC). External data sets come from global models or from global reanalysis data-sets. A temporal resolution of six hours is common for this kind of data. This is mainly due to the fact, that storage space is a limiting factor, especially for climate simulations. However, theoretically, the coupling frequency could be as high as the time step of the driving model. Meanwhile, it is unclear if a more frequent update of the LBCs has a significant effect on the climate in the domain of the RCM. The first study examines how the RCM reacts to a higher update frequency. The study is based on a 30 year time slice experiment for three update frequencies of the LBC, namely six hours, one hour and six minutes. The evaluation of means, standard deviations and statistics of the climate in the regional domain shows only small deviations, some statistically significant though, of 2m temperature, sea level pressure and precipitation. The second part of the first study assesses parameters linked to cyclone activity, which is affected by the LBC update frequency. Differences in track density and strength are found when comparing the simulations. Theoretically, regional down-scaling should act like a magnifying glass. It should reveal details on small scales which a global model cannot resolve, but it should not affect the large scale flow. As the development of the small scale features takes some time, it is important that the air stays long enough within the regional domain. The spin-up time of the small scale features is, of course, dependent on the resolution of the LBC and the resolution of the RCM. The second study examines the quality of decadal hind-casts over Europe of the decade 2001-2010 when the horizontal resolution of the driving model, namely 2.8°, 1.8°, 1.4°, 1.1°, from which the LBC are calculated, is altered. The study shows, that a smaller resolution gap between LBC resolution and RCM resolution might be beneficial.

  18. The effect of feature selection methods on computer-aided detection of masses in mammograms

    NASA Astrophysics Data System (ADS)

    Hupse, Rianne; Karssemeijer, Nico

    2010-05-01

    In computer-aided diagnosis (CAD) research, feature selection methods are often used to improve generalization performance of classifiers and shorten computation times. In an application that detects malignant masses in mammograms, we investigated the effect of using a selection criterion that is similar to the final performance measure we are optimizing, namely the mean sensitivity of the system in a predefined range of the free-response receiver operating characteristics (FROC). To obtain the generalization performance of the selected feature subsets, a cross validation procedure was performed on a dataset containing 351 abnormal and 7879 normal regions, each region providing a set of 71 mass features. The same number of noise features, not containing any information, were added to investigate the ability of the feature selection algorithms to distinguish between useful and non-useful features. It was found that significantly higher performances were obtained using feature sets selected by the general test statistic Wilks' lambda than using feature sets selected by the more specific FROC measure. Feature selection leads to better performance when compared to a system in which all features were used.

  19. Automated Inference of Chemical Discriminants of Biological Activity.

    PubMed

    Raschka, Sebastian; Scott, Anne M; Huertas, Mar; Li, Weiming; Kuhn, Leslie A

    2018-01-01

    Ligand-based virtual screening has become a standard technique for the efficient discovery of bioactive small molecules. Following assays to determine the activity of compounds selected by virtual screening, or other approaches in which dozens to thousands of molecules have been tested, machine learning techniques make it straightforward to discover the patterns of chemical groups that correlate with the desired biological activity. Defining the chemical features that generate activity can be used to guide the selection of molecules for subsequent rounds of screening and assaying, as well as help design new, more active molecules for organic synthesis.The quantitative structure-activity relationship machine learning protocols we describe here, using decision trees, random forests, and sequential feature selection, take as input the chemical structure of a single, known active small molecule (e.g., an inhibitor, agonist, or substrate) for comparison with the structure of each tested molecule. Knowledge of the atomic structure of the protein target and its interactions with the active compound are not required. These protocols can be modified and applied to any data set that consists of a series of measured structural, chemical, or other features for each tested molecule, along with the experimentally measured value of the response variable you would like to predict or optimize for your project, for instance, inhibitory activity in a biological assay or ΔG binding . To illustrate the use of different machine learning algorithms, we step through the analysis of a dataset of inhibitor candidates from virtual screening that were tested recently for their ability to inhibit GPCR-mediated signaling in a vertebrate.

  20. EEG-based recognition of video-induced emotions: selecting subject-independent feature set.

    PubMed

    Kortelainen, Jukka; Seppänen, Tapio

    2013-01-01

    Emotions are fundamental for everyday life affecting our communication, learning, perception, and decision making. Including emotions into the human-computer interaction (HCI) could be seen as a significant step forward offering a great potential for developing advanced future technologies. While the electrical activity of the brain is affected by emotions, offers electroencephalogram (EEG) an interesting channel to improve the HCI. In this paper, the selection of subject-independent feature set for EEG-based emotion recognition is studied. We investigate the effect of different feature sets in classifying person's arousal and valence while watching videos with emotional content. The classification performance is optimized by applying a sequential forward floating search algorithm for feature selection. The best classification rate (65.1% for arousal and 63.0% for valence) is obtained with a feature set containing power spectral features from the frequency band of 1-32 Hz. The proposed approach substantially improves the classification rate reported in the literature. In future, further analysis of the video-induced EEG changes including the topographical differences in the spectral features is needed.

  1. [Impact of small-area context on health: proposing a conceptual model].

    PubMed

    Voigtländer, S; Mielck, A; Razum, O

    2012-11-01

    Recent empirical studies stress the impact of features related to the small-area context on individual health. However, so far there exists no standard explanatory model that integrates the different kinds of such features and that conceptualises their relation to individual characteristics of social inequality. A review of theoretical publications on the relationship between social position and health as well as existing conceptual models for the impact of features related to the small-area context on health was undertaken. In the present article we propose a conceptual model for the health impact of the small-area context. This model conceptualises the location of residence as one dimension of social inequality that affects health through the resources as well as stressors which are inherent in the small-area context. The proposed conceptual model offers an orientation for future empirical studies and can serve as a basis for further discussions concerning the health relevance of the small-area context. © Georg Thieme Verlag KG Stuttgart · New York.

  2. Unique effects of setting goals on behavior change: Systematic review and meta-analysis.

    PubMed

    Epton, Tracy; Currie, Sinead; Armitage, Christopher J

    2017-12-01

    Goal setting is a common feature of behavior change interventions, but it is unclear when goal setting is optimally effective. The aims of this systematic review and meta-analysis were to evaluate: (a) the unique effects of goal setting on behavior change, and (b) under what circumstances and for whom goal setting works best. Four databases were searched for articles that assessed the unique effects of goal setting on behavior change using randomized controlled trials. One-hundred and 41 papers were identified from which 384 effect sizes (N = 16,523) were extracted and analyzed. A moderator analysis of sample characteristics, intervention characteristics, inclusion of other behavior change techniques, study design and delivery, quality of study, outcome measures, and behavior targeted was conducted. A random effects model indicated a small positive unique effect of goal setting across a range of behaviors, d = .34 (CI [.28, .41]). Moderator analyses indicated that goal setting was particularly effective if the goal was: (a) difficult, (b) set publicly, and (c) was a group goal. There was weaker evidence that goal setting was more effective when paired with external monitoring of the behavior/outcome by others without feedback and delivered face-to-face. Goal setting is an effective behavior change technique that has the potential to be considered a fundamental component of successful interventions. The present review adds novel insights into the means by which goal setting might be augmented to maximize behavior change and sets the agenda for future programs of research. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  3. A linearized Euler analysis of unsteady flows in turbomachinery

    NASA Technical Reports Server (NTRS)

    Hall, Kenneth C.; Crawley, Edward F.

    1987-01-01

    A method for calculating unsteady flows in cascades is presented. The model, which is based on the linearized unsteady Euler equations, accounts for blade loading shock motion, wake motion, and blade geometry. The mean flow through the cascade is determined by solving the full nonlinear Euler equations. Assuming the unsteadiness in the flow is small, then the Euler equations are linearized about the mean flow to obtain a set of linear variable coefficient equations which describe the small amplitude, harmonic motion of the flow. These equations are discretized on a computational grid via a finite volume operator and solved directly subject to an appropriate set of linearized boundary conditions. The steady flow, which is calculated prior to the unsteady flow, is found via a Newton iteration procedure. An important feature of the analysis is the use of shock fitting to model steady and unsteady shocks. Use of the Euler equations with the unsteady Rankine-Hugoniot shock jump conditions correctly models the generation of steady and unsteady entropy and vorticity at shocks. In particular, the low frequency shock displacement is correctly predicted. Results of this method are presented for a variety of test cases. Predicted unsteady transonic flows in channels are compared to full nonlinear Euler solutions obtained using time-accurate, time-marching methods. The agreement between the two methods is excellent for small to moderate levels of flow unsteadiness. The method is also used to predict unsteady flows in cascades due to blade motion (flutter problem) and incoming disturbances (gust response problem).

  4. Dealing with uncertainty in landscape genetic resistance models: a case of three co-occurring marsupials.

    PubMed

    Dudaniec, Rachael Y; Worthington Wilmer, Jessica; Hanson, Jeffrey O; Warren, Matthew; Bell, Sarah; Rhodes, Jonathan R

    2016-01-01

    Landscape genetics lacks explicit methods for dealing with the uncertainty in landscape resistance estimation, which is particularly problematic when sample sizes of individuals are small. Unless uncertainty can be quantified, valuable but small data sets may be rendered unusable for conservation purposes. We offer a method to quantify uncertainty in landscape resistance estimates using multimodel inference as an improvement over single model-based inference. We illustrate the approach empirically using co-occurring, woodland-preferring Australian marsupials within a common study area: two arboreal gliders (Petaurus breviceps, and Petaurus norfolcensis) and one ground-dwelling antechinus (Antechinus flavipes). First, we use maximum-likelihood and a bootstrap procedure to identify the best-supported isolation-by-resistance model out of 56 models defined by linear and non-linear resistance functions. We then quantify uncertainty in resistance estimates by examining parameter selection probabilities from the bootstrapped data. The selection probabilities provide estimates of uncertainty in the parameters that drive the relationships between landscape features and resistance. We then validate our method for quantifying uncertainty using simulated genetic and landscape data showing that for most parameter combinations it provides sensible estimates of uncertainty. We conclude that small data sets can be informative in landscape genetic analyses provided uncertainty can be explicitly quantified. Being explicit about uncertainty in landscape genetic models will make results more interpretable and useful for conservation decision-making, where dealing with uncertainty is critical. © 2015 John Wiley & Sons Ltd.

  5. Systems and Methods for Correcting Optical Reflectance Measurements

    NASA Technical Reports Server (NTRS)

    Yang, Ye (Inventor); Shear, Michael A. (Inventor); Soller, Babs R. (Inventor); Soyemi, Olusola O. (Inventor)

    2014-01-01

    We disclose measurement systems and methods for measuring analytes in target regions of samples that also include features overlying the target regions. The systems include: (a) a light source; (b) a detection system; (c) a set of at least first, second, and third light ports which transmit light from the light source to a sample and receive and direct light reflected from the sample to the detection system, generating a first set of data including information corresponding to both an internal target within the sample and features overlying the internal target, and a second set of data including information corresponding to features overlying the internal target; and (d) a processor configured to remove information characteristic of the overlying features from the first set of data using the first and second sets of data to produce corrected information representing the internal target.

  6. Systems and methods for correcting optical reflectance measurements

    NASA Technical Reports Server (NTRS)

    Yang, Ye (Inventor); Soller, Babs R. (Inventor); Soyemi, Olusola O. (Inventor); Shear, Michael A. (Inventor)

    2009-01-01

    We disclose measurement systems and methods for measuring analytes in target regions of samples that also include features overlying the target regions. The systems include: (a) a light source; (b) a detection system; (c) a set of at least first, second, and third light ports which transmit light from the light source to a sample and receive and direct light reflected from the sample to the detection system, generating a first set of data including information corresponding to both an internal target within the sample and features overlying the internal target, and a second set of data including information corresponding to features overlying the internal target; and (d) a processor configured to remove information characteristic of the overlying features from the first set of data using the first and second sets of data to produce corrected information representing the internal target.

  7. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ogden, K; O’Dwyer, R; Bradford, T

    Purpose: To reduce differences in features calculated from MRI brain scans acquired at different field strengths with or without Gadolinium contrast. Methods: Brain scans were processed for 111 epilepsy patients to extract hippocampus and thalamus features. Scans were acquired on 1.5 T scanners with Gadolinium contrast (group A), 1.5T scanners without Gd (group B), and 3.0 T scanners without Gd (group C). A total of 72 features were extracted. Features were extracted from original scans and from scans where the image pixel values were rescaled to the mean of the hippocampi and thalami values. For each data set, cluster analysismore » was performed on the raw feature set and for feature sets with normalization (conversion to Z scores). Two methods of normalization were used: The first was over all values of a given feature, and the second by normalizing within the patient group membership. The clustering software was configured to produce 3 clusters. Group fractions in each cluster were calculated. Results: For features calculated from both the non-rescaled and rescaled data, cluster membership was identical for both the non-normalized and normalized data sets. Cluster 1 was comprised entirely of Group A data, Cluster 2 contained data from all three groups, and Cluster 3 contained data from only groups 1 and 2. For the categorically normalized data sets there was a more uniform distribution of group data in the three Clusters. A less pronounced effect was seen in the rescaled image data features. Conclusion: Image Rescaling and feature renormalization can have a significant effect on the results of clustering analysis. These effects are also likely to influence the results of supervised machine learning algorithms. It may be possible to partly remove the influence of scanner field strength and the presence of Gadolinium based contrast in feature extraction for radiomics applications.« less

  8. A framework for feature extraction from hospital medical data with applications in risk prediction.

    PubMed

    Tran, Truyen; Luo, Wei; Phung, Dinh; Gupta, Sunil; Rana, Santu; Kennedy, Richard Lee; Larkins, Ann; Venkatesh, Svetha

    2014-12-30

    Feature engineering is a time consuming component of predictive modeling. We propose a versatile platform to automatically extract features for risk prediction, based on a pre-defined and extensible entity schema. The extraction is independent of disease type or risk prediction task. We contrast auto-extracted features to baselines generated from the Elixhauser comorbidities. Hospital medical records was transformed to event sequences, to which filters were applied to extract feature sets capturing diversity in temporal scales and data types. The features were evaluated on a readmission prediction task, comparing with baseline feature sets generated from the Elixhauser comorbidities. The prediction model was through logistic regression with elastic net regularization. Predictions horizons of 1, 2, 3, 6, 12 months were considered for four diverse diseases: diabetes, COPD, mental disorders and pneumonia, with derivation and validation cohorts defined on non-overlapping data-collection periods. For unplanned readmissions, auto-extracted feature set using socio-demographic information and medical records, outperformed baselines derived from the socio-demographic information and Elixhauser comorbidities, over 20 settings (5 prediction horizons over 4 diseases). In particular over 30-day prediction, the AUCs are: COPD-baseline: 0.60 (95% CI: 0.57, 0.63), auto-extracted: 0.67 (0.64, 0.70); diabetes-baseline: 0.60 (0.58, 0.63), auto-extracted: 0.67 (0.64, 0.69); mental disorders-baseline: 0.57 (0.54, 0.60), auto-extracted: 0.69 (0.64,0.70); pneumonia-baseline: 0.61 (0.59, 0.63), auto-extracted: 0.70 (0.67, 0.72). The advantages of auto-extracted standard features from complex medical records, in a disease and task agnostic manner were demonstrated. Auto-extracted features have good predictive power over multiple time horizons. Such feature sets have potential to form the foundation of complex automated analytic tasks.

  9. A Reduced Set of Features for Chronic Kidney Disease Prediction

    PubMed Central

    Misir, Rajesh; Mitra, Malay; Samanta, Ranjit Kumar

    2017-01-01

    Chronic kidney disease (CKD) is one of the life-threatening diseases. Early detection and proper management are solicited for augmenting survivability. As per the UCI data set, there are 24 attributes for predicting CKD or non-CKD. At least there are 16 attributes need pathological investigations involving more resources, money, time, and uncertainties. The objective of this work is to explore whether we can predict CKD or non-CKD with reasonable accuracy using less number of features. An intelligent system development approach has been used in this study. We attempted one important feature selection technique to discover reduced features that explain the data set much better. Two intelligent binary classification techniques have been adopted for the validity of the reduced feature set. Performances were evaluated in terms of four important classification evaluation parameters. As suggested from our results, we may more concentrate on those reduced features for identifying CKD and thereby reduces uncertainty, saves time, and reduces costs. PMID:28706750

  10. Ensemble methods with simple features for document zone classification

    NASA Astrophysics Data System (ADS)

    Obafemi-Ajayi, Tayo; Agam, Gady; Xie, Bingqing

    2012-01-01

    Document layout analysis is of fundamental importance for document image understanding and information retrieval. It requires the identification of blocks extracted from a document image via features extraction and block classification. In this paper, we focus on the classification of the extracted blocks into five classes: text (machine printed), handwriting, graphics, images, and noise. We propose a new set of features for efficient classifications of these blocks. We present a comparative evaluation of three ensemble based classification algorithms (boosting, bagging, and combined model trees) in addition to other known learning algorithms. Experimental results are demonstrated for a set of 36503 zones extracted from 416 document images which were randomly selected from the tobacco legacy document collection. The results obtained verify the robustness and effectiveness of the proposed set of features in comparison to the commonly used Ocropus recognition features. When used in conjunction with the Ocropus feature set, we further improve the performance of the block classification system to obtain a classification accuracy of 99.21%.

  11. Object-oriented feature extraction approach for mapping supraglacial debris in Schirmacher Oasis using very high-resolution satellite data

    NASA Astrophysics Data System (ADS)

    Jawak, Shridhar D.; Jadhav, Ajay; Luis, Alvarinho J.

    2016-05-01

    Supraglacial debris was mapped in the Schirmacher Oasis, east Antarctica, by using WorldView-2 (WV-2) high resolution optical remote sensing data consisting of 8-band calibrated Gram Schmidt (GS)-sharpened and atmospherically corrected WV-2 imagery. This study is a preliminary attempt to develop an object-oriented rule set to extract supraglacial debris for Antarctic region using 8-spectral band imagery. Supraglacial debris was manually digitized from the satellite imagery to generate the ground reference data. Several trials were performed using few existing traditional pixel-based classification techniques and color-texture based object-oriented classification methods to extract supraglacial debris over a small domain of the study area. Multi-level segmentation and attributes such as scale, shape, size, compactness along with spectral information from the data were used for developing the rule set. The quantitative analysis of error was carried out against the manually digitized reference data to test the practicability of our approach over the traditional pixel-based methods. Our results indicate that OBIA-based approach (overall accuracy: 93%) for extracting supraglacial debris performed better than all the traditional pixel-based methods (overall accuracy: 80-85%). The present attempt provides a comprehensive improved method for semiautomatic feature extraction in supraglacial environment and a new direction in the cryospheric research.

  12. Odor Recognition vs. Classification in Artificial Olfaction

    NASA Astrophysics Data System (ADS)

    Raman, Baranidharan; Hertz, Joshua; Benkstein, Kurt; Semancik, Steve

    2011-09-01

    Most studies in chemical sensing have focused on the problem of precise identification of chemical species that were exposed during the training phase (the recognition problem). However, generalization of training to predict the chemical composition of untrained gases based on their similarity with analytes in the training set (the classification problem) has received very limited attention. These two analytical tasks pose conflicting constraints on the system. While correct recognition requires detection of molecular features that are unique to an analyte, generalization to untrained chemicals requires detection of features that are common across a desired class of analytes. A simple solution that addresses both issues simultaneously can be obtained from biological olfaction, where the odor class and identity information are decoupled and extracted individually over time. Mimicking this approach, we proposed a hierarchical scheme that allowed initial discrimination between broad chemical classes (e.g. contains oxygen) followed by finer refinements using additional data into sub-classes (e.g. ketones vs. alcohols) and, eventually, specific compositions (e.g. ethanol vs. methanol) [1]. We validated this approach using an array of temperature-controlled chemiresistors. We demonstrated that a small set of training analytes is sufficient to allow generalization to novel chemicals and that the scheme provides robust categorization despite aging. Here, we provide further characterization of this approach.

  13. Analysis of spatial heterogeneity in normal epithelium and preneoplastic alterations in mouse prostate tumor models

    PubMed Central

    Valkonen, Mira; Ruusuvuori, Pekka; Kartasalo, Kimmo; Nykter, Matti; Visakorpi, Tapio; Latonen, Leena

    2017-01-01

    Cancer involves histological changes in tissue, which is of primary importance in pathological diagnosis and research. Automated histological analysis requires ability to computationally separate pathological alterations from normal tissue with all its variables. On the other hand, understanding connections between genetic alterations and histological attributes requires development of enhanced analysis methods suitable also for small sample sizes. Here, we set out to develop computational methods for early detection and distinction of prostate cancer-related pathological alterations. We use analysis of features from HE stained histological images of normal mouse prostate epithelium, distinguishing the descriptors for variability between ventral, lateral, and dorsal lobes. In addition, we use two common prostate cancer models, Hi-Myc and Pten+/− mice, to build a feature-based machine learning model separating the early pathological lesions provoked by these genetic alterations. This work offers a set of computational methods for separation of early neoplastic lesions in the prostates of model mice, and provides proof-of-principle for linking specific tumor genotypes to quantitative histological characteristics. The results obtained show that separation between different spatial locations within the organ, as well as classification between histologies linked to different genetic backgrounds, can be performed with very high specificity and sensitivity. PMID:28317907

  14. A neural network for noise correlation classification

    NASA Astrophysics Data System (ADS)

    Paitz, Patrick; Gokhberg, Alexey; Fichtner, Andreas

    2018-02-01

    We present an artificial neural network (ANN) for the classification of ambient seismic noise correlations into two categories, suitable and unsuitable for noise tomography. By using only a small manually classified data subset for network training, the ANN allows us to classify large data volumes with low human effort and to encode the valuable subjective experience of data analysts that cannot be captured by a deterministic algorithm. Based on a new feature extraction procedure that exploits the wavelet-like nature of seismic time-series, we efficiently reduce the dimensionality of noise correlation data, still keeping relevant features needed for automated classification. Using global- and regional-scale data sets, we show that classification errors of 20 per cent or less can be achieved when the network training is performed with as little as 3.5 per cent and 16 per cent of the data sets, respectively. Furthermore, the ANN trained on the regional data can be applied to the global data, and vice versa, without a significant increase of the classification error. An experiment where four students manually classified the data, revealed that the classification error they would assign to each other is substantially larger than the classification error of the ANN (>35 per cent). This indicates that reproducibility would be hampered more by human subjectivity than by imperfections of the ANN.

  15. Comparison of fruit syndromes between the Egyptian fruit-bat ( Rousettus aegyptiacus) and birds in East Mediterranean habitats

    NASA Astrophysics Data System (ADS)

    Korine, Carmi; Izhaki, Ido; Arad, Zeev

    1998-04-01

    This study analyses the fruit syndrome of the Egyptian fruit-bat, Rousettus aegyptiacus, the only fruit-bat found in East Mediterranean habitats. Two different sets of bat-fruit syndromes were revealed. One follows the general bat-fruit syndrome and one represents a special case of bat-dispersed fruit syndrome only found in East Mediterranean habitats. The latter syndrome is characterized by dry fruits with a relatively high protein content. Fruit species that belong to this syndrome are available mostly in winter (when the fruit-bat faces a severe shortage in fruit availability and inadequate fruit quality). The fruit syndromes and dietary overlap between frugivorous birds (based on the literature) and the fruit-bat were also studied. Features associated with each set of fruit species generally follow the known bat and bird syndromes. Bird-dispersed fruits tend to be small, with a high seed mass to pulp mass, variable in fat content and characterized by a high ash content. However, when the shared fruit species were included in the analysis, no significant differences were found in fruit features between the bird-dispersed and bat-dispersed fruit syndromes. A limited and asymmetrical dietary overlap was observed between these two taxa, mainly between introduced and cultivated fruits.

  16. Cassini UVIS Observations of Saturn during the Grand Finale Orbits

    NASA Astrophysics Data System (ADS)

    Pryor, W. R.; Esposito, L. W.; West, R. A.; Jouchoux, A.; Radioti, A.; Grodent, D. C.; Gerard, J. C. M. C.; Gustin, J.; Lamy, L.; Badman, S. V.

    2017-12-01

    In 2016 and 2017, the Cassini Saturn orbiter executed a final series of high inclination, low-periapsis orbits ideal for studies of Saturn's polar regions. The Cassini Ultraviolet Imaging Spectrograph (UVIS) obtained an extensive set of auroral images, some at the highest spatial resolution obtained during Cassini's long orbital mission (2004-2017). In some cases, two or three spacecraft slews at right angles to the long slit of the spectrograph were required to cover the entire auroral region to form auroral images. We will present selected images from this set showing narrow arcs of emission, more diffuse auroral emissions, multiple auroral arcs in a single image, discrete spots of emission, small scale vortices, large-scale spiral forms, and parallel linear features that appear to cross in places like twisted wires. Some shorter features are transverse to the main auroral arcs, like barbs on a wire. UVIS observations were in some cases simultaneous with auroral observations from the Hubble Space Telescope Space Telescope Imaging Spectrograph (STIS) that will also be presented. UVIS polar images also contain spectral information suitable for studies of the auroral electron energy distribution. The long wavelength part of the UVIS polar images contains a signal from reflected sunlight containing absorption signatures of acetylene and other Saturn hydrocarbons. The hydrocarbon spatial distribution will also be examined.

  17. A Non-Parametric Approach for the Activation Detection of Block Design fMRI Simulated Data Using Self-Organizing Maps and Support Vector Machine.

    PubMed

    Bahrami, Sheyda; Shamsi, Mousa

    2017-01-01

    Functional magnetic resonance imaging (fMRI) is a popular method to probe the functional organization of the brain using hemodynamic responses. In this method, volume images of the entire brain are obtained with a very good spatial resolution and low temporal resolution. However, they always suffer from high dimensionality in the face of classification algorithms. In this work, we combine a support vector machine (SVM) with a self-organizing map (SOM) for having a feature-based classification by using SVM. Then, a linear kernel SVM is used for detecting the active areas. Here, we use SOM for feature extracting and labeling the datasets. SOM has two major advances: (i) it reduces dimension of data sets for having less computational complexity and (ii) it is useful for identifying brain regions with small onset differences in hemodynamic responses. Our non-parametric model is compared with parametric and non-parametric methods. We use simulated fMRI data sets and block design inputs in this paper and consider the contrast to noise ratio (CNR) value equal to 0.6 for simulated datasets. fMRI simulated dataset has contrast 1-4% in active areas. The accuracy of our proposed method is 93.63% and the error rate is 6.37%.

  18. 48 CFR 52.219-20 - Notice of Emerging Small Business Set-Aside.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... Clauses 52.219-20 Notice of Emerging Small Business Set-Aside. As prescribed in 19.1008(b), insert the following provision: Notice of Emerging Small Business Set-Aside (JAN 1991) Offers or quotations under this acquisition are solicited from emerging small business concerns only. Offers that are not from an emerging...

  19. Feature Selection for Speech Emotion Recognition in Spanish and Basque: On the Use of Machine Learning to Improve Human-Computer Interaction

    PubMed Central

    Arruti, Andoni; Cearreta, Idoia; Álvarez, Aitor; Lazkano, Elena; Sierra, Basilio

    2014-01-01

    Study of emotions in human–computer interaction is a growing research area. This paper shows an attempt to select the most significant features for emotion recognition in spoken Basque and Spanish Languages using different methods for feature selection. RekEmozio database was used as the experimental data set. Several Machine Learning paradigms were used for the emotion classification task. Experiments were executed in three phases, using different sets of features as classification variables in each phase. Moreover, feature subset selection was applied at each phase in order to seek for the most relevant feature subset. The three phases approach was selected to check the validity of the proposed approach. Achieved results show that an instance-based learning algorithm using feature subset selection techniques based on evolutionary algorithms is the best Machine Learning paradigm in automatic emotion recognition, with all different feature sets, obtaining a mean of 80,05% emotion recognition rate in Basque and a 74,82% in Spanish. In order to check the goodness of the proposed process, a greedy searching approach (FSS-Forward) has been applied and a comparison between them is provided. Based on achieved results, a set of most relevant non-speaker dependent features is proposed for both languages and new perspectives are suggested. PMID:25279686

  20. Geophysics of Martian Periglacial Processes

    NASA Technical Reports Server (NTRS)

    Mellon, Michael T.

    2004-01-01

    Through the examination of small-scale geologic features potentially related to water and ice in the martian subsurface (specifically small-scale polygonal ground and young gully-like features), determine the state, distribution and recent history of subsurface water and ice on Mars. To refine existing models and develop new models of near-surface water and ice, and develop new insights about the nature of water on Mars as manifested by these geologic features. Through an improved understanding of potentially water-related geologic features, utilize these features in addressing questions about where to best search for present day water and what space craft may encounter that might facilitate or inhibit the search for water.

  1. smallWig: parallel compression of RNA-seq WIG files.

    PubMed

    Wang, Zhiying; Weissman, Tsachy; Milenkovic, Olgica

    2016-01-15

    We developed a new lossless compression method for WIG data, named smallWig, offering the best known compression rates for RNA-seq data and featuring random access functionalities that enable visualization, summary statistics analysis and fast queries from the compressed files. Our approach results in order of magnitude improvements compared with bigWig and ensures compression rates only a fraction of those produced by cWig. The key features of the smallWig algorithm are statistical data analysis and a combination of source coding methods that ensure high flexibility and make the algorithm suitable for different applications. Furthermore, for general-purpose file compression, the compression rate of smallWig approaches the empirical entropy of the tested WIG data. For compression with random query features, smallWig uses a simple block-based compression scheme that introduces only a minor overhead in the compression rate. For archival or storage space-sensitive applications, the method relies on context mixing techniques that lead to further improvements of the compression rate. Implementations of smallWig can be executed in parallel on different sets of chromosomes using multiple processors, thereby enabling desirable scaling for future transcriptome Big Data platforms. The development of next-generation sequencing technologies has led to a dramatic decrease in the cost of DNA/RNA sequencing and expression profiling. RNA-seq has emerged as an important and inexpensive technology that provides information about whole transcriptomes of various species and organisms, as well as different organs and cellular communities. The vast volume of data generated by RNA-seq experiments has significantly increased data storage costs and communication bandwidth requirements. Current compression tools for RNA-seq data such as bigWig and cWig either use general-purpose compressors (gzip) or suboptimal compression schemes that leave significant room for improvement. To substantiate this claim, we performed a statistical analysis of expression data in different transform domains and developed accompanying entropy coding methods that bridge the gap between theoretical and practical WIG file compression rates. We tested different variants of the smallWig compression algorithm on a number of integer-and real- (floating point) valued RNA-seq WIG files generated by the ENCODE project. The results reveal that, on average, smallWig offers 18-fold compression rate improvements, up to 2.5-fold compression time improvements, and 1.5-fold decompression time improvements when compared with bigWig. On the tested files, the memory usage of the algorithm never exceeded 90 KB. When more elaborate context mixing compressors were used within smallWig, the obtained compression rates were as much as 23 times better than those of bigWig. For smallWig used in the random query mode, which also supports retrieval of the summary statistics, an overhead in the compression rate of roughly 3-17% was introduced depending on the chosen system parameters. An increase in encoding and decoding time of 30% and 55% represents an additional performance loss caused by enabling random data access. We also implemented smallWig using multi-processor programming. This parallelization feature decreases the encoding delay 2-3.4 times compared with that of a single-processor implementation, with the number of processors used ranging from 2 to 8; in the same parameter regime, the decoding delay decreased 2-5.2 times. The smallWig software can be downloaded from: http://stanford.edu/~zhiyingw/smallWig/smallwig.html, http://publish.illinois.edu/milenkovic/, http://web.stanford.edu/~tsachy/. zhiyingw@stanford.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  2. Development of a New Optical Measuring Set-Up

    NASA Astrophysics Data System (ADS)

    Miroshnichenko, I. P.; Parinov, I. A.

    2018-06-01

    The paper proposes a description of the developed optical measuring set-up for the contactless recording and processing of measurement results for small spatial (linear and angular) displacements of control surfaces based on the use of laser technologies and optical interference methods. The proposed set-up is designed to solve all the arising measurement tasks in the study of the physical and mechanical properties of new materials and in the process of diagnosing the state of structural materials by acoustic active methods of nondestructive testing. The structure of the set-up, its constituent parts are described, and the features of construction and functioning during measurements are discussed. New technical solutions for the implementation of the components of the set-up under consideration are obtained. The purpose and description of the original specialized software, used to perform a priori analysis of measurement results, are present, while performing measurements, for a posteriori analysis of measurement results. Moreover, the influences of internal and external disturbance effects on the measurement results and correcting measurement results directly in their implementation are determined. The technical solutions, used in the set-up, are protected by the patents of the Russian Federation for inventions, and software is protected by the certificates of state registration of computer programs. The proposed set-up is intended for use in instrumentation, mechanical engineering, shipbuilding, aviation, energy sector, etc.

  3. Support Vector Data Description Model to Map Specific Land Cover with Optimal Parameters Determined from a Window-Based Validation Set.

    PubMed

    Zhang, Jinshui; Yuan, Zhoumiqi; Shuai, Guanyuan; Pan, Yaozhong; Zhu, Xiufang

    2017-04-26

    This paper developed an approach, the window-based validation set for support vector data description (WVS-SVDD), to determine optimal parameters for support vector data description (SVDD) model to map specific land cover by integrating training and window-based validation sets. Compared to the conventional approach where the validation set included target and outlier pixels selected visually and randomly, the validation set derived from WVS-SVDD constructed a tightened hypersphere because of the compact constraint by the outlier pixels which were located neighboring to the target class in the spectral feature space. The overall accuracies for wheat and bare land achieved were as high as 89.25% and 83.65%, respectively. However, target class was underestimated because the validation set covers only a small fraction of the heterogeneous spectra of the target class. The different window sizes were then tested to acquire more wheat pixels for validation set. The results showed that classification accuracy increased with the increasing window size and the overall accuracies were higher than 88% at all window size scales. Moreover, WVS-SVDD showed much less sensitivity to the untrained classes than the multi-class support vector machine (SVM) method. Therefore, the developed method showed its merits using the optimal parameters, tradeoff coefficient ( C ) and kernel width ( s ), in mapping homogeneous specific land cover.

  4. 48 CFR 1452.280-1 - Notice of Indian small business economic enterprise set-aside.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... business economic enterprise set-aside. 1452.280-1 Section 1452.280-1 Federal Acquisition Regulations... of Provisions and Clauses 1452.280-1 Notice of Indian small business economic enterprise set-aside... potential offerors. Notice of Indian Small Business Economic Enterprise Set-aside (JUL 2013) Under the Buy...

  5. 48 CFR 1452.280-1 - Notice of Indian small business economic enterprise set-aside.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... business economic enterprise set-aside. 1452.280-1 Section 1452.280-1 Federal Acquisition Regulations... of Provisions and Clauses 1452.280-1 Notice of Indian small business economic enterprise set-aside... potential offerors. Notice of Indian Small Business Economic Enterprise Set-aside (JUL 2013) Under the Buy...

  6. Model-Based Comparison of Deep Brain Stimulation Array Functionality with Varying Number of Radial Electrodes and Machine Learning Feature Sets.

    PubMed

    Teplitzky, Benjamin A; Zitella, Laura M; Xiao, YiZi; Johnson, Matthew D

    2016-01-01

    Deep brain stimulation (DBS) leads with radially distributed electrodes have potential to improve clinical outcomes through more selective targeting of pathways and networks within the brain. However, increasing the number of electrodes on clinical DBS leads by replacing conventional cylindrical shell electrodes with radially distributed electrodes raises practical design and stimulation programming challenges. We used computational modeling to investigate: (1) how the number of radial electrodes impact the ability to steer, shift, and sculpt a region of neural activation (RoA), and (2) which RoA features are best used in combination with machine learning classifiers to predict programming settings to target a particular area near the lead. Stimulation configurations were modeled using 27 lead designs with one to nine radially distributed electrodes. The computational modeling framework consisted of a three-dimensional finite element tissue conductance model in combination with a multi-compartment biophysical axon model. For each lead design, two-dimensional threshold-dependent RoAs were calculated from the computational modeling results. The models showed more radial electrodes enabled finer resolution RoA steering; however, stimulation amplitude, and therefore spatial extent of the RoA, was limited by charge injection and charge storage capacity constraints due to the small electrode surface area for leads with more than four radially distributed electrodes. RoA shifting resolution was improved by the addition of radial electrodes when using uniform multi-cathode stimulation, but non-uniform multi-cathode stimulation produced equivalent or better resolution shifting without increasing the number of radial electrodes. Robust machine learning classification of 15 monopolar stimulation configurations was achieved using as few as three geometric features describing a RoA. The results of this study indicate that, for a clinical-scale DBS lead, more than four radial electrodes minimally improved in the ability to steer, shift, and sculpt axonal activation around a DBS lead and a simple feature set consisting of the RoA center of mass and orientation enabled robust machine learning classification. These results provide important design constraints for future development of high-density DBS arrays.

  7. Model-Based Comparison of Deep Brain Stimulation Array Functionality with Varying Number of Radial Electrodes and Machine Learning Feature Sets

    PubMed Central

    Teplitzky, Benjamin A.; Zitella, Laura M.; Xiao, YiZi; Johnson, Matthew D.

    2016-01-01

    Deep brain stimulation (DBS) leads with radially distributed electrodes have potential to improve clinical outcomes through more selective targeting of pathways and networks within the brain. However, increasing the number of electrodes on clinical DBS leads by replacing conventional cylindrical shell electrodes with radially distributed electrodes raises practical design and stimulation programming challenges. We used computational modeling to investigate: (1) how the number of radial electrodes impact the ability to steer, shift, and sculpt a region of neural activation (RoA), and (2) which RoA features are best used in combination with machine learning classifiers to predict programming settings to target a particular area near the lead. Stimulation configurations were modeled using 27 lead designs with one to nine radially distributed electrodes. The computational modeling framework consisted of a three-dimensional finite element tissue conductance model in combination with a multi-compartment biophysical axon model. For each lead design, two-dimensional threshold-dependent RoAs were calculated from the computational modeling results. The models showed more radial electrodes enabled finer resolution RoA steering; however, stimulation amplitude, and therefore spatial extent of the RoA, was limited by charge injection and charge storage capacity constraints due to the small electrode surface area for leads with more than four radially distributed electrodes. RoA shifting resolution was improved by the addition of radial electrodes when using uniform multi-cathode stimulation, but non-uniform multi-cathode stimulation produced equivalent or better resolution shifting without increasing the number of radial electrodes. Robust machine learning classification of 15 monopolar stimulation configurations was achieved using as few as three geometric features describing a RoA. The results of this study indicate that, for a clinical-scale DBS lead, more than four radial electrodes minimally improved in the ability to steer, shift, and sculpt axonal activation around a DBS lead and a simple feature set consisting of the RoA center of mass and orientation enabled robust machine learning classification. These results provide important design constraints for future development of high-density DBS arrays. PMID:27375470

  8. Quantitative Wood Anatomy-Practical Guidelines.

    PubMed

    von Arx, Georg; Crivellaro, Alan; Prendin, Angela L; Čufar, Katarina; Carrer, Marco

    2016-01-01

    Quantitative wood anatomy analyzes the variability of xylem anatomical features in trees, shrubs, and herbaceous species to address research questions related to plant functioning, growth, and environment. Among the more frequently considered anatomical features are lumen dimensions and wall thickness of conducting cells, fibers, and several ray properties. The structural properties of each xylem anatomical feature are mostly fixed once they are formed, and define to a large extent its functionality, including transport and storage of water, nutrients, sugars, and hormones, and providing mechanical support. The anatomical features can often be localized within an annual growth ring, which allows to establish intra-annual past and present structure-function relationships and its sensitivity to environmental variability. However, there are many methodological challenges to handle when aiming at producing (large) data sets of xylem anatomical data. Here we describe the different steps from wood sample collection to xylem anatomical data, provide guidance and identify pitfalls, and present different image-analysis tools for the quantification of anatomical features, in particular conducting cells. We show that each data production step from sample collection in the field, microslide preparation in the lab, image capturing through an optical microscope and image analysis with specific tools can readily introduce measurement errors between 5 and 30% and more, whereby the magnitude usually increases the smaller the anatomical features. Such measurement errors-if not avoided or corrected-may make it impossible to extract meaningful xylem anatomical data in light of the rather small range of variability in many anatomical features as observed, for example, within time series of individual plants. Following a rigid protocol and quality control as proposed in this paper is thus mandatory to use quantitative data of xylem anatomical features as a powerful source for many research topics.

  9. Quantitative Wood Anatomy—Practical Guidelines

    PubMed Central

    von Arx, Georg; Crivellaro, Alan; Prendin, Angela L.; Čufar, Katarina; Carrer, Marco

    2016-01-01

    Quantitative wood anatomy analyzes the variability of xylem anatomical features in trees, shrubs, and herbaceous species to address research questions related to plant functioning, growth, and environment. Among the more frequently considered anatomical features are lumen dimensions and wall thickness of conducting cells, fibers, and several ray properties. The structural properties of each xylem anatomical feature are mostly fixed once they are formed, and define to a large extent its functionality, including transport and storage of water, nutrients, sugars, and hormones, and providing mechanical support. The anatomical features can often be localized within an annual growth ring, which allows to establish intra-annual past and present structure-function relationships and its sensitivity to environmental variability. However, there are many methodological challenges to handle when aiming at producing (large) data sets of xylem anatomical data. Here we describe the different steps from wood sample collection to xylem anatomical data, provide guidance and identify pitfalls, and present different image-analysis tools for the quantification of anatomical features, in particular conducting cells. We show that each data production step from sample collection in the field, microslide preparation in the lab, image capturing through an optical microscope and image analysis with specific tools can readily introduce measurement errors between 5 and 30% and more, whereby the magnitude usually increases the smaller the anatomical features. Such measurement errors—if not avoided or corrected—may make it impossible to extract meaningful xylem anatomical data in light of the rather small range of variability in many anatomical features as observed, for example, within time series of individual plants. Following a rigid protocol and quality control as proposed in this paper is thus mandatory to use quantitative data of xylem anatomical features as a powerful source for many research topics. PMID:27375641

  10. The interaction of feature and space based orienting within the attention set.

    PubMed

    Lim, Ahnate; Sinnett, Scott

    2014-01-01

    The processing of sensory information relies on interacting mechanisms of sustained attention and attentional capture, both of which operate in space and on object features. While evidence indicates that exogenous attentional capture, a mechanism previously understood to be automatic, can be eliminated while concurrently performing a demanding task, we reframe this phenomenon within the theoretical framework of the "attention set" (Most et al., 2005). Consequently, the specific prediction that cuing effects should reappear when feature dimensions of the cue overlap with those in the attention set (i.e., elements of the demanding task) was empirically tested and confirmed using a dual-task paradigm involving both sustained attention and attentional capture, adapted from Santangelo et al. (2007). Participants were required to either detect a centrally presented target presented in a stream of distractors (the primary task), or respond to a spatially cued target (the secondary task). Importantly, the spatial cue could either share features with the target in the centrally presented primary task, or not share any features. Overall, the findings supported the attention set hypothesis showing that a spatial cuing effect was only observed when the peripheral cue shared a feature with objects that were already in the attention set (i.e., the primary task). However, this finding was accompanied by differential attentional orienting dependent on the different types of objects within the attention set, with feature-based orienting occurring for target-related objects, and additional spatial-based orienting for distractor-related objects.

  11. A keyword spotting model using perceptually significant energy features

    NASA Astrophysics Data System (ADS)

    Umakanthan, Padmalochini

    The task of a keyword recognition system is to detect the presence of certain words in a conversation based on the linguistic information present in human speech. Such keyword spotting systems have applications in homeland security, telephone surveillance and human-computer interfacing. General procedure of a keyword spotting system involves feature generation and matching. In this work, new set of features that are based on the psycho-acoustic masking nature of human speech are proposed. After developing these features a time aligned pattern matching process was implemented to locate the words in a set of unknown words. A word boundary detection technique based on frame classification using the nonlinear characteristics of speech is also addressed in this work. Validation of this keyword spotting model was done using widely acclaimed Cepstral features. The experimental results indicate the viability of using these perceptually significant features as an augmented feature set in keyword spotting.

  12. A random forest model based classification scheme for neonatal amplitude-integrated EEG.

    PubMed

    Chen, Weiting; Wang, Yu; Cao, Guitao; Chen, Guoqiang; Gu, Qiufang

    2014-01-01

    Modern medical advances have greatly increased the survival rate of infants, while they remain in the higher risk group for neurological problems later in life. For the infants with encephalopathy or seizures, identification of the extent of brain injury is clinically challenging. Continuous amplitude-integrated electroencephalography (aEEG) monitoring offers a possibility to directly monitor the brain functional state of the newborns over hours, and has seen an increasing application in neonatal intensive care units (NICUs). This paper presents a novel combined feature set of aEEG and applies random forest (RF) method to classify aEEG tracings. To that end, a series of experiments were conducted on 282 aEEG tracing cases (209 normal and 73 abnormal ones). Basic features, statistic features and segmentation features were extracted from both the tracing as a whole and the segmented recordings, and then form a combined feature set. All the features were sent to a classifier afterwards. The significance of feature, the data segmentation, the optimization of RF parameters, and the problem of imbalanced datasets were examined through experiments. Experiments were also done to evaluate the performance of RF on aEEG signal classifying, compared with several other widely used classifiers including SVM-Linear, SVM-RBF, ANN, Decision Tree (DT), Logistic Regression(LR), ML, and LDA. The combined feature set can better characterize aEEG signals, compared with basic features, statistic features and segmentation features respectively. With the combined feature set, the proposed RF-based aEEG classification system achieved a correct rate of 92.52% and a high F1-score of 95.26%. Among all of the seven classifiers examined in our work, the RF method got the highest correct rate, sensitivity, specificity, and F1-score, which means that RF outperforms all of the other classifiers considered here. The results show that the proposed RF-based aEEG classification system with the combined feature set is efficient and helpful to better detect the brain disorders in newborns.

  13. Downscaling Soil Moisture in the Southern Great Plains Through a Calibrated Multifractal Model for Land Surface Modeling Applications

    NASA Technical Reports Server (NTRS)

    Mascaro, Giuseppe; Vivoni, Enrique R.; Deidda, Roberto

    2010-01-01

    Accounting for small-scale spatial heterogeneity of soil moisture (theta) is required to enhance the predictive skill of land surface models. In this paper, we present the results of the development, calibration, and performance evaluation of a downscaling model based on multifractal theory using aircraft!based (800 m) theta estimates collected during the southern Great Plains experiment in 1997 (SGP97).We first demonstrate the presence of scale invariance and multifractality in theta fields of nine square domains of size 25.6 x 25.6 sq km, approximately a satellite footprint. Then, we estimate the downscaling model parameters and evaluate the model performance using a set of different calibration approaches. Results reveal that small-scale theta distributions are adequately reproduced across the entire region when coarse predictors include a dynamic component (i.e., the spatial mean soil moisture ) and a stationary contribution accounting for static features (i.e., topography, soil texture, vegetation). For wet conditions, we found similar multifractal properties of soil moisture across all domains, which we ascribe to the signature of rainfall spatial variability. For drier states, the theta fields in the northern domains are more intermittent than in southern domains, likely because of differences in the distribution of vegetation coverage. Through our analyses, we propose a regional downscaling relation for coarse, satellite-based soil moisture estimates, based on ancillary information (static and dynamic landscape features), which can be used in the study area to characterize statistical properties of small-scale theta distribution required by land surface models and data assimilation systems.

  14. Feature Selection for Classification of Polar Regions Using a Fuzzy Expert System

    NASA Technical Reports Server (NTRS)

    Penaloza, Mauel A.; Welch, Ronald M.

    1996-01-01

    Labeling, feature selection, and the choice of classifier are critical elements for classification of scenes and for image understanding. This study examines several methods for feature selection in polar regions, including the list, of a fuzzy logic-based expert system for further refinement of a set of selected features. Six Advanced Very High Resolution Radiometer (AVHRR) Local Area Coverage (LAC) arctic scenes are classified into nine classes: water, snow / ice, ice cloud, land, thin stratus, stratus over water, cumulus over water, textured snow over water, and snow-covered mountains. Sixty-seven spectral and textural features are computed and analyzed by the feature selection algorithms. The divergence, histogram analysis, and discriminant analysis approaches are intercompared for their effectiveness in feature selection. The fuzzy expert system method is used not only to determine the effectiveness of each approach in classifying polar scenes, but also to further reduce the features into a more optimal set. For each selection method,features are ranked from best to worst, and the best half of the features are selected. Then, rules using these selected features are defined. The results of running the fuzzy expert system with these rules show that the divergence method produces the best set features, not only does it produce the highest classification accuracy, but also it has the lowest computation requirements. A reduction of the set of features produced by the divergence method using the fuzzy expert system results in an overall classification accuracy of over 95 %. However, this increase of accuracy has a high computation cost.

  15. Three studies of retail gasoline pricing dynamics

    NASA Astrophysics Data System (ADS)

    Atkinson, Benjamin James

    In many Canadian cities, retail gasoline prices appear to cycle, rising by large amounts in one or two days followed by several days of small consecutive price decreases. While many empirical studies examine such markets, certain questions cannot b e properly answered without high frequency, station-specific price data for an entire market. Thus, the first paper in this thesis uses bi-hourly price data collected for 27 stations in Guelph, Ontario, eight tunes per day for 103 days to examine several basic predictions of the Edgeworth cycle theory. The results are largely consistent with this theory. However, most independent firms do not tend to undercut their rivals' prices, contrary to previous findings. Furthermore, the tuning, sizes and leaders of price increases appear to be very predictable, and a specific pattern of price movements has been detected on days when prices increase. These findings suggest that leading a price increase might not be as risky as one may expect. The second paper uses these same data to examine the implications o f an informal theory of competitive gasoline pricing, as advanced by industry and government. Consistent with this theory, stations do tend to set prices to match (or set a small positive or negative differential with) a small number of other stations, which are not necessarily the closest stations. Also, while retailers frequently respond to price changes within two hours, many take considerably longer to respond than is predicted by the theory. Finally, while price decreases do ripple across the market like falling dominos, increases appear to propagate based more on geographic location and source of price control than proximity to the leaders. The third paper uses both these data and Guelph price data collected every 12 hours during the same 103 days from OntarioGasPrices.com to examine the sample selection biases that might exist in such Internet price data, as well as their implications for empirical research. It is found that the Internet data tend to accurately identify features of cycles that can be distinguished using company-operated, major brand station prices, while features that require individual independent station data or very high frequency data might not be well-identified.

  16. 48 CFR 5119.1070-2 - Emerging small business set-aside.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 48 Federal Acquisition Regulations System 7 2010-10-01 2010-10-01 false Emerging small business... ARMY ACQUISITION REGULATIONS SMALL BUSINESS AND SMALL DISADVANTAGED BUSINESS CONCERNS Small Business Competitiveness Demonstration Program 5119.1070-2 Emerging small business set-aside. (a)(S-90) Solicitations for...

  17. 48 CFR 5119.1070-2 - Emerging small business set-aside.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 48 Federal Acquisition Regulations System 7 2014-10-01 2014-10-01 false Emerging small business... ARMY ACQUISITION REGULATIONS SMALL BUSINESS AND SMALL DISADVANTAGED BUSINESS CONCERNS Small Business Competitiveness Demonstration Program 5119.1070-2 Emerging small business set-aside. (a)(S-90) Solicitations for...

  18. 48 CFR 5119.1070-2 - Emerging small business set-aside.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 48 Federal Acquisition Regulations System 7 2011-10-01 2011-10-01 false Emerging small business... ARMY ACQUISITION REGULATIONS SMALL BUSINESS AND SMALL DISADVANTAGED BUSINESS CONCERNS Small Business Competitiveness Demonstration Program 5119.1070-2 Emerging small business set-aside. (a)(S-90) Solicitations for...

  19. 48 CFR 5119.1070-2 - Emerging small business set-aside.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 48 Federal Acquisition Regulations System 7 2012-10-01 2012-10-01 false Emerging small business... ARMY ACQUISITION REGULATIONS SMALL BUSINESS AND SMALL DISADVANTAGED BUSINESS CONCERNS Small Business Competitiveness Demonstration Program 5119.1070-2 Emerging small business set-aside. (a)(S-90) Solicitations for...

  20. 48 CFR 852.219-10 - VA Notice of total service-disabled veteran-owned small business set-aside.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ...-disabled veteran-owned small business set-aside. 852.219-10 Section 852.219-10 Federal Acquisition... CLAUSES Texts of Provisions and Clauses 852.219-10 VA Notice of total service-disabled veteran-owned small...-Disabled Veteran-Owned Small Business Set-Aside (DEC 2009) (a) Definition. For the Department of Veterans...

Top