Sample records for five-fold cross-validation technique

  1. K-Fold Crossvalidation in Canonical Analysis.

    ERIC Educational Resources Information Center

    Liang, Kun-Hsia; And Others

    1995-01-01

    A computer-assisted, K-fold cross-validation technique is discussed in the framework of canonical correlation analysis of randomly generated data sets. Analysis results suggest that this technique can effectively reduce the contamination of canonical variates and canonical correlations by sample-specific variance components. (Author/SLD)

  2. Prostate tissue characterization/classification in 144 patient population using wavelet and higher order spectra features from transrectal ultrasound images.

    PubMed

    Pareek, Gyan; Acharya, U Rajendra; Sree, S Vinitha; Swapna, G; Yantri, Ratna; Martis, Roshan Joy; Saba, Luca; Krishnamurthi, Ganapathy; Mallarini, Giorgio; El-Baz, Ayman; Al Ekish, Shadi; Beland, Michael; Suri, Jasjit S

    2013-12-01

    In this work, we have proposed an on-line computer-aided diagnostic system called "UroImage" that classifies a Transrectal Ultrasound (TRUS) image into cancerous or non-cancerous with the help of non-linear Higher Order Spectra (HOS) features and Discrete Wavelet Transform (DWT) coefficients. The UroImage system consists of an on-line system where five significant features (one DWT-based feature and four HOS-based features) are extracted from the test image. These on-line features are transformed by the classifier parameters obtained using the training dataset to determine the class. We trained and tested six classifiers. The dataset used for evaluation had 144 TRUS images which were split into training and testing sets. Three-fold and ten-fold cross-validation protocols were adopted for training and estimating the accuracy of the classifiers. The ground truth used for training was obtained using the biopsy results. Among the six classifiers, using 10-fold cross-validation technique, Support Vector Machine and Fuzzy Sugeno classifiers presented the best classification accuracy of 97.9% with equally high values for sensitivity, specificity and positive predictive value. Our proposed automated system, which achieved more than 95% values for all the performance measures, can be an adjunct tool to provide an initial diagnosis for the identification of patients with prostate cancer. The technique, however, is limited by the limitations of 2D ultrasound guided biopsy, and we intend to improve our technique by using 3D TRUS images in the future.

  3. Developing Enhanced Blood–Brain Barrier Permeability Models: Integrating External Bio-Assay Data in QSAR Modeling

    PubMed Central

    Wang, Wenyi; Kim, Marlene T.; Sedykh, Alexander

    2015-01-01

    Purpose Experimental Blood–Brain Barrier (BBB) permeability models for drug molecules are expensive and time-consuming. As alternative methods, several traditional Quantitative Structure-Activity Relationship (QSAR) models have been developed previously. In this study, we aimed to improve the predictivity of traditional QSAR BBB permeability models by employing relevant public bio-assay data in the modeling process. Methods We compiled a BBB permeability database consisting of 439 unique compounds from various resources. The database was split into a modeling set of 341 compounds and a validation set of 98 compounds. Consensus QSAR modeling workflow was employed on the modeling set to develop various QSAR models. A five-fold cross-validation approach was used to validate the developed models, and the resulting models were used to predict the external validation set compounds. Furthermore, we used previously published membrane transporter models to generate relevant transporter profiles for target compounds. The transporter profiles were used as additional biological descriptors to develop hybrid QSAR BBB models. Results The consensus QSAR models have R2=0.638 for fivefold cross-validation and R2=0.504 for external validation. The consensus model developed by pooling chemical and transporter descriptors showed better predictivity (R2=0.646 for five-fold cross-validation and R2=0.526 for external validation). Moreover, several external bio-assays that correlate with BBB permeability were identified using our automatic profiling tool. Conclusions The BBB permeability models developed in this study can be useful for early evaluation of new compounds (e.g., new drug candidates). The combination of chemical and biological descriptors shows a promising direction to improve the current traditional QSAR models. PMID:25862462

  4. Soft computing techniques toward modeling the water supplies of Cyprus.

    PubMed

    Iliadis, L; Maris, F; Tachos, S

    2011-10-01

    This research effort aims in the application of soft computing techniques toward water resources management. More specifically, the target is the development of reliable soft computing models capable of estimating the water supply for the case of "Germasogeia" mountainous watersheds in Cyprus. Initially, ε-Regression Support Vector Machines (ε-RSVM) and fuzzy weighted ε-RSVMR models have been developed that accept five input parameters. At the same time, reliable artificial neural networks have been developed to perform the same job. The 5-fold cross validation approach has been employed in order to eliminate bad local behaviors and to produce a more representative training data set. Thus, the fuzzy weighted Support Vector Regression (SVR) combined with the fuzzy partition has been employed in an effort to enhance the quality of the results. Several rational and reliable models have been produced that can enhance the efficiency of water policy designers. Copyright © 2011 Elsevier Ltd. All rights reserved.

  5. Novel Breast Imaging and Machine Learning: Predicting Breast Lesion Malignancy at Cone-Beam CT Using Machine Learning Techniques.

    PubMed

    Uhlig, Johannes; Uhlig, Annemarie; Kunze, Meike; Beissbarth, Tim; Fischer, Uwe; Lotz, Joachim; Wienbeck, Susanne

    2018-05-24

    The purpose of this study is to evaluate the diagnostic performance of machine learning techniques for malignancy prediction at breast cone-beam CT (CBCT) and to compare them to human readers. Five machine learning techniques, including random forests, back propagation neural networks (BPN), extreme learning machines, support vector machines, and K-nearest neighbors, were used to train diagnostic models on a clinical breast CBCT dataset with internal validation by repeated 10-fold cross-validation. Two independent blinded human readers with profound experience in breast imaging and breast CBCT analyzed the same CBCT dataset. Diagnostic performance was compared using AUC, sensitivity, and specificity. The clinical dataset comprised 35 patients (American College of Radiology density type C and D breasts) with 81 suspicious breast lesions examined with contrast-enhanced breast CBCT. Forty-five lesions were histopathologically proven to be malignant. Among the machine learning techniques, BPNs provided the best diagnostic performance, with AUC of 0.91, sensitivity of 0.85, and specificity of 0.82. The diagnostic performance of the human readers was AUC of 0.84, sensitivity of 0.89, and specificity of 0.72 for reader 1 and AUC of 0.72, sensitivity of 0.71, and specificity of 0.67 for reader 2. AUC was significantly higher for BPN when compared with both reader 1 (p = 0.01) and reader 2 (p < 0.001). Machine learning techniques provide a high and robust diagnostic performance in the prediction of malignancy in breast lesions identified at CBCT. BPNs showed the best diagnostic performance, surpassing human readers in terms of AUC and specificity.

  6. Non-destructive Techniques for Classifying Aircraft Coating Degradation

    DTIC Science & Technology

    2015-03-26

    model is bidirectional reflectance distribution func- tions ( BRDF ) which describes how much radiation is reflected for each solid angle and each...incident angle. An intermediate model between ideal reflectors and BRDF is to assume all reflectance is a combination of diffuse and specular reflectance...19 K-Fold Cross Validation

  7. Empirical Performance of Cross-Validation With Oracle Methods in a Genomics Context.

    PubMed

    Martinez, Josue G; Carroll, Raymond J; Müller, Samuel; Sampson, Joshua N; Chatterjee, Nilanjan

    2011-11-01

    When employing model selection methods with oracle properties such as the smoothly clipped absolute deviation (SCAD) and the Adaptive Lasso, it is typical to estimate the smoothing parameter by m-fold cross-validation, for example, m = 10. In problems where the true regression function is sparse and the signals large, such cross-validation typically works well. However, in regression modeling of genomic studies involving Single Nucleotide Polymorphisms (SNP), the true regression functions, while thought to be sparse, do not have large signals. We demonstrate empirically that in such problems, the number of selected variables using SCAD and the Adaptive Lasso, with 10-fold cross-validation, is a random variable that has considerable and surprising variation. Similar remarks apply to non-oracle methods such as the Lasso. Our study strongly questions the suitability of performing only a single run of m-fold cross-validation with any oracle method, and not just the SCAD and Adaptive Lasso.

  8. Parkinson's disease detection based on dysphonia measurements

    NASA Astrophysics Data System (ADS)

    Lahmiri, Salim

    2017-04-01

    Assessing dysphonic symptoms is a noninvasive and effective approach to detect Parkinson's disease (PD) in patients. The main purpose of this study is to investigate the effect of different dysphonia measurements on PD detection by support vector machine (SVM). Seven categories of dysphonia measurements are considered. Experimental results from ten-fold cross-validation technique demonstrate that vocal fundamental frequency statistics yield the highest accuracy of 88 % ± 0.04. When all dysphonia measurements are employed, the SVM classifier achieves 94 % ± 0.03 accuracy. A refinement of the original patterns space by removing dysphonia measurements with similar variation across healthy and PD subjects allows achieving 97.03 % ± 0.03 accuracy. The latter performance is larger than what is reported in the literature on the same dataset with ten-fold cross-validation technique. Finally, it was found that measures of ratio of noise to tonal components in the voice are the most suitable dysphonic symptoms to detect PD subjects as they achieve 99.64 % ± 0.01 specificity. This finding is highly promising for understanding PD symptoms.

  9. Empirical Performance of Cross-Validation With Oracle Methods in a Genomics Context

    PubMed Central

    Martinez, Josue G.; Carroll, Raymond J.; Müller, Samuel; Sampson, Joshua N.; Chatterjee, Nilanjan

    2012-01-01

    When employing model selection methods with oracle properties such as the smoothly clipped absolute deviation (SCAD) and the Adaptive Lasso, it is typical to estimate the smoothing parameter by m-fold cross-validation, for example, m = 10. In problems where the true regression function is sparse and the signals large, such cross-validation typically works well. However, in regression modeling of genomic studies involving Single Nucleotide Polymorphisms (SNP), the true regression functions, while thought to be sparse, do not have large signals. We demonstrate empirically that in such problems, the number of selected variables using SCAD and the Adaptive Lasso, with 10-fold cross-validation, is a random variable that has considerable and surprising variation. Similar remarks apply to non-oracle methods such as the Lasso. Our study strongly questions the suitability of performing only a single run of m-fold cross-validation with any oracle method, and not just the SCAD and Adaptive Lasso. PMID:22347720

  10. Predictive modeling of outcomes following definitive chemoradiotherapy for oropharyngeal cancer based on FDG-PET image characteristics

    NASA Astrophysics Data System (ADS)

    Folkert, Michael R.; Setton, Jeremy; Apte, Aditya P.; Grkovski, Milan; Young, Robert J.; Schöder, Heiko; Thorstad, Wade L.; Lee, Nancy Y.; Deasy, Joseph O.; Oh, Jung Hun

    2017-07-01

    In this study, we investigate the use of imaging feature-based outcomes research (‘radiomics’) combined with machine learning techniques to develop robust predictive models for the risk of all-cause mortality (ACM), local failure (LF), and distant metastasis (DM) following definitive chemoradiation therapy (CRT). One hundred seventy four patients with stage III-IV oropharyngeal cancer (OC) treated at our institution with CRT with retrievable pre- and post-treatment 18F-fluorodeoxyglucose positron emission tomography (FDG-PET) scans were identified. From pre-treatment PET scans, 24 representative imaging features of FDG-avid disease regions were extracted. Using machine learning-based feature selection methods, multiparameter logistic regression models were built incorporating clinical factors and imaging features. All model building methods were tested by cross validation to avoid overfitting, and final outcome models were validated on an independent dataset from a collaborating institution. Multiparameter models were statistically significant on 5 fold cross validation with the area under the receiver operating characteristic curve (AUC)  =  0.65 (p  =  0.004), 0.73 (p  =  0.026), and 0.66 (p  =  0.015) for ACM, LF, and DM, respectively. The model for LF retained significance on the independent validation cohort with AUC  =  0.68 (p  =  0.029) whereas the models for ACM and DM did not reach statistical significance, but resulted in comparable predictive power to the 5 fold cross validation with AUC  =  0.60 (p  =  0.092) and 0.65 (p  =  0.062), respectively. In the largest study of its kind to date, predictive features including increasing metabolic tumor volume, increasing image heterogeneity, and increasing tumor surface irregularity significantly correlated to mortality, LF, and DM on 5 fold cross validation in a relatively uniform single-institution cohort. The LF model also retained significance in an independent population.

  11. Genomic selection across multiple breeding cycles in applied bread wheat breeding.

    PubMed

    Michel, Sebastian; Ametz, Christian; Gungor, Huseyin; Epure, Doru; Grausgruber, Heinrich; Löschenberger, Franziska; Buerstmayr, Hermann

    2016-06-01

    We evaluated genomic selection across five breeding cycles of bread wheat breeding. Bias of within-cycle cross-validation and methods for improving the prediction accuracy were assessed. The prospect of genomic selection has been frequently shown by cross-validation studies using the same genetic material across multiple environments, but studies investigating genomic selection across multiple breeding cycles in applied bread wheat breeding are lacking. We estimated the prediction accuracy of grain yield, protein content and protein yield of 659 inbred lines across five independent breeding cycles and assessed the bias of within-cycle cross-validation. We investigated the influence of outliers on the prediction accuracy and predicted protein yield by its components traits. A high average heritability was estimated for protein content, followed by grain yield and protein yield. The bias of the prediction accuracy using populations from individual cycles using fivefold cross-validation was accordingly substantial for protein yield (17-712 %) and less pronounced for protein content (8-86 %). Cross-validation using the cycles as folds aimed to avoid this bias and reached a maximum prediction accuracy of [Formula: see text] = 0.51 for protein content, [Formula: see text] = 0.38 for grain yield and [Formula: see text] = 0.16 for protein yield. Dropping outlier cycles increased the prediction accuracy of grain yield to [Formula: see text] = 0.41 as estimated by cross-validation, while dropping outlier environments did not have a significant effect on the prediction accuracy. Independent validation suggests, on the other hand, that careful consideration is necessary before an outlier correction is undertaken, which removes lines from the training population. Predicting protein yield by multiplying genomic estimated breeding values of grain yield and protein content raised the prediction accuracy to [Formula: see text] = 0.19 for this derived trait.

  12. Pse-Analysis: a python package for DNA/RNA and protein/ peptide sequence analysis based on pseudo components and kernel methods.

    PubMed

    Liu, Bin; Wu, Hao; Zhang, Deyuan; Wang, Xiaolong; Chou, Kuo-Chen

    2017-02-21

    To expedite the pace in conducting genome/proteome analysis, we have developed a Python package called Pse-Analysis. The powerful package can automatically complete the following five procedures: (1) sample feature extraction, (2) optimal parameter selection, (3) model training, (4) cross validation, and (5) evaluating prediction quality. All the work a user needs to do is to input a benchmark dataset along with the query biological sequences concerned. Based on the benchmark dataset, Pse-Analysis will automatically construct an ideal predictor, followed by yielding the predicted results for the submitted query samples. All the aforementioned tedious jobs can be automatically done by the computer. Moreover, the multiprocessing technique was adopted to enhance computational speed by about 6 folds. The Pse-Analysis Python package is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/Pse-Analysis/, and can be directly run on Windows, Linux, and Unix.

  13. Recurrence predictive models for patients with hepatocellular carcinoma after radiofrequency ablation using support vector machines with feature selection methods.

    PubMed

    Liang, Ja-Der; Ping, Xiao-Ou; Tseng, Yi-Ju; Huang, Guan-Tarn; Lai, Feipei; Yang, Pei-Ming

    2014-12-01

    Recurrence of hepatocellular carcinoma (HCC) is an important issue despite effective treatments with tumor eradication. Identification of patients who are at high risk for recurrence may provide more efficacious screening and detection of tumor recurrence. The aim of this study was to develop recurrence predictive models for HCC patients who received radiofrequency ablation (RFA) treatment. From January 2007 to December 2009, 83 newly diagnosed HCC patients receiving RFA as their first treatment were enrolled. Five feature selection methods including genetic algorithm (GA), simulated annealing (SA) algorithm, random forests (RF) and hybrid methods (GA+RF and SA+RF) were utilized for selecting an important subset of features from a total of 16 clinical features. These feature selection methods were combined with support vector machine (SVM) for developing predictive models with better performance. Five-fold cross-validation was used to train and test SVM models. The developed SVM-based predictive models with hybrid feature selection methods and 5-fold cross-validation had averages of the sensitivity, specificity, accuracy, positive predictive value, negative predictive value, and area under the ROC curve as 67%, 86%, 82%, 69%, 90%, and 0.69, respectively. The SVM derived predictive model can provide suggestive high-risk recurrent patients, who should be closely followed up after complete RFA treatment. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  14. Using patient data similarities to predict radiation pneumonitis via a self-organizing map

    NASA Astrophysics Data System (ADS)

    Chen, Shifeng; Zhou, Sumin; Yin, Fang-Fang; Marks, Lawrence B.; Das, Shiva K.

    2008-01-01

    This work investigates the use of the self-organizing map (SOM) technique for predicting lung radiation pneumonitis (RP) risk. SOM is an effective method for projecting and visualizing high-dimensional data in a low-dimensional space (map). By projecting patients with similar data (dose and non-dose factors) onto the same region of the map, commonalities in their outcomes can be visualized and categorized. Once built, the SOM may be used to predict pneumonitis risk by identifying the region of the map that is most similar to a patient's characteristics. Two SOM models were developed from a database of 219 lung cancer patients treated with radiation therapy (34 clinically diagnosed with Grade 2+ pneumonitis). The models were: SOMall built from all dose and non-dose factors and, for comparison, SOMdose built from dose factors alone. Both models were tested using ten-fold cross validation and Receiver Operating Characteristics (ROC) analysis. Models SOMall and SOMdose yielded ten-fold cross-validated ROC areas of 0.73 (sensitivity/specificity = 71%/68%) and 0.67 (sensitivity/specificity = 63%/66%), respectively. The significant difference between the cross-validated ROC areas of these two models (p < 0.05) implies that non-dose features add important information toward predicting RP risk. Among the input features selected by model SOMall, the two with highest impact for increasing RP risk were: (a) higher mean lung dose and (b) chemotherapy prior to radiation therapy. The SOM model developed here may not be extrapolated to treatment techniques outside that used in our database, such as several-field lung intensity modulated radiation therapy or gated radiation therapy.

  15. A nearest neighbor approach for automated transporter prediction and categorization from protein sequences.

    PubMed

    Li, Haiquan; Dai, Xinbin; Zhao, Xuechun

    2008-05-01

    Membrane transport proteins play a crucial role in the import and export of ions, small molecules or macromolecules across biological membranes. Currently, there are a limited number of published computational tools which enable the systematic discovery and categorization of transporters prior to costly experimental validation. To approach this problem, we utilized a nearest neighbor method which seamlessly integrates homologous search and topological analysis into a machine-learning framework. Our approach satisfactorily distinguished 484 transporter families in the Transporter Classification Database, a curated and representative database for transporters. A five-fold cross-validation on the database achieved a positive classification rate of 72.3% on average. Furthermore, this method successfully detected transporters in seven model and four non-model organisms, ranging from archaean to mammalian species. A preliminary literature-based validation has cross-validated 65.8% of our predictions on the 11 organisms, including 55.9% of our predictions overlapping with 83.6% of the predicted transporters in TransportDB.

  16. Cross-validation pitfalls when selecting and assessing regression and classification models.

    PubMed

    Krstajic, Damjan; Buturovic, Ljubomir J; Leahy, David E; Thomas, Simon

    2014-03-29

    We address the problem of selecting and assessing classification and regression models using cross-validation. Current state-of-the-art methods can yield models with high variance, rendering them unsuitable for a number of practical applications including QSAR. In this paper we describe and evaluate best practices which improve reliability and increase confidence in selected models. A key operational component of the proposed methods is cloud computing which enables routine use of previously infeasible approaches. We describe in detail an algorithm for repeated grid-search V-fold cross-validation for parameter tuning in classification and regression, and we define a repeated nested cross-validation algorithm for model assessment. As regards variable selection and parameter tuning we define two algorithms (repeated grid-search cross-validation and double cross-validation), and provide arguments for using the repeated grid-search in the general case. We show results of our algorithms on seven QSAR datasets. The variation of the prediction performance, which is the result of choosing different splits of the dataset in V-fold cross-validation, needs to be taken into account when selecting and assessing classification and regression models. We demonstrate the importance of repeating cross-validation when selecting an optimal model, as well as the importance of repeating nested cross-validation when assessing a prediction error.

  17. SU-G-BRC-13: Model Based Classification for Optimal Position Selection for Left-Sided Breast Radiotherapy: Free Breathing, DIBH, Or Prone

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lin, H; Liu, T; Xu, X

    Purpose: There are clinical decision challenges to select optimal treatment positions for left-sided breast cancer patients—supine free breathing (FB), supine Deep Inspiration Breath Hold (DIBH) and prone free breathing (prone). Physicians often make the decision based on experiences and trials, which might not always result optimal OAR doses. We herein propose a mathematical model to predict the lowest OAR doses among these three positions, providing a quantitative tool for corresponding clinical decision. Methods: Patients were scanned in FB, DIBH, and prone positions under an IRB approved protocol. Tangential beam plans were generated for each position, and OAR doses were calculated.more » The position with least OAR doses is defined as the optimal position. The following features were extracted from each scan to build the model: heart, ipsilateral lung, breast volume, in-field heart, ipsilateral lung volume, distance between heart and target, laterality of heart, and dose to heart and ipsilateral lung. Principal Components Analysis (PCA) was applied to remove the co-linearity of the input data and also to lower the data dimensionality. Feature selection, another method to reduce dimensionality, was applied as a comparison. Support Vector Machine (SVM) was then used for classification. Thirtyseven patient data were acquired; up to now, five patient plans were available. K-fold cross validation was used to validate the accuracy of the classifier model with small training size. Results: The classification results and K-fold cross validation demonstrated the model is capable of predicting the optimal position for patients. The accuracy of K-fold cross validations has reached 80%. Compared to PCA, feature selection allows causal features of dose to be determined. This provides more clinical insights. Conclusion: The proposed classification system appeared to be feasible. We are generating plans for the rest of the 37 patient images, and more statistically significant results are to be presented.« less

  18. In silico prediction of Tetrahymena pyriformis toxicity for diverse industrial chemicals with substructure pattern recognition and machine learning methods.

    PubMed

    Cheng, Feixiong; Shen, Jie; Yu, Yue; Li, Weihua; Liu, Guixia; Lee, Philip W; Tang, Yun

    2011-03-01

    There is an increasing need for the rapid safety assessment of chemicals by both industries and regulatory agencies throughout the world. In silico techniques are practical alternatives in the environmental hazard assessment. It is especially true to address the persistence, bioaccumulative and toxicity potentials of organic chemicals. Tetrahymena pyriformis toxicity is often used as a toxic endpoint. In this study, 1571 diverse unique chemicals were collected from the literature and composed of the largest diverse data set for T. pyriformis toxicity. Classification predictive models of T. pyriformis toxicity were developed by substructure pattern recognition and different machine learning methods, including support vector machine (SVM), C4.5 decision tree, k-nearest neighbors and random forest. The results of a 5-fold cross-validation showed that the SVM method performed better than other algorithms. The overall predictive accuracies of the SVM classification model with radial basis functions kernel was 92.2% for the 5-fold cross-validation and 92.6% for the external validation set, respectively. Furthermore, several representative substructure patterns for characterizing T. pyriformis toxicity were also identified via the information gain analysis methods. Copyright © 2010 Elsevier Ltd. All rights reserved.

  19. Benchmark of Machine Learning Methods for Classification of a SENTINEL-2 Image

    NASA Astrophysics Data System (ADS)

    Pirotti, F.; Sunar, F.; Piragnolo, M.

    2016-06-01

    Thanks to mainly ESA and USGS, a large bulk of free images of the Earth is readily available nowadays. One of the main goals of remote sensing is to label images according to a set of semantic categories, i.e. image classification. This is a very challenging issue since land cover of a specific class may present a large spatial and spectral variability and objects may appear at different scales and orientations. In this study, we report the results of benchmarking 9 machine learning algorithms tested for accuracy and speed in training and classification of land-cover classes in a Sentinel-2 dataset. The following machine learning methods (MLM) have been tested: linear discriminant analysis, k-nearest neighbour, random forests, support vector machines, multi layered perceptron, multi layered perceptron ensemble, ctree, boosting, logarithmic regression. The validation is carried out using a control dataset which consists of an independent classification in 11 land-cover classes of an area about 60 km2, obtained by manual visual interpretation of high resolution images (20 cm ground sampling distance) by experts. In this study five out of the eleven classes are used since the others have too few samples (pixels) for testing and validating subsets. The classes used are the following: (i) urban (ii) sowable areas (iii) water (iv) tree plantations (v) grasslands. Validation is carried out using three different approaches: (i) using pixels from the training dataset (train), (ii) using pixels from the training dataset and applying cross-validation with the k-fold method (kfold) and (iii) using all pixels from the control dataset. Five accuracy indices are calculated for the comparison between the values predicted with each model and control values over three sets of data: the training dataset (train), the whole control dataset (full) and with k-fold cross-validation (kfold) with ten folds. Results from validation of predictions of the whole dataset (full) show the random forests method with the highest values; kappa index ranging from 0.55 to 0.42 respectively with the most and least number pixels for training. The two neural networks (multi layered perceptron and its ensemble) and the support vector machines - with default radial basis function kernel - methods follow closely with comparable performance.

  20. Non-destructive detection of cross-sectional strain and defect structure in an individual Ag five-fold twinned nanowire by 3D electron diffraction mapping.

    PubMed

    Fu, Xin; Yuan, Jun

    2017-07-24

    Coherent x-ray diffraction investigations on Ag five-fold twinned nanowires (FTNWs) have drawn controversial conclusions concerning whether the intrinsic 7.35° angular gap could be compensated homogeneously through phase transformation or inhomogeneously by forming disclination strain field. In those studies, the x-ray techniques only provided an ensemble average of the structural information from all the Ag nanowires. Here, using three-dimensional (3D) electron diffraction mapping approach, we non-destructively explore the cross-sectional strain and the related strain-relief defect structures of an individual Ag FTNW with diameter about 30 nm. The quantitative analysis of the fine structure of intensity distribution combining with kinematic electron diffraction simulation confirms that for such a Ag FTNW, the intrinsic 7.35° angular deficiency results in an inhomogeneous strain field within each single crystalline segment consistent with the disclination model of stress-relief. Moreover, the five crystalline segments are found to be strained differently. Modeling analysis in combination with system energy calculation further indicates that the elastic strain energy within some crystalline segments, could be partially relieved by the creation of stacking fault layers near the twin boundaries. Our study demonstrates that 3D electron diffraction mapping is a powerful tool for the cross-sectional strain analysis of complex 1D nanostructures.

  1. Improved method for predicting protein fold patterns with ensemble classifiers.

    PubMed

    Chen, W; Liu, X; Huang, Y; Jiang, Y; Zou, Q; Lin, C

    2012-01-27

    Protein folding is recognized as a critical problem in the field of biophysics in the 21st century. Predicting protein-folding patterns is challenging due to the complex structure of proteins. In an attempt to solve this problem, we employed ensemble classifiers to improve prediction accuracy. In our experiments, 188-dimensional features were extracted based on the composition and physical-chemical property of proteins and 20-dimensional features were selected using a coupled position-specific scoring matrix. Compared with traditional prediction methods, these methods were superior in terms of prediction accuracy. The 188-dimensional feature-based method achieved 71.2% accuracy in five cross-validations. The accuracy rose to 77% when we used a 20-dimensional feature vector. These methods were used on recent data, with 54.2% accuracy. Source codes and dataset, together with web server and software tools for prediction, are available at: http://datamining.xmu.edu.cn/main/~cwc/ProteinPredict.html.

  2. GIS-aided Statistical Landslide Susceptibility Modeling And Mapping Of Antipolo Rizal (Philippines)

    NASA Astrophysics Data System (ADS)

    Dumlao, A. J.; Victor, J. A.

    2015-09-01

    Slope instability associated with heavy rainfall or earthquake is a familiar geotechnical problem in the Philippines. The main objective of this study is to perform a detailed landslide susceptibility assessment of Antipolo City. The statistical method of assessment used was logistic regression. Landslide inventory was done through interpretation of aerial photographs and satellite images with corresponding field verification. In this study, morphologic and non-morphologic factors contributing to landslide occurrence and their corresponding spatial relationships were considered. The analysis of landslide susceptibility was implemented in a Geographic Information System (GIS). The 17320 randomly selected datasets were divided into training and test data sets. K- cross fold validation is done with k= 5. The subsamples are then fitted five times with k-1 training data set and the remaining fold as the validation data set. The AUROC of each model is validated using each corresponding data set. The AUROC of the five models are; 0.978, 0.977, 0.977, 0.974, and 0.979 respectively, implying that the models are effective in correctly predicting the occurrence and nonoccurrence of landslide activity. Field verification was also done. The landslide susceptibility map was then generated from the model. It is classified into four categories; low, moderate, high and very high susceptibility. The study also shows that almost 40% of Antipolo City has been assessed to be potentially dangerous areas in terms of landslide occurrence.

  3. Comparisons of the Outcome Prediction Performance of Injury Severity Scoring Tools Using the Abbreviated Injury Scale 90 Update 98 (AIS 98) and 2005 Update 2008 (AIS 2008).

    PubMed

    Tohira, Hideo; Jacobs, Ian; Mountain, David; Gibson, Nick; Yeo, Allen

    2011-01-01

    The Abbreviated Injury Scale (AIS) was revised in 2005 and updated in 2008 (AIS 2008). We aimed to compare the outcome prediction performance of AIS-based injury severity scoring tools by using AIS 2008 and AIS 98. We used all major trauma patients hospitalized to the Royal Perth Hospital between 1994 and 2008. We selected five AIS-based injury severity scoring tools, including Injury Severity Score (ISS), New Injury Severity Score (NISS), modified Anatomic Profile (mAP), Trauma and Injury Severity Score (TRISS) and A Severity Characterization of Trauma (ASCOT). We selected survival after injury as a target outcome. We used the area under the Receiver Operating Characteristic curve (AUROC) as a performance measure. First, we compared the five tools using all cases whose records included all variables for the TRISS (complete dataset) using a 10-fold cross-validation. Second, we compared the ISS and NISS for AIS 98 and AIS 2008 using all subjects (whole dataset). We identified 1,269 and 4,174 cases for a complete dataset and a whole dataset, respectively. With the 10-fold cross-validation, there were no clear differences in the AUROCs between the AIS 98- and AIS 2008-based scores. With the second comparison, the AIS 98-based ISS performed significantly worse than the AIS 2008-based ISS (p<0.0001), while there was no significant difference between the AIS 98- and AIS 2008-based NISSs. Researchers should be aware of these findings when they select an injury severity scoring tool for their studies.

  4. Novel naïve Bayes classification models for predicting the chemical Ames mutagenicity.

    PubMed

    Zhang, Hui; Kang, Yan-Li; Zhu, Yuan-Yuan; Zhao, Kai-Xia; Liang, Jun-Yu; Ding, Lan; Zhang, Teng-Guo; Zhang, Ji

    2017-06-01

    Prediction of drug candidates for mutagenicity is a regulatory requirement since mutagenic compounds could pose a toxic risk to humans. The aim of this investigation was to develop a novel prediction model of mutagenicity by using a naïve Bayes classifier. The established model was validated by the internal 5-fold cross validation and external test sets. For comparison, the recursive partitioning classifier prediction model was also established and other various reported prediction models of mutagenicity were collected. Among these methods, the prediction performance of naïve Bayes classifier established here displayed very well and stable, which yielded average overall prediction accuracies for the internal 5-fold cross validation of the training set and external test set I set were 89.1±0.4% and 77.3±1.5%, respectively. The concordance of the external test set II with 446 marketed drugs was 90.9±0.3%. In addition, four simple molecular descriptors (e.g., Apol, No. of H donors, Num-Rings and Wiener) related to mutagenicity and five representative substructures of mutagens (e.g., aromatic nitro, hydroxyl amine, nitroso, aromatic amine and N-methyl-N-methylenemethanaminum) produced by ECFP_14 fingerprints were identified. We hope the established naïve Bayes prediction model can be applied to risk assessment processes; and the obtained important information of mutagenic chemicals can guide the design of chemical libraries for hit and lead optimization. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. sNebula, a network-based algorithm to predict binding between human leukocyte antigens and peptides

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Luo, Heng; Ye, Hao; Ng, Hui Wen

    Understanding the binding between human leukocyte antigens (HLAs) and peptides is important to understand the functioning of the immune system. Since it is time-consuming and costly to measure the binding between large numbers of HLAs and peptides, computational methods including machine learning models and network approaches have been developed to predict HLA-peptide binding. However, there are several limitations for the existing methods. We developed a network-based algorithm called sNebula to address these limitations. We curated qualitative Class I HLA-peptide binding data and demonstrated the prediction performance of sNebula on this dataset using leave-one-out cross-validation and five-fold cross-validations. Furthermore, this algorithmmore » can predict not only peptides of different lengths and different types of HLAs, but also the peptides or HLAs that have no existing binding data. We believe sNebula is an effective method to predict HLA-peptide binding and thus improve our understanding of the immune system.« less

  6. sNebula, a network-based algorithm to predict binding between human leukocyte antigens and peptides

    PubMed Central

    Luo, Heng; Ye, Hao; Ng, Hui Wen; Sakkiah, Sugunadevi; Mendrick, Donna L.; Hong, Huixiao

    2016-01-01

    Understanding the binding between human leukocyte antigens (HLAs) and peptides is important to understand the functioning of the immune system. Since it is time-consuming and costly to measure the binding between large numbers of HLAs and peptides, computational methods including machine learning models and network approaches have been developed to predict HLA-peptide binding. However, there are several limitations for the existing methods. We developed a network-based algorithm called sNebula to address these limitations. We curated qualitative Class I HLA-peptide binding data and demonstrated the prediction performance of sNebula on this dataset using leave-one-out cross-validation and five-fold cross-validations. This algorithm can predict not only peptides of different lengths and different types of HLAs, but also the peptides or HLAs that have no existing binding data. We believe sNebula is an effective method to predict HLA-peptide binding and thus improve our understanding of the immune system. PMID:27558848

  7. sNebula, a network-based algorithm to predict binding between human leukocyte antigens and peptides

    DOE PAGES

    Luo, Heng; Ye, Hao; Ng, Hui Wen; ...

    2016-08-25

    Understanding the binding between human leukocyte antigens (HLAs) and peptides is important to understand the functioning of the immune system. Since it is time-consuming and costly to measure the binding between large numbers of HLAs and peptides, computational methods including machine learning models and network approaches have been developed to predict HLA-peptide binding. However, there are several limitations for the existing methods. We developed a network-based algorithm called sNebula to address these limitations. We curated qualitative Class I HLA-peptide binding data and demonstrated the prediction performance of sNebula on this dataset using leave-one-out cross-validation and five-fold cross-validations. Furthermore, this algorithmmore » can predict not only peptides of different lengths and different types of HLAs, but also the peptides or HLAs that have no existing binding data. We believe sNebula is an effective method to predict HLA-peptide binding and thus improve our understanding of the immune system.« less

  8. Cross-cultural equivalence of the patient- and parent-reported quality of life in short stature youth (QoLISSY) questionnaire.

    PubMed

    Bullinger, Monika; Quitmann, Julia; Silva, Neuza; Rohenkohl, Anja; Chaplin, John E; DeBusk, Kendra; Mimoun, Emmanuelle; Feigerlova, Eva; Herdman, Michael; Sanz, Dolores; Wollmann, Hartmut; Pleil, Andreas; Power, Michael

    2014-01-01

    Testing cross-cultural equivalence of patient-reported outcomes requires sufficiently large samples per country, which is difficult to achieve in rare endocrine paediatric conditions. We describe a novel approach to cross-cultural testing of the Quality of Life in Short Stature Youth (QoLISSY) questionnaire in five countries by sequentially taking one country out (TOCO) from the total sample and iteratively comparing the resulting psychometric performance. Development of the QoLISSY proceeded from focus group discussions through pilot testing to field testing in 268 short-statured patients and their parents. To explore cross-cultural equivalence, the iterative TOCO technique was used to examine and compare the validity, reliability, and convergence of patient and parent responses on QoLISSY in the field test dataset, and to predict QoLISSY scores from clinical, socio-demographic and psychosocial variables. Validity and reliability indicators were satisfactory for each sample after iteratively omitting one country. Comparisons with the total sample revealed cross-cultural equivalence in internal consistency and construct validity for patients and parents, high inter-rater agreement and a substantial proportion of QoLISSY variance explained by predictors. The TOCO technique is a powerful method to overcome problems of country-specific testing of patient-reported outcome instruments. It provides an empirical support to QoLISSY's cross-cultural equivalence and is recommended for future research.

  9. Comparisons of the Outcome Prediction Performance of Injury Severity Scoring Tools Using the Abbreviated Injury Scale 90 Update 98 (AIS 98) and 2005 Update 2008 (AIS 2008)

    PubMed Central

    Tohira, Hideo; Jacobs, Ian; Mountain, David; Gibson, Nick; Yeo, Allen

    2011-01-01

    The Abbreviated Injury Scale (AIS) was revised in 2005 and updated in 2008 (AIS 2008). We aimed to compare the outcome prediction performance of AIS-based injury severity scoring tools by using AIS 2008 and AIS 98. We used all major trauma patients hospitalized to the Royal Perth Hospital between 1994 and 2008. We selected five AIS-based injury severity scoring tools, including Injury Severity Score (ISS), New Injury Severity Score (NISS), modified Anatomic Profile (mAP), Trauma and Injury Severity Score (TRISS) and A Severity Characterization of Trauma (ASCOT). We selected survival after injury as a target outcome. We used the area under the Receiver Operating Characteristic curve (AUROC) as a performance measure. First, we compared the five tools using all cases whose records included all variables for the TRISS (complete dataset) using a 10-fold cross-validation. Second, we compared the ISS and NISS for AIS 98 and AIS 2008 using all subjects (whole dataset). We identified 1,269 and 4,174 cases for a complete dataset and a whole dataset, respectively. With the 10-fold cross-validation, there were no clear differences in the AUROCs between the AIS 98- and AIS 2008-based scores. With the second comparison, the AIS 98-based ISS performed significantly worse than the AIS 2008-based ISS (p<0.0001), while there was no significant difference between the AIS 98- and AIS 2008-based NISSs. Researchers should be aware of these findings when they select an injury severity scoring tool for their studies. PMID:22105401

  10. A new fold-cross metal mesh filter for suppressing side lobe leakage in terahertz region

    NASA Astrophysics Data System (ADS)

    Lu, Changgui; Qi, Zhengqing; Guo, Wengao; Cui, Yiping

    2018-04-01

    In this paper we propose a new type of fold-cross metal mesh band pass filter, which keeps diffraction side lobe far away from the main transmission peak and shows much better side lobe suppression. Both experimental and theoretical studies are made to analyze the mechanism of side lobe. Compared to the traditional cross filter, the fold-cross filter has a much lower side lobe with almost the same central frequency, bandwidth and highest transmission about 98%. Using the photolithography and electroplating techniques, we experimentally extend the distance between the main peak and diffraction side lobe to larger than 1 THz for the fold-cross filter, which is two times larger than the cross filter while maintaining the main peak transmissions of 89% at 1.25 THz for the two structures. This type of single layer substrate-free fold-cross metal structure shows better design flexibility and structure reliability with the introduction of fold arms for metal mesh band pass filters.

  11. Development of estrogen receptor beta binding prediction model using large sets of chemicals.

    PubMed

    Sakkiah, Sugunadevi; Selvaraj, Chandrabose; Gong, Ping; Zhang, Chaoyang; Tong, Weida; Hong, Huixiao

    2017-11-03

    We developed an ER β binding prediction model to facilitate identification of chemicals specifically bind ER β or ER α together with our previously developed ER α binding model. Decision Forest was used to train ER β binding prediction model based on a large set of compounds obtained from EADB. Model performance was estimated through 1000 iterations of 5-fold cross validations. Prediction confidence was analyzed using predictions from the cross validations. Informative chemical features for ER β binding were identified through analysis of the frequency data of chemical descriptors used in the models in the 5-fold cross validations. 1000 permutations were conducted to assess the chance correlation. The average accuracy of 5-fold cross validations was 93.14% with a standard deviation of 0.64%. Prediction confidence analysis indicated that the higher the prediction confidence the more accurate the predictions. Permutation testing results revealed that the prediction model is unlikely generated by chance. Eighteen informative descriptors were identified to be important to ER β binding prediction. Application of the prediction model to the data from ToxCast project yielded very high sensitivity of 90-92%. Our results demonstrated ER β binding of chemicals could be accurately predicted using the developed model. Coupling with our previously developed ER α prediction model, this model could be expected to facilitate drug development through identification of chemicals that specifically bind ER β or ER α .

  12. An adaptive deep learning approach for PPG-based identification.

    PubMed

    Jindal, V; Birjandtalab, J; Pouyan, M Baran; Nourani, M

    2016-08-01

    Wearable biosensors have become increasingly popular in healthcare due to their capabilities for low cost and long term biosignal monitoring. This paper presents a novel two-stage technique to offer biometric identification using these biosensors through Deep Belief Networks and Restricted Boltzman Machines. Our identification approach improves robustness in current monitoring procedures within clinical, e-health and fitness environments using Photoplethysmography (PPG) signals through deep learning classification models. The approach is tested on TROIKA dataset using 10-fold cross validation and achieved an accuracy of 96.1%.

  13. EL_PSSM-RT: DNA-binding residue prediction by integrating ensemble learning with PSSM Relation Transformation.

    PubMed

    Zhou, Jiyun; Lu, Qin; Xu, Ruifeng; He, Yulan; Wang, Hongpeng

    2017-08-29

    Prediction of DNA-binding residue is important for understanding the protein-DNA recognition mechanism. Many computational methods have been proposed for the prediction, but most of them do not consider the relationships of evolutionary information between residues. In this paper, we first propose a novel residue encoding method, referred to as the Position Specific Score Matrix (PSSM) Relation Transformation (PSSM-RT), to encode residues by utilizing the relationships of evolutionary information between residues. PDNA-62 and PDNA-224 are used to evaluate PSSM-RT and two existing PSSM encoding methods by five-fold cross-validation. Performance evaluations indicate that PSSM-RT is more effective than previous methods. This validates the point that the relationship of evolutionary information between residues is indeed useful in DNA-binding residue prediction. An ensemble learning classifier (EL_PSSM-RT) is also proposed by combining ensemble learning model and PSSM-RT to better handle the imbalance between binding and non-binding residues in datasets. EL_PSSM-RT is evaluated by five-fold cross-validation using PDNA-62 and PDNA-224 as well as two independent datasets TS-72 and TS-61. Performance comparisons with existing predictors on the four datasets demonstrate that EL_PSSM-RT is the best-performing method among all the predicting methods with improvement between 0.02-0.07 for MCC, 4.18-21.47% for ST and 0.013-0.131 for AUC. Furthermore, we analyze the importance of the pair-relationships extracted by PSSM-RT and the results validates the usefulness of PSSM-RT for encoding DNA-binding residues. We propose a novel prediction method for the prediction of DNA-binding residue with the inclusion of relationship of evolutionary information and ensemble learning. Performance evaluation shows that the relationship of evolutionary information between residues is indeed useful in DNA-binding residue prediction and ensemble learning can be used to address the data imbalance issue between binding and non-binding residues. A web service of EL_PSSM-RT ( http://hlt.hitsz.edu.cn:8080/PSSM-RT_SVM/ ) is provided for free access to the biological research community.

  14. In vivo multi-modality photoacoustic and pulse echo tracking of prostate tumor growth using a window chamber

    NASA Astrophysics Data System (ADS)

    Bauer, Daniel R.; Olafsson, Ragnar; Montilla, Leonardo G.; Witte, Russell S.

    2010-02-01

    Understanding the tumor microenvironment is critical to characterizing how cancers operate and predicting how they will eventually respond to treatment. The mouse window chamber model is an excellent tool for cancer research, because it enables high resolution tumor imaging and cross-validation using multiple modalities. We describe a novel multimodality imaging system that incorporates three dimensional (3D) photoacoustics with pulse echo ultrasound for imaging the tumor microenvironment and tracking tissue growth in mice. Three mice were implanted with a dorsal skin flap window chamber. PC-3 prostate tumor cells, expressing green fluorescent protein (GFP), were injected into the skin. The ensuing tumor invasion was mapped using photoacoustic and pulse echo imaging, as well as optical and fluorescent imaging for comparison and cross validation. The photoacoustic imaging and spectroscopy system, consisting of a tunable (680-1000nm) pulsed laser and 25 MHz ultrasound transducer, revealed near infrared absorbing regions, primarily blood vessels. Pulse echo images, obtained simultaneously, provided details of the tumor microstructure and growth with 100-μm3 resolution. The tumor size in all three mice increased between three and five fold during 3+ weeks of imaging. Results were consistent with the optical and fluorescent images. Photoacoustic imaging revealed detailed maps of the tumor vasculature, whereas photoacoustic spectroscopy identified regions of oxygenated and deoxygenated blood vessels. The 3D photoacoustic and pulse echo imaging system provided complementary information to track the tumor microenvironment, evaluate new cancer therapies, and develop molecular imaging agents in vivo. Finally, these safe and noninvasive techniques are potentially applicable for human cancer imaging.

  15. Classification of Focal and Non Focal Epileptic Seizures Using Multi-Features and SVM Classifier.

    PubMed

    Sriraam, N; Raghu, S

    2017-09-02

    Identifying epileptogenic zones prior to surgery is an essential and crucial step in treating patients having pharmacoresistant focal epilepsy. Electroencephalogram (EEG) is a significant measurement benchmark to assess patients suffering from epilepsy. This paper investigates the application of multi-features derived from different domains to recognize the focal and non focal epileptic seizures obtained from pharmacoresistant focal epilepsy patients from Bern Barcelona database. From the dataset, five different classification tasks were formed. Total 26 features were extracted from focal and non focal EEG. Significant features were selected using Wilcoxon rank sum test by setting p-value (p < 0.05) and z-score (-1.96 > z > 1.96) at 95% significance interval. Hypothesis was made that the effect of removing outliers improves the classification accuracy. Turkey's range test was adopted for pruning outliers from feature set. Finally, 21 features were classified using optimized support vector machine (SVM) classifier with 10-fold cross validation. Bayesian optimization technique was adopted to minimize the cross-validation loss. From the simulation results, it was inferred that the highest sensitivity, specificity, and classification accuracy of 94.56%, 89.74%, and 92.15% achieved respectively and found to be better than the state-of-the-art approaches. Further, it was observed that the classification accuracy improved from 80.2% with outliers to 92.15% without outliers. The classifier performance metrics ensures the suitability of the proposed multi-features with optimized SVM classifier. It can be concluded that the proposed approach can be applied for recognition of focal EEG signals to localize epileptogenic zones.

  16. Identification of DNA-Binding Proteins Using Mixed Feature Representation Methods.

    PubMed

    Qu, Kaiyang; Han, Ke; Wu, Song; Wang, Guohua; Wei, Leyi

    2017-09-22

    DNA-binding proteins play vital roles in cellular processes, such as DNA packaging, replication, transcription, regulation, and other DNA-associated activities. The current main prediction method is based on machine learning, and its accuracy mainly depends on the features extraction method. Therefore, using an efficient feature representation method is important to enhance the classification accuracy. However, existing feature representation methods cannot efficiently distinguish DNA-binding proteins from non-DNA-binding proteins. In this paper, a multi-feature representation method, which combines three feature representation methods, namely, K-Skip-N-Grams, Information theory, and Sequential and structural features (SSF), is used to represent the protein sequences and improve feature representation ability. In addition, the classifier is a support vector machine. The mixed-feature representation method is evaluated using 10-fold cross-validation and a test set. Feature vectors, which are obtained from a combination of three feature extractions, show the best performance in 10-fold cross-validation both under non-dimensional reduction and dimensional reduction by max-relevance-max-distance. Moreover, the reduced mixed feature method performs better than the non-reduced mixed feature technique. The feature vectors, which are a combination of SSF and K-Skip-N-Grams, show the best performance in the test set. Among these methods, mixed features exhibit superiority over the single features.

  17. Multivariate Adaptive Regression Splines (Preprint)

    DTIC Science & Technology

    1990-08-01

    fold cross -validation would take about ten time as long, and MARS is not all that fast to begin with. Friedman has a number of examples showing...standardized mean squared error of prediction (MSEP), the generalized cross validation (GCV), and the number of selected terms (TERMS). In accordance with...and mi= 10 case were almost exclusively spurious cross product terms and terms involving the nuisance variables x6 through xlo. This large number of

  18. GWAS-based machine learning approach to predict duloxetine response in major depressive disorder.

    PubMed

    Maciukiewicz, Malgorzata; Marshe, Victoria S; Hauschild, Anne-Christin; Foster, Jane A; Rotzinger, Susan; Kennedy, James L; Kennedy, Sidney H; Müller, Daniel J; Geraci, Joseph

    2018-04-01

    Major depressive disorder (MDD) is one of the most prevalent psychiatric disorders and is commonly treated with antidepressant drugs. However, large variability is observed in terms of response to antidepressants. Machine learning (ML) models may be useful to predict treatment outcomes. A sample of 186 MDD patients received treatment with duloxetine for up to 8 weeks were categorized as "responders" based on a MADRS change >50% from baseline; or "remitters" based on a MADRS score ≤10 at end point. The initial dataset (N = 186) was randomly divided into training and test sets in a nested 5-fold cross-validation, where 80% was used as a training set and 20% made up five independent test sets. We performed genome-wide logistic regression to identify potentially significant variants related to duloxetine response/remission and extracted the most promising predictors using LASSO regression. Subsequently, classification-regression trees (CRT) and support vector machines (SVM) were applied to construct models, using ten-fold cross-validation. With regards to response, none of the pairs performed significantly better than chance (accuracy p > .1). For remission, SVM achieved moderate performance with an accuracy = 0.52, a sensitivity = 0.58, and a specificity = 0.46, and 0.51 for all coefficients for CRT. The best performing SVM fold was characterized by an accuracy = 0.66 (p = .071), sensitivity = 0.70 and a sensitivity = 0.61. In this study, the potential of using GWAS data to predict duloxetine outcomes was examined using ML models. The models were characterized by a promising sensitivity, but specificity remained moderate at best. The inclusion of additional non-genetic variables to create integrated models may improve prediction. Copyright © 2017. Published by Elsevier Ltd.

  19. Automatic Detection of Whole Night Snoring Events Using Non-Contact Microphone

    PubMed Central

    Dafna, Eliran; Tarasiuk, Ariel; Zigel, Yaniv

    2013-01-01

    Objective Although awareness of sleep disorders is increasing, limited information is available on whole night detection of snoring. Our study aimed to develop and validate a robust, high performance, and sensitive whole-night snore detector based on non-contact technology. Design Sounds during polysomnography (PSG) were recorded using a directional condenser microphone placed 1 m above the bed. An AdaBoost classifier was trained and validated on manually labeled snoring and non-snoring acoustic events. Patients Sixty-seven subjects (age 52.5±13.5 years, BMI 30.8±4.7 kg/m2, m/f 40/27) referred for PSG for obstructive sleep apnea diagnoses were prospectively and consecutively recruited. Twenty-five subjects were used for the design study; the validation study was blindly performed on the remaining forty-two subjects. Measurements and Results To train the proposed sound detector, >76,600 acoustic episodes collected in the design study were manually classified by three scorers into snore and non-snore episodes (e.g., bedding noise, coughing, environmental). A feature selection process was applied to select the most discriminative features extracted from time and spectral domains. The average snore/non-snore detection rate (accuracy) for the design group was 98.4% based on a ten-fold cross-validation technique. When tested on the validation group, the average detection rate was 98.2% with sensitivity of 98.0% (snore as a snore) and specificity of 98.3% (noise as noise). Conclusions Audio-based features extracted from time and spectral domains can accurately discriminate between snore and non-snore acoustic events. This audio analysis approach enables detection and analysis of snoring sounds from a full night in order to produce quantified measures for objective follow-up of patients. PMID:24391903

  20. Automatic detection of whole night snoring events using non-contact microphone.

    PubMed

    Dafna, Eliran; Tarasiuk, Ariel; Zigel, Yaniv

    2013-01-01

    Although awareness of sleep disorders is increasing, limited information is available on whole night detection of snoring. Our study aimed to develop and validate a robust, high performance, and sensitive whole-night snore detector based on non-contact technology. Sounds during polysomnography (PSG) were recorded using a directional condenser microphone placed 1 m above the bed. An AdaBoost classifier was trained and validated on manually labeled snoring and non-snoring acoustic events. Sixty-seven subjects (age 52.5 ± 13.5 years, BMI 30.8 ± 4.7 kg/m(2), m/f 40/27) referred for PSG for obstructive sleep apnea diagnoses were prospectively and consecutively recruited. Twenty-five subjects were used for the design study; the validation study was blindly performed on the remaining forty-two subjects. To train the proposed sound detector, >76,600 acoustic episodes collected in the design study were manually classified by three scorers into snore and non-snore episodes (e.g., bedding noise, coughing, environmental). A feature selection process was applied to select the most discriminative features extracted from time and spectral domains. The average snore/non-snore detection rate (accuracy) for the design group was 98.4% based on a ten-fold cross-validation technique. When tested on the validation group, the average detection rate was 98.2% with sensitivity of 98.0% (snore as a snore) and specificity of 98.3% (noise as noise). Audio-based features extracted from time and spectral domains can accurately discriminate between snore and non-snore acoustic events. This audio analysis approach enables detection and analysis of snoring sounds from a full night in order to produce quantified measures for objective follow-up of patients.

  1. Cross-Cultural Validation of the Five-Factor Structure of Social Goals: A Filipino Investigation

    ERIC Educational Resources Information Center

    King, Ronnel B.; Watkins, David A.

    2012-01-01

    The aim of the present study was to test the cross-cultural validity of the five-factor structure of social goals that Dowson and McInerney proposed. Using both between-network and within-network approaches to construct validation, 1,147 Filipino high school students participated in the study. Confirmatory factor analysis indicated that the…

  2. Factors associated with vocal fold pathologies in teachers.

    PubMed

    Souza, Carla Lima de; Carvalho, Fernando Martins; Araújo, Tânia Maria de; Reis, Eduardo José Farias Borges Dos; Lima, Verônica Maria Cadena; Porto, Lauro Antonio

    2011-10-01

    To analyze factors associated with the prevalence of the medical diagnosis of vocal fold pathologies in teachers. A census-based epidemiological, cross-sectional study was conducted with 4,495 public primary and secondary school teachers in the city of Salvador, Northeastern Brazil, between March and April 2006. The dependent variable was the self-reported medical diagnosis of vocal fold pathologies and the independent variables were sociodemographic characteristics; professional activity; work organization/interpersonal relationships; physical work environment characteristics; frequency of common mental disorders, measured by the Self-Reporting Questionnaire-20 (SRQ-20 >7); and general health conditions. Descriptive statistical, bivariate and multiple logistic regression analysis techniques were used. The prevalence of self-reported medical diagnosis of vocal fold pathologies was 18.9%. In the logistic regression analysis, the variables that remained associated with this medical diagnosis were as follows: being female, having worked as a teacher for more than seven years, excessive voice use, reporting more than five unfavorable physical work environment characteristics and presence of common mental disorders. The presence of self-reported vocal fold pathologies was associated with factors that point out the need of actions that promote teachers' vocal health and changes in their work structure and organization.

  3. iSS-PC: Identifying Splicing Sites via Physical-Chemical Properties Using Deep Sparse Auto-Encoder.

    PubMed

    Xu, Zhao-Chun; Wang, Peng; Qiu, Wang-Ren; Xiao, Xuan

    2017-08-15

    Gene splicing is one of the most significant biological processes in eukaryotic gene expression, such as RNA splicing, which can cause a pre-mRNA to produce one or more mature messenger RNAs containing the coded information with multiple biological functions. Thus, identifying splicing sites in DNA/RNA sequences is significant for both the bio-medical research and the discovery of new drugs. However, it is expensive and time consuming based only on experimental technique, so new computational methods are needed. To identify the splice donor sites and splice acceptor sites accurately and quickly, a deep sparse auto-encoder model with two hidden layers, called iSS-PC, was constructed based on minimum error law, in which we incorporated twelve physical-chemical properties of the dinucleotides within DNA into PseDNC to formulate given sequence samples via a battery of cross-covariance and auto-covariance transformations. In this paper, five-fold cross-validation test results based on the same benchmark data-sets indicated that the new predictor remarkably outperformed the existing prediction methods in this field. Furthermore, it is expected that many other related problems can be also studied by this approach. To implement classification accurately and quickly, an easy-to-use web-server for identifying slicing sites has been established for free access at: http://www.jci-bioinfo.cn/iSS-PC.

  4. Design and Implementation of a Smart Home System Using Multisensor Data Fusion Technology.

    PubMed

    Hsu, Yu-Liang; Chou, Po-Huan; Chang, Hsing-Cheng; Lin, Shyan-Lung; Yang, Shih-Chin; Su, Heng-Yi; Chang, Chih-Chien; Cheng, Yuan-Sheng; Kuo, Yu-Chen

    2017-07-15

    This paper aims to develop a multisensor data fusion technology-based smart home system by integrating wearable intelligent technology, artificial intelligence, and sensor fusion technology. We have developed the following three systems to create an intelligent smart home environment: (1) a wearable motion sensing device to be placed on residents' wrists and its corresponding 3D gesture recognition algorithm to implement a convenient automated household appliance control system; (2) a wearable motion sensing device mounted on a resident's feet and its indoor positioning algorithm to realize an effective indoor pedestrian navigation system for smart energy management; (3) a multisensor circuit module and an intelligent fire detection and alarm algorithm to realize a home safety and fire detection system. In addition, an intelligent monitoring interface is developed to provide in real-time information about the smart home system, such as environmental temperatures, CO concentrations, communicative environmental alarms, household appliance status, human motion signals, and the results of gesture recognition and indoor positioning. Furthermore, an experimental testbed for validating the effectiveness and feasibility of the smart home system was built and verified experimentally. The results showed that the 3D gesture recognition algorithm could achieve recognition rates for automated household appliance control of 92.0%, 94.8%, 95.3%, and 87.7% by the 2-fold cross-validation, 5-fold cross-validation, 10-fold cross-validation, and leave-one-subject-out cross-validation strategies. For indoor positioning and smart energy management, the distance accuracy and positioning accuracy were around 0.22% and 3.36% of the total traveled distance in the indoor environment. For home safety and fire detection, the classification rate achieved 98.81% accuracy for determining the conditions of the indoor living environment.

  5. Design and Implementation of a Smart Home System Using Multisensor Data Fusion Technology

    PubMed Central

    Chou, Po-Huan; Chang, Hsing-Cheng; Lin, Shyan-Lung; Yang, Shih-Chin; Su, Heng-Yi; Chang, Chih-Chien; Cheng, Yuan-Sheng; Kuo, Yu-Chen

    2017-01-01

    This paper aims to develop a multisensor data fusion technology-based smart home system by integrating wearable intelligent technology, artificial intelligence, and sensor fusion technology. We have developed the following three systems to create an intelligent smart home environment: (1) a wearable motion sensing device to be placed on residents’ wrists and its corresponding 3D gesture recognition algorithm to implement a convenient automated household appliance control system; (2) a wearable motion sensing device mounted on a resident’s feet and its indoor positioning algorithm to realize an effective indoor pedestrian navigation system for smart energy management; (3) a multisensor circuit module and an intelligent fire detection and alarm algorithm to realize a home safety and fire detection system. In addition, an intelligent monitoring interface is developed to provide in real-time information about the smart home system, such as environmental temperatures, CO concentrations, communicative environmental alarms, household appliance status, human motion signals, and the results of gesture recognition and indoor positioning. Furthermore, an experimental testbed for validating the effectiveness and feasibility of the smart home system was built and verified experimentally. The results showed that the 3D gesture recognition algorithm could achieve recognition rates for automated household appliance control of 92.0%, 94.8%, 95.3%, and 87.7% by the 2-fold cross-validation, 5-fold cross-validation, 10-fold cross-validation, and leave-one-subject-out cross-validation strategies. For indoor positioning and smart energy management, the distance accuracy and positioning accuracy were around 0.22% and 3.36% of the total traveled distance in the indoor environment. For home safety and fire detection, the classification rate achieved 98.81% accuracy for determining the conditions of the indoor living environment. PMID:28714884

  6. Automatic Brain Tumor Detection in T2-weighted Magnetic Resonance Images

    NASA Astrophysics Data System (ADS)

    Dvořák, P.; Kropatsch, W. G.; Bartušek, K.

    2013-10-01

    This work focuses on fully automatic detection of brain tumors. The first aim is to determine, whether the image contains a brain with a tumor, and if it does, localize it. The goal of this work is not the exact segmentation of tumors, but the localization of their approximate position. The test database contains 203 T2-weighted images of which 131 are images of healthy brain and the remaining 72 images contain brain with pathological area. The estimation, whether the image shows an afflicted brain and where a pathological area is, is done by multi resolution symmetry analysis. The first goal was tested by five-fold cross-validation technique with 100 repetitions to avoid the result dependency on sample order. This part of the proposed method reaches the true positive rate of 87.52% and the true negative rate of 93.14% for an afflicted brain detection. The evaluation of the second part of the algorithm was carried out by comparing the estimated location to the true tumor location. The detection of the tumor location reaches the rate of 95.83% of correct anomaly detection and the rate 87.5% of correct tumor location.

  7. Identification of protein-interacting nucleotides in a RNA sequence using composition profile of tri-nucleotides.

    PubMed

    Panwar, Bharat; Raghava, Gajendra P S

    2015-04-01

    The RNA-protein interactions play a diverse role in the cells, thus identification of RNA-protein interface is essential for the biologist to understand their function. In the past, several methods have been developed for predicting RNA interacting residues in proteins, but limited efforts have been made for the identification of protein-interacting nucleotides in RNAs. In order to discriminate protein-interacting and non-interacting nucleotides, we used various classifiers (NaiveBayes, NaiveBayesMultinomial, BayesNet, ComplementNaiveBayes, MultilayerPerceptron, J48, SMO, RandomForest, SMO and SVM(light)) for prediction model development using various features and achieved highest 83.92% sensitivity, 84.82 specificity, 84.62% accuracy and 0.62 Matthew's correlation coefficient by SVM(light) based models. We observed that certain tri-nucleotides like ACA, ACC, AGA, CAC, CCA, GAG, UGA, and UUU preferred in protein-interaction. All the models have been developed using a non-redundant dataset and are evaluated using five-fold cross validation technique. A web-server called RNApin has been developed for the scientific community (http://crdd.osdd.net/raghava/rnapin/). Copyright © 2015 Elsevier Inc. All rights reserved.

  8. Hybrid fusion of linear, non-linear and spectral models for the dynamic modeling of sEMG and skeletal muscle force: an application to upper extremity amputation.

    PubMed

    Potluri, Chandrasekhar; Anugolu, Madhavi; Schoen, Marco P; Subbaram Naidu, D; Urfer, Alex; Chiu, Steve

    2013-11-01

    Estimating skeletal muscle (finger) forces using surface Electromyography (sEMG) signals poses many challenges. In general, the sEMG measurements are based on single sensor data. In this paper, two novel hybrid fusion techniques for estimating the skeletal muscle force from the sEMG array sensors are proposed. The sEMG signals are pre-processed using five different filters: Butterworth, Chebychev Type II, Exponential, Half-Gaussian and Wavelet transforms. Dynamic models are extracted from the acquired data using Nonlinear Wiener Hammerstein (NLWH) models and Spectral Analysis Frequency Dependent Resolution (SPAFDR) models based system identification techniques. A detailed comparison is provided for the proposed filters and models using 18 healthy subjects. Wavelet transforms give higher mean correlation of 72.6 ± 1.7 (mean ± SD) and 70.4 ± 1.5 (mean ± SD) for NLWH and SPAFDR models, respectively, when compared to the other filters used in this work. Experimental verification of the fusion based hybrid models with wavelet transform shows a 96% mean correlation and 3.9% mean relative error with a standard deviation of ± 1.3 and ± 0.9 respectively between the overall hybrid fusion algorithm estimated and the actual force for 18 test subjects' k-fold cross validation data. © 2013 Elsevier Ltd. All rights reserved.

  9. Estimates of Commercial Motor Vehicles Using the Southwest Border Crossings

    DOT National Transportation Integrated Search

    2000-09-20

    The United States has experienced almost a five-fold increase in commercial motor vehicle traffic to and from Mexico during the past sixteen years. There were more than 4< million commercial motor vehicle (CMV) crossings from Mexico into the United S...

  10. Predicting the Operational Acceptability of Route Advisories

    NASA Technical Reports Server (NTRS)

    Evans, Antony; Lee, Paul

    2017-01-01

    NASA envisions a future Air Traffic Management system that allows safe, efficient growth in global operations, enabled by increasing levels of automation and autonomy. In a safety-critical system, the introduction of increasing automation and autonomy has to be done in stages, making human-system integrated concepts critical in the foreseeable future. One example where this is relevant is for tools that generate more efficient flight routings or reroute advisories. If these routes are not operationally acceptable, they will be rejected by human operators, and the associated benefits will not be realized. Operational acceptance is therefore required to enable the increased efficiency and reduced workload benefits associated with these tools. In this paper, the authors develop a predictor of operational acceptability for reroute advisories. Such a capability has applications in tools that identify more efficient routings around weather and congestion and that better meet airline preferences. The capability is based on applying data mining techniques to flight plan amendment data reported by the Federal Aviation Administration and data on requested reroutes collected from a field trial of the NASA developed Dynamic Weather Routes tool, which advised efficient route changes to American Airlines dispatchers in 2014. 10-Fold cross validation was used for feature, model and parameter selection, while nested cross validation was used to validate the model. The model performed well in predicting controller acceptance or rejection of a route change as indicated by chosen performance metrics. Features identified as relevant to controller acceptance included the historical usage of the advised route, the location of the maneuver start point relative to the boundaries of the airspace sector containing the maneuver start (the maneuver start sector), the reroute deviation from the original flight plan, and the demand level in the maneuver start sector. A random forest with forty trees was the best performing of the five models evaluated in this paper.

  11. A photo-cross-linking approach to monitor folding and assembly of newly synthesized proteins in a living cell.

    PubMed

    Miyazaki, Ryoji; Myougo, Naomi; Mori, Hiroyuki; Akiyama, Yoshinori

    2018-01-12

    Many proteins form multimeric complexes that play crucial roles in various cellular processes. Studying how proteins are correctly folded and assembled into such complexes in a living cell is important for understanding the physiological roles and the qualitative and quantitative regulation of the complex. However, few methods are suitable for analyzing these rapidly occurring processes. Site-directed in vivo photo-cross-linking is an elegant technique that enables analysis of protein-protein interactions in living cells with high spatial resolution. However, the conventional site-directed in vivo photo-cross-linking method is unsuitable for analyzing dynamic processes. Here, by combining an improved site-directed in vivo photo-cross-linking technique with a pulse-chase approach, we developed a new method that can analyze the folding and assembly of a newly synthesized protein with high spatiotemporal resolution. We demonstrate that this method, named the pulse-chase and in vivo photo-cross-linking experiment (PiXie), enables the kinetic analysis of the formation of an Escherichia coli periplasmic (soluble) protein complex (PhoA). We also used our new technique to investigate assembly/folding processes of two membrane complexes (SecD-SecF in the inner membrane and LptD-LptE in the outer membrane), which provided new insights into the biogenesis of these complexes. Our PiXie method permits analysis of the dynamic behavior of various proteins and enables examination of protein-protein interactions at the level of individual amino acid residues. We anticipate that our new technique will have valuable utility for studies of protein dynamics in many organisms. © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.

  12. Computer-aided detection of prostate cancer in T2-weighted MRI within the peripheral zone

    NASA Astrophysics Data System (ADS)

    Rampun, Andrik; Zheng, Ling; Malcolm, Paul; Tiddeman, Bernie; Zwiggelaar, Reyer

    2016-07-01

    In this paper we propose a prostate cancer computer-aided diagnosis (CAD) system and suggest a set of discriminant texture descriptors extracted from T2-weighted MRI data which can be used as a good basis for a multimodality system. For this purpose, 215 texture descriptors were extracted and eleven different classifiers were employed to achieve the best possible results. The proposed method was tested based on 418 T2-weighted MR images taken from 45 patients and evaluated using 9-fold cross validation with five patients in each fold. The results demonstrated comparable results to existing CAD systems using multimodality MRI. We achieved an area under the receiver operating curve (A z ) values equal to 90.0%+/- 7.6% , 89.5%+/- 8.9% , 87.9%+/- 9.3% and 87.4%+/- 9.2% for Bayesian networks, ADTree, random forest and multilayer perceptron classifiers, respectively, while a meta-voting classifier using average probability as a combination rule achieved 92.7%+/- 7.4% .

  13. Engagement techniques and playing level impact the biomechanical demands on rugby forwards during machine-based scrummaging.

    PubMed

    Preatoni, Ezio; Stokes, Keith A; England, Michael E; Trewartha, Grant

    2015-04-01

    This cross-sectional study investigated the factors that may influence the physical loading on rugby forwards performing a scrum by studying the biomechanics of machine-based scrummaging under different engagement techniques and playing levels. 34 forward packs from six playing levels performed repetitions of five different types of engagement techniques against an instrumented scrum machine under realistic training conditions. Applied forces and body movements were recorded in three orthogonal directions. The modification of the engagement technique altered the load acting on players. These changes were in a similar direction and of similar magnitude irrespective of the playing level. Reducing the dynamics of the initial engagement through a fold-in procedure decreased the peak compression force, the peak downward force and the engagement speed in excess of 30%. For example, peak compression (horizontal) forces in the professional teams changed from 16.5 (baseline technique) to 8.6 kN (fold-in procedure). The fold-in technique also reduced the occurrence of combined high forces and head-trunk misalignment during the absorption of the impact, which was used as a measure of potential hazard, by more than 30%. Reducing the initial impact did not decrease the ability of the teams to produce sustained compression forces. De-emphasising the initial impact against the scrum machine decreased the mechanical stresses acting on forward players and may benefit players' welfare by reducing the hazard factors that may induce chronic degeneration of the spine. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  14. O-GlcNAcPRED-II: an integrated classification algorithm for identifying O-GlcNAcylation sites based on fuzzy undersampling and a K-means PCA oversampling technique.

    PubMed

    Jia, Cangzhi; Zuo, Yun; Zou, Quan; Hancock, John

    2018-02-06

    Protein O-GlcNAcylation (O-GlcNAc) is an important post-translational modification of serine (S)/threonine (T) residues that involves multiple molecular and cellular processes. Recent studies have suggested that abnormal O-G1cNAcylation causes many diseases, such as cancer and various neurodegenerative diseases. With the available protein O-G1cNAcylation sites experimentally verified, it is highly desired to develop automated methods to rapidly and effectively identify O-G1cNAcylation sites. Although some computational methods have been proposed, their performance has been unsatisfactory, particularly in terms of prediction sensitivity. In this study, we developed an ensemble model O-GlcNAcPRED-II to identify potential O-G1cNAcylation sites. A K-means principal component analysis oversampling technique (KPCA) and fuzzy undersampling method (FUS) were first proposed and incorporated to reduce the proportion of the original positive and negative training samples. Then, rotation forest, a type of classifier-integrated system, was adopted to divide the eight types of feature space into several subsets using four sub-classifiers: random forest, k-nearest neighbour, naive Bayesian and support vector machine. We observed that O-GlcNAcPRED-II achieved a sensitivity of 81.05%, specificity of 95.91%, accuracy of 91.43% and Matthew's correlation coefficient of 0.7928 for five-fold cross-validation run 10 times. Additionally, the results obtained by O-GlcNAcPRED-II on two independent datasets also indicated that the proposed predictor outperformed five published prediction tools. http://121.42.167.206/OGlcPred/. cangzhijia@dlmu.edu.cn or zouquan@nclab.net. © The Author (2018). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  15. Using deep learning for detecting gender in adult chest radiographs

    NASA Astrophysics Data System (ADS)

    Xue, Zhiyun; Antani, Sameer; Long, L. Rodney; Thoma, George R.

    2018-03-01

    In this paper, we present a method for automatically identifying the gender of an imaged person using their frontal chest x-ray images. Our work is motivated by the need to determine missing gender information in some datasets. The proposed method employs the technique of convolutional neural network (CNN) based deep learning and transfer learning to overcome the challenge of developing handcrafted features in limited data. Specifically, the method consists of four main steps: pre-processing, CNN feature extractor, feature selection, and classifier. The method is tested on a combined dataset obtained from several sources with varying acquisition quality resulting in different pre-processing steps that are applied for each. For feature extraction, we tested and compared four CNN architectures, viz., AlexNet, VggNet, GoogLeNet, and ResNet. We applied a feature selection technique, since the feature length is larger than the number of images. Two popular classifiers: SVM and Random Forest, are used and compared. We evaluated the classification performance by cross-validation and used seven performance measures. The best performer is the VggNet-16 feature extractor with the SVM classifier, with accuracy of 86.6% and ROC Area being 0.932 for 5-fold cross validation. We also discuss several misclassified cases and describe future work for performance improvement.

  16. Analysis of a crossed Bragg cell acousto-optical spectrometer for SETI

    NASA Technical Reports Server (NTRS)

    Gulkis, S.

    1989-01-01

    The search for radio signals from extraterrestrial intelligent beings (SETI) requires the use of large instantaneous bandwidth (500 MHz) and high resolution (20 Hz) spectrometers. Digital systems with a high degree of modularity can be used to provide this capability, and this method has been widely discussed. Another technique for meeting the SETI requirement is to use a crossed Bragg cell spectrometer as described by Psaltis and Casasent. This technique makes use of the Folded Spectrum concept, introduced by Thomas. The Folded Spectrum is a 2-D Fourier Transform of a raster scanned 1-D signal. It is directly related to the long 1-D spectrum of the original signal and is ideally suited for optical signal processing. The folded spectrum technique has received little attention to date, primarily because early systems made use of photographic film which are unsuitable for the real time data analysis and voluminous data requirements of SETI. An analysis of the crossed Bragg cell spectrometer is presented as a method to achieve the spectral processing requirements for SETI. Systematic noise contributions unique to the Bragg cell system will be discussed.

  17. Analysis of a crossed Bragg cell acousto-optical spectrometer for SETI.

    PubMed

    Gulkis, S

    1989-01-01

    The search for radio signals from extraterrestrial intelligent beings (SETI) requires the use of large instantaneous bandwidth (500 MHz) and high resolution (20 Hz) spectrometers. Digital systems with a high degree of modularity can be used to provide this capability, and this method has been widely discussed. Another technique for meeting the SETI requirement is to use a crossed Bragg cell spectrometer as described by Psaltis and Casasent. This technique makes use of the Folded Spectrum concept, introduced by Thomas. The Folded Spectrum is a 2-D Fourier Transform of a raster scanned 1-D signal. It is directly related to the long 1-D spectrum of the original signal and is ideally suited for optical signal processing. The folded spectrum technique has received little attention to date, primarily because early systems made use of photographic film which are unsuitable for the real time data analysis and voluminous data requirements of SETI. An analysis of the crossed Bragg cell spectrometer is presented as a method to achieve the spectral processing requirements for SETI. Systematic noise contributions unique to the Bragg cell system will be discussed.

  18. Analysis of a crossed Bragg cell acousto-optical spectrometer for SETI

    NASA Astrophysics Data System (ADS)

    Gulkis, Samuel

    The search for radio signals from extraterrestrial intelligent beings (SETI) requires the use of large instantaneous bandwidth (500 MHz) and high resolution (20 Hz) spectrometers. Digital systems with a high degree of modularity can be used to provide this capability, and this method has been widely discussed. Another technique for meeting the SETI requirement is to use a crossed Bragg cell spectrometer as described by Psaltis and Casasent. This technique makes use of the Folded Spectrum concept, introduced by Thomas. The Folded Spectrum is a 2-D Fourier Transform of a raster scanned 1-D signal. It is directly related to the long 1-D spectrum of the original signal and is ideally suited for optical signal processing. The folded spectrum technique has received little attention to date, primarily because early systems made use of photographic film which are unsuitable for the real time data analysis and voluminous data requirements of SETI. An analysis of the crossed Bragg cell spectrometer is presented as a method to achieve the spectral processing requirements for SETI. Systematic noise contributions unique to the Bragg cell system will be discussed.

  19. A Real-Time Earthquake Precursor Detection Technique Using TEC from a GPS Network

    NASA Astrophysics Data System (ADS)

    Alp Akyol, Ali; Arikan, Feza; Arikan, Orhan

    2016-07-01

    Anomalies have been observed in the ionospheric electron density distribution prior to strong earthquakes. However, most of the reported results are obtained by earthquake analysis. Therefore, their implementation in practice is highly problematic. Recently, a novel earthquake precursor detection technique based on spatio-temporal analysis of Total Electron Content (TEC) data obtained from Turkish National Permanent GPS Network (TNPGN) is developed by IONOLAB group (www.ionolab.org). In the present study, the developed detection technique is implemented in a causal setup over the available data set in test phase that enables the real time implementation. The performance of the developed earthquake prediction technique is evaluated by using 10 fold cross validation over the data obtained in 2011. Among the 23 earthquakes that have magnitudes higher than 5, the developed technique can detect precursors of 14 earthquakes while producing 8 false alarms. This study is supported by TUBITAK 115E915 and Joint TUBITAK 114E092 and AS CR 14/001 projects.

  20. Medical application of artificial immune recognition system (AIRS): diagnosis of atherosclerosis from carotid artery Doppler signals.

    PubMed

    Latifoğlu, Fatma; Kodaz, Halife; Kara, Sadik; Güneş, Salih

    2007-08-01

    This study was conducted to distinguish between atherosclerosis and healthy subjects. Hence, we have employed the maximum envelope of the carotid artery Doppler sonograms derived from Fast Fourier Transformation-Welch method and Artificial Immune Recognition System (AIRS). The fuzzy appearance of the carotid artery Doppler signals makes physicians suspicious about the existence of diseases and sometimes causes false diagnosis. Our technique gets around this problem using AIRS to decide and assist the physician to make the final judgment in confidence. AIRS has reached 99.29% classification accuracy using 10-fold cross validation. Results show that the proposed method classified Doppler signals successfully.

  1. A new class of compact high sensitive tiltmeter based on the UNISA folded pendulum mechanical architecture

    NASA Astrophysics Data System (ADS)

    Barone, Fabrizio; Giordano, Gerardo

    2018-02-01

    We present the Extended Folded Pendulum Model (EFPM), a model developed for a quantitative description of the dynamical behavior of a folded pendulum generically oriented in space. This model, based on the Tait-Bryan angular reference system, highlights the relationship between the folded pendulum orientation in the gravitational field and its natural resonance frequency. Tis model validated by tests performed with a monolithic UNISA Folded Pendulum, highlights a new technique of implementation of folded pendulum based tiltmeters.

  2. Can Statistical Machine Learning Algorithms Help for Classification of Obstructive Sleep Apnea Severity to Optimal Utilization of Polysomnography Resources?

    PubMed

    Bozkurt, Selen; Bostanci, Asli; Turhan, Murat

    2017-08-11

    The goal of this study is to evaluate the results of machine learning methods for the classification of OSA severity of patients with suspected sleep disorder breathing as normal, mild, moderate and severe based on non-polysomnographic variables: 1) clinical data, 2) symptoms and 3) physical examination. In order to produce classification models for OSA severity, five different machine learning methods (Bayesian network, Decision Tree, Random Forest, Neural Networks and Logistic Regression) were trained while relevant variables and their relationships were derived empirically from observed data. Each model was trained and evaluated using 10-fold cross-validation and to evaluate classification performances of all methods, true positive rate (TPR), false positive rate (FPR), Positive Predictive Value (PPV), F measure and Area Under Receiver Operating Characteristics curve (ROC-AUC) were used. Results of 10-fold cross validated tests with different variable settings promisingly indicated that the OSA severity of suspected OSA patients can be classified, using non-polysomnographic features, with 0.71 true positive rate as the highest and, 0.15 false positive rate as the lowest, respectively. Moreover, the test results of different variables settings revealed that the accuracy of the classification models was significantly improved when physical examination variables were added to the model. Study results showed that machine learning methods can be used to estimate the probabilities of no, mild, moderate, and severe obstructive sleep apnea and such approaches may improve accurate initial OSA screening and help referring only the suspected moderate or severe OSA patients to sleep laboratories for the expensive tests.

  3. An intercomparison of a large ensemble of statistical downscaling methods for Europe: Overall results from the VALUE perfect predictor cross-validation experiment

    NASA Astrophysics Data System (ADS)

    Gutiérrez, Jose Manuel; Maraun, Douglas; Widmann, Martin; Huth, Radan; Hertig, Elke; Benestad, Rasmus; Roessler, Ole; Wibig, Joanna; Wilcke, Renate; Kotlarski, Sven

    2016-04-01

    VALUE is an open European network to validate and compare downscaling methods for climate change research (http://www.value-cost.eu). A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of both dynamical and statistical downscaling methods. This framework is based on a user-focused validation tree, guiding the selection of relevant validation indices and performance measures for different aspects of the validation (marginal, temporal, spatial, multi-variable). Moreover, several experiments have been designed to isolate specific points in the downscaling procedure where problems may occur (assessment of intrinsic performance, effect of errors inherited from the global models, effect of non-stationarity, etc.). The list of downscaling experiments includes 1) cross-validation with perfect predictors, 2) GCM predictors -aligned with EURO-CORDEX experiment- and 3) pseudo reality predictors (see Maraun et al. 2015, Earth's Future, 3, doi:10.1002/2014EF000259, for more details). The results of these experiments are gathered, validated and publicly distributed through the VALUE validation portal, allowing for a comprehensive community-open downscaling intercomparison study. In this contribution we describe the overall results from Experiment 1), consisting of a European wide 5-fold cross-validation (with consecutive 6-year periods from 1979 to 2008) using predictors from ERA-Interim to downscale precipitation and temperatures (minimum and maximum) over a set of 86 ECA&D stations representative of the main geographical and climatic regions in Europe. As a result of the open call for contribution to this experiment (closed in Dec. 2015), over 40 methods representative of the main approaches (MOS and Perfect Prognosis, PP) and techniques (linear scaling, quantile mapping, analogs, weather typing, linear and generalized regression, weather generators, etc.) were submitted, including information both data (downscaled values) and metadata (characterizing different aspects of the downscaling methods). This constitutes the largest and most comprehensive to date intercomparison of statistical downscaling methods. Here, we present an overall validation, analyzing marginal and temporal aspects to assess the intrinsic performance and added value of statistical downscaling methods at both annual and seasonal levels. This validation takes into account the different properties/limitations of different approaches and techniques (as reported in the provided metadata) in order to perform a fair comparison. It is pointed out that this experiment alone is not sufficient to evaluate the limitations of (MOS) bias correction techniques. Moreover, it also does not fully validate PP since we don't learn whether we have the right predictors and whether the PP assumption is valid. These problems will be analyzed in the subsequent community-open VALUE experiments 2) and 3), which will be open for participation along the present year.

  4. Validation of bioelectrical impedance analysis for total body water assessment against the deuterium dilution technique in Asian children.

    PubMed

    Liu, A; Byrne, N M; Ma, G; Nasreddine, L; Trinidad, T P; Kijboonchoo, K; Ismail, M N; Kagawa, M; Poh, B K; Hills, A P

    2011-12-01

    To develop and cross-validate bioelectrical impedance analysis (BIA) prediction equations of total body water (TBW) and fat-free mass (FFM) for Asian pre-pubertal children from China, Lebanon, Malaysia, Philippines and Thailand. Height, weight, age, gender, resistance and reactance measured by BIA were collected from 948 Asian children (492 boys and 456 girls) aged 8-10 years from the five countries. The deuterium dilution technique was used as the criterion method for the estimation of TBW and FFM. The BIA equations were developed using stepwise multiple regression analysis and cross-validated using the Bland-Altman approach. The BIA prediction equation for the estimation of TBW was as follows: TBW=0.231 × height(2)/resistance+0.066 × height+0.188 × weight+0.128 × age+0.500 × sex-0.316 × Thais-4.574 (R (2)=88.0%, root mean square error (RMSE)=1.3 kg), and for the estimation of FFM was as follows: FFM=0.299 × height(2)/resistance+0.086 × height+0.245 × weight+0.260 × age+0.901 × sex-0.415 × ethnicity (Thai ethnicity =1, others = 0)-6.952 (R (2)=88.3%, RMSE=1.7 kg). No significant difference between measured and predicted values for the whole cross-validation sample was found. However, the prediction equation for estimation of TBW/FFM tended to overestimate TBW/FFM at lower levels whereas underestimate at higher levels of TBW/FFM. Accuracy of the general equation for TBW and FFM was also valid at each body mass index category. Ethnicity influences the relationship between BIA and body composition in Asian pre-pubertal children. The newly developed BIA prediction equations are valid for use in Asian pre-pubertal children.

  5. Micro-Doppler Based Classification of Human Aquatic Activities via Transfer Learning of Convolutional Neural Networks.

    PubMed

    Park, Jinhee; Javier, Rios Jesus; Moon, Taesup; Kim, Youngwook

    2016-11-24

    Accurate classification of human aquatic activities using radar has a variety of potential applications such as rescue operations and border patrols. Nevertheless, the classification of activities on water using radar has not been extensively studied, unlike the case on dry ground, due to its unique challenge. Namely, not only is the radar cross section of a human on water small, but the micro-Doppler signatures are much noisier due to water drops and waves. In this paper, we first investigate whether discriminative signatures could be obtained for activities on water through a simulation study. Then, we show how we can effectively achieve high classification accuracy by applying deep convolutional neural networks (DCNN) directly to the spectrogram of real measurement data. From the five-fold cross-validation on our dataset, which consists of five aquatic activities, we report that the conventional feature-based scheme only achieves an accuracy of 45.1%. In contrast, the DCNN trained using only the collected data attains 66.7%, and the transfer learned DCNN, which takes a DCNN pre-trained on a RGB image dataset and fine-tunes the parameters using the collected data, achieves a much higher 80.3%, which is a significant performance boost.

  6. Neural network and SVM classifiers accurately predict lipid binding proteins, irrespective of sequence homology.

    PubMed

    Bakhtiarizadeh, Mohammad Reza; Moradi-Shahrbabak, Mohammad; Ebrahimi, Mansour; Ebrahimie, Esmaeil

    2014-09-07

    Due to the central roles of lipid binding proteins (LBPs) in many biological processes, sequence based identification of LBPs is of great interest. The major challenge is that LBPs are diverse in sequence, structure, and function which results in low accuracy of sequence homology based methods. Therefore, there is a need for developing alternative functional prediction methods irrespective of sequence similarity. To identify LBPs from non-LBPs, the performances of support vector machine (SVM) and neural network were compared in this study. Comprehensive protein features and various techniques were employed to create datasets. Five-fold cross-validation (CV) and independent evaluation (IE) tests were used to assess the validity of the two methods. The results indicated that SVM outperforms neural network. SVM achieved 89.28% (CV) and 89.55% (IE) overall accuracy in identification of LBPs from non-LBPs and 92.06% (CV) and 92.90% (IE) (in average) for classification of different LBPs classes. Increasing the number and the range of extracted protein features as well as optimization of the SVM parameters significantly increased the efficiency of LBPs class prediction in comparison to the only previous report in this field. Altogether, the results showed that the SVM algorithm can be run on broad, computationally calculated protein features and offers a promising tool in detection of LBPs classes. The proposed approach has the potential to integrate and improve the common sequence alignment based methods. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. In silico toxicity prediction by support vector machine and SMILES representation-based string kernel.

    PubMed

    Cao, D-S; Zhao, J-C; Yang, Y-N; Zhao, C-X; Yan, J; Liu, S; Hu, Q-N; Xu, Q-S; Liang, Y-Z

    2012-01-01

    There is a great need to assess the harmful effects or toxicities of chemicals to which man is exposed. In the present paper, the simplified molecular input line entry specification (SMILES) representation-based string kernel, together with the state-of-the-art support vector machine (SVM) algorithm, were used to classify the toxicity of chemicals from the US Environmental Protection Agency Distributed Structure-Searchable Toxicity (DSSTox) database network. In this method, the molecular structure can be directly encoded by a series of SMILES substrings that represent the presence of some chemical elements and different kinds of chemical bonds (double, triple and stereochemistry) in the molecules. Thus, SMILES string kernel can accurately and directly measure the similarities of molecules by a series of local information hidden in the molecules. Two model validation approaches, five-fold cross-validation and independent validation set, were used for assessing the predictive capability of our developed models. The results obtained indicate that SVM based on the SMILES string kernel can be regarded as a very promising and alternative modelling approach for potential toxicity prediction of chemicals.

  8. A Diagnostic Model for Impending Death in Cancer Patients: Preliminary Report

    PubMed Central

    Hui, David; Hess, Kenneth; dos Santos, Renata; Chisholm, Gary; Bruera, Eduardo

    2015-01-01

    Background We recently identified several highly specific bedside physical signs associated with impending death within 3 days among patients with advanced cancer. In this study, we developed and assessed a diagnostic model for impending death based on these physical signs. Methods We systematically documented 62 physical signs every 12 hours from admission to death or discharge in 357 patients with advanced cancer admitted to acute palliative care units (APCUs) at two tertiary care cancer centers. We used recursive partitioning analysis (RPA) to develop a prediction model for impending death in 3 days using admission data. We validated the model with 5 iterations of 10-fold cross-validation, and also applied the model to APCU days 2/3/4/5/6. Results Among 322/357 (90%) patients with complete data for all signs, the 3-day mortality was 24% on admission. The final model was based on 2 variables (palliative performance scale [PPS] and drooping of nasolabial fold) and had 4 terminal leaves: PPS≤20% and drooping of nasolabial fold present, PPS≤20% and drooping of nasolabial fold absent, PPS 30–60% and PPS ≥ 70%, with 3-day mortality of 94%, 42%, 16% and 3%, respectively. The diagnostic accuracy was 81% for the original tree, 80% for cross-validation, and 79%–84% for subsequent APCU days. Conclusion(s) We developed a diagnostic model for impending death within 3 days based on 2 objective bedside physical signs. This model was applicable to both APCU admission and subsequent days. Upon further external validation, this model may help clinicians to formulate the diagnosis of impending death. PMID:26218612

  9. The cross-cultural equivalence of participation instruments: a systematic review.

    PubMed

    Stevelink, S A M; van Brakel, W H

    2013-07-01

    Concepts such as health-related quality of life, disability and participation may differ across cultures. Consequently, when assessing such a concept using a measure developed elsewhere, it is important to test its cultural equivalence. Previous research suggested a lack of cultural equivalence testing in several areas of measurement. This paper reviews the process of cross-cultural equivalence testing of instruments to measure participation in society. An existing cultural equivalence framework was adapted and used to assess participation instruments on five categories of equivalence: conceptual, item, semantic, measurement and operational equivalence. For each category, several aspects were rated, resulting in an overall category rating of 'minimal/none', 'partial' or 'extensive'. The best possible overall study rating was five 'extensive' ratings. Articles were included if the instruments focussed explicitly on measuring 'participation' and were theoretically grounded in the ICIDH(-2) or ICF. Cross-validation articles were only included if it concerned an adaptation of an instrument developed in a high or middle-income country to a low-income country or vice versa. Eight cross-cultural validation studies were included in which five participation instruments were tested (Impact on Participation and Autonomy, London Handicap Scale, Perceived Impact and Problem Profile, Craig Handicap Assessment Reporting Technique, Participation Scale). Of these eight studies, only three received at least two 'extensive' ratings for the different categories of equivalence. The majority of the cultural equivalence ratings given were 'partial' and 'minimal/none'. The majority of the 'none/minimal' ratings were given for item and measurement equivalence. The cross-cultural equivalence testing of the participation instruments included leaves much to be desired. A detailed checklist is proposed for designing a cross-validation study. Once a study has been conducted, the checklist can be used to ensure comprehensive reporting of the validation (equivalence) testing process and its results. • Participation instruments are often used in a different cultural setting than initial developed for. • The conceptualization of participation may vary across cultures. Therefore, cultural equivalence – the extent to which an instrument is equally suitable for use in two or more cultures – is an important concept to address. • This review showed that the process of cultural equivalence testing of the included participation instruments was often addressed insufficiently. • Clinicians should be aware that application of participations instruments in a different culture than initially developed for needs prior testing of cultural validity in the next context.

  10. Development and validation of a 6-point grading scale in patients undergoing correction of nasolabial folds with a collagen implant.

    PubMed

    Monheit, Gary D; Gendler, Ellen C; Poff, Bradley; Fleming, Laura; Bachtell, Nathan; Garcia, Emily; Burkholder, David

    2010-11-01

    Various scoring techniques prone to subjective interpretation have been used to evaluate soft tissue augmentation of nasolabial folds (NLFs). To design and validate a reliable wrinkle assessment scoring scale. Six photographed wrinkles of varying severity were electronically copied onto the same facial image to become a 6-point grading scale (GGS). A pilot training program (13 investigators) determined reliability, and a 12-week multicenter survey study validated the GGS scoring method. Pilot study inter- and intrarater scoring reliability were high (weighted kappa scores of 0.85 and 0.86, respectively). Seventy-five percent of survey investigators and independent review panel (IRP) members considered a GGS score difference of 0.5 to be a minimally perceivable difference. Interrater weighted kappa scores were 0.91 for the IRP and 0.80 for investigators. Intrarater agreements after repeat testing were 0.91 and 0.89, respectively. The baseline "live" assessment GGS mean score was 3.34, and the baseline blinded photographic assessment GGS mean score was 2.00 for the IRP and 2.16 for the investigators. The GGS is a reproducible method of grading the severity of NLF wrinkles. Treatment effectiveness of a dermal filler can be reliably evaluated using the GGS by comparing "live" assessments with the standard GGS photographic panel. © 2010 by the American Society for Dermatologic Surgery, Inc.

  11. QSPR for predicting chloroform formation in drinking water disinfection.

    PubMed

    Luilo, G B; Cabaniss, S E

    2011-01-01

    Chlorination is the most widely used technique for water disinfection, but may lead to the formation of chloroform (trichloromethane; TCM) and other by-products. This article reports the first quantitative structure-property relationship (QSPR) for predicting the formation of TCM in chlorinated drinking water. Model compounds (n = 117) drawn from 10 literature sources were divided into training data (n = 90, analysed by five-way leave-many-out internal cross-validation) and external validation data (n = 27). QSPR internal cross-validation had Q² = 0.94 and root mean square error (RMSE) of 0.09 moles TCM per mole compound, consistent with external validation Q2 of 0.94 and RMSE of 0.08 moles TCM per mole compound, and met criteria for high predictive power and robustness. In contrast, log TCM QSPR performed poorly and did not meet the criteria for predictive power. The QSPR predictions were consistent with experimental values for TCM formation from tannic acid and for model fulvic acid structures. The descriptors used are consistent with a relatively small number of important TCM precursor structures based upon 1,3-dicarbonyls or 1,3-diphenols.

  12. Texture analysis for survival prediction of pancreatic ductal adenocarcinoma patients with neoadjuvant chemotherapy

    NASA Astrophysics Data System (ADS)

    Chakraborty, Jayasree; Langdon-Embry, Liana; Escalon, Joanna G.; Allen, Peter J.; Lowery, Maeve A.; O'Reilly, Eileen M.; Do, Richard K. G.; Simpson, Amber L.

    2016-03-01

    Pancreatic ductal adenocarcinoma (PDAC) is the fourth leading cause of cancer-related death in the United States. The five-year survival rate for all stages is approximately 6%, and approximately 2% when presenting with distant disease.1 Only 10-20% of all patients present with resectable disease, but recurrence rates are high with only 5 to 15% remaining free of disease at 5 years. At this time, we are unable to distinguish between resectable PDAC patients with occult metastatic disease from those with potentially curable disease. Early classification of these tumor types may eventually lead to changes in initial management including the use of neoadjuvant chemotherapy or radiation, or in the choice of postoperative adjuvant treatments. Texture analysis is an emerging methodology in oncologic imaging for quantitatively assessing tumor heterogeneity that could potentially aid in the stratification of these patients. The present study derives several texture-based features from CT images of PDAC patients, acquired prior to neoadjuvant chemotherapy, and analyzes their performance, individually as well as in combination, as prognostic markers. A fuzzy minimum redundancy maximum relevance method with leave-one-image-out technique is included to select discriminating features from the set of extracted features. With a naive Bayes classifier, the proposed method predicts the 5-year overall survival of PDAC patients prior to neoadjuvant therapy and achieves the best results in terms of the area under the receiver operating characteristic curve of 0:858 and accuracy of 83:0% with four-fold cross-validation techniques.

  13. Identifying a predictive model for response to atypical antipsychotic monotherapy treatment in south Indian schizophrenia patients.

    PubMed

    Gupta, Meenal; Moily, Nagaraj S; Kaur, Harpreet; Jajodia, Ajay; Jain, Sanjeev; Kukreti, Ritushree

    2013-08-01

    Atypical antipsychotic (AAP) drugs are the preferred choice of treatment for schizophrenia patients. Patients who do not show favorable response to AAP monotherapy are subjected to random prolonged therapeutic treatment with AAP multitherapy, typical antipsychotics or a combination of both. Therefore, prior identification of patients' response to drugs can be an important step in providing efficacious and safe therapeutic treatment. We thus attempted to elucidate a genetic signature which could predict patients' response to AAP monotherapy. Our logistic regression analyses indicated the probability that 76% patients carrying combination of four SNPs will not show favorable response to AAP therapy. The robustness of this prediction model was assessed using repeated 10-fold cross validation method, and the results across n-fold cross-validations (mean accuracy=71.91%; 95%CI=71.47-72.35) suggest high accuracy and reliability of the prediction model. Further validations of these results in large sample sets are likely to establish their clinical applicability. Copyright © 2013 Elsevier Inc. All rights reserved.

  14. The development and cross-validation of an MMPI typology of murderers.

    PubMed

    Holcomb, W R; Adams, N A; Ponder, H M

    1985-06-01

    A sample of 80 male offenders charged with premeditated murder were divided into five personality types using MMPI scores. A hierarchical clustering procedure was used with a subsequent internal cross-validation analysis using a second sample of 80 premeditated murderers. A Discriminant Analysis resulted in a 96.25% correct classification of subjects from the second sample into the five types. Clinical data from a mental status interview schedule supported the external validity of these types. There were significant differences among the five types in hallucinations, disorientation, hostility, depression, and paranoid thinking. Both similarities and differences of the present typology with prior research was discussed. Additional research questions were suggested.

  15. A multiscale decomposition approach to detect abnormal vasculature in the optic disc.

    PubMed

    Agurto, Carla; Yu, Honggang; Murray, Victor; Pattichis, Marios S; Nemeth, Sheila; Barriga, Simon; Soliz, Peter

    2015-07-01

    This paper presents a multiscale method to detect neovascularization in the optic disc (NVD) using fundus images. Our method is applied to a manually selected region of interest (ROI) containing the optic disc. All the vessels in the ROI are segmented by adaptively combining contrast enhancement methods with a vessel segmentation technique. Textural features extracted using multiscale amplitude-modulation frequency-modulation, morphological granulometry, and fractal dimension are used. A linear SVM is used to perform the classification, which is tested by means of 10-fold cross-validation. The performance is evaluated using 300 images achieving an AUC of 0.93 with maximum accuracy of 88%. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. Towards computer-assisted TTTS: Laser ablation detection for workflow segmentation from fetoscopic video.

    PubMed

    Vasconcelos, Francisco; Brandão, Patrick; Vercauteren, Tom; Ourselin, Sebastien; Deprest, Jan; Peebles, Donald; Stoyanov, Danail

    2018-06-27

    Intrauterine foetal surgery is the treatment option for several congenital malformations. For twin-to-twin transfusion syndrome (TTTS), interventions involve the use of laser fibre to ablate vessels in a shared placenta. The procedure presents a number of challenges for the surgeon, and computer-assisted technologies can potentially be a significant support. Vision-based sensing is the primary source of information from the intrauterine environment, and hence, vision approaches present an appealing approach for extracting higher level information from the surgical site. In this paper, we propose a framework to detect one of the key steps during TTTS interventions-ablation. We adopt a deep learning approach, specifically the ResNet101 architecture, for classification of different surgical actions performed during laser ablation therapy. We perform a two-fold cross-validation using almost 50 k frames from five different TTTS ablation procedures. Our results show that deep learning methods are a promising approach for ablation detection. To our knowledge, this is the first attempt at automating photocoagulation detection using video and our technique can be an important component of a larger assistive framework for enhanced foetal therapies. The current implementation does not include semantic segmentation or localisation of the ablation site, and this would be a natural extension in future work.

  17. Taking the Next Step: Combining Incrementally Valid Indicators to Improve Recidivism Prediction

    ERIC Educational Resources Information Center

    Walters, Glenn D.

    2011-01-01

    The possibility of combining indicators to improve recidivism prediction was evaluated in a sample of released federal prisoners randomly divided into a derivation subsample (n = 550) and a cross-validation subsample (n = 551). Five incrementally valid indicators were selected from five domains: demographic (age), historical (prior convictions),…

  18. Burn-injured tissue detection for debridement surgery through the combination of non-invasive optical imaging techniques.

    PubMed

    Heredia-Juesas, Juan; Thatcher, Jeffrey E; Lu, Yang; Squiers, John J; King, Darlene; Fan, Wensheng; DiMaio, J Michael; Martinez-Lorenzo, Jose A

    2018-04-01

    The process of burn debridement is a challenging technique requiring significant skills to identify the regions that need excision and their appropriate excision depths. In order to assist surgeons, a machine learning tool is being developed to provide a quantitative assessment of burn-injured tissue. This paper presents three non-invasive optical imaging techniques capable of distinguishing four kinds of tissue-healthy skin, viable wound bed, shallow burn, and deep burn-during serial burn debridement in a porcine model. All combinations of these three techniques have been studied through a k-fold cross-validation method. In terms of global performance, the combination of all three techniques significantly improves the classification accuracy with respect to just one technique, from 0.42 up to more than 0.76. Furthermore, a non-linear spatial filtering based on the mode of a small neighborhood has been applied as a post-processing technique, in order to improve the performance of the classification. Using this technique, the global accuracy reaches a value close to 0.78 and, for some particular tissues and combination of techniques, the accuracy improves by 13%.

  19. Discrimination among populations of sockeye salmon fry with Fourier analysis of otolith banding patterns formed during incubation

    USGS Publications Warehouse

    Finn, James E.; Burger, Carl V.; Holland-Bartels, Leslie E.

    1997-01-01

    We used otolith banding patterns formed during incubation to discriminate among hatchery- and wild-incubated fry of sockeye salmon Oncorhynchus nerka from Tustumena Lake, Alaska. Fourier analysis of otolith luminance profiles was used to describe banding patterns: the amplitudes of individual Fourier harmonics were discriminant variables. Correct classification of otoliths to either hatchery or wild origin was 83.1% (cross-validation) and 72.7% (test data) with the use of quadratic discriminant function analysts on 10 Fourier amplitudes. Overall classification rates among the six test groups (one hatchery and five wild groups) were 46.5% (cross-validation) and 39.3% (test data) with the use of linear discriminant function analysis on 16 Fourier amplitudes. Although classification rates for wild-incubated fry from any one site never exceeded 67% (cross-validation) or 60% (test data), location-specific information was evident for all groups because the probability of classifying an individual to its true incubation location was significantly greater than chance. Results indicate phenotypic differences in otolith microstructure among incubation sites separated by less than 10 km. Analysis of otolith luminance profiles is a potentially useful technique for discriminating among and between various populations of hatchery and wild fish.

  20. A diagnostic model for impending death in cancer patients: Preliminary report.

    PubMed

    Hui, David; Hess, Kenneth; dos Santos, Renata; Chisholm, Gary; Bruera, Eduardo

    2015-11-01

    Several highly specific bedside physical signs associated with impending death within 3 days for patients with advanced cancer were recently identified. A diagnostic model for impending death based on these physical signs was developed and assessed. Sixty-two physical signs were systematically documented every 12 hours from admission to death or discharge for 357 patients with advanced cancer who were admitted to acute palliative care units (APCUs) at 2 tertiary care cancer centers. Recursive partitioning analysis was used to develop a prediction model for impending death within 3 days with admission data. The model was validated with 5 iterations of 10-fold cross-validation, and the model was also applied to APCU days 2 to 6. For the 322 of 357 patients (90%) with complete data for all signs, the 3-day mortality rate was 24% on admission. The final model was based on 2 variables (Palliative Performance Scale [PPS] and drooping of nasolabial folds) and had 4 terminal leaves: PPS score ≤ 20% and drooping of nasolabial folds present, PPS score ≤ 20% and drooping of nasolabial folds absent, PPS score of 30% to 60%, and PPS score ≥ 70%. The 3-day mortality rates were 94%, 42%, 16%, and 3%, respectively. The diagnostic accuracy was 81% for the original tree, 80% for cross-validation, and 79% to 84% for subsequent APCU days. Based on 2 objective bedside physical signs, a diagnostic model was developed for impending death within 3 days. This model was applicable to both APCU admission and subsequent days. Upon further external validation, this model may help clinicians to formulate the diagnosis of impending death. © 2015 American Cancer Society.

  1. BagMOOV: A novel ensemble for heart disease prediction bootstrap aggregation with multi-objective optimized voting.

    PubMed

    Bashir, Saba; Qamar, Usman; Khan, Farhan Hassan

    2015-06-01

    Conventional clinical decision support systems are based on individual classifiers or simple combination of these classifiers which tend to show moderate performance. This research paper presents a novel classifier ensemble framework based on enhanced bagging approach with multi-objective weighted voting scheme for prediction and analysis of heart disease. The proposed model overcomes the limitations of conventional performance by utilizing an ensemble of five heterogeneous classifiers: Naïve Bayes, linear regression, quadratic discriminant analysis, instance based learner and support vector machines. Five different datasets are used for experimentation, evaluation and validation. The datasets are obtained from publicly available data repositories. Effectiveness of the proposed ensemble is investigated by comparison of results with several classifiers. Prediction results of the proposed ensemble model are assessed by ten fold cross validation and ANOVA statistics. The experimental evaluation shows that the proposed framework deals with all type of attributes and achieved high diagnosis accuracy of 84.16 %, 93.29 % sensitivity, 96.70 % specificity, and 82.15 % f-measure. The f-ratio higher than f-critical and p value less than 0.05 for 95 % confidence interval indicate that the results are extremely statistically significant for most of the datasets.

  2. Novel naïve Bayes classification models for predicting the carcinogenicity of chemicals.

    PubMed

    Zhang, Hui; Cao, Zhi-Xing; Li, Meng; Li, Yu-Zhi; Peng, Cheng

    2016-11-01

    The carcinogenicity prediction has become a significant issue for the pharmaceutical industry. The purpose of this investigation was to develop a novel prediction model of carcinogenicity of chemicals by using a naïve Bayes classifier. The established model was validated by the internal 5-fold cross validation and external test set. The naïve Bayes classifier gave an average overall prediction accuracy of 90 ± 0.8% for the training set and 68 ± 1.9% for the external test set. Moreover, five simple molecular descriptors (e.g., AlogP, Molecular weight (M W ), No. of H donors, Apol and Wiener) considered as important for the carcinogenicity of chemicals were identified, and some substructures related to the carcinogenicity were achieved. Thus, we hope the established naïve Bayes prediction model could be applied to filter early-stage molecules for this potential carcinogenicity adverse effect; and the identified five simple molecular descriptors and substructures of carcinogens would give a better understanding of the carcinogenicity of chemicals, and further provide guidance for medicinal chemists in the design of new candidate drugs and lead optimization, ultimately reducing the attrition rate in later stages of drug development. Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. Detrended fluctuation analysis for major depressive disorder.

    PubMed

    Mumtaz, Wajid; Malik, Aamir Saeed; Ali, Syed Saad Azhar; Yasin, Mohd Azhar Mohd; Amin, Hafeezullah

    2015-01-01

    Clinical utility of Electroencephalography (EEG) based diagnostic studies is less clear for major depressive disorder (MDD). In this paper, a novel machine learning (ML) scheme was presented to discriminate the MDD patients and healthy controls. The proposed method inherently involved feature extraction, selection, classification and validation. The EEG data acquisition involved eyes closed (EC) and eyes open (EO) conditions. At feature extraction stage, the de-trended fluctuation analysis (DFA) was performed, based on the EEG data, to achieve scaling exponents. The DFA was performed to analyzes the presence or absence of long-range temporal correlations (LRTC) in the recorded EEG data. The scaling exponents were used as input features to our proposed system. At feature selection stage, 3 different techniques were used for comparison purposes. Logistic regression (LR) classifier was employed. The method was validated by a 10-fold cross-validation. As results, we have observed that the effect of 3 different reference montages on the computed features. The proposed method employed 3 different types of feature selection techniques for comparison purposes as well. The results show that the DFA analysis performed better in LE data compared with the IR and AR data. In addition, during Wilcoxon ranking, the AR performed better than LE and IR. Based on the results, it was concluded that the DFA provided useful information to discriminate the MDD patients and with further validation can be employed in clinics for diagnosis of MDD.

  4. An experimental study of interstitial lung tissue classification in HRCT images using ANN and role of cost functions

    NASA Astrophysics Data System (ADS)

    Dash, Jatindra K.; Kale, Mandar; Mukhopadhyay, Sudipta; Khandelwal, Niranjan; Prabhakar, Nidhi; Garg, Mandeep; Kalra, Naveen

    2017-03-01

    In this paper, we investigate the effect of the error criteria used during a training phase of the artificial neural network (ANN) on the accuracy of the classifier for classification of lung tissues affected with Interstitial Lung Diseases (ILD). Mean square error (MSE) and the cross-entropy (CE) criteria are chosen being most popular choice in state-of-the-art implementations. The classification experiment performed on the six interstitial lung disease (ILD) patterns viz. Consolidation, Emphysema, Ground Glass Opacity, Micronodules, Fibrosis and Healthy from MedGIFT database. The texture features from an arbitrary region of interest (AROI) are extracted using Gabor filter. Two different neural networks are trained with the scaled conjugate gradient back propagation algorithm with MSE and CE error criteria function respectively for weight updation. Performance is evaluated in terms of average accuracy of these classifiers using 4 fold cross-validation. Each network is trained for five times for each fold with randomly initialized weight vectors and accuracies are computed. Significant improvement in classification accuracy is observed when ANN is trained by using CE (67.27%) as error function compared to MSE (63.60%). Moreover, standard deviation of the classification accuracy for the network trained with CE (6.69) error criteria is found less as compared to network trained with MSE (10.32) criteria.

  5. Ω-Net (Omega-Net): Fully automatic, multi-view cardiac MR detection, orientation, and segmentation with deep neural networks.

    PubMed

    Vigneault, Davis M; Xie, Weidi; Ho, Carolyn Y; Bluemke, David A; Noble, J Alison

    2018-05-22

    Pixelwise segmentation of the left ventricular (LV) myocardium and the four cardiac chambers in 2-D steady state free precession (SSFP) cine sequences is an essential preprocessing step for a wide range of analyses. Variability in contrast, appearance, orientation, and placement of the heart between patients, clinical views, scanners, and protocols makes fully automatic semantic segmentation a notoriously difficult problem. Here, we present Ω-Net (Omega-Net): A novel convolutional neural network (CNN) architecture for simultaneous localization, transformation into a canonical orientation, and semantic segmentation. First, an initial segmentation is performed on the input image; second, the features learned during this initial segmentation are used to predict the parameters needed to transform the input image into a canonical orientation; and third, a final segmentation is performed on the transformed image. In this work, Ω-Nets of varying depths were trained to detect five foreground classes in any of three clinical views (short axis, SA; four-chamber, 4C; two-chamber, 2C), without prior knowledge of the view being segmented. This constitutes a substantially more challenging problem compared with prior work. The architecture was trained using three-fold cross-validation on a cohort of patients with hypertrophic cardiomyopathy (HCM, N=42) and healthy control subjects (N=21). Network performance, as measured by weighted foreground intersection-over-union (IoU), was substantially improved for the best-performing Ω-Net compared with U-Net segmentation without localization or orientation (0.858 vs 0.834). In addition, to be comparable with other works, Ω-Net was retrained from scratch using five-fold cross-validation on the publicly available 2017 MICCAI Automated Cardiac Diagnosis Challenge (ACDC) dataset. The Ω-Net outperformed the state-of-the-art method in segmentation of the LV and RV bloodpools, and performed slightly worse in segmentation of the LV myocardium. We conclude that this architecture represents a substantive advancement over prior approaches, with implications for biomedical image segmentation more generally. Published by Elsevier B.V.

  6. Exploring QSARs of the interaction of flavonoids with GABA (A) receptor using MLR, ANN and SVM techniques.

    PubMed

    Deeb, Omar; Shaik, Basheerulla; Agrawal, Vijay K

    2014-10-01

    Quantitative Structure-Activity Relationship (QSAR) models for binding affinity constants (log Ki) of 78 flavonoid ligands towards the benzodiazepine site of GABA (A) receptor complex were calculated using the machine learning methods: artificial neural network (ANN) and support vector machine (SVM) techniques. The models obtained were compared with those obtained using multiple linear regression (MLR) analysis. The descriptor selection and model building were performed with 10-fold cross-validation using the training data set. The SVM and MLR coefficient of determination values are 0.944 and 0.879, respectively, for the training set and are higher than those of ANN models. Though the SVM model shows improvement of training set fitting, the ANN model was superior to SVM and MLR in predicting the test set. Randomization test is employed to check the suitability of the models.

  7. Dideoxynucleoside resistance emerges with prolonged zidovudine monotherapy. The RV43 Study Group.

    PubMed Central

    Mayers, D L; Japour, A J; Arduino, J M; Hammer, S M; Reichman, R; Wagner, K F; Chung, R; Lane, J; Crumpacker, C S; McLeod, G X

    1994-01-01

    Human immunodeficiency virus type 1 (HIV-1) isolates resistant to zidovudine (ZDV) have previously been demonstrated to exhibit in vitro cross-resistance to other similar dideoxynucleoside agents which contain a 3'-azido group. However, cross-resistance to didanosine (ddI) or dideoxycytidine (ddC) has been less well documented. ZDV, ddI, and ddC susceptibility data have been collected from clinical HIV-1 isolates obtained by five clinical centers and their respective retrovirology laboratories. All subjects were treated only with ZDV. Clinical HIV-1 isolates were isolated, amplified, and assayed for drug susceptibility in standardized cultures of phytohemagglutinin-stimulated donor peripheral blood mononuclear cells obtained from healthy seronegative donors. All five cohorts showed a correlation between decreased in vitro susceptibility to ZDV and decreased susceptibility to ddI and ddC. For each 10-fold decrease in ZDV susceptibility, an average corresponding decrease of 2.2-fold in ddI susceptibility was observed (129 isolates studied; P < 0.001, Fisher's test of combined significance). Similarly, susceptibility to ddC decreased 2.0-fold for each 10-fold decrease in ZDV susceptibility (82 isolates studied; P < 0.001, Fisher's test of combined significance). These data indicate that a correlation exists between HIV-1 susceptibilities to ZDV and ddI or ddC for clinical HIV-1 isolates. PMID:8192457

  8. Improving lung cancer prognosis assessment by incorporating synthetic minority oversampling technique and score fusion method

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yan, Shiju; Qian, Wei; Guan, Yubao

    2016-06-15

    Purpose: This study aims to investigate the potential to improve lung cancer recurrence risk prediction performance for stage I NSCLS patients by integrating oversampling, feature selection, and score fusion techniques and develop an optimal prediction model. Methods: A dataset involving 94 early stage lung cancer patients was retrospectively assembled, which includes CT images, nine clinical and biological (CB) markers, and outcome of 3-yr disease-free survival (DFS) after surgery. Among the 94 patients, 74 remained DFS and 20 had cancer recurrence. Applying a computer-aided detection scheme, tumors were segmented from the CT images and 35 quantitative image (QI) features were initiallymore » computed. Two normalized Gaussian radial basis function network (RBFN) based classifiers were built based on QI features and CB markers separately. To improve prediction performance, the authors applied a synthetic minority oversampling technique (SMOTE) and a BestFirst based feature selection method to optimize the classifiers and also tested fusion methods to combine QI and CB based prediction results. Results: Using a leave-one-case-out cross-validation (K-fold cross-validation) method, the computed areas under a receiver operating characteristic curve (AUCs) were 0.716 ± 0.071 and 0.642 ± 0.061, when using the QI and CB based classifiers, respectively. By fusion of the scores generated by the two classifiers, AUC significantly increased to 0.859 ± 0.052 (p < 0.05) with an overall prediction accuracy of 89.4%. Conclusions: This study demonstrated the feasibility of improving prediction performance by integrating SMOTE, feature selection, and score fusion techniques. Combining QI features and CB markers and performing SMOTE prior to feature selection in classifier training enabled RBFN based classifier to yield improved prediction accuracy.« less

  9. Candidate soil indicators for monitoring the progress of constructed wetlands toward a natural state: a statistical approach

    USGS Publications Warehouse

    Stapanian, Martin A.; Adams, Jean V.; Fennessy, M. Siobhan; Mack, John; Micacchion, Mick

    2013-01-01

    A persistent question among ecologists and environmental managers is whether constructed wetlands are structurally or functionally equivalent to naturally occurring wetlands. We examined 19 variables collected from 10 constructed and nine natural emergent wetlands in Ohio, USA. Our primary objective was to identify candidate indicators of wetland class (natural or constructed), based on measurements of soil properties and an index of vegetation integrity, that can be used to track the progress of constructed wetlands toward a natural state. The method of nearest shrunken centroids was used to find a subset of variables that would serve as the best classifiers of wetland class, and error rate was calculated using a five-fold cross-validation procedure. The shrunken differences of percent total organic carbon (% TOC) and percent dry weight of the soil exhibited the greatest distances from the overall centroid. Classification based on these two variables yielded a misclassification rate of 11% based on cross-validation. Our results indicate that % TOC and percent dry weight can be used as candidate indicators of the status of emergent, constructed wetlands in Ohio and for assessing the performance of mitigation. The method of nearest shrunken centroids has excellent potential for further applications in ecology.

  10. Systematic bias of correlation coefficient may explain negative accuracy of genomic prediction.

    PubMed

    Zhou, Yao; Vales, M Isabel; Wang, Aoxue; Zhang, Zhiwu

    2017-09-01

    Accuracy of genomic prediction is commonly calculated as the Pearson correlation coefficient between the predicted and observed phenotypes in the inference population by using cross-validation analysis. More frequently than expected, significant negative accuracies of genomic prediction have been reported in genomic selection studies. These negative values are surprising, given that the minimum value for prediction accuracy should hover around zero when randomly permuted data sets are analyzed. We reviewed the two common approaches for calculating the Pearson correlation and hypothesized that these negative accuracy values reflect potential bias owing to artifacts caused by the mathematical formulas used to calculate prediction accuracy. The first approach, Instant accuracy, calculates correlations for each fold and reports prediction accuracy as the mean of correlations across fold. The other approach, Hold accuracy, predicts all phenotypes in all fold and calculates correlation between the observed and predicted phenotypes at the end of the cross-validation process. Using simulated and real data, we demonstrated that our hypothesis is true. Both approaches are biased downward under certain conditions. The biases become larger when more fold are employed and when the expected accuracy is low. The bias of Instant accuracy can be corrected using a modified formula. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  11. Predicting protein-binding regions in RNA using nucleotide profiles and compositions.

    PubMed

    Choi, Daesik; Park, Byungkyu; Chae, Hanju; Lee, Wook; Han, Kyungsook

    2017-03-14

    Motivated by the increased amount of data on protein-RNA interactions and the availability of complete genome sequences of several organisms, many computational methods have been proposed to predict binding sites in protein-RNA interactions. However, most computational methods are limited to finding RNA-binding sites in proteins instead of protein-binding sites in RNAs. Predicting protein-binding sites in RNA is more challenging than predicting RNA-binding sites in proteins. Recent computational methods for finding protein-binding sites in RNAs have several drawbacks for practical use. We developed a new support vector machine (SVM) model for predicting protein-binding regions in mRNA sequences. The model uses sequence profiles constructed from log-odds scores of mono- and di-nucleotides and nucleotide compositions. The model was evaluated by standard 10-fold cross validation, leave-one-protein-out (LOPO) cross validation and independent testing. Since actual mRNA sequences have more non-binding regions than protein-binding regions, we tested the model on several datasets with different ratios of protein-binding regions to non-binding regions. The best performance of the model was obtained in a balanced dataset of positive and negative instances. 10-fold cross validation with a balanced dataset achieved a sensitivity of 91.6%, a specificity of 92.4%, an accuracy of 92.0%, a positive predictive value (PPV) of 91.7%, a negative predictive value (NPV) of 92.3% and a Matthews correlation coefficient (MCC) of 0.840. LOPO cross validation showed a lower performance than the 10-fold cross validation, but the performance remains high (87.6% accuracy and 0.752 MCC). In testing the model on independent datasets, it achieved an accuracy of 82.2% and an MCC of 0.656. Testing of our model and other state-of-the-art methods on a same dataset showed that our model is better than the others. Sequence profiles of log-odds scores of mono- and di-nucleotides were much more powerful features than nucleotide compositions in finding protein-binding regions in RNA sequences. But, a slight performance gain was obtained when using the sequence profiles along with nucleotide compositions. These are preliminary results of ongoing research, but demonstrate the potential of our approach as a powerful predictor of protein-binding regions in RNA. The program and supporting data are available at http://bclab.inha.ac.kr/RBPbinding .

  12. Infrastructure and distributed learning methodology for privacy-preserving multi-centric rapid learning health care: euroCAT.

    PubMed

    Deist, Timo M; Jochems, A; van Soest, Johan; Nalbantov, Georgi; Oberije, Cary; Walsh, Seán; Eble, Michael; Bulens, Paul; Coucke, Philippe; Dries, Wim; Dekker, Andre; Lambin, Philippe

    2017-06-01

    Machine learning applications for personalized medicine are highly dependent on access to sufficient data. For personalized radiation oncology, datasets representing the variation in the entire cancer patient population need to be acquired and used to learn prediction models. Ethical and legal boundaries to ensure data privacy hamper collaboration between research institutes. We hypothesize that data sharing is possible without identifiable patient data leaving the radiation clinics and that building machine learning applications on distributed datasets is feasible. We developed and implemented an IT infrastructure in five radiation clinics across three countries (Belgium, Germany, and The Netherlands). We present here a proof-of-principle for future 'big data' infrastructures and distributed learning studies. Lung cancer patient data was collected in all five locations and stored in local databases. Exemplary support vector machine (SVM) models were learned using the Alternating Direction Method of Multipliers (ADMM) from the distributed databases to predict post-radiotherapy dyspnea grade [Formula: see text]. The discriminative performance was assessed by the area under the curve (AUC) in a five-fold cross-validation (learning on four sites and validating on the fifth). The performance of the distributed learning algorithm was compared to centralized learning where datasets of all institutes are jointly analyzed. The euroCAT infrastructure has been successfully implemented in five radiation clinics across three countries. SVM models can be learned on data distributed over all five clinics. Furthermore, the infrastructure provides a general framework to execute learning algorithms on distributed data. The ongoing expansion of the euroCAT network will facilitate machine learning in radiation oncology. The resulting access to larger datasets with sufficient variation will pave the way for generalizable prediction models and personalized medicine.

  13. A machine learning approach to multi-level ECG signal quality classification.

    PubMed

    Li, Qiao; Rajagopalan, Cadathur; Clifford, Gari D

    2014-12-01

    Current electrocardiogram (ECG) signal quality assessment studies have aimed to provide a two-level classification: clean or noisy. However, clinical usage demands more specific noise level classification for varying applications. This work outlines a five-level ECG signal quality classification algorithm. A total of 13 signal quality metrics were derived from segments of ECG waveforms, which were labeled by experts. A support vector machine (SVM) was trained to perform the classification and tested on a simulated dataset and was validated using data from the MIT-BIH arrhythmia database (MITDB). The simulated training and test datasets were created by selecting clean segments of the ECG in the 2011 PhysioNet/Computing in Cardiology Challenge database, and adding three types of real ECG noise at different signal-to-noise ratio (SNR) levels from the MIT-BIH Noise Stress Test Database (NSTDB). The MITDB was re-annotated for five levels of signal quality. Different combinations of the 13 metrics were trained and tested on the simulated datasets and the best combination that produced the highest classification accuracy was selected and validated on the MITDB. Performance was assessed using classification accuracy (Ac), and a single class overlap accuracy (OAc), which assumes that an individual type classified into an adjacent class is acceptable. An Ac of 80.26% and an OAc of 98.60% on the test set were obtained by selecting 10 metrics while 57.26% (Ac) and 94.23% (OAc) were the numbers for the unseen MITDB validation data without retraining. By performing the fivefold cross validation, an Ac of 88.07±0.32% and OAc of 99.34±0.07% were gained on the validation fold of MITDB. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  14. Validation of RNAi Silencing Efficiency Using Gene Array Data shows 18.5% Failure Rate across 429 Independent Experiments.

    PubMed

    Munkácsy, Gyöngyi; Sztupinszki, Zsófia; Herman, Péter; Bán, Bence; Pénzváltó, Zsófia; Szarvas, Nóra; Győrffy, Balázs

    2016-09-27

    No independent cross-validation of success rate for studies utilizing small interfering RNA (siRNA) for gene silencing has been completed before. To assess the influence of experimental parameters like cell line, transfection technique, validation method, and type of control, we have to validate these in a large set of studies. We utilized gene chip data published for siRNA experiments to assess success rate and to compare methods used in these experiments. We searched NCBI GEO for samples with whole transcriptome analysis before and after gene silencing and evaluated the efficiency for the target and off-target genes using the array-based expression data. Wilcoxon signed-rank test was used to assess silencing efficacy and Kruskal-Wallis tests and Spearman rank correlation were used to evaluate study parameters. All together 1,643 samples representing 429 experiments published in 207 studies were evaluated. The fold change (FC) of down-regulation of the target gene was above 0.7 in 18.5% and was above 0.5 in 38.7% of experiments. Silencing efficiency was lowest in MCF7 and highest in SW480 cells (FC = 0.59 and FC = 0.30, respectively, P = 9.3E-06). Studies utilizing Western blot for validation performed better than those with quantitative polymerase chain reaction (qPCR) or microarray (FC = 0.43, FC = 0.47, and FC = 0.55, respectively, P = 2.8E-04). There was no correlation between type of control, transfection method, publication year, and silencing efficiency. Although gene silencing is a robust feature successfully cross-validated in the majority of experiments, efficiency remained insufficient in a significant proportion of studies. Selection of cell line model and validation method had the highest influence on silencing proficiency.

  15. Development of a QSAR Model for Thyroperoxidase Inhbition ...

    EPA Pesticide Factsheets

    hyroid hormones (THs) are involved in multiple biological processes and are critical modulators of fetal development. Even moderate changes in maternal or fetal TH levels can produce irreversible neurological deficits in children, such as lower IQ. The enzyme thyroperoxidase (TPO) plays a key role in the synthesis of THs, and inhibition of TPO by xenobiotics results in decreased TH synthesis. Recently, a high-throughput screening assay for TPO inhibition (AUR-TPO) was developed and used to test the ToxCast Phase I and II chemicals. In the present study, we used the results from AUR-TPO to develop a Quantitative Structure-Activity Relationship (QSAR) model for TPO inhibition. The training set consisted of 898 discrete organic chemicals: 134 inhibitors and 764 non-inhibitors. A five times two-fold cross-validation of the model was performed, yielding a balanced accuracy of 78.7%. More recently, an additional ~800 chemicals were tested in the AUR-TPO assay. These data were used for a blinded external validation of the QSAR model, demonstrating a balanced accuracy of 85.7%. Overall, the cross- and external validation indicate a robust model with high predictive performance. Next, we used the QSAR model to predict 72,526 REACH pre-registered substances. The model could predict 49.5% (35,925) of the substances in its applicability domain and of these, 8,863 (24.7%) were predicted to be TPO inhibitors. Predictions from this screening can be used in a tiered approach to

  16. An Automated Approach for Ranking Journals to Help in Clinician Decision Support

    PubMed Central

    Jonnalagadda, Siddhartha R.; Moosavinasab, Soheil; Nath, Chinmoy; Li, Dingcheng; Chute, Christopher G.; Liu, Hongfang

    2014-01-01

    Point of care access to knowledge from full text journal articles supports decision-making and decreases medical errors. However, it is an overwhelming task to search through full text journal articles and find quality information needed by clinicians. We developed a method to rate journals for a given clinical topic, Congestive Heart Failure (CHF). Our method enables filtering of journals and ranking of journal articles based on source journal in relation to CHF. We also obtained a journal priority score, which automatically rates any journal based on its importance to CHF. Comparing our ranking with data gathered by surveying 169 cardiologists, who publish on CHF, our best Multiple Linear Regression model showed a correlation of 0.880, based on five-fold cross validation. Our ranking system can be extended to other clinical topics. PMID:25954382

  17. Validity Evidence in Scale Development: The Application of Cross Validation and Classification-Sequencing Validation

    ERIC Educational Resources Information Center

    Acar, Tu¨lin

    2014-01-01

    In literature, it has been observed that many enhanced criteria are limited by factor analysis techniques. Besides examinations of statistical structure and/or psychological structure, such validity studies as cross validation and classification-sequencing studies should be performed frequently. The purpose of this study is to examine cross…

  18. Multi-parameter machine learning approach to the neuroanatomical basis of developmental dyslexia.

    PubMed

    Płoński, Piotr; Gradkowski, Wojciech; Altarelli, Irene; Monzalvo, Karla; van Ermingen-Marbach, Muna; Grande, Marion; Heim, Stefan; Marchewka, Artur; Bogorodzki, Piotr; Ramus, Franck; Jednoróg, Katarzyna

    2017-02-01

    Despite decades of research, the anatomical abnormalities associated with developmental dyslexia are still not fully described. Studies have focused on between-group comparisons in which different neuroanatomical measures were generally explored in isolation, disregarding potential interactions between regions and measures. Here, for the first time a multivariate classification approach was used to investigate grey matter disruptions in children with dyslexia in a large (N = 236) multisite sample. A variety of cortical morphological features, including volumetric (volume, thickness and area) and geometric (folding index and mean curvature) measures were taken into account and generalizability of classification was assessed with both 10-fold and leave-one-out cross validation (LOOCV) techniques. Classification into control vs. dyslexic subjects achieved above chance accuracy (AUC = 0.66 and ACC = 0.65 in the case of 10-fold CV, and AUC = 0.65 and ACC = 0.64 using LOOCV) after principled feature selection. Features that discriminated between dyslexic and control children were exclusively situated in the left hemisphere including superior and middle temporal gyri, subparietal sulcus and prefrontal areas. They were related to geometric properties of the cortex, with generally higher mean curvature and a greater folding index characterizing the dyslexic group. Our results support the hypothesis that an atypical curvature pattern with extra folds in left hemispheric perisylvian regions characterizes dyslexia. Hum Brain Mapp 38:900-908, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  19. Analysis of a crossed Bragg-cell acousto optical spectrometer for SETI

    NASA Technical Reports Server (NTRS)

    Gulkis, S.

    1986-01-01

    The search for radio signals from extraterrestrial intelligent (SETI) beings requires the use of large instantaneous bandwidth (500 MHz) and high resolution (20 Hz) spectrometers. Digital systems with a high degree of modularity can be used to provide this capability, and this method has been widely discussed. Another technique for meeting the SETI requirement is to use a crossed Bragg-cell spectrometer as described by Psaltis and Casasent (1979). This technique makes use of the Folded Spectrum concept, introduced by Thomas (1966). The Folded Spectrum is a two-dimensional Fourier Transform of a raster scanned one-dimensional signal. It is directly related to the long one-dimensional spectrum of the original signal and is ideally suited for optical signal processing.

  20. Analysis of a crossed Bragg-cell acousto optical spectrometer for SETI

    NASA Astrophysics Data System (ADS)

    Gulkis, S.

    1986-10-01

    The search for radio signals from extraterrestrial intelligent (SETI) beings requires the use of large instantaneous bandwidth (500 MHz) and high resolution (20 Hz) spectrometers. Digital systems with a high degree of modularity can be used to provide this capability, and this method has been widely discussed. Another technique for meeting the SETI requirement is to use a crossed Bragg-cell spectrometer as described by Psaltis and Casasent (1979). This technique makes use of the Folded Spectrum concept, introduced by Thomas (1966). The Folded Spectrum is a two-dimensional Fourier Transform of a raster scanned one-dimensional signal. It is directly related to the long one-dimensional spectrum of the original signal and is ideally suited for optical signal processing.

  1. Cross-Validation of Survival Bump Hunting by Recursive Peeling Methods.

    PubMed

    Dazard, Jean-Eudes; Choe, Michael; LeBlanc, Michael; Rao, J Sunil

    2014-08-01

    We introduce a survival/risk bump hunting framework to build a bump hunting model with a possibly censored time-to-event type of response and to validate model estimates. First, we describe the use of adequate survival peeling criteria to build a survival/risk bump hunting model based on recursive peeling methods. Our method called "Patient Recursive Survival Peeling" is a rule-induction method that makes use of specific peeling criteria such as hazard ratio or log-rank statistics. Second, to validate our model estimates and improve survival prediction accuracy, we describe a resampling-based validation technique specifically designed for the joint task of decision rule making by recursive peeling (i.e. decision-box) and survival estimation. This alternative technique, called "combined" cross-validation is done by combining test samples over the cross-validation loops, a design allowing for bump hunting by recursive peeling in a survival setting. We provide empirical results showing the importance of cross-validation and replication.

  2. Cross-Validation of Survival Bump Hunting by Recursive Peeling Methods

    PubMed Central

    Dazard, Jean-Eudes; Choe, Michael; LeBlanc, Michael; Rao, J. Sunil

    2015-01-01

    We introduce a survival/risk bump hunting framework to build a bump hunting model with a possibly censored time-to-event type of response and to validate model estimates. First, we describe the use of adequate survival peeling criteria to build a survival/risk bump hunting model based on recursive peeling methods. Our method called “Patient Recursive Survival Peeling” is a rule-induction method that makes use of specific peeling criteria such as hazard ratio or log-rank statistics. Second, to validate our model estimates and improve survival prediction accuracy, we describe a resampling-based validation technique specifically designed for the joint task of decision rule making by recursive peeling (i.e. decision-box) and survival estimation. This alternative technique, called “combined” cross-validation is done by combining test samples over the cross-validation loops, a design allowing for bump hunting by recursive peeling in a survival setting. We provide empirical results showing the importance of cross-validation and replication. PMID:26997922

  3. Joint use of over- and under-sampling techniques and cross-validation for the development and assessment of prediction models.

    PubMed

    Blagus, Rok; Lusa, Lara

    2015-11-04

    Prediction models are used in clinical research to develop rules that can be used to accurately predict the outcome of the patients based on some of their characteristics. They represent a valuable tool in the decision making process of clinicians and health policy makers, as they enable them to estimate the probability that patients have or will develop a disease, will respond to a treatment, or that their disease will recur. The interest devoted to prediction models in the biomedical community has been growing in the last few years. Often the data used to develop the prediction models are class-imbalanced as only few patients experience the event (and therefore belong to minority class). Prediction models developed using class-imbalanced data tend to achieve sub-optimal predictive accuracy in the minority class. This problem can be diminished by using sampling techniques aimed at balancing the class distribution. These techniques include under- and oversampling, where a fraction of the majority class samples are retained in the analysis or new samples from the minority class are generated. The correct assessment of how the prediction model is likely to perform on independent data is of crucial importance; in the absence of an independent data set, cross-validation is normally used. While the importance of correct cross-validation is well documented in the biomedical literature, the challenges posed by the joint use of sampling techniques and cross-validation have not been addressed. We show that care must be taken to ensure that cross-validation is performed correctly on sampled data, and that the risk of overestimating the predictive accuracy is greater when oversampling techniques are used. Examples based on the re-analysis of real datasets and simulation studies are provided. We identify some results from the biomedical literature where the incorrect cross-validation was performed, where we expect that the performance of oversampling techniques was heavily overestimated.

  4. A computer-aided diagnosis system to detect pathologies in temporal subtraction images of chest radiographs

    NASA Astrophysics Data System (ADS)

    Looper, Jared; Harrison, Melanie; Armato, Samuel G.

    2016-03-01

    Radiologists often compare sequential radiographs to identify areas of pathologic change; however, this process is prone to error, as human anatomy can obscure the regions of change, causing the radiologists to overlook pathology. Temporal subtraction (TS) images can provide enhanced visualization of regions of change in sequential radiographs and allow radiologists to better detect areas of change in radiographs. Not all areas of change shown in TS images, however, are actual pathology. The purpose of this study was to create a computer-aided diagnostic (CAD) system that identifies which regions of change are caused by pathology and which are caused by misregistration of the radiographs used to create the TS image. The dataset used in this study contained 120 images with 74 pathologic regions on 54 images outlined by an experienced radiologist. High and low ("light" and "dark") gray-level candidate regions were extracted from the images using gray-level thresholding. Then, sampling techniques were used to address the class imbalance problem between "true" and "false" candidate regions. Next, the datasets of light candidate regions, dark candidate regions, and the combined set of light and dark candidate regions were used as training and testing data for classifiers by using five-fold cross validation. Of the classifiers tested (support vector machines, discriminant analyses, logistic regression, and k-nearest neighbors), the support vector machine on the combined candidates using synthetic minority oversampling technique (SMOTE) performed best with an area under the receiver operating characteristic curve value of 0.85, a sensitivity of 85%, and a specificity of 84%.

  5. [Lateral fixation of the vocal fold (Lichtenberger's technique): interest in the bilateral laryngeal immobilities].

    PubMed

    Pérouse, R; Coulombeau, B; Arias, C; Casanova, C

    2006-01-01

    In patients presenting a bilateral laryngeal immobility, the potential reversibility of certain cases, the refusal or the bad tolerance of long term tracheotomy bring up the question of the choice of the surgical technique if it is indicated. To report our experience with the technique of lateralization of the paralyzed vocal fold (arytenoidopexy) suggested by Lichtenberger. After having described the technique, we report 5 cases (3 pos-thyroidectomy, I of central origine, 1 post-burn). From 1 to 12 months after surgery 2 patients were fully satisfied, a patient (central origin) recovered spontaneously at the end of a month and the 2 last had a partial result. Only one patient required several surgical gestures. The Lichtenberger's technique combines theoretical reversibility and conservation of a functional glottic plan. It avoids tracheotomy. This approach can according to us validly replace the traditional techniques, medium or long term tracheotomy , or endoscopic arythenoid or posterior vocal fold resection.

  6. Mechanical versus kinematical shortening reconstructions of the Zagros High Folded Zone (Kurdistan region of Iraq)

    NASA Astrophysics Data System (ADS)

    Frehner, Marcel; Reif, Daniel; Grasemann, Bernhard

    2012-06-01

    This paper compares kinematical and mechanical techniques for the palinspastic reconstruction of folded cross sections in collision orogens. The studied area and the reconstructed NE-SW trending, 55.5 km long cross section is located in the High Folded Zone of the Zagros fold-and-thrust belt in the Kurdistan region of Iraq. The present-day geometry of the cross section has been constructed from field as well as remote sensing data. In a first step, the structures and the stratigraphy are simplified and summarized in eight units trying to identify the main geometric and mechanical parameters. In a second step, the shortening is kinematically estimated using the dip domain method to 11%-15%. Then the same cross section is used in a numerical finite element model to perform dynamical unfolding simulations taking various rheological parameters into account. The main factor allowing for an efficient dynamic unfolding is the presence of interfacial slip conditions between the mechanically strong units. Other factors, such as Newtonian versus power law viscous rheology or the presence of a basement, affect the numerical simulations much less strongly. If interfacial slip is accounted for, fold amplitudes are reduced efficiently during the dynamical unfolding simulations, while welded layer interfaces lead to unrealistic shortening estimates. It is suggested that interfacial slip and decoupling of the deformation along detachment horizons is an important mechanical parameter that controlled the folding processes in the Zagros High Folded Zone.

  7. Mechanical versus kinematical shortening reconstructions of the Zagros High Folded Zone (Kurdistan Region of Iraq)

    NASA Astrophysics Data System (ADS)

    Frehner, M.; Reif, D.; Grasemann, B.

    2012-04-01

    Our study compares kinematical and mechanical techniques for the palinspastic reconstruction of folded cross-sections in collision orogens. The studied area and the reconstructed NE-SW-trending, 55.5 km long cross-section is located in the High Folded Zone of the Zagros fold-and-thrust belt in the Kurdistan Region of Iraq. The present-day geometry of the cross-section has been constructed from field, as well as remote sensing data. In a first step, the structures and the stratigraphy are simplified and summarized in eight units trying to identify the main geometric and mechanical parameters. In a second step, the shortening is kinematically estimated using the dip-domain method to 11%-15%. Then the same cross-section is used in a numerical finite-element model to perform dynamical unfolding simulations taking various rheological parameters into account. The main factor allowing for an efficient dynamic unfolding is the presence of interfacial slip conditions between the mechanically strong units. Other factors, such as Newtonian vs. power-law viscous rheology or the presence of a basement affect the numerical simulations much less strongly. If interfacial slip is accounted for, fold amplitudes are reduced efficiently during the dynamical unfolding simulations, while welded layer interfaces lead to unrealistic shortening estimates. It is suggested that interfacial slip and decoupling of the deformation along detachment horizons is an important mechanical parameter that controlled the folding processes in the Zagros High Folded Zone.

  8. Signal processing and neural network toolbox and its application to failure diagnosis and prognosis

    NASA Astrophysics Data System (ADS)

    Tu, Fang; Wen, Fang; Willett, Peter K.; Pattipati, Krishna R.; Jordan, Eric H.

    2001-07-01

    Many systems are comprised of components equipped with self-testing capability; however, if the system is complex involving feedback and the self-testing itself may occasionally be faulty, tracing faults to a single or multiple causes is difficult. Moreover, many sensors are incapable of reliable decision-making on their own. In such cases, a signal processing front-end that can match inference needs will be very helpful. The work is concerned with providing an object-oriented simulation environment for signal processing and neural network-based fault diagnosis and prognosis. In the toolbox, we implemented a wide range of spectral and statistical manipulation methods such as filters, harmonic analyzers, transient detectors, and multi-resolution decomposition to extract features for failure events from data collected by data sensors. Then we evaluated multiple learning paradigms for general classification, diagnosis and prognosis. The network models evaluated include Restricted Coulomb Energy (RCE) Neural Network, Learning Vector Quantization (LVQ), Decision Trees (C4.5), Fuzzy Adaptive Resonance Theory (FuzzyArtmap), Linear Discriminant Rule (LDR), Quadratic Discriminant Rule (QDR), Radial Basis Functions (RBF), Multiple Layer Perceptrons (MLP) and Single Layer Perceptrons (SLP). Validation techniques, such as N-fold cross-validation and bootstrap techniques, are employed for evaluating the robustness of network models. The trained networks are evaluated for their performance using test data on the basis of percent error rates obtained via cross-validation, time efficiency, generalization ability to unseen faults. Finally, the usage of neural networks for the prediction of residual life of turbine blades with thermal barrier coatings is described and the results are shown. The neural network toolbox has also been applied to fault diagnosis in mixed-signal circuits.

  9. Proteome-wide characterization of signalling interactions in the hippocampal CA4/DG subfield of patients with Alzheimer’s disease

    PubMed Central

    Ho Kim, Jae; Franck, Julien; Kang, Taewook; Heinsen, Helmut; Ravid, Rivka; Ferrer, Isidro; Hee Cheon, Mi; Lee, Joo-Yong; Shin Yoo, Jong; Steinbusch, Harry W; Salzet, Michel; Fournier, Isabelle; Mok Park, Young

    2015-01-01

    Alzheimer’s disease (AD) is the most common form of dementia; however, mechanisms and biomarkers remain unclear. Here, we examined hippocampal CA4 and dentate gyrus subfields, which are less studied in the context of AD pathology, in post-mortem AD and control tissue to identify possible biomarkers. We performed mass spectrometry-based proteomic analysis combined with label-free quantification for identification of differentially expressed proteins. We identified 4,328 proteins, of which 113 showed more than 2-fold higher or lower expression in AD hippocampi than in control tissues. Five proteins were identified as putative AD biomarkers (MDH2, PCLO, TRRAP, YWHAZ, and MUC19 isoform 5) and were cross-validated by immunoblotting, selected reaction monitoring, and MALDI imaging. We also used a bioinformatics approach to examine upstream signalling interactions of the 113 regulated proteins. Five upstream signalling (IGF1, BDNF, ZAP70, MYC, and cyclosporin A) factors showed novel interactions in AD hippocampi. Taken together, these results demonstrate a novel platform that may provide new strategies for the early detection of AD and thus its diagnosis. PMID:26059363

  10. Automatic diagnosis of tuberculosis disease based on Plasmonic ELISA and color-based image classification.

    PubMed

    AbuHassan, Kamal J; Bakhori, Noremylia M; Kusnin, Norzila; Azmi, Umi Z M; Tania, Marzia H; Evans, Benjamin A; Yusof, Nor A; Hossain, M A

    2017-07-01

    Tuberculosis (TB) remains one of the most devastating infectious diseases and its treatment efficiency is majorly influenced by the stage at which infection with the TB bacterium is diagnosed. The available methods for TB diagnosis are either time consuming, costly or not efficient. This study employs a signal generation mechanism for biosensing, known as Plasmonic ELISA, and computational intelligence to facilitate automatic diagnosis of TB. Plasmonic ELISA enables the detection of a few molecules of analyte by the incorporation of smart nanomaterials for better sensitivity of the developed detection system. The computational system uses k-means clustering and thresholding for image segmentation. This paper presents the results of the classification performance of the Plasmonic ELISA imaging data by using various types of classifiers. The five-fold cross-validation results show high accuracy rate (>97%) in classifying TB images using the entire data set. Future work will focus on developing an intelligent mobile-enabled expert system to diagnose TB in real-time. The intelligent system will be clinically validated and tested in collaboration with healthcare providers in Malaysia.

  11. Predicting drug-induced liver injury using ensemble learning methods and molecular fingerprints.

    PubMed

    Ai, Haixin; Chen, Wen; Zhang, Li; Huang, Liangchao; Yin, Zimo; Hu, Huan; Zhao, Qi; Zhao, Jian; Liu, Hongsheng

    2018-05-21

    Drug-induced liver injury (DILI) is a major safety concern in the drug-development process, and various methods have been proposed to predict the hepatotoxicity of compounds during the early stages of drug trials. In this study, we developed an ensemble model using three machine learning algorithms and 12 molecular fingerprints from a dataset containing 1,241 diverse compounds. The ensemble model achieved an average accuracy of 71.1±2.6%, sensitivity of 79.9±3.6%, specificity of 60.3±4.8%, and area under the receiver operating characteristic curve (AUC) of 0.764±0.026 in five-fold cross-validation and an accuracy of 84.3%, sensitivity of 86.9%, specificity of 75.4%, and AUC of 0.904 in an external validation dataset of 286 compounds collected from the Liver Toxicity Knowledge Base (LTKB). Compared with previous methods, the ensemble model achieved relatively high accuracy and sensitivity. We also identified several substructures related to DILI. In addition, we provide a web server offering access to our models (http://ccsipb.lnu.edu.cn/toxicity/HepatoPred-EL/).

  12. Development of a Bayesian model to estimate health care outcomes in the severely wounded

    PubMed Central

    Stojadinovic, Alexander; Eberhardt, John; Brown, Trevor S; Hawksworth, Jason S; Gage, Frederick; Tadaki, Douglas K; Forsberg, Jonathan A; Davis, Thomas A; Potter, Benjamin K; Dunne, James R; Elster, E A

    2010-01-01

    Background: Graphical probabilistic models have the ability to provide insights as to how clinical factors are conditionally related. These models can be used to help us understand factors influencing health care outcomes and resource utilization, and to estimate morbidity and clinical outcomes in trauma patient populations. Study design: Thirty-two combat casualties with severe extremity injuries enrolled in a prospective observational study were analyzed using step-wise machine-learned Bayesian belief network (BBN) and step-wise logistic regression (LR). Models were evaluated using 10-fold cross-validation to calculate area-under-the-curve (AUC) from receiver operating characteristics (ROC) curves. Results: Our BBN showed important associations between various factors in our data set that could not be developed using standard regression methods. Cross-validated ROC curve analysis showed that our BBN model was a robust representation of our data domain and that LR models trained on these findings were also robust: hospital-acquired infection (AUC: LR, 0.81; BBN, 0.79), intensive care unit length of stay (AUC: LR, 0.97; BBN, 0.81), and wound healing (AUC: LR, 0.91; BBN, 0.72) showed strong AUC. Conclusions: A BBN model can effectively represent clinical outcomes and biomarkers in patients hospitalized after severe wounding, and is confirmed by 10-fold cross-validation and further confirmed through logistic regression modeling. The method warrants further development and independent validation in other, more diverse patient populations. PMID:21197361

  13. Machine learning for the assessment of Alzheimer's disease through DTI

    NASA Astrophysics Data System (ADS)

    Lella, Eufemia; Amoroso, Nicola; Bellotti, Roberto; Diacono, Domenico; La Rocca, Marianna; Maggipinto, Tommaso; Monaco, Alfonso; Tangaro, Sabina

    2017-09-01

    Digital imaging techniques have found several medical applications in the development of computer aided detection systems, especially in neuroimaging. Recent advances in Diffusion Tensor Imaging (DTI) aim to discover biological markers for the early diagnosis of Alzheimer's disease (AD), one of the most widespread neurodegenerative disorders. We explore here how different supervised classification models provide a robust support to the diagnosis of AD patients. We use DTI measures, assessing the structural integrity of white matter (WM) fiber tracts, to reveal patterns of disrupted brain connectivity. In particular, we provide a voxel-wise measure of fractional anisotropy (FA) and mean diffusivity (MD), thus identifying the regions of the brain mostly affected by neurodegeneration, and then computing intensity features to feed supervised classification algorithms. In particular, we evaluate the accuracy of discrimination of AD patients from healthy controls (HC) with a dataset of 80 subjects (40 HC, 40 AD), from the Alzheimer's Disease Neurodegenerative Initiative (ADNI). In this study, we compare three state-of-the-art classification models: Random Forests, Naive Bayes and Support Vector Machines (SVMs). We use a repeated five-fold cross validation framework with nested feature selection to perform a fair comparison between these algorithms and evaluate the information content they provide. Results show that AD patterns are well localized within the brain, thus DTI features can support the AD diagnosis.

  14. Automatic classification of tissue malignancy for breast carcinoma diagnosis.

    PubMed

    Fondón, Irene; Sarmiento, Auxiliadora; García, Ana Isabel; Silvestre, María; Eloy, Catarina; Polónia, António; Aguiar, Paulo

    2018-05-01

    Breast cancer is the second leading cause of cancer death among women. Its early diagnosis is extremely important to prevent avoidable deaths. However, malignancy assessment of tissue biopsies is complex and dependent on observer subjectivity. Moreover, hematoxylin and eosin (H&E)-stained histological images exhibit a highly variable appearance, even within the same malignancy level. In this paper, we propose a computer-aided diagnosis (CAD) tool for automated malignancy assessment of breast tissue samples based on the processing of histological images. We provide four malignancy levels as the output of the system: normal, benign, in situ and invasive. The method is based on the calculation of three sets of features related to nuclei, colour regions and textures considering local characteristics and global image properties. By taking advantage of well-established image processing techniques, we build a feature vector for each image that serves as an input to an SVM (Support Vector Machine) classifier with a quadratic kernel. The method has been rigorously evaluated, first with a 5-fold cross-validation within an initial set of 120 images, second with an external set of 30 different images and third with images with artefacts included. Accuracy levels range from 75.8% when the 5-fold cross-validation was performed to 75% with the external set of new images and 61.11% when the extremely difficult images were added to the classification experiment. The experimental results indicate that the proposed method is capable of distinguishing between four malignancy levels with high accuracy. Our results are close to those obtained with recent deep learning-based methods. Moreover, it performs better than other state-of-the-art methods based on feature extraction, and it can help improve the CAD of breast cancer. Copyright © 2018 Elsevier Ltd. All rights reserved.

  15. [Study of adaptation and validation of the Practice environment scale of the nursing work index for the Portuguese reality].

    PubMed

    Ferreira, Maria Regina Sardinheiro do Céu Furtado; Martins, José Joaquim Penedos Amendoeira

    2014-08-01

    Testing the psychometric properties of the Portuguese version of the Practice Environment Scale of the Nursing Work Index. A descriptive, analytical and cross-sectional study, for the cross-cultural adaptation and validation of the psychometric properties of the scale. The study participants were 236 nurses from two hospitals in the regions of Lisbon and Vale do Tejo. The 0.92 Cronbach's alpha was obtained for overall reliability and support of a five-dimension structure. The excellent quality of adjustment of analysis confirms the validity of the adapted version to hospital care settings, although there was no total coincidence of items in the five dimensions

  16. Partial wave analysis for folded differential cross sections

    NASA Astrophysics Data System (ADS)

    Machacek, J. R.; McEachran, R. P.

    2018-03-01

    The value of modified effective range theory (MERT) and the connection between differential cross sections and phase shifts in low-energy electron scattering has long been recognized. Recent experimental techniques involving magnetically confined beams have introduced the concept of folded differential cross sections (FDCS) where the forward (θ ≤ π/2) and backward scattered (θ ≥ π/2) projectiles are unresolved, that is the value measured at the angle θ is the sum of the signal for particles scattered into the angles θ and π - θ. We have developed an alternative approach to MERT in order to analyse low-energy folded differential cross sections for positrons and electrons. This results in a simplified expression for the FDCS when it is expressed in terms of partial waves and thereby enables one to extract the first few phase shifts from a fit to an experimental FDCS at low energies. Thus, this method predicts forward and backward angle scattering (0 to π) using only experimental FDCS data and can be used to determine the total elastic cross section solely from experimental results at low-energy, which are limited in angular range.

  17. Development of novel in silico model for developmental toxicity assessment by using naïve Bayes classifier method.

    PubMed

    Zhang, Hui; Ren, Ji-Xia; Kang, Yan-Li; Bo, Peng; Liang, Jun-Yu; Ding, Lan; Kong, Wei-Bao; Zhang, Ji

    2017-08-01

    Toxicological testing associated with developmental toxicity endpoints are very expensive, time consuming and labor intensive. Thus, developing alternative approaches for developmental toxicity testing is an important and urgent task in the drug development filed. In this investigation, the naïve Bayes classifier was applied to develop a novel prediction model for developmental toxicity. The established prediction model was evaluated by the internal 5-fold cross validation and external test set. The overall prediction results for the internal 5-fold cross validation of the training set and external test set were 96.6% and 82.8%, respectively. In addition, four simple descriptors and some representative substructures of developmental toxicants were identified. Thus, we hope the established in silico prediction model could be used as alternative method for toxicological assessment. And these obtained molecular information could afford a deeper understanding on the developmental toxicants, and provide guidance for medicinal chemists working in drug discovery and lead optimization. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Development of a five-year mortality model in systemic sclerosis patients by different analytical approaches.

    PubMed

    Beretta, Lorenzo; Santaniello, Alessandro; Cappiello, Francesca; Chawla, Nitesh V; Vonk, Madelon C; Carreira, Patricia E; Allanore, Yannick; Popa-Diaconu, D A; Cossu, Marta; Bertolotti, Francesca; Ferraccioli, Gianfranco; Mazzone, Antonino; Scorza, Raffaella

    2010-01-01

    Systemic sclerosis (SSc) is a multiorgan disease with high mortality rates. Several clinical features have been associated with poor survival in different populations of SSc patients, but no clear and reproducible prognostic model to assess individual survival prediction in scleroderma patients has ever been developed. We used Cox regression and three data mining-based classifiers (Naïve Bayes Classifier [NBC], Random Forests [RND-F] and logistic regression [Log-Reg]) to develop a robust and reproducible 5-year prognostic model. All the models were built and internally validated by means of 5-fold cross-validation on a population of 558 Italian SSc patients. Their predictive ability and capability of generalisation was then tested on an independent population of 356 patients recruited from 5 external centres and finally compared to the predictions made by two SSc domain experts on the same population. The NBC outperformed the Cox-based classifier and the other data mining algorithms after internal cross-validation (area under receiving operator characteristic curve, AUROC: NBC=0.759; RND-F=0.736; Log-Reg=0.754 and Cox= 0.724). The NBC had also a remarkable and better trade-off between sensitivity and specificity (e.g. Balanced accuracy, BA) than the Cox-based classifier, when tested on an independent population of SSc patients (BA: NBC=0.769, Cox=0.622). The NBC was also superior to domain experts in predicting 5-year survival in this population (AUROC=0.829 vs. AUROC=0.788 and BA=0.769 vs. BA=0.67). We provide a model to make consistent 5-year prognostic predictions in SSc patients. Its internal validity, as well as capability of generalisation and reduced uncertainty compared to human experts support its use at bedside. Available at: http://www.nd.edu/~nchawla/survival.xls.

  19. Polarization and Color Filtering Applied to Enhance Photogrammetric Measurements of Reflective Surfaces

    NASA Technical Reports Server (NTRS)

    Wells, Jeffrey M.; Jones, Thomas W.; Danehy, Paul M.

    2005-01-01

    Techniques for enhancing photogrammetric measurement of reflective surfaces by reducing noise were developed utilizing principles of light polarization. Signal selectivity with polarized light was also compared to signal selectivity using chromatic filters. Combining principles of linear cross polarization and color selectivity enhanced signal-to-noise ratios by as much as 800 fold. More typical improvements with combining polarization and color selectivity were about 100 fold. We review polarization-based techniques and present experimental results comparing the performance of traditional retroreflective targeting materials, cornercube targets returning depolarized light, and color selectivity.

  20. Thiamethoxam Resistance in the House Fly, Musca domestica L.: Current Status, Resistance Selection, Cross-Resistance Potential and Possible Biochemical Mechanisms.

    PubMed

    Khan, Hafiz Azhar Ali; Akram, Waseem; Iqbal, Javaid; Naeem-Ullah, Unsar

    2015-01-01

    The house fly, Musca domestica L., is an important ectoparasite with the ability to develop resistance to insecticides used for their control. Thiamethoxam, a neonicotinoid, is a relatively new insecticide and effectively used against house flies with a few reports of resistance around the globe. To understand the status of resistance to thiamethoxam, eight adult house fly strains were evaluated under laboratory conditions. In addition, to assess the risks of resistance development, cross-resistance potential and possible biochemical mechanisms, a field strain of house flies was selected with thiamethoxam in the laboratory. The results revealed that the field strains showed varying level of resistance to thiamethoxam with resistance ratios (RR) at LC50 ranged from 7.66-20.13 folds. Continuous selection of the field strain (Thia-SEL) for five generations increased the RR from initial 7.66 fold to 33.59 fold. However, resistance declined significantly when the Thia-SEL strain reared for the next five generations without exposure to thiamethoxam. Compared to the laboratory susceptible reference strain (Lab-susceptible), the Thia-SEL strain showed cross-resistance to imidacloprid. Synergism tests revealed that S,S,S-tributylphosphorotrithioate (DEF) and piperonyl butoxide (PBO) produced synergism of thiamethoxam effects in the Thia-SEL strain (2.94 and 5.00 fold, respectively). In addition, biochemical analyses revealed that the activities of carboxylesterase (CarE) and mixed function oxidase (MFO) in the Thia-SEL strain were significantly higher than the Lab-susceptible strain. It seems that metabolic detoxification by CarE and MFO was a major mechanism for thiamethoxam resistance in the Thia-SEL strain of house flies. The results could be helpful in the future to develop an improved control strategy against house flies.

  1. Thiamethoxam Resistance in the House Fly, Musca domestica L.: Current Status, Resistance Selection, Cross-Resistance Potential and Possible Biochemical Mechanisms

    PubMed Central

    Khan, Hafiz Azhar Ali; Akram, Waseem; Iqbal, Javaid; Naeem-Ullah, Unsar

    2015-01-01

    The house fly, Musca domestica L., is an important ectoparasite with the ability to develop resistance to insecticides used for their control. Thiamethoxam, a neonicotinoid, is a relatively new insecticide and effectively used against house flies with a few reports of resistance around the globe. To understand the status of resistance to thiamethoxam, eight adult house fly strains were evaluated under laboratory conditions. In addition, to assess the risks of resistance development, cross-resistance potential and possible biochemical mechanisms, a field strain of house flies was selected with thiamethoxam in the laboratory. The results revealed that the field strains showed varying level of resistance to thiamethoxam with resistance ratios (RR) at LC50 ranged from 7.66-20.13 folds. Continuous selection of the field strain (Thia-SEL) for five generations increased the RR from initial 7.66 fold to 33.59 fold. However, resistance declined significantly when the Thia-SEL strain reared for the next five generations without exposure to thiamethoxam. Compared to the laboratory susceptible reference strain (Lab-susceptible), the Thia-SEL strain showed cross-resistance to imidacloprid. Synergism tests revealed that S,S,S-tributylphosphorotrithioate (DEF) and piperonyl butoxide (PBO) produced synergism of thiamethoxam effects in the Thia-SEL strain (2.94 and 5.00 fold, respectively). In addition, biochemical analyses revealed that the activities of carboxylesterase (CarE) and mixed function oxidase (MFO) in the Thia-SEL strain were significantly higher than the Lab-susceptible strain. It seems that metabolic detoxification by CarE and MFO was a major mechanism for thiamethoxam resistance in the Thia-SEL strain of house flies. The results could be helpful in the future to develop an improved control strategy against house flies. PMID:25938578

  2. Predicting introductory programming performance: A multi-institutional multivariate study

    NASA Astrophysics Data System (ADS)

    Bergin, Susan; Reilly, Ronan

    2006-12-01

    A model for predicting student performance on introductory programming modules is presented. The model uses attributes identified in a study carried out at four third-level institutions in the Republic of Ireland. Four instruments were used to collect the data and over 25 attributes were examined. A data reduction technique was applied and a logistic regression model using 10-fold stratified cross validation was developed. The model used three attributes: Leaving Certificate Mathematics result (final mathematics examination at second level), number of hours playing computer games while taking the module and programming self-esteem. Prediction success was significant with 80% of students correctly classified. The model also works well on a per-institution level. A discussion on the implications of the model is provided and future work is outlined.

  3. Modeling of autocatalytic hydrolysis of adefovir dipivoxil in solid formulations.

    PubMed

    Dong, Ying; Zhang, Yan; Xiang, Bingren; Deng, Haishan; Wu, Jingfang

    2011-04-01

    The stability and hydrolysis kinetics of a phosphate prodrug, adefovir dipivoxil, in solid formulations were studied. The stability relationship between five solid formulations was explored. An autocatalytic mechanism for hydrolysis could be proposed according to the kinetic behavior which fits the Prout-Tompkins model well. For the classical kinetic models could hardly describe and predict the hydrolysis kinetics of adefovir dipivoxil in solid formulations accurately when the temperature is high, a feedforward multilayer perceptron (MLP) neural network was constructed to model the hydrolysis kinetics. The build-in approaches in Weka, such as lazy classifiers and rule-based learners (IBk, KStar, DecisionTable and M5Rules), were used to verify the performance of MLP. The predictability of the models was evaluated by 10-fold cross-validation and an external test set. It reveals that MLP should be of general applicability proposing an alternative efficient way to model and predict autocatalytic hydrolysis kinetics for phosphate prodrugs.

  4. Oral cancer prognosis based on clinicopathologic and genomic markers using a hybrid of feature selection and machine learning methods

    PubMed Central

    2013-01-01

    Background Machine learning techniques are becoming useful as an alternative approach to conventional medical diagnosis or prognosis as they are good for handling noisy and incomplete data, and significant results can be attained despite a small sample size. Traditionally, clinicians make prognostic decisions based on clinicopathologic markers. However, it is not easy for the most skilful clinician to come out with an accurate prognosis by using these markers alone. Thus, there is a need to use genomic markers to improve the accuracy of prognosis. The main aim of this research is to apply a hybrid of feature selection and machine learning methods in oral cancer prognosis based on the parameters of the correlation of clinicopathologic and genomic markers. Results In the first stage of this research, five feature selection methods have been proposed and experimented on the oral cancer prognosis dataset. In the second stage, the model with the features selected from each feature selection methods are tested on the proposed classifiers. Four types of classifiers are chosen; these are namely, ANFIS, artificial neural network, support vector machine and logistic regression. A k-fold cross-validation is implemented on all types of classifiers due to the small sample size. The hybrid model of ReliefF-GA-ANFIS with 3-input features of drink, invasion and p63 achieved the best accuracy (accuracy = 93.81%; AUC = 0.90) for the oral cancer prognosis. Conclusions The results revealed that the prognosis is superior with the presence of both clinicopathologic and genomic markers. The selected features can be investigated further to validate the potential of becoming as significant prognostic signature in the oral cancer studies. PMID:23725313

  5. RBF kernel based support vector regression to estimate the blood volume and heart rate responses during hemodialysis.

    PubMed

    Javed, Faizan; Chan, Gregory S H; Savkin, Andrey V; Middleton, Paul M; Malouf, Philip; Steel, Elizabeth; Mackie, James; Lovell, Nigel H

    2009-01-01

    This paper uses non-linear support vector regression (SVR) to model the blood volume and heart rate (HR) responses in 9 hemodynamically stable kidney failure patients during hemodialysis. Using radial bias function (RBF) kernels the non-parametric models of relative blood volume (RBV) change with time as well as percentage change in HR with respect to RBV were obtained. The e-insensitivity based loss function was used for SVR modeling. Selection of the design parameters which includes capacity (C), insensitivity region (e) and the RBF kernel parameter (sigma) was made based on a grid search approach and the selected models were cross-validated using the average mean square error (AMSE) calculated from testing data based on a k-fold cross-validation technique. Linear regression was also applied to fit the curves and the AMSE was calculated for comparison with SVR. For the model based on RBV with time, SVR gave a lower AMSE for both training (AMSE=1.5) as well as testing data (AMSE=1.4) compared to linear regression (AMSE=1.8 and 1.5). SVR also provided a better fit for HR with RBV for both training as well as testing data (AMSE=15.8 and 16.4) compared to linear regression (AMSE=25.2 and 20.1).

  6. Computational Depth of Anesthesia via Multiple Vital Signs Based on Artificial Neural Networks.

    PubMed

    Sadrawi, Muammar; Fan, Shou-Zen; Abbod, Maysam F; Jen, Kuo-Kuang; Shieh, Jiann-Shing

    2015-01-01

    This study evaluated the depth of anesthesia (DoA) index using artificial neural networks (ANN) which is performed as the modeling technique. Totally 63-patient data is addressed, for both modeling and testing of 17 and 46 patients, respectively. The empirical mode decomposition (EMD) is utilized to purify between the electroencephalography (EEG) signal and the noise. The filtered EEG signal is subsequently extracted to achieve a sample entropy index by every 5-second signal. Then, it is combined with other mean values of vital signs, that is, electromyography (EMG), heart rate (HR), pulse, systolic blood pressure (SBP), diastolic blood pressure (DBP), and signal quality index (SQI) to evaluate the DoA index as the input. The 5 doctor scores are averaged to obtain an output index. The mean absolute error (MAE) is utilized as the performance evaluation. 10-fold cross-validation is performed in order to generalize the model. The ANN model is compared with the bispectral index (BIS). The results show that the ANN is able to produce lower MAE than BIS. For the correlation coefficient, ANN also has higher value than BIS tested on the 46-patient testing data. Sensitivity analysis and cross-validation method are applied in advance. The results state that EMG has the most effecting parameter, significantly.

  7. Computational Depth of Anesthesia via Multiple Vital Signs Based on Artificial Neural Networks

    PubMed Central

    Sadrawi, Muammar; Fan, Shou-Zen; Abbod, Maysam F.; Jen, Kuo-Kuang; Shieh, Jiann-Shing

    2015-01-01

    This study evaluated the depth of anesthesia (DoA) index using artificial neural networks (ANN) which is performed as the modeling technique. Totally 63-patient data is addressed, for both modeling and testing of 17 and 46 patients, respectively. The empirical mode decomposition (EMD) is utilized to purify between the electroencephalography (EEG) signal and the noise. The filtered EEG signal is subsequently extracted to achieve a sample entropy index by every 5-second signal. Then, it is combined with other mean values of vital signs, that is, electromyography (EMG), heart rate (HR), pulse, systolic blood pressure (SBP), diastolic blood pressure (DBP), and signal quality index (SQI) to evaluate the DoA index as the input. The 5 doctor scores are averaged to obtain an output index. The mean absolute error (MAE) is utilized as the performance evaluation. 10-fold cross-validation is performed in order to generalize the model. The ANN model is compared with the bispectral index (BIS). The results show that the ANN is able to produce lower MAE than BIS. For the correlation coefficient, ANN also has higher value than BIS tested on the 46-patient testing data. Sensitivity analysis and cross-validation method are applied in advance. The results state that EMG has the most effecting parameter, significantly. PMID:26568957

  8. γ production and neutron inelastic scattering cross sections for 76Ge

    NASA Astrophysics Data System (ADS)

    Rouki, C.; Domula, A. R.; Drohé, J. C.; Koning, A. J.; Plompen, A. J. M.; Zuber, K.

    2013-11-01

    The 2040.7-keV γ ray from the 69th excited state of 76Ge was investigated in the interest of Ge-based double-β-decay experiments like the Germanium Detector Array (GERDA) experiment. The predicted transition could interfere with valid 0νββ events at 2039.0 keV, creating false signals in large-volume 76Ge enriched detectors. The measurement was performed with the Gamma Array for Inelastic Neutron Scattering (GAINS) at the Geel Electron Linear Accelerator (GELINA) white neutron source, using the (n,n'γ) technique and focusing on the strongest γ rays originating from the level. Upper limits obtained for the production cross section of the 2040.7-keV γ ray showed no possible influence on GERDA data. Additional analysis of the data yielded high-resolution cross sections for the low-lying states of 76Ge and related γ rays, improving the accuracy and extending existing data for five transitions and five levels. The inelastic scattering cross section for 76Ge was determined for incident neutron energies up to 2.23 MeV, significantly increasing the energy range for which experimental data are available. Comparisons with model calculations using the talys code are presented indicating that accounting for the recently established asymmetric rotor structure should lead to an improved description of the data.

  9. Model-based and Model-free Machine Learning Techniques for Diagnostic Prediction and Classification of Clinical Outcomes in Parkinson's Disease.

    PubMed

    Gao, Chao; Sun, Hanbo; Wang, Tuo; Tang, Ming; Bohnen, Nicolaas I; Müller, Martijn L T M; Herman, Talia; Giladi, Nir; Kalinin, Alexandr; Spino, Cathie; Dauer, William; Hausdorff, Jeffrey M; Dinov, Ivo D

    2018-05-08

    In this study, we apply a multidisciplinary approach to investigate falls in PD patients using clinical, demographic and neuroimaging data from two independent initiatives (University of Michigan and Tel Aviv Sourasky Medical Center). Using machine learning techniques, we construct predictive models to discriminate fallers and non-fallers. Through controlled feature selection, we identified the most salient predictors of patient falls including gait speed, Hoehn and Yahr stage, postural instability and gait difficulty-related measurements. The model-based and model-free analytical methods we employed included logistic regression, random forests, support vector machines, and XGboost. The reliability of the forecasts was assessed by internal statistical (5-fold) cross validation as well as by external out-of-bag validation. Four specific challenges were addressed in the study: Challenge 1, develop a protocol for harmonizing and aggregating complex, multisource, and multi-site Parkinson's disease data; Challenge 2, identify salient predictive features associated with specific clinical traits, e.g., patient falls; Challenge 3, forecast patient falls and evaluate the classification performance; and Challenge 4, predict tremor dominance (TD) vs. posture instability and gait difficulty (PIGD). Our findings suggest that, compared to other approaches, model-free machine learning based techniques provide a more reliable clinical outcome forecasting of falls in Parkinson's patients, for example, with a classification accuracy of about 70-80%.

  10. Automated measurement of vocal fold vibratory asymmetry from high-speed videoendoscopy recordings.

    PubMed

    Mehta, Daryush D; Deliyski, Dimitar D; Quatieri, Thomas F; Hillman, Robert E

    2011-02-01

    In prior work, a manually derived measure of vocal fold vibratory phase asymmetry correlated to varying degrees with visual judgments made from laryngeal high-speed videoendoscopy (HSV) recordings. This investigation extended this work by establishing an automated HSV-based framework to quantify 3 categories of vocal fold vibratory asymmetry. HSV-based analysis provided for cycle-to-cycle estimates of left-right phase asymmetry, left-right amplitude asymmetry, and axis shift during glottal closure for 52 speakers with no vocal pathology producing comfortable and pressed phonation. An initial cross-validation of the automated left-right phase asymmetry measure was performed by correlating the measure with other objective and subjective assessments of phase asymmetry. Vocal fold vibratory asymmetry was exhibited to a similar extent in both comfortable and pressed phonations. The automated measure of left-right phase asymmetry strongly correlated with manually derived measures and moderately correlated with visual-perceptual ratings. Correlations with the visual-perceptual ratings remained relatively consistent as the automated measure was derived from kymograms taken at different glottal locations. An automated HSV-based framework for the quantification of vocal fold vibratory asymmetry was developed and initially validated. This framework serves as a platform for investigating relationships between vocal fold tissue motion and acoustic measures of voice function.

  11. Environmental drivers of spatial patterns of topsoil nitrogen and phosphorus under monsoon conditions in a complex terrain of South Korea

    PubMed Central

    Choi, Kwanghun; Spohn, Marie; Park, Soo Jin; Huwe, Bernd; Ließ, Mareike

    2017-01-01

    Nitrogen (N) and phosphorus (P) in topsoils are critical for plant nutrition. Relatively little is known about the spatial patterns of N and P in the organic layer of mountainous landscapes. Therefore, the spatial distributions of N and P in both the organic layer and the A horizon were analyzed using a light detection and ranging (LiDAR) digital elevation model and vegetation metrics. The objective of the study was to analyze the effect of vegetation and topography on the spatial patterns of N and P in a small watershed covered by forest in South Korea. Soil samples were collected using the conditioned latin hypercube method. LiDAR vegetation metrics, the normalized difference vegetation index (NDVI), and terrain parameters were derived as predictors. Spatial explicit predictions of N/P ratios were obtained using a random forest with uncertainty analysis. We tested different strategies of model validation (repeated 2-fold to 20-fold and leave-one-out cross validation). Repeated 10-fold cross validation was selected for model validation due to the comparatively high accuracy and low variance of prediction. Surface curvature was the best predictor of P contents in the organic layer and in the A horizon, while LiDAR vegetation metrics and NDVI were important predictors of N in the organic layer. N/P ratios increased with surface curvature and were higher on the convex upper slope than on the concave lower slope. This was due to P enrichment of the soil on the lower slope and a more even spatial distribution of N. Our digital soil maps showed that the topsoils on the upper slopes contained relatively little P. These findings are critical for understanding N and P dynamics in mountainous ecosystems. PMID:28837590

  12. Direct Validation of Differential Prediction.

    ERIC Educational Resources Information Center

    Lunneborg, Clifford E.

    Using academic achievement data for 655 University students, direct validation of differential predictions based on a battery of aptitude/achievement measures selected for their differential prediction efficiency was attempted. In the cross-validation of the prediction of actual differences among five academic area GPA's, this set of differential…

  13. PCA-based polling strategy in machine learning framework for coronary artery disease risk assessment in intravascular ultrasound: A link between carotid and coronary grayscale plaque morphology.

    PubMed

    Araki, Tadashi; Ikeda, Nobutaka; Shukla, Devarshi; Jain, Pankaj K; Londhe, Narendra D; Shrivastava, Vimal K; Banchhor, Sumit K; Saba, Luca; Nicolaides, Andrew; Shafique, Shoaib; Laird, John R; Suri, Jasjit S

    2016-05-01

    Percutaneous coronary interventional procedures need advance planning prior to stenting or an endarterectomy. Cardiologists use intravascular ultrasound (IVUS) for screening, risk assessment and stratification of coronary artery disease (CAD). We hypothesize that plaque components are vulnerable to rupture due to plaque progression. Currently, there are no standard grayscale IVUS tools for risk assessment of plaque rupture. This paper presents a novel strategy for risk stratification based on plaque morphology embedded with principal component analysis (PCA) for plaque feature dimensionality reduction and dominant feature selection technique. The risk assessment utilizes 56 grayscale coronary features in a machine learning framework while linking information from carotid and coronary plaque burdens due to their common genetic makeup. This system consists of a machine learning paradigm which uses a support vector machine (SVM) combined with PCA for optimal and dominant coronary artery morphological feature extraction. Carotid artery proven intima-media thickness (cIMT) biomarker is adapted as a gold standard during the training phase of the machine learning system. For the performance evaluation, K-fold cross validation protocol is adapted with 20 trials per fold. For choosing the dominant features out of the 56 grayscale features, a polling strategy of PCA is adapted where the original value of the features is unaltered. Different protocols are designed for establishing the stability and reliability criteria of the coronary risk assessment system (cRAS). Using the PCA-based machine learning paradigm and cross-validation protocol, a classification accuracy of 98.43% (AUC 0.98) with K=10 folds using an SVM radial basis function (RBF) kernel was achieved. A reliability index of 97.32% and machine learning stability criteria of 5% were met for the cRAS. This is the first Computer aided design (CADx) system of its kind that is able to demonstrate the ability of coronary risk assessment and stratification while demonstrating a successful design of the machine learning system based on our assumptions. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  14. Environmental fate model for ultra-low-volume insecticide applications used for adult mosquito management

    USGS Publications Warehouse

    Schleier, Jerome J.; Peterson, Robert K.D.; Irvine, Kathryn M.; Marshall, Lucy M.; Weaver, David K.; Preftakes, Collin J.

    2012-01-01

    One of the more effective ways of managing high densities of adult mosquitoes that vector human and animal pathogens is ultra-low-volume (ULV) aerosol applications of insecticides. The U.S. Environmental Protection Agency uses models that are not validated for ULV insecticide applications and exposure assumptions to perform their human and ecological risk assessments. Currently, there is no validated model that can accurately predict deposition of insecticides applied using ULV technology for adult mosquito management. In addition, little is known about the deposition and drift of small droplets like those used under conditions encountered during ULV applications. The objective of this study was to perform field studies to measure environmental concentrations of insecticides and to develop a validated model to predict the deposition of ULV insecticides. The final regression model was selected by minimizing the Bayesian Information Criterion and its prediction performance was evaluated using k-fold cross validation. Density of the formulation and the density and CMD interaction coefficients were the largest in the model. The results showed that as density of the formulation decreases, deposition increases. The interaction of density and CMD showed that higher density formulations and larger droplets resulted in greater deposition. These results are supported by the aerosol physics literature. A k-fold cross validation demonstrated that the mean square error of the selected regression model is not biased, and the mean square error and mean square prediction error indicated good predictive ability.

  15. Novel Screening Tool for Stroke Using Artificial Neural Network.

    PubMed

    Abedi, Vida; Goyal, Nitin; Tsivgoulis, Georgios; Hosseinichimeh, Niyousha; Hontecillas, Raquel; Bassaganya-Riera, Josep; Elijovich, Lucas; Metter, Jeffrey E; Alexandrov, Anne W; Liebeskind, David S; Alexandrov, Andrei V; Zand, Ramin

    2017-06-01

    The timely diagnosis of stroke at the initial examination is extremely important given the disease morbidity and narrow time window for intervention. The goal of this study was to develop a supervised learning method to recognize acute cerebral ischemia (ACI) and differentiate that from stroke mimics in an emergency setting. Consecutive patients presenting to the emergency department with stroke-like symptoms, within 4.5 hours of symptoms onset, in 2 tertiary care stroke centers were randomized for inclusion in the model. We developed an artificial neural network (ANN) model. The learning algorithm was based on backpropagation. To validate the model, we used a 10-fold cross-validation method. A total of 260 patients (equal number of stroke mimics and ACIs) were enrolled for the development and validation of our ANN model. Our analysis indicated that the average sensitivity and specificity of ANN for the diagnosis of ACI based on the 10-fold cross-validation analysis was 80.0% (95% confidence interval, 71.8-86.3) and 86.2% (95% confidence interval, 78.7-91.4), respectively. The median precision of ANN for the diagnosis of ACI was 92% (95% confidence interval, 88.7-95.3). Our results show that ANN can be an effective tool for the recognition of ACI and differentiation of ACI from stroke mimics at the initial examination. © 2017 American Heart Association, Inc.

  16. Watch-Dog: Detecting Self-Harming Activities From Wrist Worn Accelerometers.

    PubMed

    Bharti, Pratool; Panwar, Anurag; Gopalakrishna, Ganesh; Chellappan, Sriram

    2018-05-01

    In a 2012 survey, in the United States alone, there were more than 35 000 reported suicides with approximately 1800 of being psychiatric inpatients. Recent Centers for Disease Control and Prevention (CDC) reports indicate an upward trend in these numbers. In psychiatric facilities, staff perform intermittent or continuous observation of patients manually in order to prevent such tragedies, but studies show that they are insufficient, and also consume staff time and resources. In this paper, we present the Watch-Dog system, to address the problem of detecting self-harming activities when attempted by in-patients in clinical settings. Watch-Dog comprises of three key components-Data sensed by tiny accelerometer sensors worn on wrists of subjects; an efficient algorithm to classify whether a user is active versus dormant (i.e., performing a physical activity versus not performing any activity); and a novel decision selection algorithm based on random forests and continuity indices for fine grained activity classification. With data acquired from 11 subjects performing a series of activities (both self-harming and otherwise), Watch-Dog achieves a classification accuracy of , , and for same-user 10-fold cross-validation, cross-user 10-fold cross-validation, and cross-user leave-one-out evaluation, respectively. We believe that the problem addressed in this paper is practical, important, and timely. We also believe that our proposed system is practically deployable, and related discussions are provided in this paper.

  17. Common measure of quality of life for people with systemic sclerosis across seven European countries: a cross-sectional study.

    PubMed

    Ndosi, Mwidimi; Alcacer-Pitarch, Begonya; Allanore, Yannick; Del Galdo, Francesco; Frerix, Marc; García-Díaz, Sílvia; Hesselstrand, Roger; Kendall, Christine; Matucci-Cerinic, Marco; Mueller-Ladner, Ulf; Sandqvist, Gunnel; Torrente-Segarra, Vicenç; Schmeiser, Tim; Sierakowska, Matylda; Sierakowska, Justyna; Sierakowski, Stanslaw; Redmond, Anthony

    2018-02-20

    The aim of this study was to adapt the Systemic Sclerosis Quality of Life Questionnaire (SScQoL) into six European cultures and validate it as a common measure of quality of life in systemic sclerosis (SSc). This was a seven-country (Germany, France, Italy, Poland, Spain, Sweden and UK) cross-sectional study. A forward-backward translation process was used to adapt the English SScQoL into target languages. SScQoL was completed by patients with SSc, then data were validated against the Rasch model. To correct local response dependency, items were grouped into the following subscales: function, emotion, sleep, social and pain and reanalysed for fit to the model, unidimensionality and cross-cultural equivalence. The adaptation of the SScQoL was seamless in all countries except Germany. Cross-cultural validation included 1080 patients with a mean age 58.0 years (SD 13.9) and 87% were women. Local dependency was evident in individual country data. Grouping items into testlets corrected the local dependency in most country specific data. Fit to the model, reliability and unidimensionality was achieved in six-country data after cross-cultural adjustment for Italy in the social subscale. The SScQoL was then calibrated into an interval level scale. The individual SScQoL items have translated well into five languages and overall, the scale maintained its construct validity, working well as a five-subscale questionnaire. Measures of quality of life in SSc can be directly compared across five countries (France, Poland Spain, Sweden and UK). Data from Italy are also comparable with the other five countries although require an adjustment. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  18. Overexpression of Plasminogen Activator Inhibitor-1 in Advanced Gastric Cancer with Aggressive Lymph Node Metastasis

    PubMed Central

    Suh, Yun-Suhk; Yu, Jieun; Kim, Byung Chul; Choi, Boram; Han, Tae-Su; Ahn, Hye Seong; Kong, Seong-Ho; Lee, Hyuk-Joon; Kim, Woo Ho; Yang, Han-Kwang

    2015-01-01

    Purpose The purpose of this study is to investigate differentially expressed genes using DNA microarray between advanced gastric cancer (AGC) with aggressive lymph node (LN) metastasis and that with a more advanced tumor stage but without LN metastasis. Materials and Methods Five sample pairs of gastric cancer tissue and normal gastric mucosa were taken from three patients with T3N3 stage (highN) and two with T4N0 stage (lowN). Data from triplicate DNA microarray experiments were analyzed, and candidate genes were identified using a volcano plot that showed ≥ 2-fold differential expression and were significant by Welch's t test (p < 0.05) between highN and lowN. Those selected genes were validated independently by reverse-transcriptase–polymerase chain reaction (RT-PCR) using five AGC patients, and tissue-microarray (TMA) comprising 47 AGC patients. Results CFTR, LAMC2, SERPINE2, F2R, MMP7, FN1, TIMP1, plasminogen activator inhibitor-1 (PAI-1), ITGB8, SDS, and TMPRSS4 were commonly up-regulated over 2-fold in highN. REG3A, CD24, ITLN1, and WBP5 were commonly down-regulated over 2-fold in lowN. Among these genes, overexpression of PAI-1 was validated by RT-PCR, and TMA showed 16.7% (7/42) PAI-1 expression in T3N3, but none (0/5) in T4N0 (p=0.393). Conclusion DNA microarray analysis and validation by RT-PCR and TMA showed that overexpression of PAI-1 is related to aggressive LN metastasis in AGC. PMID:25687870

  19. Future Performance Trend Indicators: A Current Value Approach to Human Resources Accounting. Report III. Multivariate Predictions of Organizational Performance Across Time.

    ERIC Educational Resources Information Center

    Pecorella, Patricia A.; Bowers, David G.

    Multiple regression in a double cross-validated design was used to predict two performance measures (total variable expense and absence rate) by multi-month period in five industrial firms. The regressions do cross-validate, and produce multiple coefficients which display both concurrent and predictive effects, peaking 18 months to two years…

  20. Validation of the Technology Acceptance Measure for Pre-Service Teachers (TAMPST) on a Malaysian Sample: A Cross-Cultural Study

    ERIC Educational Resources Information Center

    Teo, Timothy

    2010-01-01

    Purpose: The purpose of this paper is to assess the cross-cultural validity of the technology acceptance measure for pre-service teachers (TAMPST) on a Malaysian sample. Design/methodology/approach: A total of 193 pre-service teachers from a Malaysian university completed a survey questionnaire measuring their responses to five constructs in the…

  1. The Naïve Overfitting Index Selection (NOIS): A new method to optimize model complexity for hyperspectral data

    NASA Astrophysics Data System (ADS)

    Rocha, Alby D.; Groen, Thomas A.; Skidmore, Andrew K.; Darvishzadeh, Roshanak; Willemen, Louise

    2017-11-01

    The growing number of narrow spectral bands in hyperspectral remote sensing improves the capacity to describe and predict biological processes in ecosystems. But it also poses a challenge to fit empirical models based on such high dimensional data, which often contain correlated and noisy predictors. As sample sizes, to train and validate empirical models, seem not to be increasing at the same rate, overfitting has become a serious concern. Overly complex models lead to overfitting by capturing more than the underlying relationship, and also through fitting random noise in the data. Many regression techniques claim to overcome these problems by using different strategies to constrain complexity, such as limiting the number of terms in the model, by creating latent variables or by shrinking parameter coefficients. This paper is proposing a new method, named Naïve Overfitting Index Selection (NOIS), which makes use of artificially generated spectra, to quantify the relative model overfitting and to select an optimal model complexity supported by the data. The robustness of this new method is assessed by comparing it to a traditional model selection based on cross-validation. The optimal model complexity is determined for seven different regression techniques, such as partial least squares regression, support vector machine, artificial neural network and tree-based regressions using five hyperspectral datasets. The NOIS method selects less complex models, which present accuracies similar to the cross-validation method. The NOIS method reduces the chance of overfitting, thereby avoiding models that present accurate predictions that are only valid for the data used, and too complex to make inferences about the underlying process.

  2. Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints.

    PubMed

    van der Ploeg, Tjeerd; Austin, Peter C; Steyerberg, Ewout W

    2014-12-22

    Modern modelling techniques may potentially provide more accurate predictions of binary outcomes than classical techniques. We aimed to study the predictive performance of different modelling techniques in relation to the effective sample size ("data hungriness"). We performed simulation studies based on three clinical cohorts: 1282 patients with head and neck cancer (with 46.9% 5 year survival), 1731 patients with traumatic brain injury (22.3% 6 month mortality) and 3181 patients with minor head injury (7.6% with CT scan abnormalities). We compared three relatively modern modelling techniques: support vector machines (SVM), neural nets (NN), and random forests (RF) and two classical techniques: logistic regression (LR) and classification and regression trees (CART). We created three large artificial databases with 20 fold, 10 fold and 6 fold replication of subjects, where we generated dichotomous outcomes according to different underlying models. We applied each modelling technique to increasingly larger development parts (100 repetitions). The area under the ROC-curve (AUC) indicated the performance of each model in the development part and in an independent validation part. Data hungriness was defined by plateauing of AUC and small optimism (difference between the mean apparent AUC and the mean validated AUC <0.01). We found that a stable AUC was reached by LR at approximately 20 to 50 events per variable, followed by CART, SVM, NN and RF models. Optimism decreased with increasing sample sizes and the same ranking of techniques. The RF, SVM and NN models showed instability and a high optimism even with >200 events per variable. Modern modelling techniques such as SVM, NN and RF may need over 10 times as many events per variable to achieve a stable AUC and a small optimism than classical modelling techniques such as LR. This implies that such modern techniques should only be used in medical prediction problems if very large data sets are available.

  3. Uniting statistical and individual-based approaches for animal movement modelling.

    PubMed

    Latombe, Guillaume; Parrott, Lael; Basille, Mathieu; Fortin, Daniel

    2014-01-01

    The dynamic nature of their internal states and the environment directly shape animals' spatial behaviours and give rise to emergent properties at broader scales in natural systems. However, integrating these dynamic features into habitat selection studies remains challenging, due to practically impossible field work to access internal states and the inability of current statistical models to produce dynamic outputs. To address these issues, we developed a robust method, which combines statistical and individual-based modelling. Using a statistical technique for forward modelling of the IBM has the advantage of being faster for parameterization than a pure inverse modelling technique and allows for robust selection of parameters. Using GPS locations from caribou monitored in Québec, caribou movements were modelled based on generative mechanisms accounting for dynamic variables at a low level of emergence. These variables were accessed by replicating real individuals' movements in parallel sub-models, and movement parameters were then empirically parameterized using Step Selection Functions. The final IBM model was validated using both k-fold cross-validation and emergent patterns validation and was tested for two different scenarios, with varying hardwood encroachment. Our results highlighted a functional response in habitat selection, which suggests that our method was able to capture the complexity of the natural system, and adequately provided projections on future possible states of the system in response to different management plans. This is especially relevant for testing the long-term impact of scenarios corresponding to environmental configurations that have yet to be observed in real systems.

  4. Uniting Statistical and Individual-Based Approaches for Animal Movement Modelling

    PubMed Central

    Latombe, Guillaume; Parrott, Lael; Basille, Mathieu; Fortin, Daniel

    2014-01-01

    The dynamic nature of their internal states and the environment directly shape animals' spatial behaviours and give rise to emergent properties at broader scales in natural systems. However, integrating these dynamic features into habitat selection studies remains challenging, due to practically impossible field work to access internal states and the inability of current statistical models to produce dynamic outputs. To address these issues, we developed a robust method, which combines statistical and individual-based modelling. Using a statistical technique for forward modelling of the IBM has the advantage of being faster for parameterization than a pure inverse modelling technique and allows for robust selection of parameters. Using GPS locations from caribou monitored in Québec, caribou movements were modelled based on generative mechanisms accounting for dynamic variables at a low level of emergence. These variables were accessed by replicating real individuals' movements in parallel sub-models, and movement parameters were then empirically parameterized using Step Selection Functions. The final IBM model was validated using both k-fold cross-validation and emergent patterns validation and was tested for two different scenarios, with varying hardwood encroachment. Our results highlighted a functional response in habitat selection, which suggests that our method was able to capture the complexity of the natural system, and adequately provided projections on future possible states of the system in response to different management plans. This is especially relevant for testing the long-term impact of scenarios corresponding to environmental configurations that have yet to be observed in real systems. PMID:24979047

  5. A Computational Model for Predicting RNase H Domain of Retrovirus.

    PubMed

    Wu, Sijia; Zhang, Xinman; Han, Jiuqiang

    2016-01-01

    RNase H (RNH) is a pivotal domain in retrovirus to cleave the DNA-RNA hybrid for continuing retroviral replication. The crucial role indicates that RNH is a promising drug target for therapeutic intervention. However, annotated RNHs in UniProtKB database have still been insufficient for a good understanding of their statistical characteristics so far. In this work, a computational RNH model was proposed to annotate new putative RNHs (np-RNHs) in the retroviruses. It basically predicts RNH domains through recognizing their start and end sites separately with SVM method. The classification accuracy rates are 100%, 99.01% and 97.52% respectively corresponding to jack-knife, 10-fold cross-validation and 5-fold cross-validation test. Subsequently, this model discovered 14,033 np-RNHs after scanning sequences without RNH annotations. All these predicted np-RNHs and annotated RNHs were employed to analyze the length, hydrophobicity and evolutionary relationship of RNH domains. They are all related to retroviral genera, which validates the classification of retroviruses to a certain degree. In the end, a software tool was designed for the application of our prediction model. The software together with datasets involved in this paper can be available for free download at https://sourceforge.net/projects/rhtool/files/?source=navbar.

  6. Multiview hyperspectral topography of tissue structural and functional characteristics

    NASA Astrophysics Data System (ADS)

    Liu, Peng; Huang, Jiwei; Zhang, Shiwu; Xu, Ronald X.

    2016-01-01

    Accurate and in vivo characterization of structural, functional, and molecular characteristics of biological tissue will facilitate quantitative diagnosis, therapeutic guidance, and outcome assessment in many clinical applications, such as wound healing, cancer surgery, and organ transplantation. We introduced and tested a multiview hyperspectral imaging technique for noninvasive topographic imaging of cutaneous wound oxygenation. The technique integrated a multiview module and a hyperspectral module in a single portable unit. Four plane mirrors were cohered to form a multiview reflective mirror set with a rectangular cross section. The mirror set was placed between a hyperspectral camera and the target biological tissue. For a single image acquisition task, a hyperspectral data cube with five views was obtained. The five-view hyperspectral image consisted of a main objective image and four reflective images. Three-dimensional (3-D) topography of the scene was achieved by correlating the matching pixels between the objective image and the reflective images. 3-D mapping of tissue oxygenation was achieved using a hyperspectral oxygenation algorithm. The multiview hyperspectral imaging technique was validated in a wound model, a tissue-simulating blood phantom, and in vivo biological tissue. The experimental results demonstrated the technical feasibility of using multiview hyperspectral imaging for 3-D topography of tissue functional properties.

  7. DNA motifs associated with aberrant CpG island methylation.

    PubMed

    Feltus, F Alex; Lee, Eva K; Costello, Joseph F; Plass, Christoph; Vertino, Paula M

    2006-05-01

    Epigenetic silencing involving the aberrant methylation of promoter region CpG islands is widely recognized as a tumor suppressor silencing mechanism in cancer. However, the molecular pathways underlying aberrant DNA methylation remain elusive. Recently we showed that, on a genome-wide level, CpG island loci differ in their intrinsic susceptibility to aberrant methylation and that this susceptibility can be predicted based on underlying sequence context. These data suggest that there are sequence/structural features that contribute to the protection from or susceptibility to aberrant methylation. Here we use motif elicitation coupled with classification techniques to identify DNA sequence motifs that selectively define methylation-prone or methylation-resistant CpG islands. Motifs common to 28 methylation-prone or 47 methylation-resistant CpG island-containing genomic fragments were determined using the MEME and MAST algorithms (). The five most discriminatory motifs derived from methylation-prone sequences were found to be associated with CpG islands in general and were nonrandomly distributed throughout the genome. In contrast, the eight most discriminatory motifs derived from the methylation-resistant CpG islands were randomly distributed throughout the genome. Interestingly, this latter group tended to associate with Alu and other repetitive sequences. Used together, the frequency of occurrence of these motifs successfully discriminated methylation-prone and methylation-resistant CpG island groups with an accuracy of 87% after 10-fold cross-validation. The motifs identified here are candidate methylation-targeting or methylation-protection DNA sequences.

  8. Simultaneous detection and classification of breast masses in digital mammograms via a deep learning YOLO-based CAD system.

    PubMed

    Al-Masni, Mohammed A; Al-Antari, Mugahed A; Park, Jeong-Min; Gi, Geon; Kim, Tae-Yeon; Rivera, Patricio; Valarezo, Edwin; Choi, Mun-Taek; Han, Seung-Moo; Kim, Tae-Seong

    2018-04-01

    Automatic detection and classification of the masses in mammograms are still a big challenge and play a crucial role to assist radiologists for accurate diagnosis. In this paper, we propose a novel Computer-Aided Diagnosis (CAD) system based on one of the regional deep learning techniques, a ROI-based Convolutional Neural Network (CNN) which is called You Only Look Once (YOLO). Although most previous studies only deal with classification of masses, our proposed YOLO-based CAD system can handle detection and classification simultaneously in one framework. The proposed CAD system contains four main stages: preprocessing of mammograms, feature extraction utilizing deep convolutional networks, mass detection with confidence, and finally mass classification using Fully Connected Neural Networks (FC-NNs). In this study, we utilized original 600 mammograms from Digital Database for Screening Mammography (DDSM) and their augmented mammograms of 2,400 with the information of the masses and their types in training and testing our CAD. The trained YOLO-based CAD system detects the masses and then classifies their types into benign or malignant. Our results with five-fold cross validation tests show that the proposed CAD system detects the mass location with an overall accuracy of 99.7%. The system also distinguishes between benign and malignant lesions with an overall accuracy of 97%. Our proposed system even works on some challenging breast cancer cases where the masses exist over the pectoral muscles or dense regions. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. Preliminary study of tumor heterogeneity in imaging predicts two year survival in pancreatic cancer patients.

    PubMed

    Chakraborty, Jayasree; Langdon-Embry, Liana; Cunanan, Kristen M; Escalon, Joanna G; Allen, Peter J; Lowery, Maeve A; O'Reilly, Eileen M; Gönen, Mithat; Do, Richard G; Simpson, Amber L

    2017-01-01

    Pancreatic ductal adenocarcinoma (PDAC) is one of the most lethal cancers in the United States with a five-year survival rate of 7.2% for all stages. Although surgical resection is the only curative treatment, currently we are unable to differentiate between resectable patients with occult metastatic disease from those with potentially curable disease. Identification of patients with poor prognosis via early classification would help in initial management including the use of neoadjuvant chemotherapy or radiation, or in the choice of postoperative adjuvant therapy. PDAC ranges in appearance from homogeneously isoattenuating masses to heterogeneously hypovascular tumors on CT images; hence, we hypothesize that heterogeneity reflects underlying differences at the histologic or genetic level and will therefore correlate with patient outcome. We quantify heterogeneity of PDAC with texture analysis to predict 2-year survival. Using fuzzy minimum-redundancy maximum-relevance feature selection and a naive Bayes classifier, the proposed features achieve an area under receiver operating characteristic curve (AUC) of 0.90 and accuracy (Ac) of 82.86% with the leave-one-image-out technique and an AUC of 0.80 and Ac of 75.0% with three-fold cross-validation. We conclude that texture analysis can be used to quantify heterogeneity in CT images to accurately predict 2-year survival in patients with pancreatic cancer. From these data, we infer differences in the biological evolution of pancreatic cancer subtypes measurable in imaging and identify opportunities for optimized patient selection for therapy.

  10. Twofold processing for denoising ultrasound medical images.

    PubMed

    Kishore, P V V; Kumar, K V V; Kumar, D Anil; Prasad, M V D; Goutham, E N D; Rahul, R; Krishna, C B S Vamsi; Sandeep, Y

    2015-01-01

    Ultrasound medical (US) imaging non-invasively pictures inside of a human body for disease diagnostics. Speckle noise attacks ultrasound images degrading their visual quality. A twofold processing algorithm is proposed in this work to reduce this multiplicative speckle noise. First fold used block based thresholding, both hard (BHT) and soft (BST), on pixels in wavelet domain with 8, 16, 32 and 64 non-overlapping block sizes. This first fold process is a better denoising method for reducing speckle and also inducing object of interest blurring. The second fold process initiates to restore object boundaries and texture with adaptive wavelet fusion. The degraded object restoration in block thresholded US image is carried through wavelet coefficient fusion of object in original US mage and block thresholded US image. Fusion rules and wavelet decomposition levels are made adaptive for each block using gradient histograms with normalized differential mean (NDF) to introduce highest level of contrast between the denoised pixels and the object pixels in the resultant image. Thus the proposed twofold methods are named as adaptive NDF block fusion with hard and soft thresholding (ANBF-HT and ANBF-ST). The results indicate visual quality improvement to an interesting level with the proposed twofold processing, where the first fold removes noise and second fold restores object properties. Peak signal to noise ratio (PSNR), normalized cross correlation coefficient (NCC), edge strength (ES), image quality Index (IQI) and structural similarity index (SSIM), measure the quantitative quality of the twofold processing technique. Validation of the proposed method is done by comparing with anisotropic diffusion (AD), total variational filtering (TVF) and empirical mode decomposition (EMD) for enhancement of US images. The US images are provided by AMMA hospital radiology labs at Vijayawada, India.

  11. LiDAR based prediction of forest biomass using hierarchical models with spatially varying coefficients

    USGS Publications Warehouse

    Babcock, Chad; Finley, Andrew O.; Bradford, John B.; Kolka, Randall K.; Birdsey, Richard A.; Ryan, Michael G.

    2015-01-01

    Many studies and production inventory systems have shown the utility of coupling covariates derived from Light Detection and Ranging (LiDAR) data with forest variables measured on georeferenced inventory plots through regression models. The objective of this study was to propose and assess the use of a Bayesian hierarchical modeling framework that accommodates both residual spatial dependence and non-stationarity of model covariates through the introduction of spatial random effects. We explored this objective using four forest inventory datasets that are part of the North American Carbon Program, each comprising point-referenced measures of above-ground forest biomass and discrete LiDAR. For each dataset, we considered at least five regression model specifications of varying complexity. Models were assessed based on goodness of fit criteria and predictive performance using a 10-fold cross-validation procedure. Results showed that the addition of spatial random effects to the regression model intercept improved fit and predictive performance in the presence of substantial residual spatial dependence. Additionally, in some cases, allowing either some or all regression slope parameters to vary spatially, via the addition of spatial random effects, further improved model fit and predictive performance. In other instances, models showed improved fit but decreased predictive performance—indicating over-fitting and underscoring the need for cross-validation to assess predictive ability. The proposed Bayesian modeling framework provided access to pixel-level posterior predictive distributions that were useful for uncertainty mapping, diagnosing spatial extrapolation issues, revealing missing model covariates, and discovering locally significant parameters.

  12. Improved prediction of drug-target interactions using regularized least squares integrating with kernel fusion technique.

    PubMed

    Hao, Ming; Wang, Yanli; Bryant, Stephen H

    2016-02-25

    Identification of drug-target interactions (DTI) is a central task in drug discovery processes. In this work, a simple but effective regularized least squares integrating with nonlinear kernel fusion (RLS-KF) algorithm is proposed to perform DTI predictions. Using benchmark DTI datasets, our proposed algorithm achieves the state-of-the-art results with area under precision-recall curve (AUPR) of 0.915, 0.925, 0.853 and 0.909 for enzymes, ion channels (IC), G protein-coupled receptors (GPCR) and nuclear receptors (NR) based on 10 fold cross-validation. The performance can further be improved by using a recalculated kernel matrix, especially for the small set of nuclear receptors with AUPR of 0.945. Importantly, most of the top ranked interaction predictions can be validated by experimental data reported in the literature, bioassay results in the PubChem BioAssay database, as well as other previous studies. Our analysis suggests that the proposed RLS-KF is helpful for studying DTI, drug repositioning as well as polypharmacology, and may help to accelerate drug discovery by identifying novel drug targets. Published by Elsevier B.V.

  13. Mapping the Transmission Risk of Zika Virus using Machine Learning Models.

    PubMed

    Jiang, Dong; Hao, Mengmeng; Ding, Fangyu; Fu, Jingying; Li, Meng

    2018-06-19

    Zika virus, which has been linked to severe congenital abnormalities, is exacerbating global public health problems with its rapid transnational expansion fueled by increased global travel and trade. Suitability mapping of the transmission risk of Zika virus is essential for drafting public health plans and disease control strategies, which are especially important in areas where medical resources are relatively scarce. Predicting the risk of Zika virus outbreak has been studied in recent years, but the published literature rarely includes multiple model comparisons or predictive uncertainty analysis. Here, three relatively popular machine learning models including backward propagation neural network (BPNN), gradient boosting machine (GBM) and random forest (RF) were adopted to map the probability of Zika epidemic outbreak at the global level, pairing high-dimensional multidisciplinary covariate layers with comprehensive location data on recorded Zika virus infection in humans. The results show that the predicted high-risk areas for Zika transmission are concentrated in four regions: Southeastern North America, Eastern South America, Central Africa and Eastern Asia. To evaluate the performance of machine learning models, the 50 modeling processes were conducted based on a training dataset. The BPNN model obtained the highest predictive accuracy with a 10-fold cross-validation area under the curve (AUC) of 0.966 [95% confidence interval (CI) 0.965-0.967], followed by the GBM model (10-fold cross-validation AUC = 0.964[0.963-0.965]) and the RF model (10-fold cross-validation AUC = 0.963[0.962-0.964]). Based on training samples, compared with the BPNN-based model, we find that significant differences (p = 0.0258* and p = 0.0001***, respectively) are observed for prediction accuracies achieved by the GBM and RF models. Importantly, the prediction uncertainty introduced by the selection of absence data was quantified and could provide more accurate fundamental and scientific information for further study on disease transmission prediction and risk assessment. Copyright © 2018. Published by Elsevier B.V.

  14. GIMDA: Graphlet interaction-based MiRNA-disease association prediction.

    PubMed

    Chen, Xing; Guan, Na-Na; Li, Jian-Qiang; Yan, Gui-Ying

    2018-03-01

    MicroRNAs (miRNAs) have been confirmed to be closely related to various human complex diseases by many experimental studies. It is necessary and valuable to develop powerful and effective computational models to predict potential associations between miRNAs and diseases. In this work, we presented a prediction model of Graphlet Interaction for MiRNA-Disease Association prediction (GIMDA) by integrating the disease semantic similarity, miRNA functional similarity, Gaussian interaction profile kernel similarity and the experimentally confirmed miRNA-disease associations. The related score of a miRNA to a disease was calculated by measuring the graphlet interactions between two miRNAs or two diseases. The novelty of GIMDA lies in that we used graphlet interaction to analyse the complex relationships between two nodes in a graph. The AUCs of GIMDA in global and local leave-one-out cross-validation (LOOCV) turned out to be 0.9006 and 0.8455, respectively. The average result of five-fold cross-validation reached to 0.8927 ± 0.0012. In case study for colon neoplasms, kidney neoplasms and prostate neoplasms based on the database of HMDD V2.0, 45, 45, 41 of the top 50 potential miRNAs predicted by GIMDA were validated by dbDEMC and miR2Disease. Additionally, in the case study of new diseases without any known associated miRNAs and the case study of predicting potential miRNA-disease associations using HMDD V1.0, there were also high percentages of top 50 miRNAs verified by the experimental literatures. © 2017 The Authors. Journal of Cellular and Molecular Medicine published by John Wiley & Sons Ltd and Foundation for Cellular and Molecular Medicine.

  15. Intelligent wear mode identification system for marine diesel engines based on multi-level belief rule base methodology

    NASA Astrophysics Data System (ADS)

    Yan, Xinping; Xu, Xiaojian; Sheng, Chenxing; Yuan, Chengqing; Li, Zhixiong

    2018-01-01

    Wear faults are among the chief causes of main-engine damage, significantly influencing the secure and economical operation of ships. It is difficult for engineers to utilize multi-source information to identify wear modes, so an intelligent wear mode identification model needs to be developed to assist engineers in diagnosing wear faults in diesel engines. For this purpose, a multi-level belief rule base (BBRB) system is proposed in this paper. The BBRB system consists of two-level belief rule bases, and the 2D and 3D characteristics of wear particles are used as antecedent attributes on each level. Quantitative and qualitative wear information with uncertainties can be processed simultaneously by the BBRB system. In order to enhance the efficiency of the BBRB, the silhouette value is adopted to determine referential points and the fuzzy c-means clustering algorithm is used to transform input wear information into belief degrees. In addition, the initial parameters of the BBRB system are constructed on the basis of expert-domain knowledge and then optimized by the genetic algorithm to ensure the robustness of the system. To verify the validity of the BBRB system, experimental data acquired from real-world diesel engines are analyzed. Five-fold cross-validation is conducted on the experimental data and the BBRB is compared with the other four models in the cross-validation. In addition, a verification dataset containing different wear particles is used to highlight the effectiveness of the BBRB system in wear mode identification. The verification results demonstrate that the proposed BBRB is effective and efficient for wear mode identification with better performance and stability than competing systems.

  16. Comparative forensic soil analysis of New Jersey state parks using a combination of simple techniques with multivariate statistics.

    PubMed

    Bonetti, Jennifer; Quarino, Lawrence

    2014-05-01

    This study has shown that the combination of simple techniques with the use of multivariate statistics offers the potential for the comparative analysis of soil samples. Five samples were obtained from each of twelve state parks across New Jersey in both the summer and fall seasons. Each sample was examined using particle-size distribution, pH analysis in both water and 1 M CaCl2 , and a loss on ignition technique. Data from each of the techniques were combined, and principal component analysis (PCA) and canonical discriminant analysis (CDA) were used for multivariate data transformation. Samples from different locations could be visually differentiated from one another using these multivariate plots. Hold-one-out cross-validation analysis showed error rates as low as 3.33%. Ten blind study samples were analyzed resulting in no misclassifications using Mahalanobis distance calculations and visual examinations of multivariate plots. Seasonal variation was minimal between corresponding samples, suggesting potential success in forensic applications. © 2014 American Academy of Forensic Sciences.

  17. Minimizing donor-site morbidity following bilateral pedicled TRAM breast reconstruction with the double mesh fold over technique.

    PubMed

    Bharti, Gaurav; Groves, Leslie; Sanger, Claire; Thompson, James; David, Lisa; Marks, Malcolm

    2013-05-01

    Transverse rectus abdominus muscle flaps (TRAM) can result in significant abdominal wall donor-site morbidity. We present our experience with bilateral pedicle TRAM breast reconstruction using a double-layered polypropylene mesh fold over technique to repair the rectus fascia. A retrospective study was performed that included patients with bilateral pedicle TRAM breast reconstruction and abdominal reconstruction using a double-layered polypropylene mesh fold over technique. Thirty-five patients met the study criteria with a mean age of 49 years old and mean follow-up of 7.4 years. There were no instances of abdominal hernia and only 2 cases (5.7%) of abdominal bulge. Other abdominal complications included partial umbilical necrosis (14.3%), seroma (11.4%), partial wound dehiscence (8.6%), abdominal weakness (5.7%), abdominal laxity (2.9%), and hematoma (2.9%). The TRAM flap is a reliable option for bilateral autologous breast reconstruction. Using the double mesh repair of the abdominal wall can reduce instances of an abdominal bulge and hernia.

  18. Test-retest reliability and cross validation of the functioning everyday with a wheelchair instrument.

    PubMed

    Mills, Tamara L; Holm, Margo B; Schmeler, Mark

    2007-01-01

    The purpose of this study was to establish the test-retest reliability and content validity of an outcomes tool designed to measure the effectiveness of seating-mobility interventions on the functional performance of individuals who use wheelchairs or scooters as their primary seating-mobility device. The instrument, Functioning Everyday With a Wheelchair (FEW), is a questionnaire designed to measure perceived user function related to wheelchair/scooter use. Using consumer-generated items, FEW Beta Version 1.0 was developed and test-retest reliability was established. Cross-validation of FEW Beta Version 1.0 was then carried out with five samples of seating-mobility users to establish content validity. Based on the content validity study, FEW Version 2.0 was developed and administered to seating-mobility consumers to examine its test-retest reliability. FEW Beta Version 1.0 yielded an intraclass correlation coefficient (ICC) Model (3,k) of .92, p < .001, and the content validity results revealed that FEW Beta Version 1.0 captured 55% of seating-mobility goals reported by consumers across five samples. FEW Version 2.0 yielded ICC(3,k) = .86, p < .001, and captured 98.5% of consumers' seating-mobility goals. The cross-validation study identified new categories of seating-mobility goals for inclusion in FEW Version 2.0, and the content validity of FEW Version 2.0 was confirmed. FEW Beta Version 1.0 and FEW Version 2.0 were highly stable in their measurement of participants' seating-mobility goals over a 1-week interval.

  19. A lightweight QRS detector for single lead ECG signals using a max-min difference algorithm.

    PubMed

    Pandit, Diptangshu; Zhang, Li; Liu, Chengyu; Chattopadhyay, Samiran; Aslam, Nauman; Lim, Chee Peng

    2017-06-01

    Detection of the R-peak pertaining to the QRS complex of an ECG signal plays an important role for the diagnosis of a patient's heart condition. To accurately identify the QRS locations from the acquired raw ECG signals, we need to handle a number of challenges, which include noise, baseline wander, varying peak amplitudes, and signal abnormality. This research aims to address these challenges by developing an efficient lightweight algorithm for QRS (i.e., R-peak) detection from raw ECG signals. A lightweight real-time sliding window-based Max-Min Difference (MMD) algorithm for QRS detection from Lead II ECG signals is proposed. Targeting to achieve the best trade-off between computational efficiency and detection accuracy, the proposed algorithm consists of five key steps for QRS detection, namely, baseline correction, MMD curve generation, dynamic threshold computation, R-peak detection, and error correction. Five annotated databases from Physionet are used for evaluating the proposed algorithm in R-peak detection. Integrated with a feature extraction technique and a neural network classifier, the proposed ORS detection algorithm has also been extended to undertake normal and abnormal heartbeat detection from ECG signals. The proposed algorithm exhibits a high degree of robustness in QRS detection and achieves an average sensitivity of 99.62% and an average positive predictivity of 99.67%. Its performance compares favorably with those from the existing state-of-the-art models reported in the literature. In regards to normal and abnormal heartbeat detection, the proposed QRS detection algorithm in combination with the feature extraction technique and neural network classifier achieves an overall accuracy rate of 93.44% based on an empirical evaluation using the MIT-BIH Arrhythmia data set with 10-fold cross validation. In comparison with other related studies, the proposed algorithm offers a lightweight adaptive alternative for R-peak detection with good computational efficiency. The empirical results indicate that it not only yields a high accuracy rate in QRS detection, but also exhibits efficient computational complexity at the order of O(n), where n is the length of an ECG signal. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. Efficient computational model for classification of protein localization images using Extended Threshold Adjacency Statistics and Support Vector Machines.

    PubMed

    Tahir, Muhammad; Jan, Bismillah; Hayat, Maqsood; Shah, Shakir Ullah; Amin, Muhammad

    2018-04-01

    Discriminative and informative feature extraction is the core requirement for accurate and efficient classification of protein subcellular localization images so that drug development could be more effective. The objective of this paper is to propose a novel modification in the Threshold Adjacency Statistics technique and enhance its discriminative power. In this work, we utilized Threshold Adjacency Statistics from a novel perspective to enhance its discrimination power and efficiency. In this connection, we utilized seven threshold ranges to produce seven distinct feature spaces, which are then used to train seven SVMs. The final prediction is obtained through the majority voting scheme. The proposed ETAS-SubLoc system is tested on two benchmark datasets using 5-fold cross-validation technique. We observed that our proposed novel utilization of TAS technique has improved the discriminative power of the classifier. The ETAS-SubLoc system has achieved 99.2% accuracy, 99.3% sensitivity and 99.1% specificity for Endogenous dataset outperforming the classical Threshold Adjacency Statistics technique. Similarly, 91.8% accuracy, 96.3% sensitivity and 91.6% specificity values are achieved for Transfected dataset. Simulation results validated the effectiveness of ETAS-SubLoc that provides superior prediction performance compared to the existing technique. The proposed methodology aims at providing support to pharmaceutical industry as well as research community towards better drug designing and innovation in the fields of bioinformatics and computational biology. The implementation code for replicating the experiments presented in this paper is available at: https://drive.google.com/file/d/0B7IyGPObWbSqRTRMcXI2bG5CZWs/view?usp=sharing. Copyright © 2018 Elsevier B.V. All rights reserved.

  1. A machine learning model to predict the risk of 30-day readmissions in patients with heart failure: a retrospective analysis of electronic medical records data.

    PubMed

    Golas, Sara Bersche; Shibahara, Takuma; Agboola, Stephen; Otaki, Hiroko; Sato, Jumpei; Nakae, Tatsuya; Hisamitsu, Toru; Kojima, Go; Felsted, Jennifer; Kakarmath, Sujay; Kvedar, Joseph; Jethwani, Kamal

    2018-06-22

    Heart failure is one of the leading causes of hospitalization in the United States. Advances in big data solutions allow for storage, management, and mining of large volumes of structured and semi-structured data, such as complex healthcare data. Applying these advances to complex healthcare data has led to the development of risk prediction models to help identify patients who would benefit most from disease management programs in an effort to reduce readmissions and healthcare cost, but the results of these efforts have been varied. The primary aim of this study was to develop a 30-day readmission risk prediction model for heart failure patients discharged from a hospital admission. We used longitudinal electronic medical record data of heart failure patients admitted within a large healthcare system. Feature vectors included structured demographic, utilization, and clinical data, as well as selected extracts of un-structured data from clinician-authored notes. The risk prediction model was developed using deep unified networks (DUNs), a new mesh-like network structure of deep learning designed to avoid over-fitting. The model was validated with 10-fold cross-validation and results compared to models based on logistic regression, gradient boosting, and maxout networks. Overall model performance was assessed using concordance statistic. We also selected a discrimination threshold based on maximum projected cost saving to the Partners Healthcare system. Data from 11,510 patients with 27,334 admissions and 6369 30-day readmissions were used to train the model. After data processing, the final model included 3512 variables. The DUNs model had the best performance after 10-fold cross-validation. AUCs for prediction models were 0.664 ± 0.015, 0.650 ± 0.011, 0.695 ± 0.016 and 0.705 ± 0.015 for logistic regression, gradient boosting, maxout networks, and DUNs respectively. The DUNs model had an accuracy of 76.4% at the classification threshold that corresponded with maximum cost saving to the hospital. Deep learning techniques performed better than other traditional techniques in developing this EMR-based prediction model for 30-day readmissions in heart failure patients. Such models can be used to identify heart failure patients with impending hospitalization, enabling care teams to target interventions at their most high-risk patients and improving overall clinical outcomes.

  2. Determination of the geographic origin of onions between three main production areas in Japan and other countries by mineral composition.

    PubMed

    Ariyama, Kaoru; Aoyama, Yoshinori; Mochizuki, Akashi; Homura, Yuji; Kadokura, Masashi; Yasui, Akemi

    2007-01-24

    Onions (Allium cepa L.) are produced in many countries and are one of the most popular vegetables in the world, thus leading to an enormous amount of international trade. It is currently important that a scientific technique be developed for determining geographic origin as a means to detect fraudulent labeling. We have therefore developed a technique based on mineral analysis and linear discriminant analysis (LDA). The onion samples used in this study were from Hokkaido, Hyogo, and Saga, which are the primary onion-growing areas in Japan, and those from countries that export onions to Japan (China, the United States, New Zealand, Thailand, Australia, and Chile). Of 309 samples, 108 were from Hokkaido, 52 were from Saga, 77 were from Hyogo, and 72 were from abroad. Fourteen elements (Na, Mg, P, Mn, Co, Ni, Cu, Zn, Rb, Sr, Mo, Cd, Cs, and Ba) in the samples were determined by frame atomic adsorption spectrometry, inductively coupled plasma optical emission spectrometry, and inductively coupled plasma mass spectrometry. The models established by LDA were used to discriminate the geographic origin between Hokkaido and abroad, Hyogo and abroad, and Saga and abroad. Ten-fold cross-validations were conducted using these models. The discrimination accuracies obtained by cross-validation between Hokkaido and abroad were 100 and 86%, respectively. Those between Hyogo and abroad were 100 and 90%, respectively. Those between Saga and abroad were 98 and 90%, respectively. In addition, it was demonstrated that the fingerprint of an element pattern from a specific production area, which a crop receives, did not easily change by the variations of fertilization, crop year, variety, soil type, and production year if appropriate elements were chosen.

  3. Application of Quantitative Structure–Activity Relationship Models of 5-HT1A Receptor Binding to Virtual Screening Identifies Novel and Potent 5-HT1A Ligands

    PubMed Central

    2015-01-01

    The 5-hydroxytryptamine 1A (5-HT1A) serotonin receptor has been an attractive target for treating mood and anxiety disorders such as schizophrenia. We have developed binary classification quantitative structure–activity relationship (QSAR) models of 5-HT1A receptor binding activity using data retrieved from the PDSP Ki database. The prediction accuracy of these models was estimated by external 5-fold cross-validation as well as using an additional validation set comprising 66 structurally distinct compounds from the World of Molecular Bioactivity database. These validated models were then used to mine three major types of chemical screening libraries, i.e., drug-like libraries, GPCR targeted libraries, and diversity libraries, to identify novel computational hits. The five best hits from each class of libraries were chosen for further experimental testing in radioligand binding assays, and nine of the 15 hits were confirmed to be active experimentally with binding affinity better than 10 μM. The most active compound, Lysergol, from the diversity library showed very high binding affinity (Ki) of 2.3 nM against 5-HT1A receptor. The novel 5-HT1A actives identified with the QSAR-based virtual screening approach could be potentially developed as novel anxiolytics or potential antischizophrenic drugs. PMID:24410373

  4. Development of Sorting System for Fishes by Feed-forward Neural Networks Using Rotation Invariant Features

    NASA Astrophysics Data System (ADS)

    Shiraishi, Yuhki; Takeda, Fumiaki

    In this research, we have developed a sorting system for fishes, which is comprised of a conveyance part, a capturing image part, and a sorting part. In the conveyance part, we have developed an independent conveyance system in order to separate one fish from an intertwined group of fishes. After the image of the separated fish is captured in the capturing part, a rotation invariant feature is extracted using two-dimensional fast Fourier transform, which is the mean value of the power spectrum with the same distance from the origin in the spectrum field. After that, the fishes are classified by three-layered feed-forward neural networks. The experimental results show that the developed system classifies three kinds of fishes captured in various angles with the classification ratio of 98.95% for 1044 captured images of five fishes. The other experimental results show the classification ratio of 90.7% for 300 fishes by 10-fold cross validation method.

  5. Prediction of fatty acid-binding residues on protein surfaces with three-dimensional probability distributions of interacting atoms.

    PubMed

    Mahalingam, Rajasekaran; Peng, Hung-Pin; Yang, An-Suei

    2014-08-01

    Protein-fatty acid interaction is vital for many cellular processes and understanding this interaction is important for functional annotation as well as drug discovery. In this work, we present a method for predicting the fatty acid (FA)-binding residues by using three-dimensional probability density distributions of interacting atoms of FAs on protein surfaces which are derived from the known protein-FA complex structures. A machine learning algorithm was established to learn the characteristic patterns of the probability density maps specific to the FA-binding sites. The predictor was trained with five-fold cross validation on a non-redundant training set and then evaluated with an independent test set as well as on holo-apo pair's dataset. The results showed good accuracy in predicting the FA-binding residues. Further, the predictor developed in this study is implemented as an online server which is freely accessible at the following website, http://ismblab.genomics.sinica.edu.tw/. Copyright © 2014 Elsevier B.V. All rights reserved.

  6. Digital mammographic tumor classification using transfer learning from deep convolutional neural networks.

    PubMed

    Huynh, Benjamin Q; Li, Hui; Giger, Maryellen L

    2016-07-01

    Convolutional neural networks (CNNs) show potential for computer-aided diagnosis (CADx) by learning features directly from the image data instead of using analytically extracted features. However, CNNs are difficult to train from scratch for medical images due to small sample sizes and variations in tumor presentations. Instead, transfer learning can be used to extract tumor information from medical images via CNNs originally pretrained for nonmedical tasks, alleviating the need for large datasets. Our database includes 219 breast lesions (607 full-field digital mammographic images). We compared support vector machine classifiers based on the CNN-extracted image features and our prior computer-extracted tumor features in the task of distinguishing between benign and malignant breast lesions. Five-fold cross validation (by lesion) was conducted with the area under the receiver operating characteristic (ROC) curve as the performance metric. Results show that classifiers based on CNN-extracted features (with transfer learning) perform comparably to those using analytically extracted features [area under the ROC curve [Formula: see text

  7. Countering imbalanced datasets to improve adverse drug event predictive models in labor and delivery.

    PubMed

    Taft, L M; Evans, R S; Shyu, C R; Egger, M J; Chawla, N; Mitchell, J A; Thornton, S N; Bray, B; Varner, M

    2009-04-01

    The IOM report, Preventing Medication Errors, emphasizes the overall lack of knowledge of the incidence of adverse drug events (ADE). Operating rooms, emergency departments and intensive care units are known to have a higher incidence of ADE. Labor and delivery (L&D) is an emergency care unit that could have an increased risk of ADE, where reported rates remain low and under-reporting is suspected. Risk factor identification with electronic pattern recognition techniques could improve ADE detection rates. The objective of the present study is to apply Synthetic Minority Over Sampling Technique (SMOTE) as an enhanced sampling method in a sparse dataset to generate prediction models to identify ADE in women admitted for labor and delivery based on patient risk factors and comorbidities. By creating synthetic cases with the SMOTE algorithm and using a 10-fold cross-validation technique, we demonstrated improved performance of the Naïve Bayes and the decision tree algorithms. The true positive rate (TPR) of 0.32 in the raw dataset increased to 0.67 in the 800% over-sampled dataset. Enhanced performance from classification algorithms can be attained with the use of synthetic minority class oversampling techniques in sparse clinical datasets. Predictive models created in this manner can be used to develop evidence based ADE monitoring systems.

  8. Diagnosis of diabetes diseases using an Artificial Immune Recognition System2 (AIRS2) with fuzzy K-nearest neighbor.

    PubMed

    Chikh, Mohamed Amine; Saidi, Meryem; Settouti, Nesma

    2012-10-01

    The use of expert systems and artificial intelligence techniques in disease diagnosis has been increasing gradually. Artificial Immune Recognition System (AIRS) is one of the methods used in medical classification problems. AIRS2 is a more efficient version of the AIRS algorithm. In this paper, we used a modified AIRS2 called MAIRS2 where we replace the K- nearest neighbors algorithm with the fuzzy K-nearest neighbors to improve the diagnostic accuracy of diabetes diseases. The diabetes disease dataset used in our work is retrieved from UCI machine learning repository. The performances of the AIRS2 and MAIRS2 are evaluated regarding classification accuracy, sensitivity and specificity values. The highest classification accuracy obtained when applying the AIRS2 and MAIRS2 using 10-fold cross-validation was, respectively 82.69% and 89.10%.

  9. Multimodal Teaching Analytics: Automated Extraction of Orchestration Graphs from Wearable Sensor Data.

    PubMed

    Prieto, Luis P; Sharma, Kshitij; Kidzinski, Łukasz; Rodríguez-Triana, María Jesús; Dillenbourg, Pierre

    2018-04-01

    The pedagogical modelling of everyday classroom practice is an interesting kind of evidence, both for educational research and teachers' own professional development. This paper explores the usage of wearable sensors and machine learning techniques to automatically extract orchestration graphs (teaching activities and their social plane over time), on a dataset of 12 classroom sessions enacted by two different teachers in different classroom settings. The dataset included mobile eye-tracking as well as audiovisual and accelerometry data from sensors worn by the teacher. We evaluated both time-independent and time-aware models, achieving median F1 scores of about 0.7-0.8 on leave-one-session-out k-fold cross-validation. Although these results show the feasibility of this approach, they also highlight the need for larger datasets, recorded in a wider variety of classroom settings, to provide automated tagging of classroom practice that can be used in everyday practice across multiple teachers.

  10. A novel feature ranking method for prediction of cancer stages using proteomics data

    PubMed Central

    Saghapour, Ehsan; Sehhati, Mohammadreza

    2017-01-01

    Proteomic analysis of cancers' stages has provided new opportunities for the development of novel, highly sensitive diagnostic tools which helps early detection of cancer. This paper introduces a new feature ranking approach called FRMT. FRMT is based on the Technique for Order of Preference by Similarity to Ideal Solution method (TOPSIS) which select the most discriminative proteins from proteomics data for cancer staging. In this approach, outcomes of 10 feature selection techniques were combined by TOPSIS method, to select the final discriminative proteins from seven different proteomic databases of protein expression profiles. In the proposed workflow, feature selection methods and protein expressions have been considered as criteria and alternatives in TOPSIS, respectively. The proposed method is tested on seven various classifier models in a 10-fold cross validation procedure that repeated 30 times on the seven cancer datasets. The obtained results proved the higher stability and superior classification performance of method in comparison with other methods, and it is less sensitive to the applied classifier. Moreover, the final introduced proteins are informative and have the potential for application in the real medical practice. PMID:28934234

  11. Support vector machines and generalisation in HEP

    NASA Astrophysics Data System (ADS)

    Bevan, Adrian; Gamboa Goñi, Rodrigo; Hays, Jon; Stevenson, Tom

    2017-10-01

    We review the concept of Support Vector Machines (SVMs) and discuss examples of their use in a number of scenarios. Several SVM implementations have been used in HEP and we exemplify this algorithm using the Toolkit for Multivariate Analysis (TMVA) implementation. We discuss examples relevant to HEP including background suppression for H → τ + τ - at the LHC with several different kernel functions. Performance benchmarking leads to the issue of generalisation of hyper-parameter selection. The avoidance of fine tuning (over training or over fitting) in MVA hyper-parameter optimisation, i.e. the ability to ensure generalised performance of an MVA that is independent of the training, validation and test samples, is of utmost importance. We discuss this issue and compare and contrast performance of hold-out and k-fold cross-validation. We have extended the SVM functionality and introduced tools to facilitate cross validation in TMVA and present results based on these improvements.

  12. Computer-aided Assessment of Regional Abdominal Fat with Food Residue Removal in CT

    PubMed Central

    Makrogiannis, Sokratis; Caturegli, Giorgio; Davatzikos, Christos; Ferrucci, Luigi

    2014-01-01

    Rationale and Objectives Separate quantification of abdominal subcutaneous and visceral fat regions is essential to understand the role of regional adiposity as risk factor in epidemiological studies. Fat quantification is often based on computed tomography (CT) because fat density is distinct from other tissue densities in the abdomen. However, the presence of intestinal food residues with densities similar to fat may reduce fat quantification accuracy. We introduce an abdominal fat quantification method in CT with interest in food residue removal. Materials and Methods Total fat was identified in the feature space of Hounsfield units and divided into subcutaneous and visceral components using model-based segmentation. Regions of food residues were identified and removed from visceral fat using a machine learning method integrating intensity, texture, and spatial information. Cost-weighting and bagging techniques were investigated to address class imbalance. Results We validated our automated food residue removal technique against semimanual quantifications. Our feature selection experiments indicated that joint intensity and texture features produce the highest classification accuracy at 95%. We explored generalization capability using k-fold cross-validation and receiver operating characteristic (ROC) analysis with variable k. Losses in accuracy and area under ROC curve between maximum and minimum k were limited to 0.1% and 0.3%. We validated tissue segmentation against reference semimanual delineations. The Dice similarity scores were as high as 93.1 for subcutaneous fat and 85.6 for visceral fat. Conclusions Computer-aided regional abdominal fat quantification is a reliable computational tool for large-scale epidemiological studies. Our proposed intestinal food residue reduction scheme is an original contribution of this work. Validation experiments indicate very good accuracy and generalization capability. PMID:24119354

  13. Computer-aided assessment of regional abdominal fat with food residue removal in CT.

    PubMed

    Makrogiannis, Sokratis; Caturegli, Giorgio; Davatzikos, Christos; Ferrucci, Luigi

    2013-11-01

    Separate quantification of abdominal subcutaneous and visceral fat regions is essential to understand the role of regional adiposity as risk factor in epidemiological studies. Fat quantification is often based on computed tomography (CT) because fat density is distinct from other tissue densities in the abdomen. However, the presence of intestinal food residues with densities similar to fat may reduce fat quantification accuracy. We introduce an abdominal fat quantification method in CT with interest in food residue removal. Total fat was identified in the feature space of Hounsfield units and divided into subcutaneous and visceral components using model-based segmentation. Regions of food residues were identified and removed from visceral fat using a machine learning method integrating intensity, texture, and spatial information. Cost-weighting and bagging techniques were investigated to address class imbalance. We validated our automated food residue removal technique against semimanual quantifications. Our feature selection experiments indicated that joint intensity and texture features produce the highest classification accuracy at 95%. We explored generalization capability using k-fold cross-validation and receiver operating characteristic (ROC) analysis with variable k. Losses in accuracy and area under ROC curve between maximum and minimum k were limited to 0.1% and 0.3%. We validated tissue segmentation against reference semimanual delineations. The Dice similarity scores were as high as 93.1 for subcutaneous fat and 85.6 for visceral fat. Computer-aided regional abdominal fat quantification is a reliable computational tool for large-scale epidemiological studies. Our proposed intestinal food residue reduction scheme is an original contribution of this work. Validation experiments indicate very good accuracy and generalization capability. Published by Elsevier Inc.

  14. AVP-IC50 Pred: Multiple machine learning techniques-based prediction of peptide antiviral activity in terms of half maximal inhibitory concentration (IC50).

    PubMed

    Qureshi, Abid; Tandon, Himani; Kumar, Manoj

    2015-11-01

    Peptide-based antiviral therapeutics has gradually paved their way into mainstream drug discovery research. Experimental determination of peptides' antiviral activity as expressed by their IC50 values involves a lot of effort. Therefore, we have developed "AVP-IC50 Pred," a regression-based algorithm to predict the antiviral activity in terms of IC50 values (μM). A total of 759 non-redundant peptides from AVPdb and HIPdb were divided into a training/test set having 683 peptides (T(683)) and a validation set with 76 independent peptides (V(76)) for evaluation. We utilized important peptide sequence features like amino-acid compositions, binary profile of N8-C8 residues, physicochemical properties and their hybrids. Four different machine learning techniques (MLTs) namely Support vector machine, Random Forest, Instance-based classifier, and K-Star were employed. During 10-fold cross validation, we achieved maximum Pearson correlation coefficients (PCCs) of 0.66, 0.64, 0.56, 0.55, respectively, for the above MLTs using the best combination of feature sets. All the predictive models also performed well on the independent validation dataset and achieved maximum PCCs of 0.74, 0.68, 0.59, 0.57, respectively, on the best combination of feature sets. The AVP-IC50 Pred web server is anticipated to assist the researchers working on antiviral therapeutics by enabling them to computationally screen many compounds and focus experimental validation on the most promising set of peptides, thus reducing cost and time efforts. The server is available at http://crdd.osdd.net/servers/ic50avp. © 2015 Wiley Periodicals, Inc.

  15. A cross-validation package driving Netica with python

    USGS Publications Warehouse

    Fienen, Michael N.; Plant, Nathaniel G.

    2014-01-01

    Bayesian networks (BNs) are powerful tools for probabilistically simulating natural systems and emulating process models. Cross validation is a technique to avoid overfitting resulting from overly complex BNs. Overfitting reduces predictive skill. Cross-validation for BNs is known but rarely implemented due partly to a lack of software tools designed to work with available BN packages. CVNetica is open-source, written in Python, and extends the Netica software package to perform cross-validation and read, rebuild, and learn BNs from data. Insights gained from cross-validation and implications on prediction versus description are illustrated with: a data-driven oceanographic application; and a model-emulation application. These examples show that overfitting occurs when BNs become more complex than allowed by supporting data and overfitting incurs computational costs as well as causing a reduction in prediction skill. CVNetica evaluates overfitting using several complexity metrics (we used level of discretization) and its impact on performance metrics (we used skill).

  16. Implementation of a Medication Reconciliation Assistive Technology: A Qualitative Analysis

    PubMed Central

    Wright, Theodore B.; Adams, Kathleen; Church, Victoria L.; Ferraro, Mimi; Ragland, Scott; Sayers, Anthony; Tallett, Stephanie; Lovejoy, Travis; Ash, Joan; Holahan, Patricia J.; Lesselroth, Blake J.

    2017-01-01

    Objective: To aid the implementation of a medication reconciliation process within a hybrid primary-specialty care setting by using qualitative techniques to describe the climate of implementation and provide guidance for future projects. Methods: Guided by McMullen et al’s Rapid Assessment Process1, we performed semi-structured interviews prior to and iteratively throughout the implementation. Interviews were coded and analyzed using grounded theory2 and cross-examined for validity. Results: We identified five barriers and five facilitators that impacted the implementation. Facilitators identified were process alignment with user values, and motivation and clinical champions fostered by the implementation team rather than the administration. Barriers included a perceived limited capacity for change, diverging priorities, and inconsistencies in process standards and role definitions. Discussion: A more complete, qualitative understanding of existing barriers and facilitators helps to guide critical decisions on the design and implementation of a successful medication reconciliation process. PMID:29854251

  17. Cross-cultural validation of Lupus Impact Tracker in five European clinical practice settings.

    PubMed

    Schneider, Matthias; Mosca, Marta; Pego-Reigosa, José-Maria; Gunnarsson, Iva; Maurel, Frédérique; Garofano, Anna; Perna, Alessandra; Porcasi, Rolando; Devilliers, Hervé

    2017-05-01

    The aim was to evaluate the cross-cultural validity of the Lupus Impact Tracker (LIT) in five European countries and to assess its acceptability and feasibility from the patient and physician perspectives. A prospective, observational, cross-sectional and multicentre validation study was conducted in clinical settings. Before the visit, patients completed LIT, Short Form 36 (SF-36) and care satisfaction questionnaires. During the visit, physicians assessed disease activity [Safety of Estrogens in Lupus Erythematosus National Assessment (SELENA)-SLEDAI], organ damage [SLICC/ACR damage index (SDI)] and flare occurrence. Cross-cultural validity was assessed using the Differential Item Functioning method. Five hundred and sixty-nine SLE patients were included by 25 specialists; 91.7% were outpatients and 89.9% female, with mean age 43.5 (13.0) years. Disease profile was as follows: 18.3% experienced flares; mean SELENA-SLEDAI score 3.4 (4.5); mean SDI score 0.8 (1.4); and SF-36 mean physical and mental component summary scores: physical component summary 42.8 (10.8) and mental component summary 43.0 (12.3). Mean LIT score was 34.2 (22.3) (median: 32.5), indicating that lupus moderately impacted patients' daily life. A cultural Differential Item Functioning of negligible magnitude was detected across countries (pseudo- R 2 difference of 0.01-0.04). Differences were observed between LIT scores and Physician Global Assessment, SELENA-SLEDAI, SDI scores = 0 (P < 0.035) and absence of flares (P = 0.004). The LIT showed a strong association with SF-36 physical and social role functioning, vitality, bodily pain and mental health (P < 0.001). The LIT was well accepted by patients and physicians. It was reliable, with Cronbach α coefficients ranging from 0.89 to 0.92 among countries. The LIT is validated in the five participating European countries. The results show its reliability and cultural invariability across countries. They suggest that LIT can be used in routine clinical practice to evaluate and follow patient-reported outcomes in order to improve patient-physician interaction. © The Author 2017. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  18. Predicting turns in proteins with a unified model.

    PubMed

    Song, Qi; Li, Tonghua; Cong, Peisheng; Sun, Jiangming; Li, Dapeng; Tang, Shengnan

    2012-01-01

    Turns are a critical element of the structure of a protein; turns play a crucial role in loops, folds, and interactions. Current prediction methods are well developed for the prediction of individual turn types, including α-turn, β-turn, and γ-turn, etc. However, for further protein structure and function prediction it is necessary to develop a uniform model that can accurately predict all types of turns simultaneously. In this study, we present a novel approach, TurnP, which offers the ability to investigate all the turns in a protein based on a unified model. The main characteristics of TurnP are: (i) using newly exploited features of structural evolution information (secondary structure and shape string of protein) based on structure homologies, (ii) considering all types of turns in a unified model, and (iii) practical capability of accurate prediction of all turns simultaneously for a query. TurnP utilizes predicted secondary structures and predicted shape strings, both of which have greater accuracy, based on innovative technologies which were both developed by our group. Then, sequence and structural evolution features, which are profile of sequence, profile of secondary structures and profile of shape strings are generated by sequence and structure alignment. When TurnP was validated on a non-redundant dataset (4,107 entries) by five-fold cross-validation, we achieved an accuracy of 88.8% and a sensitivity of 71.8%, which exceeded the most state-of-the-art predictors of certain type of turn. Newly determined sequences, the EVA and CASP9 datasets were used as independent tests and the results we achieved were outstanding for turn predictions and confirmed the good performance of TurnP for practical applications.

  19. Predicting Turns in Proteins with a Unified Model

    PubMed Central

    Song, Qi; Li, Tonghua; Cong, Peisheng; Sun, Jiangming; Li, Dapeng; Tang, Shengnan

    2012-01-01

    Motivation Turns are a critical element of the structure of a protein; turns play a crucial role in loops, folds, and interactions. Current prediction methods are well developed for the prediction of individual turn types, including α-turn, β-turn, and γ-turn, etc. However, for further protein structure and function prediction it is necessary to develop a uniform model that can accurately predict all types of turns simultaneously. Results In this study, we present a novel approach, TurnP, which offers the ability to investigate all the turns in a protein based on a unified model. The main characteristics of TurnP are: (i) using newly exploited features of structural evolution information (secondary structure and shape string of protein) based on structure homologies, (ii) considering all types of turns in a unified model, and (iii) practical capability of accurate prediction of all turns simultaneously for a query. TurnP utilizes predicted secondary structures and predicted shape strings, both of which have greater accuracy, based on innovative technologies which were both developed by our group. Then, sequence and structural evolution features, which are profile of sequence, profile of secondary structures and profile of shape strings are generated by sequence and structure alignment. When TurnP was validated on a non-redundant dataset (4,107 entries) by five-fold cross-validation, we achieved an accuracy of 88.8% and a sensitivity of 71.8%, which exceeded the most state-of-the-art predictors of certain type of turn. Newly determined sequences, the EVA and CASP9 datasets were used as independent tests and the results we achieved were outstanding for turn predictions and confirmed the good performance of TurnP for practical applications. PMID:23144872

  20. Classification of echolocation clicks from odontocetes in the Southern California Bight.

    PubMed

    Roch, Marie A; Klinck, Holger; Baumann-Pickering, Simone; Mellinger, David K; Qui, Simon; Soldevilla, Melissa S; Hildebrand, John A

    2011-01-01

    This study presents a system for classifying echolocation clicks of six species of odontocetes in the Southern California Bight: Visually confirmed bottlenose dolphins, short- and long-beaked common dolphins, Pacific white-sided dolphins, Risso's dolphins, and presumed Cuvier's beaked whales. Echolocation clicks are represented by cepstral feature vectors that are classified by Gaussian mixture models. A randomized cross-validation experiment is designed to provide conditions similar to those found in a field-deployed system. To prevent matched conditions from inappropriately lowering the error rate, echolocation clicks associated with a single sighting are never split across the training and test data. Sightings are randomly permuted before assignment to folds in the experiment. This allows different combinations of the training and test data to be used while keeping data from each sighting entirely in the training or test set. The system achieves a mean error rate of 22% across 100 randomized three-fold cross-validation experiments. Four of the six species had mean error rates lower than the overall mean, with the presumed Cuvier's beaked whale clicks showing the best performance (<2% error rate). Long-beaked common and bottlenose dolphins proved the most difficult to classify, with mean error rates of 53% and 68%, respectively.

  1. ETHNOPRED: a novel machine learning method for accurate continental and sub-continental ancestry identification and population stratification correction.

    PubMed

    Hajiloo, Mohsen; Sapkota, Yadav; Mackey, John R; Robson, Paula; Greiner, Russell; Damaraju, Sambasivarao

    2013-02-22

    Population stratification is a systematic difference in allele frequencies between subpopulations. This can lead to spurious association findings in the case-control genome wide association studies (GWASs) used to identify single nucleotide polymorphisms (SNPs) associated with disease-linked phenotypes. Methods such as self-declared ancestry, ancestry informative markers, genomic control, structured association, and principal component analysis are used to assess and correct population stratification but each has limitations. We provide an alternative technique to address population stratification. We propose a novel machine learning method, ETHNOPRED, which uses the genotype and ethnicity data from the HapMap project to learn ensembles of disjoint decision trees, capable of accurately predicting an individual's continental and sub-continental ancestry. To predict an individual's continental ancestry, ETHNOPRED produced an ensemble of 3 decision trees involving a total of 10 SNPs, with 10-fold cross validation accuracy of 100% using HapMap II dataset. We extended this model to involve 29 disjoint decision trees over 149 SNPs, and showed that this ensemble has an accuracy of ≥ 99.9%, even if some of those 149 SNP values were missing. On an independent dataset, predominantly of Caucasian origin, our continental classifier showed 96.8% accuracy and improved genomic control's λ from 1.22 to 1.11. We next used the HapMap III dataset to learn classifiers to distinguish European subpopulations (North-Western vs. Southern), East Asian subpopulations (Chinese vs. Japanese), African subpopulations (Eastern vs. Western), North American subpopulations (European vs. Chinese vs. African vs. Mexican vs. Indian), and Kenyan subpopulations (Luhya vs. Maasai). In these cases, ETHNOPRED produced ensembles of 3, 39, 21, 11, and 25 disjoint decision trees, respectively involving 31, 502, 526, 242 and 271 SNPs, with 10-fold cross validation accuracy of 86.5% ± 2.4%, 95.6% ± 3.9%, 95.6% ± 2.1%, 98.3% ± 2.0%, and 95.9% ± 1.5%. However, ETHNOPRED was unable to produce a classifier that can accurately distinguish Chinese in Beijing vs. Chinese in Denver. ETHNOPRED is a novel technique for producing classifiers that can identify an individual's continental and sub-continental heritage, based on a small number of SNPs. We show that its learned classifiers are simple, cost-efficient, accurate, transparent, flexible, fast, applicable to large scale GWASs, and robust to missing values.

  2. Developing a Time Series Predictive Model for Dengue in Zhongshan, China Based on Weather and Guangzhou Dengue Surveillance Data.

    PubMed

    Zhang, Yingtao; Wang, Tao; Liu, Kangkang; Xia, Yao; Lu, Yi; Jing, Qinlong; Yang, Zhicong; Hu, Wenbiao; Lu, Jiahai

    2016-02-01

    Dengue is a re-emerging infectious disease of humans, rapidly growing from endemic areas to dengue-free regions due to favorable conditions. In recent decades, Guangzhou has again suffered from several big outbreaks of dengue; as have its neighboring cities. This study aims to examine the impact of dengue epidemics in Guangzhou, China, and to develop a predictive model for Zhongshan based on local weather conditions and Guangzhou dengue surveillance information. We obtained weekly dengue case data from 1st January, 2005 to 31st December, 2014 for Guangzhou and Zhongshan city from the Chinese National Disease Surveillance Reporting System. Meteorological data was collected from the Zhongshan Weather Bureau and demographic data was collected from the Zhongshan Statistical Bureau. A negative binomial regression model with a log link function was used to analyze the relationship between weekly dengue cases in Guangzhou and Zhongshan, controlling for meteorological factors. Cross-correlation functions were applied to identify the time lags of the effect of each weather factor on weekly dengue cases. Models were validated using receiver operating characteristic (ROC) curves and k-fold cross-validation. Our results showed that weekly dengue cases in Zhongshan were significantly associated with dengue cases in Guangzhou after the treatment of a 5 weeks prior moving average (Relative Risk (RR) = 2.016, 95% Confidence Interval (CI): 1.845-2.203), controlling for weather factors including minimum temperature, relative humidity, and rainfall. ROC curve analysis indicated our forecasting model performed well at different prediction thresholds, with 0.969 area under the receiver operating characteristic curve (AUC) for a threshold of 3 cases per week, 0.957 AUC for a threshold of 2 cases per week, and 0.938 AUC for a threshold of 1 case per week. Models established during k-fold cross-validation also had considerable AUC (average 0.938-0.967). The sensitivity and specificity obtained from k-fold cross-validation was 78.83% and 92.48% respectively, with a forecasting threshold of 3 cases per week; 91.17% and 91.39%, with a threshold of 2 cases; and 85.16% and 87.25% with a threshold of 1 case. The out-of-sample prediction for the epidemics in 2014 also showed satisfactory performance. Our study findings suggest that the occurrence of dengue outbreaks in Guangzhou could impact dengue outbreaks in Zhongshan under suitable weather conditions. Future studies should focus on developing integrated early warning systems for dengue transmission including local weather and human movement.

  3. An improved method of early diagnosis of smoking-induced respiratory changes using machine learning algorithms.

    PubMed

    Amaral, Jorge L M; Lopes, Agnaldo J; Jansen, José M; Faria, Alvaro C D; Melo, Pedro L

    2013-12-01

    The purpose of this study was to develop an automatic classifier to increase the accuracy of the forced oscillation technique (FOT) for diagnosing early respiratory abnormalities in smoking patients. The data consisted of FOT parameters obtained from 56 volunteers, 28 healthy and 28 smokers with low tobacco consumption. Many supervised learning techniques were investigated, including logistic linear classifiers, k nearest neighbor (KNN), neural networks and support vector machines (SVM). To evaluate performance, the ROC curve of the most accurate parameter was established as baseline. To determine the best input features and classifier parameters, we used genetic algorithms and a 10-fold cross-validation using the average area under the ROC curve (AUC). In the first experiment, the original FOT parameters were used as input. We observed a significant improvement in accuracy (KNN=0.89 and SVM=0.87) compared with the baseline (0.77). The second experiment performed a feature selection on the original FOT parameters. This selection did not cause any significant improvement in accuracy, but it was useful in identifying more adequate FOT parameters. In the third experiment, we performed a feature selection on the cross products of the FOT parameters. This selection resulted in a further increase in AUC (KNN=SVM=0.91), which allows for high diagnostic accuracy. In conclusion, machine learning classifiers can help identify early smoking-induced respiratory alterations. The use of FOT cross products and the search for the best features and classifier parameters can markedly improve the performance of machine learning classifiers. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  4. Carboxylator: incorporating solvent-accessible surface area for identifying protein carboxylation sites

    NASA Astrophysics Data System (ADS)

    Lu, Cheng-Tsung; Chen, Shu-An; Bretaña, Neil Arvin; Cheng, Tzu-Hsiu; Lee, Tzong-Yi

    2011-10-01

    In proteins, glutamate (Glu) residues are transformed into γ-carboxyglutamate (Gla) residues in a process called carboxylation. The process of protein carboxylation catalyzed by γ-glutamyl carboxylase is deemed to be important due to its involvement in biological processes such as blood clotting cascade and bone growth. There is an increasing interest within the scientific community to identify protein carboxylation sites. However, experimental identification of carboxylation sites via mass spectrometry-based methods is observed to be expensive, time-consuming, and labor-intensive. Thus, we were motivated to design a computational method for identifying protein carboxylation sites. This work aims to investigate the protein carboxylation by considering the composition of amino acids that surround modification sites. With the implication of a modified residue prefers to be accessible on the surface of a protein, the solvent-accessible surface area (ASA) around carboxylation sites is also investigated. Radial basis function network is then employed to build a predictive model using various features for identifying carboxylation sites. Based on a five-fold cross-validation evaluation, a predictive model trained using the combined features of amino acid sequence (AA20D), amino acid composition, and ASA, yields the highest accuracy at 0.874. Furthermore, an independent test done involving data not included in the cross-validation process indicates that in silico identification is a feasible means of preliminary analysis. Additionally, the predictive method presented in this work is implemented as Carboxylator (http://csb.cse.yzu.edu.tw/Carboxylator/), a web-based tool for identifying carboxylated proteins with modification sites in order to help users in investigating γ-glutamyl carboxylation.

  5. Evaluating current automatic de-identification methods with Veteran's health administration clinical documents.

    PubMed

    Ferrández, Oscar; South, Brett R; Shen, Shuying; Friedlin, F Jeffrey; Samore, Matthew H; Meystre, Stéphane M

    2012-07-27

    The increased use and adoption of Electronic Health Records (EHR) causes a tremendous growth in digital information useful for clinicians, researchers and many other operational purposes. However, this information is rich in Protected Health Information (PHI), which severely restricts its access and possible uses. A number of investigators have developed methods for automatically de-identifying EHR documents by removing PHI, as specified in the Health Insurance Portability and Accountability Act "Safe Harbor" method.This study focuses on the evaluation of existing automated text de-identification methods and tools, as applied to Veterans Health Administration (VHA) clinical documents, to assess which methods perform better with each category of PHI found in our clinical notes; and when new methods are needed to improve performance. We installed and evaluated five text de-identification systems "out-of-the-box" using a corpus of VHA clinical documents. The systems based on machine learning methods were trained with the 2006 i2b2 de-identification corpora and evaluated with our VHA corpus, and also evaluated with a ten-fold cross-validation experiment using our VHA corpus. We counted exact, partial, and fully contained matches with reference annotations, considering each PHI type separately, or only one unique 'PHI' category. Performance of the systems was assessed using recall (equivalent to sensitivity) and precision (equivalent to positive predictive value) metrics, as well as the F(2)-measure. Overall, systems based on rules and pattern matching achieved better recall, and precision was always better with systems based on machine learning approaches. The highest "out-of-the-box" F(2)-measure was 67% for partial matches; the best precision and recall were 95% and 78%, respectively. Finally, the ten-fold cross validation experiment allowed for an increase of the F(2)-measure to 79% with partial matches. The "out-of-the-box" evaluation of text de-identification systems provided us with compelling insight about the best methods for de-identification of VHA clinical documents. The errors analysis demonstrated an important need for customization to PHI formats specific to VHA documents. This study informed the planning and development of a "best-of-breed" automatic de-identification application for VHA clinical text.

  6. Prediction of drug indications based on chemical interactions and chemical similarities.

    PubMed

    Huang, Guohua; Lu, Yin; Lu, Changhong; Zheng, Mingyue; Cai, Yu-Dong

    2015-01-01

    Discovering potential indications of novel or approved drugs is a key step in drug development. Previous computational approaches could be categorized into disease-centric and drug-centric based on the starting point of the issues or small-scaled application and large-scale application according to the diversity of the datasets. Here, a classifier has been constructed to predict the indications of a drug based on the assumption that interactive/associated drugs or drugs with similar structures are more likely to target the same diseases using a large drug indication dataset. To examine the classifier, it was conducted on a dataset with 1,573 drugs retrieved from Comprehensive Medicinal Chemistry database for five times, evaluated by 5-fold cross-validation, yielding five 1st order prediction accuracies that were all approximately 51.48%. Meanwhile, the model yielded an accuracy rate of 50.00% for the 1st order prediction by independent test on a dataset with 32 other drugs in which drug repositioning has been confirmed. Interestingly, some clinically repurposed drug indications that were not included in the datasets are successfully identified by our method. These results suggest that our method may become a useful tool to associate novel molecules with new indications or alternative indications with existing drugs.

  7. Prediction of Drug Indications Based on Chemical Interactions and Chemical Similarities

    PubMed Central

    Huang, Guohua; Lu, Yin; Lu, Changhong; Cai, Yu-Dong

    2015-01-01

    Discovering potential indications of novel or approved drugs is a key step in drug development. Previous computational approaches could be categorized into disease-centric and drug-centric based on the starting point of the issues or small-scaled application and large-scale application according to the diversity of the datasets. Here, a classifier has been constructed to predict the indications of a drug based on the assumption that interactive/associated drugs or drugs with similar structures are more likely to target the same diseases using a large drug indication dataset. To examine the classifier, it was conducted on a dataset with 1,573 drugs retrieved from Comprehensive Medicinal Chemistry database for five times, evaluated by 5-fold cross-validation, yielding five 1st order prediction accuracies that were all approximately 51.48%. Meanwhile, the model yielded an accuracy rate of 50.00% for the 1st order prediction by independent test on a dataset with 32 other drugs in which drug repositioning has been confirmed. Interestingly, some clinically repurposed drug indications that were not included in the datasets are successfully identified by our method. These results suggest that our method may become a useful tool to associate novel molecules with new indications or alternative indications with existing drugs. PMID:25821813

  8. Improved detection of congestive heart failure via probabilistic symbolic pattern recognition and heart rate variability metrics.

    PubMed

    Mahajan, Ruhi; Viangteeravat, Teeradache; Akbilgic, Oguz

    2017-12-01

    A timely diagnosis of congestive heart failure (CHF) is crucial to evade a life-threatening event. This paper presents a novel probabilistic symbol pattern recognition (PSPR) approach to detect CHF in subjects from their cardiac interbeat (R-R) intervals. PSPR discretizes each continuous R-R interval time series by mapping them onto an eight-symbol alphabet and then models the pattern transition behavior in the symbolic representation of the series. The PSPR-based analysis of the discretized series from 107 subjects (69 normal and 38 CHF subjects) yielded discernible features to distinguish normal subjects and subjects with CHF. In addition to PSPR features, we also extracted features using the time-domain heart rate variability measures such as average and standard deviation of R-R intervals. An ensemble of bagged decision trees was used to classify two groups resulting in a five-fold cross-validation accuracy, specificity, and sensitivity of 98.1%, 100%, and 94.7%, respectively. However, a 20% holdout validation yielded an accuracy, specificity, and sensitivity of 99.5%, 100%, and 98.57%, respectively. Results from this study suggest that features obtained with the combination of PSPR and long-term heart rate variability measures can be used in developing automated CHF diagnosis tools. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Tumour gene expression predicts response to cetuximab in patients with KRAS wild-type metastatic colorectal cancer.

    PubMed

    Baker, J B; Dutta, D; Watson, D; Maddala, T; Munneke, B M; Shak, S; Rowinsky, E K; Xu, L-A; Harbison, C T; Clark, E A; Mauro, D J; Khambata-Ford, S

    2011-02-01

    Although it is accepted that metastatic colorectal cancers (mCRCs) that carry activating mutations in KRAS are unresponsive to anti-epidermal growth factor receptor (EGFR) monoclonal antibodies, a significant fraction of KRAS wild-type (wt) mCRCs are also unresponsive to anti-EGFR therapy. Genes encoding EGFR ligands amphiregulin (AREG) and epiregulin (EREG) are promising gene expression-based markers but have not been incorporated into a test to dichotomise KRAS wt mCRC patients with respect to sensitivity to anti-EGFR treatment. We used RT-PCR to test 110 candidate gene expression markers in primary tumours from 144 KRAS wt mCRC patients who received monotherapy with the anti-EGFR antibody cetuximab. Results were correlated with multiple clinical endpoints: disease control, objective response, and progression-free survival (PFS). Expression of many of the tested candidate genes, including EREG and AREG, strongly associate with all clinical endpoints. Using multivariate analysis with two-layer five-fold cross-validation, we constructed a four-gene predictive classifier. Strikingly, patients below the classifier cutpoint had PFS and disease control rates similar to those of patients with KRAS mutant mCRC. Gene expression appears to identify KRAS wt mCRC patients who receive little benefit from cetuximab. It will be important to test this model in an independent validation study.

  10. Machine learning for the automatic detection of anomalous events

    NASA Astrophysics Data System (ADS)

    Fisher, Wendy D.

    In this dissertation, we describe our research contributions for a novel approach to the application of machine learning for the automatic detection of anomalous events. We work in two different domains to ensure a robust data-driven workflow that could be generalized for monitoring other systems. Specifically, in our first domain, we begin with the identification of internal erosion events in earth dams and levees (EDLs) using geophysical data collected from sensors located on the surface of the levee. As EDLs across the globe reach the end of their design lives, effectively monitoring their structural integrity is of critical importance. The second domain of interest is related to mobile telecommunications, where we investigate a system for automatically detecting non-commercial base station routers (BSRs) operating in protected frequency space. The presence of non-commercial BSRs can disrupt the connectivity of end users, cause service issues for the commercial providers, and introduce significant security concerns. We provide our motivation, experimentation, and results from investigating a generalized novel data-driven workflow using several machine learning techniques. In Chapter 2, we present results from our performance study that uses popular unsupervised clustering algorithms to gain insights to our real-world problems, and evaluate our results using internal and external validation techniques. Using EDL passive seismic data from an experimental laboratory earth embankment, results consistently show a clear separation of events from non-events in four of the five clustering algorithms applied. Chapter 3 uses a multivariate Gaussian machine learning model to identify anomalies in our experimental data sets. For the EDL work, we used experimental data from two different laboratory earth embankments. Additionally, we explore five wavelet transform methods for signal denoising. The best performance is achieved with the Haar wavelets. We achieve up to 97.3% overall accuracy and less than 1.4% false negatives in anomaly detection. In Chapter 4, we research using two-class and one-class support vector machines (SVMs) for an effective anomaly detection system. We again use the two different EDL data sets from experimental laboratory earth embankments (each having approximately 80% normal and 20% anomalies) to ensure our workflow is robust enough to work with multiple data sets and different types of anomalous events (e.g., cracks and piping). We apply Haar wavelet-denoising techniques and extract nine spectral features from decomposed segments of the time series data. The two-class SVM with 10-fold cross validation achieved over 94% overall accuracy and 96% F1-score. Our approach provides a means for automatically identifying anomalous events using various machine learning techniques. Detecting internal erosion events in aging EDLs, earlier than is currently possible, can allow more time to prevent or mitigate catastrophic failures. Results show that we can successfully separate normal from anomalous data observations in passive seismic data, and provide a step towards techniques for continuous real-time monitoring of EDL health. Our lightweight non-commercial BSR detection system also has promise in separating commercial from non-commercial BSR scans without the need for prior geographic location information, extensive time-lapse surveys, or a database of known commercial carriers. (Abstract shortened by ProQuest.).

  11. Geometry and Kinematics of Fault-Propagation Folds with Variable Interlimb Angles

    NASA Astrophysics Data System (ADS)

    Dhont, D.; Jabbour, M.; Hervouet, Y.; Deroin, J.

    2009-12-01

    Fault-propagation folds are common features in foreland basins and fold-and-thrust belts. Several conceptual models have been proposed to account for their geometry and kinematics. It is generally accepted that the shape of fault-propagation folds depends directly from both the amount of displacement along the basal decollement level and the dip angle of the ramp. Among these, the variable interlimb angle model proposed by Mitra (1990) is based on a folding kinematics that is able to explain open and close natural folds. However, the application of this model is limited because the geometric evolution and thickness variation of the fold directly depend on imposed parameters such as the maximal value of the ramp height. Here, we use the ramp and the interlimb angles as input data to develop a forward fold modelling accounting for thickness variations in the forelimb. The relationship between the fold amplitude and fold wavelength are subsequently applied to build balanced geologic cross-sections from surface parameters only, and to propose a kinematic restoration of the folding through time. We considered three natural examples to validate the variable interlimb angle model. Observed thickness variations in the forelimb of the Turner Valley anticline in the Alberta foothills of Canada precisely correspond to the theoretical values proposed by our model. Deep reconstruction of the Alima anticline in the southern Tunisian Atlas implies that the decollement level is localized in the Triassic-Liassic series, as highlighted by seismic imaging. Our kinematic reconstruction of the Ucero anticline in the Spanish Castilian mountains is also in agreement with the anticline geometry derived from two cross-sections. The variable interlimb angle model implies that the fault-propagation fold can be symmetric, normal asymmetric (with a greater dip value in the forelimb than in the backlimb), or reversely asymmetric (with greater dip in the backlimb) depending on the shortening amount. This model allows also: (i) to easily explain folds with wide variety of geometries; (ii) to understand the deep architecture of anticlines; and (iii) to deduce the kinematic evolution of folding with time. Mitra, S., 1990, Fault-propagation folds: geometry, kinematic evolution, and hydrocarbon traps. AAPG Bulletin, v. 74, no. 6, p. 921-945.

  12. An accurate sleep stages classification system using a new class of optimally time-frequency localized three-band wavelet filter bank.

    PubMed

    Sharma, Manish; Goyal, Deepanshu; Achuth, P V; Acharya, U Rajendra

    2018-07-01

    Sleep related disorder causes diminished quality of lives in human beings. Sleep scoring or sleep staging is the process of classifying various sleep stages which helps to detect the quality of sleep. The identification of sleep-stages using electroencephalogram (EEG) signals is an arduous task. Just by looking at an EEG signal, one cannot determine the sleep stages precisely. Sleep specialists may make errors in identifying sleep stages by visual inspection. To mitigate the erroneous identification and to reduce the burden on doctors, a computer-aided EEG based system can be deployed in the hospitals, which can help identify the sleep stages, correctly. Several automated systems based on the analysis of polysomnographic (PSG) signals have been proposed. A few sleep stage scoring systems using EEG signals have also been proposed. But, still there is a need for a robust and accurate portable system developed using huge dataset. In this study, we have developed a new single-channel EEG based sleep-stages identification system using a novel set of wavelet-based features extracted from a large EEG dataset. We employed a novel three-band time-frequency localized (TBTFL) wavelet filter bank (FB). The EEG signals are decomposed using three-level wavelet decomposition, yielding seven sub-bands (SBs). This is followed by the computation of discriminating features namely, log-energy (LE), signal-fractal-dimensions (SFD), and signal-sample-entropy (SSE) from all seven SBs. The extracted features are ranked and fed to the support vector machine (SVM) and other supervised learning classifiers. In this study, we have considered five different classification problems (CPs), (two-class (CP-1), three-class (CP-2), four-class (CP-3), five-class (CP-4) and six-class (CP-5)). The proposed system yielded accuracies of 98.3%, 93.9%, 92.1%, 91.7%, and 91.5% for CP-1 to CP-5, respectively, using 10-fold cross validation (CV) technique. Copyright © 2018 Elsevier Ltd. All rights reserved.

  13. Spatio-temporal texture (SpTeT) for distinguishing vulnerable from stable atherosclerotic plaque on dynamic contrast enhancement (DCE) MRI in a rabbit model

    PubMed Central

    Wan, Tao; Madabhushi, Anant; Phinikaridou, Alkystis; Hamilton, James A.; Hua, Ning; Pham, Tuan; Danagoulian, Jovanna; Kleiman, Ross; Buckler, Andrew J.

    2014-01-01

    Purpose: To develop a new spatio-temporal texture (SpTeT) based method for distinguishing vulnerable versus stable atherosclerotic plaques on DCE-MRI using a rabbit model of atherothrombosis. Methods: Aortic atherosclerosis was induced in 20 New Zealand White rabbits by cholesterol diet and endothelial denudation. MRI was performed before (pretrigger) and after (posttrigger) inducing plaque disruption with Russell's-viper-venom and histamine. Of the 30 vascular targets (segments) under histology analysis, 16 contained thrombus (vulnerable) and 14 did not (stable). A total of 352 voxel-wise computerized SpTeT features, including 192 Gabor, 36 Kirsch, 12 Sobel, 52 Haralick, and 60 first-order textural features, were extracted on DCE-MRI to capture subtle texture changes in the plaques over the course of contrast uptake. Different combinations of SpTeT feature sets, in which the features were ranked by a minimum-redundancy-maximum-relevance feature selection technique, were evaluated via a random forest classifier. A 500 iterative 2-fold cross validation was performed for discriminating the vulnerable atherosclerotic plaque and stable atherosclerotic plaque on per voxel basis. Four quantitative metrics were utilized to measure the classification results in separating between vulnerable and stable plaques. Results: The quantitative results show that the combination of five classes of SpTeT features can distinguish between vulnerable (disrupted plaques with an overlying thrombus) and stable plaques with the best AUC values of 0.9631 ± 0.0088, accuracy of 89.98% ± 0.57%, sensitivity of 83.71% ± 1.71%, and specificity of 94.55% ± 0.48%. Conclusions: Vulnerable and stable plaque can be distinguished by SpTeT based features. The SpTeT features, following validation on larger datasets, could be established as effective and reliable imaging biomarkers for noninvasively assessing atherosclerotic risk. PMID:24694153

  14. Fourier-transform-infrared-spectroscopy based spectral-biomarker selection towards optimum diagnostic differentiation of oral leukoplakia and cancer.

    PubMed

    Banerjee, Satarupa; Pal, Mousumi; Chakrabarty, Jitamanyu; Petibois, Cyril; Paul, Ranjan Rashmi; Giri, Amita; Chatterjee, Jyotirmoy

    2015-10-01

    In search of specific label-free biomarkers for differentiation of two oral lesions, namely oral leukoplakia (OLK) and oral squamous-cell carcinoma (OSCC), Fourier-transform infrared (FTIR) spectroscopy was performed on paraffin-embedded tissue sections from 47 human subjects (eight normal (NOM), 16 OLK, and 23 OSCC). Difference between mean spectra (DBMS), Mann-Whitney's U test, and forward feature selection (FFS) techniques were used for optimising spectral-marker selection. Classification of diseases was performed with linear and quadratic support vector machine (SVM) at 10-fold cross-validation, using different combinations of spectral features. It was observed that six features obtained through FFS enabled differentiation of NOM and OSCC tissue (1782, 1713, 1665, 1545, 1409, and 1161 cm(-1)) and were most significant, able to classify OLK and OSCC with 81.3 % sensitivity, 95.7 % specificity, and 89.7 % overall accuracy. The 43 spectral markers extracted through Mann-Whitney's U Test were the least significant when quadratic SVM was used. Considering the high sensitivity and specificity of the FFS technique, extracting only six spectral biomarkers was thus most useful for diagnosis of OLK and OSCC, and to overcome inter and intra-observer variability experienced in diagnostic best-practice histopathological procedure. By considering the biochemical assignment of these six spectral signatures, this work also revealed altered glycogen and keratin content in histological sections which could able to discriminate OLK and OSCC. The method was validated through spectral selection by the DBMS technique. Thus this method has potential for diagnostic cost minimisation for oral lesions by label-free biomarker identification.

  15. Texture classification of normal tissues in computed tomography using Gabor filters

    NASA Astrophysics Data System (ADS)

    Dettori, Lucia; Bashir, Alia; Hasemann, Julie

    2007-03-01

    The research presented in this article is aimed at developing an automated imaging system for classification of normal tissues in medical images obtained from Computed Tomography (CT) scans. Texture features based on a bank of Gabor filters are used to classify the following tissues of interests: liver, spleen, kidney, aorta, trabecular bone, lung, muscle, IP fat, and SQ fat. The approach consists of three steps: convolution of the regions of interest with a bank of 32 Gabor filters (4 frequencies and 8 orientations), extraction of two Gabor texture features per filter (mean and standard deviation), and creation of a Classification and Regression Tree-based classifier that automatically identifies the various tissues. The data set used consists of approximately 1000 DIACOM images from normal chest and abdominal CT scans of five patients. The regions of interest were labeled by expert radiologists. Optimal trees were generated using two techniques: 10-fold cross-validation and splitting of the data set into a training and a testing set. In both cases, perfect classification rules were obtained provided enough images were available for training (~65%). All performance measures (sensitivity, specificity, precision, and accuracy) for all regions of interest were at 100%. This significantly improves previous results that used Wavelet, Ridgelet, and Curvelet texture features, yielding accuracy values in the 85%-98% range The Gabor filters' ability to isolate features at different frequencies and orientations allows for a multi-resolution analysis of texture essential when dealing with, at times, very subtle differences in the texture of tissues in CT scans.

  16. Prediction of mitochondrial proteins of malaria parasite using split amino acid composition and PSSM profile.

    PubMed

    Verma, Ruchi; Varshney, Grish C; Raghava, G P S

    2010-06-01

    The rate of human death due to malaria is increasing day-by-day. Thus the malaria causing parasite Plasmodium falciparum (PF) remains the cause of concern. With the wealth of data now available, it is imperative to understand protein localization in order to gain deeper insight into their functional roles. In this manuscript, an attempt has been made to develop prediction method for the localization of mitochondrial proteins. In this study, we describe a method for predicting mitochondrial proteins of malaria parasite using machine-learning technique. All models were trained and tested on 175 proteins (40 mitochondrial and 135 non-mitochondrial proteins) and evaluated using five-fold cross validation. We developed a Support Vector Machine (SVM) model for predicting mitochondrial proteins of P. falciparum, using amino acids and dipeptides composition and achieved maximum MCC 0.38 and 0.51, respectively. In this study, split amino acid composition (SAAC) is used where composition of N-termini, C-termini, and rest of protein is computed separately. The performance of SVM model improved significantly from MCC 0.38 to 0.73 when SAAC instead of simple amino acid composition was used as input. In addition, SVM model has been developed using composition of PSSM profile with MCC 0.75 and accuracy 91.38%. We achieved maximum MCC 0.81 with accuracy 92% using a hybrid model, which combines PSSM profile and SAAC. When evaluated on an independent dataset our method performs better than existing methods. A web server PFMpred has been developed for predicting mitochondrial proteins of malaria parasites ( http://www.imtech.res.in/raghava/pfmpred/).

  17. Voxel-based plaque classification in coronary intravascular optical coherence tomography images using decision trees

    NASA Astrophysics Data System (ADS)

    Kolluru, Chaitanya; Prabhu, David; Gharaibeh, Yazan; Wu, Hao; Wilson, David L.

    2018-02-01

    Intravascular Optical Coherence Tomography (IVOCT) is a high contrast, 3D microscopic imaging technique that can be used to assess atherosclerosis and guide stent interventions. Despite its advantages, IVOCT image interpretation is challenging and time consuming with over 500 image frames generated in a single pullback volume. We have developed a method to classify voxel plaque types in IVOCT images using machine learning. To train and test the classifier, we have used our unique database of labeled cadaver vessel IVOCT images accurately registered to gold standard cryoimages. This database currently contains 300 images and is growing. Each voxel is labeled as fibrotic, lipid-rich, calcified or other. Optical attenuation, intensity and texture features were extracted for each voxel and were used to build a decision tree classifier for multi-class classification. Five-fold cross-validation across images gave accuracies of 96 % +/- 0.01 %, 90 +/- 0.02% and 90 % +/- 0.01 % for fibrotic, lipid-rich and calcified classes respectively. To rectify performance degradation seen in left out vessel specimens as opposed to left out images, we are adding data and reducing features to limit overfitting. Following spatial noise cleaning, important vascular regions were unambiguous in display. We developed displays that enable physicians to make rapid determination of calcified and lipid regions. This will inform treatment decisions such as the need for devices (e.g., atherectomy or scoring balloon in the case of calcifications) or extended stent lengths to ensure coverage of lipid regions prone to injury at the edge of a stent.

  18. Predicting soil properties for sustainable agriculture using vis-NIR spectroscopy: a case study in northern Greece

    NASA Astrophysics Data System (ADS)

    Tsakiridis, Nikolaos L.; Tziolas, Nikolaos; Dimitrakos, Agathoklis; Galanis, Georgios; Ntonou, Eleftheria; Tsirika, Anastasia; Terzopoulou, Evangelia; Kalopesa, Eleni; Zalidis, George C.

    2017-09-01

    Soil Spectral Libraries facilitate agricultural production taking into account the principles of a low-input sustainable agriculture and provide more valuable knowledge to environmental policy makers, enabling improved decision making and effective management of natural resources in the region. In this paper, a comparison in the predictive performance of two state of the art algorithms, one linear (Partial Least Squares Regression) and one non-linear (Cubist), employed in soil spectroscopy is conducted. The comparison was carried out in a regional Soil Spectral Library developed in the Eastern Macedonia and Thrace region of Northern Greece, comprised of roughly 450 Entisol soil samples from soil horizons A (0-30 cm) and B (30-60 cm). The soil spectra were acquired in the visible - Near Infrared Red region (vis- NIR, 350nm-2500nm) using a standard protocol in the laboratory. Three soil properties, which are essential for agriculture, were analyzed and taken into account for the comparison. These were the Organic Matter, the Clay content and the concentration of nitrate-N. Additionally, three different spectral pre-processing techniques were utilized, namely the continuum removal, the absorbance transformation, and the first derivative. Following the removal of outliers using the Mahalanobis distance in the first 5 principal components of the spectra (accounting for 99.8% of the variance), a five-fold cross-validation experiment was considered for all 12 datasets. Statistical comparisons were conducted on the results, which indicate that the Cubist algorithm outperforms PLSR, while the most informative transformation is the first derivative.

  19. Validation of cross-sectional time series and multivariate adaptive regression splines models for the prediction of energy expenditure in children and adolescents using doubly labeled water

    USDA-ARS?s Scientific Manuscript database

    Accurate, nonintrusive, and inexpensive techniques are needed to measure energy expenditure (EE) in free-living populations. Our primary aim in this study was to validate cross-sectional time series (CSTS) and multivariate adaptive regression splines (MARS) models based on observable participant cha...

  20. Mind your crossings: Mining GIS imagery for crosswalk localization.

    PubMed

    Ahmetovic, Dragan; Manduchi, Roberto; Coughlan, James M; Mascetti, Sergio

    2017-04-01

    For blind travelers, finding crosswalks and remaining within their borders while traversing them is a crucial part of any trip involving street crossings. While standard Orientation & Mobility (O&M) techniques allow blind travelers to safely negotiate street crossings, additional information about crosswalks and other important features at intersections would be helpful in many situations, resulting in greater safety and/or comfort during independent travel. For instance, in planning a trip a blind pedestrian may wish to be informed of the presence of all marked crossings near a desired route. We have conducted a survey of several O&M experts from the United States and Italy to determine the role that crosswalks play in travel by blind pedestrians. The results show stark differences between survey respondents from the U.S. compared with Italy: the former group emphasized the importance of following standard O&M techniques at all legal crossings (marked or unmarked), while the latter group strongly recommended crossing at marked crossings whenever possible. These contrasting opinions reflect differences in the traffic regulations of the two countries and highlight the diversity of needs that travelers in different regions may have. To address the challenges faced by blind pedestrians in negotiating street crossings, we devised a computer vision-based technique that mines existing spatial image databases for discovery of zebra crosswalks in urban settings. Our algorithm first searches for zebra crosswalks in satellite images; all candidates thus found are validated against spatially registered Google Street View images. This cascaded approach enables fast and reliable discovery and localization of zebra crosswalks in large image datasets. While fully automatic, our algorithm can be improved by a final crowdsourcing validation. To this end, we developed a Pedestrian Crossing Human Validation (PCHV) web service, which supports crowdsourcing to rule out false positives and identify false negatives.

  1. Mind your crossings: Mining GIS imagery for crosswalk localization

    PubMed Central

    Ahmetovic, Dragan; Manduchi, Roberto; Coughlan, James M.; Mascetti, Sergio

    2017-01-01

    For blind travelers, finding crosswalks and remaining within their borders while traversing them is a crucial part of any trip involving street crossings. While standard Orientation & Mobility (O&M) techniques allow blind travelers to safely negotiate street crossings, additional information about crosswalks and other important features at intersections would be helpful in many situations, resulting in greater safety and/or comfort during independent travel. For instance, in planning a trip a blind pedestrian may wish to be informed of the presence of all marked crossings near a desired route. We have conducted a survey of several O&M experts from the United States and Italy to determine the role that crosswalks play in travel by blind pedestrians. The results show stark differences between survey respondents from the U.S. compared with Italy: the former group emphasized the importance of following standard O&M techniques at all legal crossings (marked or unmarked), while the latter group strongly recommended crossing at marked crossings whenever possible. These contrasting opinions reflect differences in the traffic regulations of the two countries and highlight the diversity of needs that travelers in different regions may have. To address the challenges faced by blind pedestrians in negotiating street crossings, we devised a computer vision-based technique that mines existing spatial image databases for discovery of zebra crosswalks in urban settings. Our algorithm first searches for zebra crosswalks in satellite images; all candidates thus found are validated against spatially registered Google Street View images. This cascaded approach enables fast and reliable discovery and localization of zebra crosswalks in large image datasets. While fully automatic, our algorithm can be improved by a final crowdsourcing validation. To this end, we developed a Pedestrian Crossing Human Validation (PCHV) web service, which supports crowdsourcing to rule out false positives and identify false negatives. PMID:28757907

  2. Development and Cross-National Validation of a Laboratory Classroom Environment Instrument for Senior High School Science.

    ERIC Educational Resources Information Center

    Fraser, Barry J.; And Others

    1993-01-01

    Describes the development of the Science Laboratory Environment Inventory (SLEI) instrument for assessing perceptions of the psychosocial environment in science laboratory classrooms, and reports validation information for samples of senior high school students from six different countries. The SLEI assesses five dimensions of the actual and…

  3. RRegrs: an R package for computer-aided model selection with multiple regression models.

    PubMed

    Tsiliki, Georgia; Munteanu, Cristian R; Seoane, Jose A; Fernandez-Lozano, Carlos; Sarimveis, Haralambos; Willighagen, Egon L

    2015-01-01

    Predictive regression models can be created with many different modelling approaches. Choices need to be made for data set splitting, cross-validation methods, specific regression parameters and best model criteria, as they all affect the accuracy and efficiency of the produced predictive models, and therefore, raising model reproducibility and comparison issues. Cheminformatics and bioinformatics are extensively using predictive modelling and exhibit a need for standardization of these methodologies in order to assist model selection and speed up the process of predictive model development. A tool accessible to all users, irrespectively of their statistical knowledge, would be valuable if it tests several simple and complex regression models and validation schemes, produce unified reports, and offer the option to be integrated into more extensive studies. Additionally, such methodology should be implemented as a free programming package, in order to be continuously adapted and redistributed by others. We propose an integrated framework for creating multiple regression models, called RRegrs. The tool offers the option of ten simple and complex regression methods combined with repeated 10-fold and leave-one-out cross-validation. Methods include Multiple Linear regression, Generalized Linear Model with Stepwise Feature Selection, Partial Least Squares regression, Lasso regression, and Support Vector Machines Recursive Feature Elimination. The new framework is an automated fully validated procedure which produces standardized reports to quickly oversee the impact of choices in modelling algorithms and assess the model and cross-validation results. The methodology was implemented as an open source R package, available at https://www.github.com/enanomapper/RRegrs, by reusing and extending on the caret package. The universality of the new methodology is demonstrated using five standard data sets from different scientific fields. Its efficiency in cheminformatics and QSAR modelling is shown with three use cases: proteomics data for surface-modified gold nanoparticles, nano-metal oxides descriptor data, and molecular descriptors for acute aquatic toxicity data. The results show that for all data sets RRegrs reports models with equal or better performance for both training and test sets than those reported in the original publications. Its good performance as well as its adaptability in terms of parameter optimization could make RRegrs a popular framework to assist the initial exploration of predictive models, and with that, the design of more comprehensive in silico screening applications.Graphical abstractRRegrs is a computer-aided model selection framework for R multiple regression models; this is a fully validated procedure with application to QSAR modelling.

  4. Computer simulation of Cerebral Arteriovenous Malformation-validation analysis of hemodynamics parameters.

    PubMed

    Kumar, Y Kiran; Mehta, Shashi Bhushan; Ramachandra, Manjunath

    2017-01-01

    The purpose of this work is to provide some validation methods for evaluating the hemodynamic assessment of Cerebral Arteriovenous Malformation (CAVM). This article emphasizes the importance of validating noninvasive measurements for CAVM patients, which are designed using lumped models for complex vessel structure. The validation of the hemodynamics assessment is based on invasive clinical measurements and cross-validation techniques with the Philips proprietary validated software's Qflow and 2D Perfursion. The modeling results are validated for 30 CAVM patients for 150 vessel locations. Mean flow, diameter, and pressure were compared between modeling results and with clinical/cross validation measurements, using an independent two-tailed Student t test. Exponential regression analysis was used to assess the relationship between blood flow, vessel diameter, and pressure between them. Univariate analysis is used to assess the relationship between vessel diameter, vessel cross-sectional area, AVM volume, AVM pressure, and AVM flow results were performed with linear or exponential regression. Modeling results were compared with clinical measurements from vessel locations of cerebral regions. Also, the model is cross validated with Philips proprietary validated software's Qflow and 2D Perfursion. Our results shows that modeling results and clinical results are nearly matching with a small deviation. In this article, we have validated our modeling results with clinical measurements. The new approach for cross-validation is proposed by demonstrating the accuracy of our results with a validated product in a clinical environment.

  5. Rapid, Time-Division Multiplexed, Direct Absorption- and Wavelength Modulation-Spectroscopy

    PubMed Central

    Klein, Alexander; Witzel, Oliver; Ebert, Volker

    2014-01-01

    We present a tunable diode laser spectrometer with a novel, rapid time multiplexed direct absorption- and wavelength modulation-spectroscopy operation mode. The new technique allows enhancing the precision and dynamic range of a tunable diode laser absorption spectrometer without sacrificing accuracy. The spectroscopic technique combines the benefits of absolute concentration measurements using calibration-free direct tunable diode laser absorption spectroscopy (dTDLAS) with the enhanced noise rejection of wavelength modulation spectroscopy (WMS). In this work we demonstrate for the first time a 125 Hz time division multiplexed (TDM-dTDLAS-WMS) spectroscopic scheme by alternating the modulation of a DFB-laser between a triangle-ramp (dTDLAS) and an additional 20 kHz sinusoidal modulation (WMS). The absolute concentration measurement via the dTDLAS-technique allows one to simultaneously calibrate the normalized 2f/1f-signal of the WMS-technique. A dTDLAS/WMS-spectrometer at 1.37 μm for H2O detection was built for experimental validation of the multiplexing scheme over a concentration range from 50 to 3000 ppmV (0.1 MPa, 293 K). A precision of 190 ppbV was achieved with an absorption length of 12.7 cm and an averaging time of two seconds. Our results show a five-fold improvement in precision over the entire concentration range and a significantly decreased averaging time of the spectrometer. PMID:25405508

  6. Solving protein structures using short-distance cross-linking constraints as a guide for discrete molecular dynamics simulations

    PubMed Central

    Brodie, Nicholas I.; Popov, Konstantin I.; Petrotchenko, Evgeniy V.; Dokholyan, Nikolay V.; Borchers, Christoph H.

    2017-01-01

    We present an integrated experimental and computational approach for de novo protein structure determination in which short-distance cross-linking data are incorporated into rapid discrete molecular dynamics (DMD) simulations as constraints, reducing the conformational space and achieving the correct protein folding on practical time scales. We tested our approach on myoglobin and FK506 binding protein—models for α helix–rich and β sheet–rich proteins, respectively—and found that the lowest-energy structures obtained were in agreement with the crystal structure, hydrogen-deuterium exchange, surface modification, and long-distance cross-linking validation data. Our approach is readily applicable to other proteins with unknown structures. PMID:28695211

  7. Solving protein structures using short-distance cross-linking constraints as a guide for discrete molecular dynamics simulations.

    PubMed

    Brodie, Nicholas I; Popov, Konstantin I; Petrotchenko, Evgeniy V; Dokholyan, Nikolay V; Borchers, Christoph H

    2017-07-01

    We present an integrated experimental and computational approach for de novo protein structure determination in which short-distance cross-linking data are incorporated into rapid discrete molecular dynamics (DMD) simulations as constraints, reducing the conformational space and achieving the correct protein folding on practical time scales. We tested our approach on myoglobin and FK506 binding protein-models for α helix-rich and β sheet-rich proteins, respectively-and found that the lowest-energy structures obtained were in agreement with the crystal structure, hydrogen-deuterium exchange, surface modification, and long-distance cross-linking validation data. Our approach is readily applicable to other proteins with unknown structures.

  8. Aromatic Cluster Sensor of Protein Folding: Near-UV Electronic Circular Dichroism Bands Assigned to Fold Compactness.

    PubMed

    Farkas, Viktor; Jákli, Imre; Tóth, Gábor K; Perczel, András

    2016-09-19

    Both far- and near-UV electronic circular dichroism (ECD) spectra have bands sensitive to thermal unfolding of Trp and Tyr residues containing proteins. Beside spectral changes at 222 nm reporting secondary structural variations (far-UV range), L b bands (near-UV range) are applicable as 3D-fold sensors of protein's core structure. In this study we show that both L b (Tyr) and L b (Trp) ECD bands could be used as sensors of fold compactness. ECD is a relative method and thus requires NMR referencing and cross-validation, also provided here. The ensemble of 204 ECD spectra of Trp-cage miniproteins is analysed as a training set for "calibrating" Trp↔Tyr folded systems of known NMR structure. While in the far-UV ECD spectra changes are linear as a function of the temperature, near-UV ECD data indicate a non-linear and thus, cooperative unfolding mechanism of these proteins. Ensemble of ECD spectra deconvoluted gives both conformational weights and insight to a protein folding↔unfolding mechanism. We found that the L b 293 band is reporting on the 3D-structure compactness. In addition, the pure near-UV ECD spectrum of the unfolded state is described here for the first time. Thus, ECD folding information now validated can be applied with confidence in a large thermal window (5≤T≤85 °C) compared to NMR for studying the unfolding of Trp↔Tyr residue pairs. In conclusion, folding propensities of important proteins (RNA polymerase II, ubiquitin protein ligase, tryptase-inhibitor etc.) can now be analysed with higher confidence. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. Genomic Prediction Accounting for Residual Heteroskedasticity

    PubMed Central

    Ou, Zhining; Tempelman, Robert J.; Steibel, Juan P.; Ernst, Catherine W.; Bates, Ronald O.; Bello, Nora M.

    2015-01-01

    Whole-genome prediction (WGP) models that use single-nucleotide polymorphism marker information to predict genetic merit of animals and plants typically assume homogeneous residual variance. However, variability is often heterogeneous across agricultural production systems and may subsequently bias WGP-based inferences. This study extends classical WGP models based on normality, heavy-tailed specifications and variable selection to explicitly account for environmentally-driven residual heteroskedasticity under a hierarchical Bayesian mixed-models framework. WGP models assuming homogeneous or heterogeneous residual variances were fitted to training data generated under simulation scenarios reflecting a gradient of increasing heteroskedasticity. Model fit was based on pseudo-Bayes factors and also on prediction accuracy of genomic breeding values computed on a validation data subset one generation removed from the simulated training dataset. Homogeneous vs. heterogeneous residual variance WGP models were also fitted to two quantitative traits, namely 45-min postmortem carcass temperature and loin muscle pH, recorded in a swine resource population dataset prescreened for high and mild residual heteroskedasticity, respectively. Fit of competing WGP models was compared using pseudo-Bayes factors. Predictive ability, defined as the correlation between predicted and observed phenotypes in validation sets of a five-fold cross-validation was also computed. Heteroskedastic error WGP models showed improved model fit and enhanced prediction accuracy compared to homoskedastic error WGP models although the magnitude of the improvement was small (less than two percentage points net gain in prediction accuracy). Nevertheless, accounting for residual heteroskedasticity did improve accuracy of selection, especially on individuals of extreme genetic merit. PMID:26564950

  10. Classification of burn wounds using support vector machines

    NASA Astrophysics Data System (ADS)

    Acha, Begona; Serrano, Carmen; Palencia, Sergio; Murillo, Juan Jose

    2004-05-01

    The purpose of this work is to improve a previous method developed by the authors for the classification of burn wounds into their depths. The inputs of the system are color and texture information, as these are the characteristics observed by physicians in order to give a diagnosis. Our previous work consisted in segmenting the burn wound from the rest of the image and classifying the burn into its depth. In this paper we focus on the classification problem only. We already proposed to use a Fuzzy-ARTMAP neural network (NN). However, we may take advantage of new powerful classification tools such as Support Vector Machines (SVM). We apply the five-folded cross validation scheme to divide the database into training and validating sets. Then, we apply a feature selection method for each classifier, which will give us the set of features that yields the smallest classification error for each classifier. Features used to classify are first-order statistical parameters extracted from the L*, u* and v* color components of the image. The feature selection algorithms used are the Sequential Forward Selection (SFS) and the Sequential Backward Selection (SBS) methods. As data of the problem faced here are not linearly separable, the SVM was trained using some different kernels. The validating process shows that the SVM method, when using a Gaussian kernel of variance 1, outperforms classification results obtained with the rest of the classifiers, yielding an error classification rate of 0.7% whereas the Fuzzy-ARTMAP NN attained 1.6 %.

  11. The structure of post-traumatic stress symptoms in survivors of war: confirmatory factor analyses of the Impact of Event Scale--revised.

    PubMed

    Morina, Nexhmedin; Böhme, Hendryk F; Ajdukovic, Dean; Bogic, Marija; Franciskovic, Tanja; Galeazzi, Gian M; Kucukalic, Abdulah; Lecic-Tosevski, Dusica; Popovski, Mihajlo; Schützwohl, Matthias; Stangier, Ulrich; Priebe, Stefan

    2010-08-01

    The study aimed at establishing the factor structure of the Impact of Event Scale-Revised (IES-R) in survivors of war. A total sample of 4167 participants with potentially traumatic experiences during the war in Ex-Yugoslavia was split into three samples: two independent samples of people who stayed in the area of conflict and one sample of refugees to Western European countries. Alternative models with three, four, and five factors of post-traumatic symptoms were tested in one sample. The other samples were used for cross-validation. Results indicated that the model of best fit had five factors, i.e., intrusion, avoidance, hyperarousal, numbing, and sleep disturbance. Model superiority was cross-validated in the two other samples. These findings suggest a five-factor model of post-traumatic stress symptoms in war survivors with numbing and sleep disturbance as separate factors in addition to intrusion, avoidance and hyperarousal. (c) 2010 Elsevier Ltd. All rights reserved.

  12. Polyhedron Models for the Classroom. Second Edition.

    ERIC Educational Resources Information Center

    Wenninger, Magnus J.

    This second edition explains the historical background and techniques for constructing various types of polyhedra. Seven center-fold sheets are included, containing full-scale drawings from which nets or templates may be made to construct the models shown and described in the text. Details are provided for construction of the five Platonic solids,…

  13. Benchmarking protein classification algorithms via supervised cross-validation.

    PubMed

    Kertész-Farkas, Attila; Dhir, Somdutta; Sonego, Paolo; Pacurar, Mircea; Netoteia, Sergiu; Nijveen, Harm; Kuzniar, Arnold; Leunissen, Jack A M; Kocsor, András; Pongor, Sándor

    2008-04-24

    Development and testing of protein classification algorithms are hampered by the fact that the protein universe is characterized by groups vastly different in the number of members, in average protein size, similarity within group, etc. Datasets based on traditional cross-validation (k-fold, leave-one-out, etc.) may not give reliable estimates on how an algorithm will generalize to novel, distantly related subtypes of the known protein classes. Supervised cross-validation, i.e., selection of test and train sets according to the known subtypes within a database has been successfully used earlier in conjunction with the SCOP database. Our goal was to extend this principle to other databases and to design standardized benchmark datasets for protein classification. Hierarchical classification trees of protein categories provide a simple and general framework for designing supervised cross-validation strategies for protein classification. Benchmark datasets can be designed at various levels of the concept hierarchy using a simple graph-theoretic distance. A combination of supervised and random sampling was selected to construct reduced size model datasets, suitable for algorithm comparison. Over 3000 new classification tasks were added to our recently established protein classification benchmark collection that currently includes protein sequence (including protein domains and entire proteins), protein structure and reading frame DNA sequence data. We carried out an extensive evaluation based on various machine-learning algorithms such as nearest neighbor, support vector machines, artificial neural networks, random forests and logistic regression, used in conjunction with comparison algorithms, BLAST, Smith-Waterman, Needleman-Wunsch, as well as 3D comparison methods DALI and PRIDE. The resulting datasets provide lower, and in our opinion more realistic estimates of the classifier performance than do random cross-validation schemes. A combination of supervised and random sampling was used to construct model datasets, suitable for algorithm comparison.

  14. Multifactor dimensionality reduction reveals a three-locus epistatic interaction associated with susceptibility to pulmonary tuberculosis.

    PubMed

    Collins, Ryan L; Hu, Ting; Wejse, Christian; Sirugo, Giorgio; Williams, Scott M; Moore, Jason H

    2013-02-18

    Identifying high-order genetics associations with non-additive (i.e. epistatic) effects in population-based studies of common human diseases is a computational challenge. Multifactor dimensionality reduction (MDR) is a machine learning method that was designed specifically for this problem. The goal of the present study was to apply MDR to mining high-order epistatic interactions in a population-based genetic study of tuberculosis (TB). The study used a previously published data set consisting of 19 candidate single-nucleotide polymorphisms (SNPs) in 321 pulmonary TB cases and 347 healthy controls from Guniea-Bissau in Africa. The ReliefF algorithm was applied first to generate a smaller set of the five most informative SNPs. MDR with 10-fold cross-validation was then applied to look at all possible combinations of two, three, four and five SNPs. The MDR model with the best testing accuracy (TA) consisted of SNPs rs2305619, rs187084, and rs11465421 (TA = 0.588) in PTX3, TLR9 and DC-Sign, respectively. A general 1000-fold permutation test of the null hypothesis of no association confirmed the statistical significance of the model (p = 0.008). An additional 1000-fold permutation test designed specifically to test the linear null hypothesis that the association effects are only additive confirmed the presence of non-additive (i.e. nonlinear) or epistatic effects (p = 0.013). An independent information-gain measure corroborated these results with a third-order epistatic interaction that was stronger than any lower-order associations. We have identified statistically significant evidence for a three-way epistatic interaction that is associated with susceptibility to TB. This interaction is stronger than any previously described one-way or two-way associations. This study highlights the importance of using machine learning methods that are designed to embrace, rather than ignore, the complexity of common diseases such as TB. We recommend future studies of the genetics of TB take into account the possibility that high-order epistatic interactions might play an important role in disease susceptibility.

  15. Enhancement of ZnO-based flexible nano generators via a sol-gel technique for sensing and energy harvesting applications

    NASA Astrophysics Data System (ADS)

    Rajagopalan, P.; Singh, Vipul; Palani, I. A.

    2018-03-01

    Zinc oxide (ZnO) is a remarkable inorganic semiconductor with exceptional piezoelectric properties compared to other semiconductors. However, in comparison to lead-based hazardous piezoelectric materials, its properties have undesired limitations. Here we report a 5˜6 fold enhancement in piezoelectric features via chemical doping of copper matched to intrinsic ZnO. A flexible piezoelectric nanogenerator (F-PENG) device was fabricated using an unpretentious solution process of spin coating, with other advantages such as robustness, low-weight, improved adhesion, and low cost. The device was used to demonstrate energy harvesting from a standard weight as low as 4 gm and can work as a self-powered mass sensor in a broad range of 4 to 100 gm. The device exhibited a novel energy harvesting technique from a wind source due to its inherent flexibility. At three different velocities (10˜30 m s-1) and five different angles of attack (0˜180 degrees), the device validated the ability to discern different velocities and directions of flow. The device will be useful for mapping the flow of air apart from harvesting the energy. The simulation was done to verify the underlining mechanism of aerodynamics involved.

  16. Enhancement of ZnO-based flexible nano generators via a sol-gel technique for sensing and energy harvesting applications.

    PubMed

    Rajagopalan, P; Singh, Vipul; Palani, I A

    2018-02-01

    Zinc oxide (ZnO) is a remarkable inorganic semiconductor with exceptional piezoelectric properties compared to other semiconductors. However, in comparison to lead-based hazardous piezoelectric materials, its properties have undesired limitations. Here we report a 5∼6 fold enhancement in piezoelectric features via chemical doping of copper matched to intrinsic ZnO. A flexible piezoelectric nanogenerator (F-PENG) device was fabricated using an unpretentious solution process of spin coating, with other advantages such as robustness, low-weight, improved adhesion, and low cost. The device was used to demonstrate energy harvesting from a standard weight as low as 4 gm and can work as a self-powered mass sensor in a broad range of 4 to 100 gm. The device exhibited a novel energy harvesting technique from a wind source due to its inherent flexibility. At three different velocities (10∼30 m s -1 ) and five different angles of attack (0∼180 degrees), the device validated the ability to discern different velocities and directions of flow. The device will be useful for mapping the flow of air apart from harvesting the energy. The simulation was done to verify the underlining mechanism of aerodynamics involved.

  17. Beyond where to how: a machine learning approach for sensing mobility contexts using smartphone sensors.

    PubMed

    Guinness, Robert E

    2015-04-28

    This paper presents the results of research on the use of smartphone sensors (namely, GPS and accelerometers), geospatial information (points of interest, such as bus stops and train stations) and machine learning (ML) to sense mobility contexts. Our goal is to develop techniques to continuously and automatically detect a smartphone user's mobility activities, including walking, running, driving and using a bus or train, in real-time or near-real-time (<5 s). We investigated a wide range of supervised learning techniques for classification, including decision trees (DT), support vector machines (SVM), naive Bayes classifiers (NB), Bayesian networks (BN), logistic regression (LR), artificial neural networks (ANN) and several instance-based classifiers (KStar, LWLand IBk). Applying ten-fold cross-validation, the best performers in terms of correct classification rate (i.e., recall) were DT (96.5%), BN (90.9%), LWL (95.5%) and KStar (95.6%). In particular, the DT-algorithm RandomForest exhibited the best overall performance. After a feature selection process for a subset of algorithms, the performance was improved slightly. Furthermore, after tuning the parameters of RandomForest, performance improved to above 97.5%. Lastly, we measured the computational complexity of the classifiers, in terms of central processing unit (CPU) time needed for classification, to provide a rough comparison between the algorithms in terms of battery usage requirements. As a result, the classifiers can be ranked from lowest to highest complexity (i.e., computational cost) as follows: SVM, ANN, LR, BN, DT, NB, IBk, LWL and KStar. The instance-based classifiers take considerably more computational time than the non-instance-based classifiers, whereas the slowest non-instance-based classifier (NB) required about five-times the amount of CPU time as the fastest classifier (SVM). The above results suggest that DT algorithms are excellent candidates for detecting mobility contexts in smartphones, both in terms of performance and computational complexity.

  18. Beyond Where to How: A Machine Learning Approach for Sensing Mobility Contexts Using Smartphone Sensors †

    PubMed Central

    Guinness, Robert E.

    2015-01-01

    This paper presents the results of research on the use of smartphone sensors (namely, GPS and accelerometers), geospatial information (points of interest, such as bus stops and train stations) and machine learning (ML) to sense mobility contexts. Our goal is to develop techniques to continuously and automatically detect a smartphone user's mobility activities, including walking, running, driving and using a bus or train, in real-time or near-real-time (<5 s). We investigated a wide range of supervised learning techniques for classification, including decision trees (DT), support vector machines (SVM), naive Bayes classifiers (NB), Bayesian networks (BN), logistic regression (LR), artificial neural networks (ANN) and several instance-based classifiers (KStar, LWLand IBk). Applying ten-fold cross-validation, the best performers in terms of correct classification rate (i.e., recall) were DT (96.5%), BN (90.9%), LWL (95.5%) and KStar (95.6%). In particular, the DT-algorithm RandomForest exhibited the best overall performance. After a feature selection process for a subset of algorithms, the performance was improved slightly. Furthermore, after tuning the parameters of RandomForest, performance improved to above 97.5%. Lastly, we measured the computational complexity of the classifiers, in terms of central processing unit (CPU) time needed for classification, to provide a rough comparison between the algorithms in terms of battery usage requirements. As a result, the classifiers can be ranked from lowest to highest complexity (i.e., computational cost) as follows: SVM, ANN, LR, BN, DT, NB, IBk, LWL and KStar. The instance-based classifiers take considerably more computational time than the non-instance-based classifiers, whereas the slowest non-instance-based classifier (NB) required about five-times the amount of CPU time as the fastest classifier (SVM). The above results suggest that DT algorithms are excellent candidates for detecting mobility contexts in smartphones, both in terms of performance and computational complexity. PMID:25928060

  19. Mental State Assessment and Validation Using Personalized Physiological Biometrics

    PubMed Central

    Patel, Aashish N.; Howard, Michael D.; Roach, Shane M.; Jones, Aaron P.; Bryant, Natalie B.; Robinson, Charles S. H.; Clark, Vincent P.; Pilly, Praveen K.

    2018-01-01

    Mental state monitoring is a critical component of current and future human-machine interfaces, including semi-autonomous driving and flying, air traffic control, decision aids, training systems, and will soon be integrated into ubiquitous products like cell phones and laptops. Current mental state assessment approaches supply quantitative measures, but their only frame of reference is generic population-level ranges. What is needed are physiological biometrics that are validated in the context of task performance of individuals. Using curated intake experiments, we are able to generate personalized models of three key biometrics as useful indicators of mental state; namely, mental fatigue, stress, and attention. We demonstrate improvements to existing approaches through the introduction of new features. Furthermore, addressing the current limitations in assessing the efficacy of biometrics for individual subjects, we propose and employ a multi-level validation scheme for the biometric models by means of k-fold cross-validation for discrete classification and regression testing for continuous prediction. The paper not only provides a unified pipeline for extracting a comprehensive mental state evaluation from a parsimonious set of sensors (only EEG and ECG), but also demonstrates the use of validation techniques in the absence of empirical data. Furthermore, as an example of the application of these models to novel situations, we evaluate the significance of correlations of personalized biometrics to the dynamic fluctuations of accuracy and reaction time on an unrelated threat detection task using a permutation test. Our results provide a path toward integrating biometrics into augmented human-machine interfaces in a judicious way that can help to maximize task performance.

  20. Mental State Assessment and Validation Using Personalized Physiological Biometrics.

    PubMed

    Patel, Aashish N; Howard, Michael D; Roach, Shane M; Jones, Aaron P; Bryant, Natalie B; Robinson, Charles S H; Clark, Vincent P; Pilly, Praveen K

    2018-01-01

    Mental state monitoring is a critical component of current and future human-machine interfaces, including semi-autonomous driving and flying, air traffic control, decision aids, training systems, and will soon be integrated into ubiquitous products like cell phones and laptops. Current mental state assessment approaches supply quantitative measures, but their only frame of reference is generic population-level ranges. What is needed are physiological biometrics that are validated in the context of task performance of individuals. Using curated intake experiments, we are able to generate personalized models of three key biometrics as useful indicators of mental state; namely, mental fatigue, stress, and attention. We demonstrate improvements to existing approaches through the introduction of new features. Furthermore, addressing the current limitations in assessing the efficacy of biometrics for individual subjects, we propose and employ a multi-level validation scheme for the biometric models by means of k -fold cross-validation for discrete classification and regression testing for continuous prediction. The paper not only provides a unified pipeline for extracting a comprehensive mental state evaluation from a parsimonious set of sensors (only EEG and ECG), but also demonstrates the use of validation techniques in the absence of empirical data. Furthermore, as an example of the application of these models to novel situations, we evaluate the significance of correlations of personalized biometrics to the dynamic fluctuations of accuracy and reaction time on an unrelated threat detection task using a permutation test. Our results provide a path toward integrating biometrics into augmented human-machine interfaces in a judicious way that can help to maximize task performance.

  1. Detection of resistance, cross-resistance, and stability of resistance to new chemistry insecticides in Bemisia tabaci (Homoptera: Aleyrodidae).

    PubMed

    Basit, Muhammad; Saeed, Shafqat; Saleem, Mushtaq Ahmad; Denholm, Ian; Shah, Maqbool

    2013-06-01

    Resistance levels in whitefly, Bemisia tabaci (Gennadius) collections from cotton and sunflower (up to four districts) for five neonicotinoids and two insect growth regulators (IGRs) were investigated for two consecutive years. Based on the LC50(s), all collections showed slight to moderate levels of resistance for the tested insecticides compared with the laboratory susceptible population. The data also indicated that cotton and sunflower collections had similar resistance levels. In comparison (four collections), Vehari collections showed higher resistance for acetamiprid, thiacloprid, and nitenpyram compared with those of others. Average resistance ratios for acetamiprid, thiacloprid, and nitenpyram ranged from 5- to 13-, 4- to 8-, and 9- to 13-fold, respectively. Multan and Vehari collections also exhibited moderate levels (9- to 16-fold) of resistance to buprofezin. Furthermore, toxicity of neonicotinoids against immature stages was equal to that of insect growth regulators. The data also suggested that resistance in the field populations was stable. After selection for four generations with bifenthrin (G1 to G4), resistance to bifenthrin increased to 14-fold compared with the laboratory susceptible population. Selection also increased resistance to fenpropathrin, lambdacyhalothrin, imidacloprid, acetamiprid, and diafenthuron. Cross-resistance and stability of resistance in the field populations is of some concern. Rotation of insecticides having no cross-resistance and targeting the control against immature stages may control the resistant insects, simultaneously reducing the selection pressure imposed.

  2. Expert system verification and validation study

    NASA Technical Reports Server (NTRS)

    French, Scott W.; Hamilton, David

    1992-01-01

    Five workshops on verification and validation (V&V) of expert systems (ES) where taught during this recent period of performance. Two key activities, previously performed under this contract, supported these recent workshops (1) Survey of state-of-the-practice of V&V of ES and (2) Development of workshop material and first class. The first activity involved performing an extensive survey of ES developers in order to answer several questions regarding the state-of-the-practice in V&V of ES. These questions related to the amount and type of V&V done and the successfulness of this V&V. The next key activity involved developing an intensive hands-on workshop in V&V of ES. This activity involved surveying a large number of V&V techniques, conventional as well as ES specific ones. In addition to explaining the techniques, we showed how each technique could be applied on a sample problem. References were included in the workshop material, and cross referenced to techniques, so that students would know where to go to find additional information about each technique. In addition to teaching specific techniques, we included an extensive amount of material on V&V concepts and how to develop a V&V plan for an ES project. We felt this material was necessary so that developers would be prepared to develop an orderly and structured approach to V&V. That is, they would have a process that supported the use of the specific techniques. Finally, to provide hands-on experience, we developed a set of case study exercises. These exercises were to provide an opportunity for the students to apply all the material (concepts, techniques, and planning material) to a realistic problem.

  3. Lead in a Baltimore shipyard.

    PubMed

    Hall, Francis X

    2006-12-01

    The goal was to monitor the effectiveness of the Coast Guard Yard's lead program by comparing a shipyard period in 1991 to one in 2002-2003. Comparisons of airborne lead levels by paint removal techniques, airborne lead levels by welding techniques, and blood lead levels of workers were evaluated by chi2 analysis. Airborne lead levels in paint removal techniques decreased over time for all methods used. Airborne lead levels in welding techniques decreased over time for all methods used. Blood lead levels of the high-risk group revealed a 2-fold reduction (prevalence rate ratio = 8.3; 95% confidence interval, 3.7-18.6) and in the low-risk group revealed a 1.6-fold reduction (prevalence rate ratio = 6.2; 95% confidence interval, 0.86-44.7). The Coast Guard Yard runs an effective lead program that exceeds the national Healthy People 2010 goal for lead. The results validate the Coast Guard Yard's use of air-line respirators and lead-free paint on all vessels.

  4. Boundary integral solutions for faults in flowing rock

    NASA Astrophysics Data System (ADS)

    Wei, Wei

    We develop new boundary-integral solutions for faulting in viscous rock and implement solutions numerically with a boundary-element computer program, called Faux_Pas. In the solutions, large permanent rock deformations near faults are treated with velocity discontinuities within linear, incompressible, creeping, viscous flows. The faults may have zero strength or a finite strength that can be a constant or varying with deformation. Large deformations are achieved by integrating step by step with the fourth-order Runge-Kutta method. With this method, the boundaries and passive markers are updated dynamically. Faux_Pas has been applied to straight and curved elementary faults, and to listric and dish compound faults, composed of two or more elementary faults, such as listric faults and dish faults, all subjected to simple shear, shortening and lengthening. It reproduces the essential geometric elements seen in seismic profiles of fault-related folds associated with listric thrust faults in the Bighorn Basin of Wyoming, with dish faults in the Appalachians in Pennsylvania, Parry Islands of Canada and San Fernando Valley, California, and with listric normal faults in the Gulf of Mexico. Faux_Pas also predicts that some of these fault-related structures will include fascinating minor folds, especially in the footwall of the fault, that have been recognized earlier but have not been known to be related to the faulting. Some of these minor folds are potential structural traps. Faux_Pas is superior in several respects to current geometric techniques of balancing profiles, such as the "fault-bend fold" construction. With Faux_Pas, both the hanging wall and footwall are deformable, the faults are mechanical features, the cross sections are automatically balanced and, most important, the solutions are based on the first principles of mechanics. With the geometric techniques, folds are drawn only in the hanging wall, the faults are simply lines, the cross sections are arbitrarily balanced and, most important, the drawings are based on unsubstantiated rules of thumb. Faux_Pas provides the first rational tool for the study of fault-related folds.

  5. Baseline susceptibility of Planococcus ficus (Hemiptera: Pseudococcidae) from California to select insecticides.

    PubMed

    Prabhaker, Nilima; Gispert, Carmen; Castle, Steven J

    2012-08-01

    Between 2006 and 2008, 20 populations of Planococcus ficus (Signoret), from Coachella and San Joaquin Valleys of California were measured in the laboratory for susceptibility to buprofezin, chlorpyrifos, dimethoate, methomyl, and imidacloprid. Toxicity was assessed using a petri dish bioassay technique for contact insecticides and by a systemic uptake technique for imidacloprid. Mixed life stages were tested for susceptibility to all insecticides except for buprofezin, which was measured against early and late instars (first, second, and third). Dose-response regression lines from the mortality data established LC50 and LC99 values by both techniques. Responses of populations from the two geographical locations to all five insecticides varied, in some cases significantly. Variations in susceptibility to each insecticide among sample sites showed a sevenfold difference for buprofezin, 11-fold to chlorpyrifos, ninefold to dimethoate, 24-fold to methomyl, and 8.5-fold to imidacloprid. In spite of susceptibility differences between populations, baseline toxicity data revealed that all five insecticides were quite effective based on low LC50s. Chlorpyrifos was the most toxic compound to Planococcus ficus populations as shown by lowest LC50s. Buprofezin was toxic to all immature stages but was more potent to first instars. The highest LC99 estimated by probit analysis of the bioassay data of all 20 populations for each compound was selected as a candidate discriminating dose for use in future resistance monitoring efforts. Establishment of baseline data and development of resistance monitoring tools such as bioassay methods and discriminating doses are essential elements of a sustainable management program for Planococcus ficus.

  6. Applicability of Monte Carlo cross validation technique for model development and validation using generalised least squares regression

    NASA Astrophysics Data System (ADS)

    Haddad, Khaled; Rahman, Ataur; A Zaman, Mohammad; Shrestha, Surendra

    2013-03-01

    SummaryIn regional hydrologic regression analysis, model selection and validation are regarded as important steps. Here, the model selection is usually based on some measurements of goodness-of-fit between the model prediction and observed data. In Regional Flood Frequency Analysis (RFFA), leave-one-out (LOO) validation or a fixed percentage leave out validation (e.g., 10%) is commonly adopted to assess the predictive ability of regression-based prediction equations. This paper develops a Monte Carlo Cross Validation (MCCV) technique (which has widely been adopted in Chemometrics and Econometrics) in RFFA using Generalised Least Squares Regression (GLSR) and compares it with the most commonly adopted LOO validation approach. The study uses simulated and regional flood data from the state of New South Wales in Australia. It is found that when developing hydrologic regression models, application of the MCCV is likely to result in a more parsimonious model than the LOO. It has also been found that the MCCV can provide a more realistic estimate of a model's predictive ability when compared with the LOO.

  7. Multidomain proteins under force

    NASA Astrophysics Data System (ADS)

    Valle-Orero, Jessica; Andrés Rivas-Pardo, Jaime; Popa, Ionel

    2017-04-01

    Advancements in single-molecule force spectroscopy techniques such as atomic force microscopy and magnetic tweezers allow investigation of how domain folding under force can play a physiological role. Combining these techniques with protein engineering and HaloTag covalent attachment, we investigate similarities and differences between four model proteins: I10 and I91—two immunoglobulin-like domains from the muscle protein titin, and two α + β fold proteins—ubiquitin and protein L. These proteins show a different mechanical response and have unique extensions under force. Remarkably, when normalized to their contour length, the size of the unfolding and refolding steps as a function of force reduces to a single master curve. This curve can be described using standard models of polymer elasticity, explaining the entropic nature of the measured steps. We further validate our measurements with a simple energy landscape model, which combines protein folding with polymer physics and accounts for the complex nature of tandem domains under force. This model can become a useful tool to help in deciphering the complexity of multidomain proteins operating under force.

  8. Breast cancer detection via Hu moment invariant and feedforward neural network

    NASA Astrophysics Data System (ADS)

    Zhang, Xiaowei; Yang, Jiquan; Nguyen, Elijah

    2018-04-01

    One of eight women can get breast cancer during all her life. This study used Hu moment invariant and feedforward neural network to diagnose breast cancer. With the help of K-fold cross validation, we can test the out-of-sample accuracy of our method. Finally, we found that our methods can improve the accuracy of detecting breast cancer and reduce the difficulty of judging.

  9. Validation of the FFM PD count technique for screening personality pathology in later middle-aged and older adults.

    PubMed

    Van den Broeck, Joke; Rossi, Gina; De Clercq, Barbara; Dierckx, Eva; Bastiaansen, Leen

    2013-01-01

    Research on the applicability of the five factor model (FFM) to capture personality pathology coincided with the development of a FFM personality disorder (PD) count technique, which has been validated in adolescent, young, and middle-aged samples. This study extends the literature by validating this technique in an older sample. Five alternative FFM PD counts based upon the Revised NEO Personality Inventory (NEO PI-R) are computed and evaluated in terms of both convergent and divergent validity with the Assessment of DSM-IV Personality Disorders Questionnaire (shortly ADP-IV; DSM-IV, Diagnostic and Statistical Manual of Mental Disorders - Fourth edition). For the best working count for each PD normative data are presented, from which cut-off scores are derived. The validity of these cut-offs and their usefulness as a screening tool is tested against both a categorical (i.e., the DSM-IV - Text Revision), and a dimensional (i.e., the Dimensional Assessment of Personality Pathology; DAPP) measure of personality pathology. All but the Antisocial and Obsessive-Compulsive counts exhibited adequate convergent and divergent validity, supporting the use of this method in older adults. Using the ADP-IV and the DAPP - Short Form as validation criteria, results corroborate the use of the FFM PD count technique to screen for PDs in older adults, in particular for the Paranoid, Borderline, Histrionic, Avoidant, and Dependent PDs. Given the age-neutrality of the NEO PI-R and the considerable lack of valid personality assessment tools, current findings appear to be promising for the assessment of pathology in older adults.

  10. A Monte Carlo Evaluation of Estimated Parameters of Five Shrinkage Estimate Formuli.

    ERIC Educational Resources Information Center

    Newman, Isadore; And Others

    A Monte Carlo study was conducted to estimate the efficiency of and the relationship between five equations and the use of cross validation as methods for estimating shrinkage in multiple correlations. Two of the methods were intended to estimate shrinkage to population values and the other methods were intended to estimate shrinkage from sample…

  11. First record of plicidentine in Synapsida and patterns of tooth root shape change in Early Permian sphenacodontians.

    PubMed

    Brink, Kirstin S; LeBlanc, Aaron R H; Reisz, Robert R

    2014-11-01

    Recent histological studies have revealed a diversity of dental features in Permo-Carboniferous tetrapods. Here, we report on the occurrence of plicidentine (infolded dentine around the base of the tooth root) in Sphenacodontia, the first such documentation in Synapsida, the clade that includes mammals. Five taxa were examined histologically, Ianthodon schultzei, Sphenacodon ferocior, Dimetrodon limbatus, Dimetrodon grandis, and Secodontosaurus obtusidens. The tooth roots of Ianthodon possess multiple folds, which is generally viewed as the primitive condition for amniotes. Sphenacodon and D. limbatus have distinctive "four-leaf clover"-shaped roots in cross section, whereas Secodontosaurus has an elongate square shape with only subtle folding. The most derived and largest taxon examined in this study, D. grandis, has rounded roots in cross section and therefore no plicidentine. This pattern of a loss of plicidentine in sphenacodontids supports previous functional hypotheses of plicidentine, where teeth with shallow roots require folds to increase the area of attachment to the tooth-bearing element, whereas teeth with long roots do not. This pattern may also reflect differences in diet between co-occurring sphenacodontids as well as changes in feeding niche through time, specifically in the apex predator Dimetrodon.

  12. Support vector machine and principal component analysis for microarray data classification

    NASA Astrophysics Data System (ADS)

    Astuti, Widi; Adiwijaya

    2018-03-01

    Cancer is a leading cause of death worldwide although a significant proportion of it can be cured if it is detected early. In recent decades, technology called microarray takes an important role in the diagnosis of cancer. By using data mining technique, microarray data classification can be performed to improve the accuracy of cancer diagnosis compared to traditional techniques. The characteristic of microarray data is small sample but it has huge dimension. Since that, there is a challenge for researcher to provide solutions for microarray data classification with high performance in both accuracy and running time. This research proposed the usage of Principal Component Analysis (PCA) as a dimension reduction method along with Support Vector Method (SVM) optimized by kernel functions as a classifier for microarray data classification. The proposed scheme was applied on seven data sets using 5-fold cross validation and then evaluation and analysis conducted on term of both accuracy and running time. The result showed that the scheme can obtained 100% accuracy for Ovarian and Lung Cancer data when Linear and Cubic kernel functions are used. In term of running time, PCA greatly reduced the running time for every data sets.

  13. Detection of chewing from piezoelectric film sensor signals using ensemble classifiers.

    PubMed

    Farooq, Muhammad; Sazonov, Edward

    2016-08-01

    Selection and use of pattern recognition algorithms is application dependent. In this work, we explored the use of several ensembles of weak classifiers to classify signals captured from a wearable sensor system to detect food intake based on chewing. Three sensor signals (Piezoelectric sensor, accelerometer, and hand to mouth gesture) were collected from 12 subjects in free-living conditions for 24 hrs. Sensor signals were divided into 10 seconds epochs and for each epoch combination of time and frequency domain features were computed. In this work, we present a comparison of three different ensemble techniques: boosting (AdaBoost), bootstrap aggregation (bagging) and stacking, each trained with 3 different weak classifiers (Decision Trees, Linear Discriminant Analysis (LDA) and Logistic Regression). Type of feature normalization used can also impact the classification results. For each ensemble method, three feature normalization techniques: (no-normalization, z-score normalization, and minmax normalization) were tested. A 12 fold cross-validation scheme was used to evaluate the performance of each model where the performance was evaluated in terms of precision, recall, and accuracy. Best results achieved here show an improvement of about 4% over our previous algorithms.

  14. Generating One Biometric Feature from Another: Faces from Fingerprints

    PubMed Central

    Ozkaya, Necla; Sagiroglu, Seref

    2010-01-01

    This study presents a new approach based on artificial neural networks for generating one biometric feature (faces) from another (only fingerprints). An automatic and intelligent system was designed and developed to analyze the relationships among fingerprints and faces and also to model and to improve the existence of the relationships. The new proposed system is the first study that generates all parts of the face including eyebrows, eyes, nose, mouth, ears and face border from only fingerprints. It is also unique and different from similar studies recently presented in the literature with some superior features. The parameter settings of the system were achieved with the help of Taguchi experimental design technique. The performance and accuracy of the system have been evaluated with 10-fold cross validation technique using qualitative evaluation metrics in addition to the expanded quantitative evaluation metrics. Consequently, the results were presented on the basis of the combination of these objective and subjective metrics for illustrating the qualitative properties of the proposed methods as well as a quantitative evaluation of their performances. Experimental results have shown that one biometric feature can be determined from another. These results have once more indicated that there is a strong relationship between fingerprints and faces. PMID:22399877

  15. Performance analysis of clustering techniques over microarray data: A case study

    NASA Astrophysics Data System (ADS)

    Dash, Rasmita; Misra, Bijan Bihari

    2018-03-01

    Handling big data is one of the major issues in the field of statistical data analysis. In such investigation cluster analysis plays a vital role to deal with the large scale data. There are many clustering techniques with different cluster analysis approach. But which approach suits a particular dataset is difficult to predict. To deal with this problem a grading approach is introduced over many clustering techniques to identify a stable technique. But the grading approach depends on the characteristic of dataset as well as on the validity indices. So a two stage grading approach is implemented. In this study the grading approach is implemented over five clustering techniques like hybrid swarm based clustering (HSC), k-means, partitioning around medoids (PAM), vector quantization (VQ) and agglomerative nesting (AGNES). The experimentation is conducted over five microarray datasets with seven validity indices. The finding of grading approach that a cluster technique is significant is also established by Nemenyi post-hoc hypothetical test.

  16. Psychometric evaluation of the Environmental Reality Shock-Related Issues and Concerns instrument for newly graduated nurses.

    PubMed

    Kim, Eun-Young; Yeo, Jung Hee; Park, Hyunjeong; Sin, Kyung Mi; Jones, Cheryl B

    2018-02-01

    Reality shock is a critical representation of the gap between nursing education and clinical practice and it is important to explore the level of reality shock among nurses. However, there is no relevant instrument to assess the level of reality shock in South Korea. The purpose of this is to determine the validity and reliability of the Korean version of the Environmental Reality Shock-Related Issues and Concerns instrument. A cross-sectional study design was used in this study. The data collection was conducted in selected 15 hospitals in South Korea. A convenience sample of 216 newly graduated nurses participated in the study. The Korean version of the Environmental Reality Shock-Related Issues and Concerns instrument was developed through the forward-backward translation technique, and revision based on feedback from expert groups. The internal consistency reliability was assessed using Cronbach's alpha, and the construct validity was determined via exploratory and confirmatory factor analysis. The Korean version of the Environmental Reality Shock-Related Issues and Concerns has reliable internal consistency (Cronbach's alpha=0.91). Exploratory factor analysis revealed five factors including job, relationships, expectations, private life, and performance, which explained 61.92% of variance. The factor loadings ranged from 0.451 to 0.832. The five-factor structure was validated by confirmatory factor analysis (RMR<0.05, CFI>0.9). It was concluded that the Korean version of the Environmental Reality Shock-Related Issues and Concerns instrument has satisfactory construct validity and reliability to measure the reality shock of newly graduated nurses in South Korea. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Cross-cultural adaptation of the Oral Health Impact Profile (OHIP) for the Malaysian adult population.

    PubMed

    Saub, R; Locker, D; Allison, P; Disman, M

    2007-09-01

    The aim of this project was to develop an oral health related-quality of life measure for the Malaysian adult population aged 18 and above by the cross-cultural adaption the Oral Health Impact Profile (OHIP). The adaptation of the OHIP was based on the framework proposed by Herdman et al (1998). The OHIP was translated into the Malay language using a forward-backward translation technique. Thirty-six patients were interviewed to assess the conceptual equivalence and relevancy of each item. Based on the translation process and interview results a Malaysian version of the OHIP questionnaire was produced that contained 45 items. It was designated as the OHIP(M). This questionnaire was pre-tested on 20 patients to assess its face validity. A short 14-item version of the questionnaire was completed by 171 patients to assess the suitability of the Likert-type response format. Field-testing was conducted in order to assess the suitability of two modes of administration (mail and interview) and to establish the psychometric properties of the adapted measure. The pre-testing revealed that the OHIP(M) has good face validity. It was found that the five-point frequency Likert scale could be used for the Malaysian population. The OHIP(M) was reliable, where the scale Cronbach's alpha was 0.95 and the ICC value for test-retest reliability was 0.79. Three out four construct validity hypotheses tested were confirmed. OHIP(M) works equally well as the English version. OHIP(M) was found to be reliable and valid regardless of the mode of administration. However, this study only provides initial evidence for the reliability and validity of the measure. Further study is recommended to collect more evidence to support these results.

  18. A strategy for co-translational folding studies of ribosome-bound nascent chain complexes using NMR spectroscopy.

    PubMed

    Cassaignau, Anaïs M E; Launay, Hélène M M; Karyadi, Maria-Evangelia; Wang, Xiaolin; Waudby, Christopher A; Deckert, Annika; Robertson, Amy L; Christodoulou, John; Cabrita, Lisa D

    2016-08-01

    During biosynthesis on the ribosome, an elongating nascent polypeptide chain can begin to fold, in a process that is central to all living systems. Detailed structural studies of co-translational protein folding are now beginning to emerge; such studies were previously limited, at least in part, by the inherently dynamic nature of emerging nascent chains, which precluded most structural techniques. NMR spectroscopy is able to provide atomic-resolution information for ribosome-nascent chain complexes (RNCs), but it requires large quantities (≥10 mg) of homogeneous, isotopically labeled RNCs. Further challenges include limited sample working concentration and stability of the RNC sample (which contribute to weak NMR signals) and resonance broadening caused by attachment to the large (2.4-MDa) ribosomal complex. Here, we present a strategy to generate isotopically labeled RNCs in Escherichia coli that are suitable for NMR studies. Uniform translational arrest of the nascent chains is achieved using a stalling motif, and isotopically labeled RNCs are produced at high yield using high-cell-density E. coli growth conditions. Homogeneous RNCs are isolated by combining metal affinity chromatography (to isolate ribosome-bound species) with sucrose density centrifugation (to recover intact 70S monosomes). Sensitivity-optimized NMR spectroscopy is then applied to the RNCs, combined with a suite of parallel NMR and biochemical analyses to cross-validate their integrity, including RNC-optimized NMR diffusion measurements to report on ribosome attachment in situ. Comparative NMR studies of RNCs with the analogous isolated proteins permit a high-resolution description of the structure and dynamics of a nascent chain during its progressive biosynthesis on the ribosome.

  19. Infused autograft lymphocyte-to-monocyte ratio and survival in T-cell lymphoma post-autologous peripheral blood hematopoietic stem cell transplantation.

    PubMed

    Porrata, Luis F; Inwards, David J; Ansell, Stephen M; Micallef, Ivana N; Johnston, Patrick B; Hogan, William J; Markovic, Svetomir N

    2015-07-03

    The infused autograft lymphocyte-to-monocyte ratio (A-LMR) is a prognostic factor for survival in B-cell lymphomas post-autologous peripheral hematopoietic stem cell transplantation (APHSCT). Thus, we set out to investigate if the A-LMR is also a prognostic factor for survival post-APHSCT in T-cell lymphomas. From 1998 to 2014, 109 T-cell lymphoma patients that underwent APHSCT were studied. Receiver operating characteristic (ROC) and area under the curve (AUC) were used to identify the optimal cut-off value of A-LMR for survival analysis and k-fold cross-validation model to validate the A-LMR cut-off value. Univariate and multivariate Cox proportional hazard models were used to assess the prognostic discriminator power of A-LMR. ROC and AUC identified an A-LMR ≥ 1 as the best cut-off value and was validated by k-fold cross-validation. Multivariate analysis showed A-LMR to be an independent prognostic factor for overall survival (OS) and progression-free survival (PFS). Patients with an A-LMR ≥ 1.0 experienced a superior OS and PFS versus patients with an A-LMR < 1.0 [median OS was not reached vs 17.9 months, 5-year OS rates of 87% (95% confidence interval (CI), 75-94%) vs 26% (95% CI, 13-42%), p < 0.0001; median PFS was not reached vs 11.9 months, 5-year PFS rates of 72% (95% CI, 58-83%) vs 16% (95% CI, 6-32%), p < 0.0001]. A-LMR is also a prognostic factor for clinical outcomes in patients with T-cell lymphomas undergoing APHSCT.

  20. Automatic classification of ovarian cancer types from cytological images using deep convolutional neural networks.

    PubMed

    Wu, Miao; Yan, Chuanbo; Liu, Huiqiang; Liu, Qian

    2018-06-29

    Ovarian cancer is one of the most common gynecologic malignancies. Accurate classification of ovarian cancer types (serous carcinoma, mucous carcinoma, endometrioid carcinoma, transparent cell carcinoma) is an essential part in the different diagnosis. Computer-aided diagnosis (CADx) can provide useful advice for pathologists to determine the diagnosis correctly. In our study, we employed a Deep Convolutional Neural Networks (DCNN) based on AlexNet to automatically classify the different types of ovarian cancers from cytological images. The DCNN consists of five convolutional layers, three max pooling layers, and two full reconnect layers. Then we trained the model by two group input data separately, one was original image data and the other one was augmented image data including image enhancement and image rotation. The testing results are obtained by the method of 10-fold cross-validation, showing that the accuracy of classification models has been improved from 72.76 to 78.20% by using augmented images as training data. The developed scheme was useful for classifying ovarian cancers from cytological images. © 2018 The Author(s).

  1. Metric Sex Determination of the Human Coxal Bone on a Virtual Sample using Decision Trees.

    PubMed

    Savall, Frédéric; Faruch-Bilfeld, Marie; Dedouit, Fabrice; Sans, Nicolas; Rousseau, Hervé; Rougé, Daniel; Telmon, Norbert

    2015-11-01

    Decision trees provide an alternative to multivariate discriminant analysis, which is still the most commonly used in anthropometric studies. Our study analyzed the metric characterization of a recent virtual sample of 113 coxal bones using decision trees for sex determination. From 17 osteometric type I landmarks, a dataset was built with five classic distances traditionally reported in the literature and six new distances selected using the two-step ratio method. A ten-fold cross-validation was performed, and a decision tree was established on two subsamples (training and test sets). The decision tree established on the training set included three nodes and its application to the test set correctly classified 92% of individuals. This percentage was similar to the data of the literature. The usefulness of decision trees has been demonstrated in numerous fields. They have been already used in sex determination, body mass prediction, and ancestry estimation. This study shows another use of decision trees enabling simple and accurate sex determination. © 2015 American Academy of Forensic Sciences.

  2. Improving medical diagnosis reliability using Boosted C5.0 decision tree empowered by Particle Swarm Optimization.

    PubMed

    Pashaei, Elnaz; Ozen, Mustafa; Aydin, Nizamettin

    2015-08-01

    Improving accuracy of supervised classification algorithms in biomedical applications is one of active area of research. In this study, we improve the performance of Particle Swarm Optimization (PSO) combined with C4.5 decision tree (PSO+C4.5) classifier by applying Boosted C5.0 decision tree as the fitness function. To evaluate the effectiveness of our proposed method, it is implemented on 1 microarray dataset and 5 different medical data sets obtained from UCI machine learning databases. Moreover, the results of PSO + Boosted C5.0 implementation are compared to eight well-known benchmark classification methods (PSO+C4.5, support vector machine under the kernel of Radial Basis Function, Classification And Regression Tree (CART), C4.5 decision tree, C5.0 decision tree, Boosted C5.0 decision tree, Naive Bayes and Weighted K-Nearest neighbor). Repeated five-fold cross-validation method was used to justify the performance of classifiers. Experimental results show that our proposed method not only improve the performance of PSO+C4.5 but also obtains higher classification accuracy compared to the other classification methods.

  3. Comparison of machine-learning methods for above-ground biomass estimation based on Landsat imagery

    NASA Astrophysics Data System (ADS)

    Wu, Chaofan; Shen, Huanhuan; Shen, Aihua; Deng, Jinsong; Gan, Muye; Zhu, Jinxia; Xu, Hongwei; Wang, Ke

    2016-07-01

    Biomass is one significant biophysical parameter of a forest ecosystem, and accurate biomass estimation on the regional scale provides important information for carbon-cycle investigation and sustainable forest management. In this study, Landsat satellite imagery data combined with field-based measurements were integrated through comparisons of five regression approaches [stepwise linear regression, K-nearest neighbor, support vector regression, random forest (RF), and stochastic gradient boosting] with two different candidate variable strategies to implement the optimal spatial above-ground biomass (AGB) estimation. The results suggested that RF algorithm exhibited the best performance by 10-fold cross-validation with respect to R2 (0.63) and root-mean-square error (26.44 ton/ha). Consequently, the map of estimated AGB was generated with a mean value of 89.34 ton/ha in northwestern Zhejiang Province, China, with a similar pattern to the distribution mode of local forest species. This research indicates that machine-learning approaches associated with Landsat imagery provide an economical way for biomass estimation. Moreover, ensemble methods using all candidate variables, especially for Landsat images, provide an alternative for regional biomass simulation.

  4. Multivariate reference technique for quantitative analysis of fiber-optic tissue Raman spectroscopy.

    PubMed

    Bergholt, Mads Sylvest; Duraipandian, Shiyamala; Zheng, Wei; Huang, Zhiwei

    2013-12-03

    We report a novel method making use of multivariate reference signals of fused silica and sapphire Raman signals generated from a ball-lens fiber-optic Raman probe for quantitative analysis of in vivo tissue Raman measurements in real time. Partial least-squares (PLS) regression modeling is applied to extract the characteristic internal reference Raman signals (e.g., shoulder of the prominent fused silica boson peak (~130 cm(-1)); distinct sapphire ball-lens peaks (380, 417, 646, and 751 cm(-1))) from the ball-lens fiber-optic Raman probe for quantitative analysis of fiber-optic Raman spectroscopy. To evaluate the analytical value of this novel multivariate reference technique, a rapid Raman spectroscopy system coupled with a ball-lens fiber-optic Raman probe is used for in vivo oral tissue Raman measurements (n = 25 subjects) under 785 nm laser excitation powers ranging from 5 to 65 mW. An accurate linear relationship (R(2) = 0.981) with a root-mean-square error of cross validation (RMSECV) of 2.5 mW can be obtained for predicting the laser excitation power changes based on a leave-one-subject-out cross-validation, which is superior to the normal univariate reference method (RMSE = 6.2 mW). A root-mean-square error of prediction (RMSEP) of 2.4 mW (R(2) = 0.985) can also be achieved for laser power prediction in real time when we applied the multivariate method independently on the five new subjects (n = 166 spectra). We further apply the multivariate reference technique for quantitative analysis of gelatin tissue phantoms that gives rise to an RMSEP of ~2.0% (R(2) = 0.998) independent of laser excitation power variations. This work demonstrates that multivariate reference technique can be advantageously used to monitor and correct the variations of laser excitation power and fiber coupling efficiency in situ for standardizing the tissue Raman intensity to realize quantitative analysis of tissue Raman measurements in vivo, which is particularly appealing in challenging Raman endoscopic applications.

  5. Predicting pathway cross-talks in ankylosing spondylitis through investigating the interactions among pathways.

    PubMed

    Gu, Xiang; Liu, Cong-Jian; Wei, Jian-Jie

    2017-11-13

    Given that the pathogenesis of ankylosing spondylitis (AS) remains unclear, the aim of this study was to detect the potentially functional pathway cross-talk in AS to further reveal the pathogenesis of this disease. Using microarray profile of AS and biological pathways as study objects, Monte Carlo cross-validation method was used to identify the significant pathway cross-talks. In the process of Monte Carlo cross-validation, all steps were iterated 50 times. For each run, detection of differentially expressed genes (DEGs) between two groups was conducted. The extraction of the potential disrupted pathways enriched by DEGs was then implemented. Subsequently, we established a discriminating score (DS) for each pathway pair according to the distribution of gene expression levels. After that, we utilized random forest (RF) classification model to screen out the top 10 paired pathways with the highest area under the curve (AUCs), which was computed using 10-fold cross-validation approach. After 50 bootstrap, the best pairs of pathways were identified. According to their AUC values, the pair of pathways, antigen presentation pathway and fMLP signaling in neutrophils, achieved the best AUC value of 1.000, which indicated that this pathway cross-talk could distinguish AS patients from normal subjects. Moreover, the paired pathways of SAPK/JNK signaling and mitochondrial dysfunction were involved in 5 bootstraps. Two paired pathways (antigen presentation pathway and fMLP signaling in neutrophil, as well as SAPK/JNK signaling and mitochondrial dysfunction) can accurately distinguish AS and control samples. These paired pathways may be helpful to identify patients with AS for early intervention.

  6. Five-dimensional ultrasound system for soft tissue visualization.

    PubMed

    Deshmukh, Nishikant P; Caban, Jesus J; Taylor, Russell H; Hager, Gregory D; Boctor, Emad M

    2015-12-01

    A five-dimensional ultrasound (US) system is proposed as a real-time pipeline involving fusion of 3D B-mode data with the 3D ultrasound elastography (USE) data as well as visualization of these fused data and a real-time update capability over time for each consecutive scan. 3D B-mode data assist in visualizing the anatomy of the target organ, and 3D elastography data adds strain information. We investigate the feasibility of such a system and show that an end-to-end real-time system, from acquisition to visualization, can be developed. We present a system that consists of (a) a real-time 3D elastography algorithm based on a normalized cross-correlation (NCC) computation on a GPU; (b) real-time 3D B-mode acquisition and network transfer; (c) scan conversion of 3D elastography and B-mode volumes (if acquired by 4D wobbler probe); and (d) visualization software that fuses, visualizes, and updates 3D B-mode and 3D elastography data in real time. We achieved a speed improvement of 4.45-fold for the threaded version of the NCC-based 3D USE versus the non-threaded version. The maximum speed was 79 volumes/s for 3D scan conversion. In a phantom, we validated the dimensions of a 2.2-cm-diameter sphere scan-converted to B-mode volume. Also, we validated the 5D US system visualization transfer function and detected 1- and 2-cm spherical objects (phantom lesion). Finally, we applied the system to a phantom consisting of three lesions to delineate the lesions from the surrounding background regions of the phantom. A 5D US system is achievable with real-time performance. We can distinguish between hard and soft areas in a phantom using the transfer functions.

  7. Mass Spectrometry and Ion Mobility Characterization of Bioactive Peptide-Synthetic Polymer Conjugates.

    PubMed

    Alalwiat, Ahlam; Tang, Wen; Gerişlioğlu, Selim; Becker, Matthew L; Wesdemiotis, Chrys

    2017-01-17

    The bioconjugate BMP2-(PEO-HA) 2 , composed of a dendron with two monodisperse poly(ethylene oxide) (PEO) branches terminated by a hydroxyapatite binding peptide (HA), and a focal point substituted with a bone growth stimulating peptide (BMP2), has been comprehensively characterized by mass spectrometry (MS) methods, encompassing matrix-assisted laser desorption ionization (MALDI), electrospray ionization (ESI), tandem mass spectrometry (MS 2 ), and ion mobility mass spectrometry (IM-MS). MS 2 experiments using different ion activation techniques validated the sequences of the synthetic, bioactive peptides HA and BMP2, which contained highly basic amino acid residues either at the N-terminus (BMP2) or C-terminus (HA). Application of MALDI-MS, ESI-MS, and IM-MS to the polymer-peptide biomaterial confirmed its composition. Collision cross-section measurements and molecular modeling indicated that BMP2-(PEO-HA) 2 exists in several folded and extended conformations, depending on the degree of protonation. Protonation of all basic sites of the hybrid material nearly doubles its conformational space and accessible surface area.

  8. Automated Non-Alphanumeric Symbol Resolution in Clinical Texts

    PubMed Central

    Moon, SungRim; Pakhomov, Serguei; Ryan, James; Melton, Genevieve B.

    2011-01-01

    Although clinical texts contain many symbols, relatively little attention has been given to symbol resolution by medical natural language processing (NLP) researchers. Interpreting the meaning of symbols may be viewed as a special case of Word Sense Disambiguation (WSD). One thousand instances of four common non-alphanumeric symbols (‘+’, ‘–’, ‘/’, and ‘#’) were randomly extracted from a clinical document repository and annotated by experts. The symbols and their surrounding context, in addition to bag-of-Words (BoW), and heuristic rules were evaluated as features for the following classifiers: Naïve Bayes, Support Vector Machine, and Decision Tree, using 10-fold cross-validation. Accuracies for ‘+’, ‘–’, ‘/’, and ‘#’ were 80.11%, 80.22%, 90.44%, and 95.00% respectively, with Naïve Bayes. While symbol context contributed the most, BoW was also helpful for disambiguation of some symbols. Symbol disambiguation with supervised techniques can be implemented with reasonable accuracy as a module for medical NLP systems. PMID:22195157

  9. Memory-Augmented Active Deep Learning for Identifying Relations Between Distant Medical Concepts in Electroencephalography Reports.

    PubMed

    Maldonado, Ramon; Goodwin, Travis R; Harabagiu, Sanda M

    2018-01-01

    The automatic identification of relations between medical concepts in a large corpus of Electroencephalography (EEG) reports is an important step in the development of an EEG-specific patient cohort retrieval system as well as in the acquisition of EEG-specific knowledge from this corpus. EEG-specific relations involve medical concepts that are not typically mentioned in the same sentence or even the same section of a report, thus requiring extraction techniques that can handle such long-distance dependencies. To address this challenge, we present a novel frame work which combines the advantages of a deep learning framework employing Dynamic Relational Memory (DRM) with active learning. While DRM enables the prediction of long-distance relations, active learning provides a mechanism for accurately identifying relations with minimal training data, obtaining an 5-fold cross validationF1 score of 0.7475 on a set of 140 EEG reports selected with active learning. The results obtained with our novel framework show great promise.

  10. Utilizing random Forest QSAR models with optimized parameters for target identification and its application to target-fishing server.

    PubMed

    Lee, Kyoungyeul; Lee, Minho; Kim, Dongsup

    2017-12-28

    The identification of target molecules is important for understanding the mechanism of "target deconvolution" in phenotypic screening and "polypharmacology" of drugs. Because conventional methods of identifying targets require time and cost, in-silico target identification has been considered an alternative solution. One of the well-known in-silico methods of identifying targets involves structure activity relationships (SARs). SARs have advantages such as low computational cost and high feasibility; however, the data dependency in the SAR approach causes imbalance of active data and ambiguity of inactive data throughout targets. We developed a ligand-based virtual screening model comprising 1121 target SAR models built using a random forest algorithm. The performance of each target model was tested by employing the ROC curve and the mean score using an internal five-fold cross validation. Moreover, recall rates for top-k targets were calculated to assess the performance of target ranking. A benchmark model using an optimized sampling method and parameters was examined via external validation set. The result shows recall rates of 67.6% and 73.9% for top-11 (1% of the total targets) and top-33, respectively. We provide a website for users to search the top-k targets for query ligands available publicly at http://rfqsar.kaist.ac.kr . The target models that we built can be used for both predicting the activity of ligands toward each target and ranking candidate targets for a query ligand using a unified scoring scheme. The scores are additionally fitted to the probability so that users can estimate how likely a ligand-target interaction is active. The user interface of our web site is user friendly and intuitive, offering useful information and cross references.

  11. Dermofat graft in deep nasolabial fold and facial rhytidectomy.

    PubMed

    Hwang, Kun; Han, Jin Yi; Kim, Dae Joong

    2003-01-01

    Fat and dermis or the combined tissues are used commonly in augmentation of the nasolabial fold. Guyuron obtained the dermofat graft from either the suprapubic or the groin region. The thickness of the preauricular skin was measured in seven Korean cadavers, five male and two female. We used the dermofat graft out of the preauricular skin remnant after facial rhytidectomy to augment the deep nasolabial fold in a patient. The average thickness of the epidermis was 56 +/- 12 microm, the dermis was 1820 +/- 265 microm thick, and the subcutaneous tissue was 4783 +/- 137 microm. More dense connective tissues, such as SMAS, are seen in the preauricular skin. The dermofat graft was easily obtained and prepared from the leftover preauricular skin after dissection of the lax skin in face lifting. This technique could be employed effectively and successfully to alleviate a deep nasolabial fold and concomitant facial rhytidectomy in an Asian with a thick preauricular skin.

  12. Cross-Cultural Validation of the Preventive Health Model for Colorectal Cancer Screening: An Australian Study

    ERIC Educational Resources Information Center

    Flight, Ingrid H.; Wilson, Carlene J.; McGillivray, Jane; Myers, Ronald E.

    2010-01-01

    We investigated whether the five-factor structure of the Preventive Health Model for colorectal cancer screening, developed in the United States, has validity in Australia. We also tested extending the model with the addition of the factor Self-Efficacy to Screen using Fecal Occult Blood Test (SESFOBT). Randomly selected men and women aged between…

  13. Use of integrated analogue and numerical modelling to predict tridimensional fracture intensity in fault-related-folds.

    NASA Astrophysics Data System (ADS)

    Pizzati, Mattia; Cavozzi, Cristian; Magistroni, Corrado; Storti, Fabrizio

    2016-04-01

    Fracture density pattern predictions with low uncertainty is a fundamental issue for constraining fluid flow pathways in thrust-related anticlines in the frontal parts of thrust-and-fold belts and accretionary prisms, which can also provide plays for hydrocarbon exploration and development. Among the drivers that concur to determine the distribution of fractures in fold-and-thrust-belts, the complex kinematic pathways of folded structures play a key role. In areas with scarce and not reliable underground information, analogue modelling can provide effective support for developing and validating reliable hypotheses on structural architectures and their evolution. In this contribution, we propose a working method that combines analogue and numerical modelling. We deformed a sand-silicone multilayer to eventually produce a non-cylindrical thrust-related anticline at the wedge toe, which was our test geological structure at the reservoir scale. We cut 60 serial cross-sections through the central part of the deformed model to analyze faults and folds geometry using dedicated software (3D Move). The cross-sections were also used to reconstruct the 3D geometry of reference surfaces that compose the mechanical stratigraphy thanks to the use of the software GoCad. From the 3D model of the experimental anticline, by using 3D Move it was possible to calculate the cumulative stress and strain underwent by the deformed reference layers at the end of the deformation and also in incremental steps of fold growth. Based on these model outputs it was also possible to predict the orientation of three main fractures sets (joints and conjugate shear fractures) and their occurrence and density on model surfaces. The next step was the upscaling of the fracture network to the entire digital model volume, to create DFNs.

  14. Validation of the Adolescent Concerns Measure (ACM): evidence from exploratory and confirmatory factor analysis.

    PubMed

    Ang, Rebecca P; Chong, Wan Har; Huan, Vivien S; Yeo, Lay See

    2007-01-01

    This article reports the development and initial validation of scores obtained from the Adolescent Concerns Measure (ACM), a scale which assesses concerns of Asian adolescent students. In Study 1, findings from exploratory factor analysis using 619 adolescents suggested a 24-item scale with four correlated factors--Family Concerns (9 items), Peer Concerns (5 items), Personal Concerns (6 items), and School Concerns (4 items). Initial estimates of convergent validity for ACM scores were also reported. The four-factor structure of ACM scores derived from Study 1 was confirmed via confirmatory factor analysis in Study 2 using a two-fold cross-validation procedure with a separate sample of 811 adolescents. Support was found for both the multidimensional and hierarchical models of adolescent concerns using the ACM. Internal consistency and test-retest reliability estimates were adequate for research purposes. ACM scores show promise as a reliable and potentially valid measure of Asian adolescents' concerns.

  15. Tertiary model of a plant cellulose synthase

    PubMed Central

    Sethaphong, Latsavongsakda; Haigler, Candace H.; Kubicki, James D.; Zimmer, Jochen; Bonetta, Dario; DeBolt, Seth; Yingling, Yaroslava G.

    2013-01-01

    A 3D atomistic model of a plant cellulose synthase (CESA) has remained elusive despite over forty years of experimental effort. Here, we report a computationally predicted 3D structure of 506 amino acids of cotton CESA within the cytosolic region. Comparison of the predicted plant CESA structure with the solved structure of a bacterial cellulose-synthesizing protein validates the overall fold of the modeled glycosyltransferase (GT) domain. The coaligned plant and bacterial GT domains share a six-stranded β-sheet, five α-helices, and conserved motifs similar to those required for catalysis in other GT-2 glycosyltransferases. Extending beyond the cross-kingdom similarities related to cellulose polymerization, the predicted structure of cotton CESA reveals that plant-specific modules (plant-conserved region and class-specific region) fold into distinct subdomains on the periphery of the catalytic region. Computational results support the importance of the plant-conserved region and/or class-specific region in CESA oligomerization to form the multimeric cellulose–synthesis complexes that are characteristic of plants. Relatively high sequence conservation between plant CESAs allowed mapping of known mutations and two previously undescribed mutations that perturb cellulose synthesis in Arabidopsis thaliana to their analogous positions in the modeled structure. Most of these mutation sites are near the predicted catalytic region, and the confluence of other mutation sites supports the existence of previously undefined functional nodes within the catalytic core of CESA. Overall, the predicted tertiary structure provides a platform for the biochemical engineering of plant CESAs. PMID:23592721

  16. Five-class differential diagnostics of neurodegenerative diseases using random undersampling boosting.

    PubMed

    Tong, Tong; Ledig, Christian; Guerrero, Ricardo; Schuh, Andreas; Koikkalainen, Juha; Tolonen, Antti; Rhodius, Hanneke; Barkhof, Frederik; Tijms, Betty; Lemstra, Afina W; Soininen, Hilkka; Remes, Anne M; Waldemar, Gunhild; Hasselbalch, Steen; Mecocci, Patrizia; Baroni, Marta; Lötjönen, Jyrki; Flier, Wiesje van der; Rueckert, Daniel

    2017-01-01

    Differentiating between different types of neurodegenerative diseases is not only crucial in clinical practice when treatment decisions have to be made, but also has a significant potential for the enrichment of clinical trials. The purpose of this study is to develop a classification framework for distinguishing the four most common neurodegenerative diseases, including Alzheimer's disease, frontotemporal lobe degeneration, Dementia with Lewy bodies and vascular dementia, as well as patients with subjective memory complaints. Different biomarkers including features from images (volume features, region-wise grading features) and non-imaging features (CSF measures) were extracted for each subject. In clinical practice, the prevalence of different dementia types is imbalanced, posing challenges for learning an effective classification model. Therefore, we propose the use of the RUSBoost algorithm in order to train classifiers and to handle the class imbalance training problem. Furthermore, a multi-class feature selection method based on sparsity is integrated into the proposed framework to improve the classification performance. It also provides a way for investigating the importance of different features and regions. Using a dataset of 500 subjects, the proposed framework achieved a high accuracy of 75.2% with a balanced accuracy of 69.3% for the five-class classification using ten-fold cross validation, which is significantly better than the results using support vector machine or random forest, demonstrating the feasibility of the proposed framework to support clinical decision making.

  17. Figure of merit for macrouniformity based on image quality ruler evaluation and machine learning framework

    NASA Astrophysics Data System (ADS)

    Wang, Weibao; Overall, Gary; Riggs, Travis; Silveston-Keith, Rebecca; Whitney, Julie; Chiu, George; Allebach, Jan P.

    2013-01-01

    Assessment of macro-uniformity is a capability that is important for the development and manufacture of printer products. Our goal is to develop a metric that will predict macro-uniformity, as judged by human subjects, by scanning and analyzing printed pages. We consider two different machine learning frameworks for the metric: linear regression and the support vector machine. We have implemented the image quality ruler, based on the recommendations of the INCITS W1.1 macro-uniformity team. Using 12 subjects at Purdue University and 20 subjects at Lexmark, evenly balanced with respect to gender, we conducted subjective evaluations with a set of 35 uniform b/w prints from seven different printers with five levels of tint coverage. Our results suggest that the image quality ruler method provides a reliable means to assess macro-uniformity. We then defined and implemented separate features to measure graininess, mottle, large area variation, jitter, and large-scale non-uniformity. The algorithms that we used are largely based on ISO image quality standards. Finally, we used these features computed for a set of test pages and the subjects' image quality ruler assessments of these pages to train the two different predictors - one based on linear regression and the other based on the support vector machine (SVM). Using five-fold cross-validation, we confirmed the efficacy of our predictor.

  18. Critical Evaluation of Human Oral Bioavailability for Pharmaceutical Drugs by Using Various Cheminformatics Approaches

    PubMed Central

    Kim, Marlene; Sedykh, Alexander; Chakravarti, Suman K.; Saiakhov, Roustem D.; Zhu, Hao

    2014-01-01

    Purpose Oral bioavailability (%F) is a key factor that determines the fate of a new drug in clinical trials. Traditionally, %F is measured using costly and time -consuming experimental tests. Developing computational models to evaluate the %F of new drugs before they are synthesized would be beneficial in the drug discovery process. Methods We employed Combinatorial Quantitative Structure-Activity Relationship approach to develop several computational %F models. We compiled a %F dataset of 995 drugs from public sources. After generating chemical descriptors for each compound, we used random forest, support vector machine, k nearest neighbor, and CASE Ultra to develop the relevant QSAR models. The resulting models were validated using five-fold cross-validation. Results The external predictivity of %F values was poor (R2=0.28, n=995, MAE=24), but was improved (R2=0.40, n=362, MAE=21) by filtering unreliable predictions that had a high probability of interacting with MDR1 and MRP2 transporters. Furthermore, classifying the compounds according to the %F values (%F<50% as “low”, %F≥50% as ‘high”) and developing category QSAR models resulted in an external accuracy of 76%. Conclusions In this study, we developed predictive %F QSAR models that could be used to evaluate new drug compounds, and integrating drug-transporter interactions data greatly benefits the resulting models. PMID:24306326

  19. [Determination of calcium and magnesium in tobacco by near-infrared spectroscopy and least squares-support vector machine].

    PubMed

    Tian, Kuang-da; Qiu, Kai-xian; Li, Zu-hong; Lü, Ya-qiong; Zhang, Qiu-ju; Xiong, Yan-mei; Min, Shun-geng

    2014-12-01

    The purpose of the present paper is to determine calcium and magnesium in tobacco using NIR combined with least squares-support vector machine (LS-SVM). Five hundred ground and dried tobacco samples from Qujing city, Yunnan province, China, were surveyed by a MATRIX-I spectrometer (Bruker Optics, Bremen, Germany). At the beginning of data processing, outliers of samples were eliminated for stability of the model. The rest 487 samples were divided into several calibration sets and validation sets according to a hybrid modeling strategy. Monte-Carlo cross validation was used to choose the best spectral preprocess method from multiplicative scatter correction (MSC), standard normal variate transformation (SNV), S-G smoothing, 1st derivative, etc., and their combinations. To optimize parameters of LS-SVM model, the multilayer grid search and 10-fold cross validation were applied. The final LS-SVM models with the optimizing parameters were trained by the calibration set and accessed by 287 validation samples picked by Kennard-Stone method. For the quantitative model of calcium in tobacco, Savitzky-Golay FIR smoothing with frame size 21 showed the best performance. The regularization parameter λ of LS-SVM was e16.11, while the bandwidth of the RBF kernel σ2 was e8.42. The determination coefficient for prediction (Rc(2)) was 0.9755 and the determination coefficient for prediction (Rp(2)) was 0.9422, better than the performance of PLS model (Rc(2)=0.9593, Rp(2)=0.9344). For the quantitative analysis of magnesium, SNV made the regression model more precise than other preprocess. The optimized λ was e15.25 and σ2 was e6.32. Rc(2) and Rp(2) were 0.9961 and 0.9301, respectively, better than PLS model (Rc(2)=0.9716, Rp(2)=0.8924). After modeling, the whole progress of NIR scan and data analysis for one sample was within tens of seconds. The overall results show that NIR spectroscopy combined with LS-SVM can be efficiently utilized for rapid and accurate analysis of calcium and magnesium in tobacco.

  20. Intravoxel Incoherent Motion MR Imaging in the Differentiation of Benign and Malignant Sinonasal Lesions: Comparison with Conventional Diffusion-Weighted MR Imaging.

    PubMed

    Xiao, Z; Tang, Z; Qiang, J; Wang, S; Qian, W; Zhong, Y; Wang, R; Wang, J; Wu, L; Tang, W; Zhang, Z

    2018-01-25

    Intravoxel incoherent motion is a promising method for the differentiation of sinonasal lesions. This study aimed to evaluate the value of intravoxel incoherent motion in the differentiation of benign and malignant sinonasal lesions and to compare the diagnostic performance of intravoxel incoherent motion with that of conventional DWI. One hundred thirty-one patients with histologically proved solid sinonasal lesions (56 benign and 75 malignant) who underwent conventional DWI and intravoxel incoherent motion were recruited in this study. The diffusion coefficient ( D ), pseudodiffusion coefficient ( D *), and perfusion fraction ( f ) values derived from intravoxel incoherent motion and ADC values derived from conventional DWI were measured and compared between the 2 groups using the Student t test. Receiver operating characteristic curve analysis, logistic regression analysis, and 10-fold cross-validation were performed to evaluate the diagnostic performance of single-parametric and multiparametric models. The mean ADC and D values were significantly lower in malignant sinonasal lesions than in benign sinonasal lesions (both P < .001). The mean f value was higher in malignant lesions than in benign lesions ( P = .003). Multiparametric models can significantly improve the cross-validated areas under the curve for the differentiation of sinonasal lesions compared with single-parametric models (all corrected P < .05 except the D value). The model of D + f provided a better diagnostic performance than the ADC value (corrected P < .001). Intravoxel incoherent motion appears to be a more effective MR imaging technique than conventional DWI in the differentiation of benign and malignant sinonasal lesions. © 2018 by American Journal of Neuroradiology.

  1. Cation-induced folding of alginate-bearing bilayer gels: an unusual example of spontaneous folding along the long axis.

    PubMed

    Athas, Jasmin C; Nguyen, Catherine P; Kummar, Shailaa; Raghavan, Srinivasa R

    2018-04-04

    The spontaneous folding of flat gel films into tubes is an interesting example of self-assembly. Typically, a rectangular film folds along its short axis when forming a tube; folding along the long axis has been seen only in rare instances when the film is constrained. Here, we report a case where the same free-swelling gel film folds along either its long or short axis depending on the concentration of a solute. Our gels are sandwiches (bilayers) of two layers: a passive layer of cross-linked N,N'-dimethylyacrylamide (DMAA) and an active layer of cross-linked DMAA that also contains chains of the biopolymer alginate. Multivalent cations like Ca2+ and Cu2+ induce these bilayer gels to fold into tubes. The folding occurs instantly when a flat film of the gel is introduced into a solution of these cations. The likely cause for folding is that the active layer stiffens and shrinks (because the alginate chains in it get cross-linked by the cations) whereas the passive layer is unaffected. The resulting mismatch in swelling degree between the two layers creates internal stresses that drive folding. Cations that are incapable of cross-linking alginate, such as Na+ and Mg2+, do not induce gel folding. Moreover, the striking aspect is the direction of folding. When the Ca2+ concentration is high (100 mM or higher), the gels fold along their long axis, whereas when the Ca2+ concentration is low (40 to 80 mM), the gels fold along their short axis. We hypothesize that the folding axis is dictated by the inhomogeneous nature of alginate-cation cross-linking, i.e., that the edges get cross-linked before the faces of the gel. At high Ca2+ concentration, the stiffer edges constrain the folding; in turn, the gel folds such that the longer edges are deformed less, which explains the folding along the long axis. At low Ca2+ concentration, the edges and the faces of the gel are more similar in their degree of cross-linking; therefore, the gel folds along its short axis. An analogy can be made to natural structures (such as leaves and seed pods) where stiff elements provide the directionality for folding.

  2. Atomistic structural ensemble refinement reveals non-native structure stabilizes a sub-millisecond folding intermediate of CheY

    DOE PAGES

    Shi, Jade; Nobrega, R. Paul; Schwantes, Christian; ...

    2017-03-08

    The dynamics of globular proteins can be described in terms of transitions between a folded native state and less-populated intermediates, or excited states, which can play critical roles in both protein folding and function. Excited states are by definition transient species, and therefore are difficult to characterize using current experimental techniques. We report an atomistic model of the excited state ensemble of a stabilized mutant of an extensively studied flavodoxin fold protein CheY. We employed a hybrid simulation and experimental approach in which an aggregate 42 milliseconds of all-atom molecular dynamics were used as an informative prior for the structuremore » of the excited state ensemble. The resulting prior was then refined against small-angle X-ray scattering (SAXS) data employing an established method (EROS). The most striking feature of the resulting excited state ensemble was an unstructured N-terminus stabilized by non-native contacts in a conformation that is topologically simpler than the native state. We then predict incisive single molecule FRET experiments, using these results, as a means of model validation. Our study demonstrates the paradigm of uniting simulation and experiment in a statistical model to study the structure of protein excited states and rationally design validating experiments.« less

  3. Atomistic structural ensemble refinement reveals non-native structure stabilizes a sub-millisecond folding intermediate of CheY

    NASA Astrophysics Data System (ADS)

    Shi, Jade; Nobrega, R. Paul; Schwantes, Christian; Kathuria, Sagar V.; Bilsel, Osman; Matthews, C. Robert; Lane, T. J.; Pande, Vijay S.

    2017-03-01

    The dynamics of globular proteins can be described in terms of transitions between a folded native state and less-populated intermediates, or excited states, which can play critical roles in both protein folding and function. Excited states are by definition transient species, and therefore are difficult to characterize using current experimental techniques. Here, we report an atomistic model of the excited state ensemble of a stabilized mutant of an extensively studied flavodoxin fold protein CheY. We employed a hybrid simulation and experimental approach in which an aggregate 42 milliseconds of all-atom molecular dynamics were used as an informative prior for the structure of the excited state ensemble. This prior was then refined against small-angle X-ray scattering (SAXS) data employing an established method (EROS). The most striking feature of the resulting excited state ensemble was an unstructured N-terminus stabilized by non-native contacts in a conformation that is topologically simpler than the native state. Using these results, we then predict incisive single molecule FRET experiments as a means of model validation. This study demonstrates the paradigm of uniting simulation and experiment in a statistical model to study the structure of protein excited states and rationally design validating experiments.

  4. ETHNOPRED: a novel machine learning method for accurate continental and sub-continental ancestry identification and population stratification correction

    PubMed Central

    2013-01-01

    Background Population stratification is a systematic difference in allele frequencies between subpopulations. This can lead to spurious association findings in the case–control genome wide association studies (GWASs) used to identify single nucleotide polymorphisms (SNPs) associated with disease-linked phenotypes. Methods such as self-declared ancestry, ancestry informative markers, genomic control, structured association, and principal component analysis are used to assess and correct population stratification but each has limitations. We provide an alternative technique to address population stratification. Results We propose a novel machine learning method, ETHNOPRED, which uses the genotype and ethnicity data from the HapMap project to learn ensembles of disjoint decision trees, capable of accurately predicting an individual’s continental and sub-continental ancestry. To predict an individual’s continental ancestry, ETHNOPRED produced an ensemble of 3 decision trees involving a total of 10 SNPs, with 10-fold cross validation accuracy of 100% using HapMap II dataset. We extended this model to involve 29 disjoint decision trees over 149 SNPs, and showed that this ensemble has an accuracy of ≥ 99.9%, even if some of those 149 SNP values were missing. On an independent dataset, predominantly of Caucasian origin, our continental classifier showed 96.8% accuracy and improved genomic control’s λ from 1.22 to 1.11. We next used the HapMap III dataset to learn classifiers to distinguish European subpopulations (North-Western vs. Southern), East Asian subpopulations (Chinese vs. Japanese), African subpopulations (Eastern vs. Western), North American subpopulations (European vs. Chinese vs. African vs. Mexican vs. Indian), and Kenyan subpopulations (Luhya vs. Maasai). In these cases, ETHNOPRED produced ensembles of 3, 39, 21, 11, and 25 disjoint decision trees, respectively involving 31, 502, 526, 242 and 271 SNPs, with 10-fold cross validation accuracy of 86.5% ± 2.4%, 95.6% ± 3.9%, 95.6% ± 2.1%, 98.3% ± 2.0%, and 95.9% ± 1.5%. However, ETHNOPRED was unable to produce a classifier that can accurately distinguish Chinese in Beijing vs. Chinese in Denver. Conclusions ETHNOPRED is a novel technique for producing classifiers that can identify an individual’s continental and sub-continental heritage, based on a small number of SNPs. We show that its learned classifiers are simple, cost-efficient, accurate, transparent, flexible, fast, applicable to large scale GWASs, and robust to missing values. PMID:23432980

  5. Mirage: a visible signature evaluation tool

    NASA Astrophysics Data System (ADS)

    Culpepper, Joanne B.; Meehan, Alaster J.; Shao, Q. T.; Richards, Noel

    2017-10-01

    This paper presents the Mirage visible signature evaluation tool, designed to provide a visible signature evaluation capability that will appropriately reflect the effect of scene content on the detectability of targets, providing a capability to assess visible signatures in the context of the environment. Mirage is based on a parametric evaluation of input images, assessing the value of a range of image metrics and combining them using the boosted decision tree machine learning method to produce target detectability estimates. It has been developed using experimental data from photosimulation experiments, where human observers search for vehicle targets in a variety of digital images. The images used for tool development are synthetic (computer generated) images, showing vehicles in many different scenes and exhibiting a wide variation in scene content. A preliminary validation has been performed using k-fold cross validation, where 90% of the image data set was used for training and 10% of the image data set was used for testing. The results of the k-fold validation from 200 independent tests show a prediction accuracy between Mirage predictions of detection probability and observed probability of detection of r(262) = 0:63, p < 0:0001 (Pearson correlation) and a MAE = 0:21 (mean absolute error).

  6. Chemometric brand differentiation of commercial spices using direct analysis in real time mass spectrometry.

    PubMed

    Pavlovich, Matthew J; Dunn, Emily E; Hall, Adam B

    2016-05-15

    Commercial spices represent an emerging class of fuels for improvised explosives. Being able to classify such spices not only by type but also by brand would represent an important step in developing methods to analytically investigate these explosive compositions. Therefore, a combined ambient mass spectrometric/chemometric approach was developed to quickly and accurately classify commercial spices by brand. Direct analysis in real time mass spectrometry (DART-MS) was used to generate mass spectra for samples of black pepper, cayenne pepper, and turmeric, along with four different brands of cinnamon, all dissolved in methanol. Unsupervised learning techniques showed that the cinnamon samples clustered according to brand. Then, we used supervised machine learning algorithms to build chemometric models with a known training set and classified the brands of an unknown testing set of cinnamon samples. Ten independent runs of five-fold cross-validation showed that the training set error for the best-performing models (i.e., the linear discriminant and neural network models) was lower than 2%. The false-positive percentages for these models were 3% or lower, and the false-negative percentages were lower than 10%. In particular, the linear discriminant model perfectly classified the testing set with 0% error. Repeated iterations of training and testing gave similar results, demonstrating the reproducibility of these models. Chemometric models were able to classify the DART mass spectra of commercial cinnamon samples according to brand, with high specificity and low classification error. This method could easily be generalized to other classes of spices, and it could be applied to authenticating questioned commercial samples of spices or to examining evidence from improvised explosives. Copyright © 2016 John Wiley & Sons, Ltd.

  7. Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning.

    PubMed

    Shin, Hoo-Chang; Roth, Holger R; Gao, Mingchen; Lu, Le; Xu, Ziyue; Nogues, Isabella; Yao, Jianhua; Mollura, Daniel; Summers, Ronald M

    2016-05-01

    Remarkable progress has been made in image recognition, primarily due to the availability of large-scale annotated datasets and deep convolutional neural networks (CNNs). CNNs enable learning data-driven, highly representative, hierarchical image features from sufficient training data. However, obtaining datasets as comprehensively annotated as ImageNet in the medical imaging domain remains a challenge. There are currently three major techniques that successfully employ CNNs to medical image classification: training the CNN from scratch, using off-the-shelf pre-trained CNN features, and conducting unsupervised CNN pre-training with supervised fine-tuning. Another effective method is transfer learning, i.e., fine-tuning CNN models pre-trained from natural image dataset to medical image tasks. In this paper, we exploit three important, but previously understudied factors of employing deep convolutional neural networks to computer-aided detection problems. We first explore and evaluate different CNN architectures. The studied models contain 5 thousand to 160 million parameters, and vary in numbers of layers. We then evaluate the influence of dataset scale and spatial image context on performance. Finally, we examine when and why transfer learning from pre-trained ImageNet (via fine-tuning) can be useful. We study two specific computer-aided detection (CADe) problems, namely thoraco-abdominal lymph node (LN) detection and interstitial lung disease (ILD) classification. We achieve the state-of-the-art performance on the mediastinal LN detection, and report the first five-fold cross-validation classification results on predicting axial CT slices with ILD categories. Our extensive empirical evaluation, CNN model analysis and valuable insights can be extended to the design of high performance CAD systems for other medical imaging tasks.

  8. Harnessing Computational Biology for Exact Linear B-Cell Epitope Prediction: A Novel Amino Acid Composition-Based Feature Descriptor.

    PubMed

    Saravanan, Vijayakumar; Gautham, Namasivayam

    2015-10-01

    Proteins embody epitopes that serve as their antigenic determinants. Epitopes occupy a central place in integrative biology, not to mention as targets for novel vaccine, pharmaceutical, and systems diagnostics development. The presence of T-cell and B-cell epitopes has been extensively studied due to their potential in synthetic vaccine design. However, reliable prediction of linear B-cell epitope remains a formidable challenge. Earlier studies have reported discrepancy in amino acid composition between the epitopes and non-epitopes. Hence, this study proposed and developed a novel amino acid composition-based feature descriptor, Dipeptide Deviation from Expected Mean (DDE), to distinguish the linear B-cell epitopes from non-epitopes effectively. In this study, for the first time, only exact linear B-cell epitopes and non-epitopes have been utilized for developing the prediction method, unlike the use of epitope-containing regions in earlier reports. To evaluate the performance of the DDE feature vector, models have been developed with two widely used machine-learning techniques Support Vector Machine and AdaBoost-Random Forest. Five-fold cross-validation performance of the proposed method with error-free dataset and dataset from other studies achieved an overall accuracy between nearly 61% and 73%, with balance between sensitivity and specificity metrics. Performance of the DDE feature vector was better (with accuracy difference of about 2% to 12%), in comparison to other amino acid-derived features on different datasets. This study reflects the efficiency of the DDE feature vector in enhancing the linear B-cell epitope prediction performance, compared to other feature representations. The proposed method is made as a stand-alone tool available freely for researchers, particularly for those interested in vaccine design and novel molecular target development for systems therapeutics and diagnostics: https://github.com/brsaran/LBEEP.

  9. Differential gene expression between African American and European American colorectal cancer patients.

    PubMed

    Jovov, Biljana; Araujo-Perez, Felix; Sigel, Carlie S; Stratford, Jeran K; McCoy, Amber N; Yeh, Jen Jen; Keku, Temitope

    2012-01-01

    The incidence and mortality of colorectal cancer (CRC) is higher in African Americans (AAs) than other ethnic groups in the U. S., but reasons for the disparities are unknown. We performed gene expression profiling of sporadic CRCs from AAs vs. European Americans (EAs) to assess the contribution to CRC disparities. We evaluated the gene expression of 43 AA and 43 EA CRC tumors matched by stage and 40 matching normal colorectal tissues using the Agilent human whole genome 4x44K cDNA arrays. Gene and pathway analyses were performed using Significance Analysis of Microarrays (SAM), Ten-fold cross validation, and Ingenuity Pathway Analysis (IPA). SAM revealed that 95 genes were differentially expressed between AA and EA patients at a false discovery rate of ≤5%. Using IPA we determined that most prominent disease and pathway associations of differentially expressed genes were related to inflammation and immune response. Ten-fold cross validation demonstrated that following 10 genes can predict ethnicity with an accuracy of 94%: CRYBB2, PSPH, ADAL, VSIG10L, C17orf81, ANKRD36B, ZNF835, ARHGAP6, TRNT1 and WDR8. Expression of these 10 genes was validated by qRT-PCR in an independent test set of 28 patients (10 AA, 18 EA). Our results are the first to implicate differential gene expression in CRC racial disparities and indicate prominent difference in CRC inflammation between AA and EA patients. Differences in susceptibility to inflammation support the existence of distinct tumor microenvironments in these two patient populations.

  10. Differential Gene Expression between African American and European American Colorectal Cancer Patients

    PubMed Central

    Jovov, Biljana; Araujo-Perez, Felix; Sigel, Carlie S.; Stratford, Jeran K.; McCoy, Amber N.; Yeh, Jen Jen; Keku, Temitope

    2012-01-01

    The incidence and mortality of colorectal cancer (CRC) is higher in African Americans (AAs) than other ethnic groups in the U. S., but reasons for the disparities are unknown. We performed gene expression profiling of sporadic CRCs from AAs vs. European Americans (EAs) to assess the contribution to CRC disparities. We evaluated the gene expression of 43 AA and 43 EA CRC tumors matched by stage and 40 matching normal colorectal tissues using the Agilent human whole genome 4x44K cDNA arrays. Gene and pathway analyses were performed using Significance Analysis of Microarrays (SAM), Ten-fold cross validation, and Ingenuity Pathway Analysis (IPA). SAM revealed that 95 genes were differentially expressed between AA and EA patients at a false discovery rate of ≤5%. Using IPA we determined that most prominent disease and pathway associations of differentially expressed genes were related to inflammation and immune response. Ten-fold cross validation demonstrated that following 10 genes can predict ethnicity with an accuracy of 94%: CRYBB2, PSPH, ADAL, VSIG10L, C17orf81, ANKRD36B, ZNF835, ARHGAP6, TRNT1 and WDR8. Expression of these 10 genes was validated by qRT-PCR in an independent test set of 28 patients (10 AA, 18 EA). Our results are the first to implicate differential gene expression in CRC racial disparities and indicate prominent difference in CRC inflammation between AA and EA patients. Differences in susceptibility to inflammation support the existence of distinct tumor microenvironments in these two patient populations. PMID:22276153

  11. Mirnacle: machine learning with SMOTE and random forest for improving selectivity in pre-miRNA ab initio prediction.

    PubMed

    Marques, Yuri Bento; de Paiva Oliveira, Alcione; Ribeiro Vasconcelos, Ana Tereza; Cerqueira, Fabio Ribeiro

    2016-12-15

    MicroRNAs (miRNAs) are key gene expression regulators in plants and animals. Therefore, miRNAs are involved in several biological processes, making the study of these molecules one of the most relevant topics of molecular biology nowadays. However, characterizing miRNAs in vivo is still a complex task. As a consequence, in silico methods have been developed to predict miRNA loci. A common ab initio strategy to find miRNAs in genomic data is to search for sequences that can fold into the typical hairpin structure of miRNA precursors (pre-miRNAs). The current ab initio approaches, however, have selectivity issues, i.e., a high number of false positives is reported, which can lead to laborious and costly attempts to provide biological validation. This study presents an extension of the ab initio method miRNAFold, with the aim of improving selectivity through machine learning techniques, namely, random forest combined with the SMOTE procedure that copes with imbalance datasets. By comparing our method, termed Mirnacle, with other important approaches in the literature, we demonstrate that Mirnacle substantially improves selectivity without compromising sensitivity. For the three datasets used in our experiments, our method achieved at least 97% of sensitivity and could deliver a two-fold, 20-fold, and 6-fold increase in selectivity, respectively, compared with the best results of current computational tools. The extension of miRNAFold by the introduction of machine learning techniques, significantly increases selectivity in pre-miRNA ab initio prediction, which optimally contributes to advanced studies on miRNAs, as the need of biological validations is diminished. Hopefully, new research, such as studies of severe diseases caused by miRNA malfunction, will benefit from the proposed computational tool.

  12. Using nanoinformatics methods for automatically identifying relevant nanotoxicology entities from the literature.

    PubMed

    García-Remesal, Miguel; García-Ruiz, Alejandro; Pérez-Rey, David; de la Iglesia, Diana; Maojo, Víctor

    2013-01-01

    Nanoinformatics is an emerging research field that uses informatics techniques to collect, process, store, and retrieve data, information, and knowledge on nanoparticles, nanomaterials, and nanodevices and their potential applications in health care. In this paper, we have focused on the solutions that nanoinformatics can provide to facilitate nanotoxicology research. For this, we have taken a computational approach to automatically recognize and extract nanotoxicology-related entities from the scientific literature. The desired entities belong to four different categories: nanoparticles, routes of exposure, toxic effects, and targets. The entity recognizer was trained using a corpus that we specifically created for this purpose and was validated by two nanomedicine/nanotoxicology experts. We evaluated the performance of our entity recognizer using 10-fold cross-validation. The precisions range from 87.6% (targets) to 93.0% (routes of exposure), while recall values range from 82.6% (routes of exposure) to 87.4% (toxic effects). These results prove the feasibility of using computational approaches to reliably perform different named entity recognition (NER)-dependent tasks, such as for instance augmented reading or semantic searches. This research is a "proof of concept" that can be expanded to stimulate further developments that could assist researchers in managing data, information, and knowledge at the nanolevel, thus accelerating research in nanomedicine.

  13. Genomic Prediction Accounting for Residual Heteroskedasticity.

    PubMed

    Ou, Zhining; Tempelman, Robert J; Steibel, Juan P; Ernst, Catherine W; Bates, Ronald O; Bello, Nora M

    2015-11-12

    Whole-genome prediction (WGP) models that use single-nucleotide polymorphism marker information to predict genetic merit of animals and plants typically assume homogeneous residual variance. However, variability is often heterogeneous across agricultural production systems and may subsequently bias WGP-based inferences. This study extends classical WGP models based on normality, heavy-tailed specifications and variable selection to explicitly account for environmentally-driven residual heteroskedasticity under a hierarchical Bayesian mixed-models framework. WGP models assuming homogeneous or heterogeneous residual variances were fitted to training data generated under simulation scenarios reflecting a gradient of increasing heteroskedasticity. Model fit was based on pseudo-Bayes factors and also on prediction accuracy of genomic breeding values computed on a validation data subset one generation removed from the simulated training dataset. Homogeneous vs. heterogeneous residual variance WGP models were also fitted to two quantitative traits, namely 45-min postmortem carcass temperature and loin muscle pH, recorded in a swine resource population dataset prescreened for high and mild residual heteroskedasticity, respectively. Fit of competing WGP models was compared using pseudo-Bayes factors. Predictive ability, defined as the correlation between predicted and observed phenotypes in validation sets of a five-fold cross-validation was also computed. Heteroskedastic error WGP models showed improved model fit and enhanced prediction accuracy compared to homoskedastic error WGP models although the magnitude of the improvement was small (less than two percentage points net gain in prediction accuracy). Nevertheless, accounting for residual heteroskedasticity did improve accuracy of selection, especially on individuals of extreme genetic merit. Copyright © 2016 Ou et al.

  14. A diagnostic technique used to obtain cross range radiation centers from antenna patterns

    NASA Technical Reports Server (NTRS)

    Lee, T. H.; Burnside, W. D.

    1988-01-01

    A diagnostic technique to obtain cross range radiation centers based on antenna radiation patterns is presented. This method is similar to the synthetic aperture processing of scattered fields in the radar application. Coherent processing of the radiated fields is used to determine the various radiation centers associated with the far-zone pattern of an antenna for a given radiation direction. This technique can be used to identify an unexpected radiation center that creates an undesired effect in a pattern; on the other hand, it can improve a numerical simulation of the pattern by identifying other significant mechanisms. Cross range results for two 8' reflector antennas are presented to illustrate as well as validate that technique.

  15. Prostate cancer detection using machine learning techniques by employing combination of features extracting strategies.

    PubMed

    Hussain, Lal; Ahmed, Adeel; Saeed, Sharjil; Rathore, Saima; Awan, Imtiaz Ahmed; Shah, Saeed Arif; Majid, Abdul; Idris, Adnan; Awan, Anees Ahmed

    2018-02-06

    Prostate is a second leading causes of cancer deaths among men. Early detection of cancer can effectively reduce the rate of mortality caused by Prostate cancer. Due to high and multiresolution of MRIs from prostate cancer require a proper diagnostic systems and tools. In the past researchers developed Computer aided diagnosis (CAD) systems that help the radiologist to detect the abnormalities. In this research paper, we have employed novel Machine learning techniques such as Bayesian approach, Support vector machine (SVM) kernels: polynomial, radial base function (RBF) and Gaussian and Decision Tree for detecting prostate cancer. Moreover, different features extracting strategies are proposed to improve the detection performance. The features extracting strategies are based on texture, morphological, scale invariant feature transform (SIFT), and elliptic Fourier descriptors (EFDs) features. The performance was evaluated based on single as well as combination of features using Machine Learning Classification techniques. The Cross validation (Jack-knife k-fold) was performed and performance was evaluated in term of receiver operating curve (ROC) and specificity, sensitivity, Positive predictive value (PPV), negative predictive value (NPV), false positive rate (FPR). Based on single features extracting strategies, SVM Gaussian Kernel gives the highest accuracy of 98.34% with AUC of 0.999. While, using combination of features extracting strategies, SVM Gaussian kernel with texture + morphological, and EFDs + morphological features give the highest accuracy of 99.71% and AUC of 1.00.

  16. Hierarchical Recognition Scheme for Human Facial Expression Recognition Systems

    PubMed Central

    Siddiqi, Muhammad Hameed; Lee, Sungyoung; Lee, Young-Koo; Khan, Adil Mehmood; Truc, Phan Tran Ho

    2013-01-01

    Over the last decade, human facial expressions recognition (FER) has emerged as an important research area. Several factors make FER a challenging research problem. These include varying light conditions in training and test images; need for automatic and accurate face detection before feature extraction; and high similarity among different expressions that makes it difficult to distinguish these expressions with a high accuracy. This work implements a hierarchical linear discriminant analysis-based facial expressions recognition (HL-FER) system to tackle these problems. Unlike the previous systems, the HL-FER uses a pre-processing step to eliminate light effects, incorporates a new automatic face detection scheme, employs methods to extract both global and local features, and utilizes a HL-FER to overcome the problem of high similarity among different expressions. Unlike most of the previous works that were evaluated using a single dataset, the performance of the HL-FER is assessed using three publicly available datasets under three different experimental settings: n-fold cross validation based on subjects for each dataset separately; n-fold cross validation rule based on datasets; and, finally, a last set of experiments to assess the effectiveness of each module of the HL-FER separately. Weighted average recognition accuracy of 98.7% across three different datasets, using three classifiers, indicates the success of employing the HL-FER for human FER. PMID:24316568

  17. Using support vector machine to predict beta- and gamma-turns in proteins.

    PubMed

    Hu, Xiuzhen; Li, Qianzhong

    2008-09-01

    By using the composite vector with increment of diversity, position conservation scoring function, and predictive secondary structures to express the information of sequence, a support vector machine (SVM) algorithm for predicting beta- and gamma-turns in the proteins is proposed. The 426 and 320 nonhomologous protein chains described by Guruprasad and Rajkumar (Guruprasad and Rajkumar J. Biosci 2000, 25,143) are used for training and testing the predictive model of the beta- and gamma-turns, respectively. The overall prediction accuracy and the Matthews correlation coefficient in 7-fold cross-validation are 79.8% and 0.47, respectively, for the beta-turns. The overall prediction accuracy in 5-fold cross-validation is 61.0% for the gamma-turns. These results are significantly higher than the other algorithms in the prediction of beta- and gamma-turns using the same datasets. In addition, the 547 and 823 nonhomologous protein chains described by Fuchs and Alix (Fuchs and Alix Proteins: Struct Funct Bioinform 2005, 59, 828) are used for training and testing the predictive model of the beta- and gamma-turns, and better results are obtained. This algorithm may be helpful to improve the performance of protein turns' prediction. To ensure the ability of the SVM method to correctly classify beta-turn and non-beta-turn (gamma-turn and non-gamma-turn), the receiver operating characteristic threshold independent measure curves are provided. (c) 2008 Wiley Periodicals, Inc.

  18. Short communication: Variations in major mineral contents of Mediterranean buffalo milk and application of Fourier-transform infrared spectroscopy for their prediction.

    PubMed

    Stocco, G; Cipolat-Gotet, C; Bonfatti, V; Schiavon, S; Bittante, G; Cecchinato, A

    2016-11-01

    The aims of this study were (1) to assess variability in the major mineral components of buffalo milk, (2) to estimate the effect of certain environmental sources of variation on the major minerals during lactation, and (3) to investigate the possibility of using Fourier-transform infrared (FTIR) spectroscopy as an indirect, noninvasive tool for routine prediction of the mineral content of buffalo milk. A total of 173 buffaloes reared in 5 herds were sampled once during the morning milking. Milk samples were analyzed for Ca, P, K, and Mg contents within 3h of sample collection using inductively coupled plasma optical emission spectrometry. A Milkoscan FT2 (Foss, Hillerød, Denmark) was used to acquire milk spectra over the spectral range from 5,000 to 900 wavenumber/cm. Prediction models were built using a partial least square approach, and cross-validation was used to assess the prediction accuracy of FTIR. Prediction models were validated using a 4-fold random cross-validation, thus dividing the calibration-test set in 4 folds, using one of them to check the results (prediction models) and the remaining 3 to develop the calibration models. Buffalo milk minerals averaged 162, 117, 86, and 14.4mg/dL of milk for Ca, P, K, and Mg, respectively. Herd and days in milk were the most important sources of variation in the traits investigated. Parity slightly affected only Ca content. Coefficients of determination of cross-validation between the FTIR-predicted and the measured values were 0.71, 0.70, and 0.72 for Ca, Mg, and P, respectively, whereas prediction accuracy was lower for K (0.55). Our findings reveal FTIR to be an unsuitable tool when milk mineral content needs to be predicted with high accuracy. Predictions may play a role as indicator traits in selective breeding (if the additive genetic correlation between FTIR predictions and measures of milk minerals is high enough) or in monitoring the milk of buffalo populations for dairy industry purposes. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  19. Exploring Mouse Protein Function via Multiple Approaches.

    PubMed

    Huang, Guohua; Chu, Chen; Huang, Tao; Kong, Xiangyin; Zhang, Yunhua; Zhang, Ning; Cai, Yu-Dong

    2016-01-01

    Although the number of available protein sequences is growing exponentially, functional protein annotations lag far behind. Therefore, accurate identification of protein functions remains one of the major challenges in molecular biology. In this study, we presented a novel approach to predict mouse protein functions. The approach was a sequential combination of a similarity-based approach, an interaction-based approach and a pseudo amino acid composition-based approach. The method achieved an accuracy of about 0.8450 for the 1st-order predictions in the leave-one-out and ten-fold cross-validations. For the results yielded by the leave-one-out cross-validation, although the similarity-based approach alone achieved an accuracy of 0.8756, it was unable to predict the functions of proteins with no homologues. Comparatively, the pseudo amino acid composition-based approach alone reached an accuracy of 0.6786. Although the accuracy was lower than that of the previous approach, it could predict the functions of almost all proteins, even proteins with no homologues. Therefore, the combined method balanced the advantages and disadvantages of both approaches to achieve efficient performance. Furthermore, the results yielded by the ten-fold cross-validation indicate that the combined method is still effective and stable when there are no close homologs are available. However, the accuracy of the predicted functions can only be determined according to known protein functions based on current knowledge. Many protein functions remain unknown. By exploring the functions of proteins for which the 1st-order predicted functions are wrong but the 2nd-order predicted functions are correct, the 1st-order wrongly predicted functions were shown to be closely associated with the genes encoding the proteins. The so-called wrongly predicted functions could also potentially be correct upon future experimental verification. Therefore, the accuracy of the presented method may be much higher in reality.

  20. Exploring Mouse Protein Function via Multiple Approaches

    PubMed Central

    Huang, Tao; Kong, Xiangyin; Zhang, Yunhua; Zhang, Ning

    2016-01-01

    Although the number of available protein sequences is growing exponentially, functional protein annotations lag far behind. Therefore, accurate identification of protein functions remains one of the major challenges in molecular biology. In this study, we presented a novel approach to predict mouse protein functions. The approach was a sequential combination of a similarity-based approach, an interaction-based approach and a pseudo amino acid composition-based approach. The method achieved an accuracy of about 0.8450 for the 1st-order predictions in the leave-one-out and ten-fold cross-validations. For the results yielded by the leave-one-out cross-validation, although the similarity-based approach alone achieved an accuracy of 0.8756, it was unable to predict the functions of proteins with no homologues. Comparatively, the pseudo amino acid composition-based approach alone reached an accuracy of 0.6786. Although the accuracy was lower than that of the previous approach, it could predict the functions of almost all proteins, even proteins with no homologues. Therefore, the combined method balanced the advantages and disadvantages of both approaches to achieve efficient performance. Furthermore, the results yielded by the ten-fold cross-validation indicate that the combined method is still effective and stable when there are no close homologs are available. However, the accuracy of the predicted functions can only be determined according to known protein functions based on current knowledge. Many protein functions remain unknown. By exploring the functions of proteins for which the 1st-order predicted functions are wrong but the 2nd-order predicted functions are correct, the 1st-order wrongly predicted functions were shown to be closely associated with the genes encoding the proteins. The so-called wrongly predicted functions could also potentially be correct upon future experimental verification. Therefore, the accuracy of the presented method may be much higher in reality. PMID:27846315

  1. {sup 18}F-Fluorodeoxyglucose Positron Emission Tomography Can Quantify and Predict Esophageal Injury During Radiation Therapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Niedzielski, Joshua S., E-mail: jsniedzielski@mdanderson.org; University of Texas Houston Graduate School of Biomedical Science, Houston, Texas; Yang, Jinzhong

    Purpose: We sought to investigate the ability of mid-treatment {sup 18}F-fluorodeoxyglucose positron emission tomography (PET) studies to objectively and spatially quantify esophageal injury in vivo from radiation therapy for non-small cell lung cancer. Methods and Materials: This retrospective study was approved by the local institutional review board, with written informed consent obtained before enrollment. We normalized {sup 18}F-fluorodeoxyglucose PET uptake to each patient's low-irradiated region (<5 Gy) of the esophagus, as a radiation response measure. Spatially localized metrics of normalized uptake (normalized standard uptake value [nSUV]) were derived for 79 patients undergoing concurrent chemoradiation therapy for non-small cell lung cancer. We usedmore » nSUV metrics to classify esophagitis grade at the time of the PET study, as well as maximum severity by treatment completion, according to National Cancer Institute Common Terminology Criteria for Adverse Events, using multivariate least absolute shrinkage and selection operator (LASSO) logistic regression and repeated 3-fold cross validation (training, validation, and test folds). This 3-fold cross-validation LASSO model procedure was used to predict toxicity progression from 43 asymptomatic patients during the PET study. Dose-volume metrics were also tested in both the multivariate classification and the symptom progression prediction analyses. Classification performance was quantified with the area under the curve (AUC) from receiver operating characteristic analysis on the test set from the 3-fold analyses. Results: Statistical analysis showed increasing nSUV is related to esophagitis severity. Axial-averaged maximum nSUV for 1 esophageal slice and esophageal length with at least 40% of axial-averaged nSUV both had AUCs of 0.85 for classifying grade 2 or higher esophagitis at the time of the PET study and AUCs of 0.91 and 0.92, respectively, for maximum grade 2 or higher by treatment completion. Symptom progression was predicted with an AUC of 0.75. Dose metrics performed poorly at classifying esophagitis (AUC of 0.52, grade 2 or higher mid treatment) or predicting symptom progression (AUC of 0.67). Conclusions: Normalized uptake can objectively, locally, and noninvasively quantify esophagitis during radiation therapy and predict eventual symptoms from asymptomatic patients. Normalized uptake may provide patient-specific dose-response information not discernible from dose.« less

  2. (18)F-Fluorodeoxyglucose Positron Emission Tomography Can Quantify and Predict Esophageal Injury During Radiation Therapy.

    PubMed

    Niedzielski, Joshua S; Yang, Jinzhong; Liao, Zhongxing; Gomez, Daniel R; Stingo, Francesco; Mohan, Radhe; Martel, Mary K; Briere, Tina M; Court, Laurence E

    2016-11-01

    We sought to investigate the ability of mid-treatment (18)F-fluorodeoxyglucose positron emission tomography (PET) studies to objectively and spatially quantify esophageal injury in vivo from radiation therapy for non-small cell lung cancer. This retrospective study was approved by the local institutional review board, with written informed consent obtained before enrollment. We normalized (18)F-fluorodeoxyglucose PET uptake to each patient's low-irradiated region (<5 Gy) of the esophagus, as a radiation response measure. Spatially localized metrics of normalized uptake (normalized standard uptake value [nSUV]) were derived for 79 patients undergoing concurrent chemoradiation therapy for non-small cell lung cancer. We used nSUV metrics to classify esophagitis grade at the time of the PET study, as well as maximum severity by treatment completion, according to National Cancer Institute Common Terminology Criteria for Adverse Events, using multivariate least absolute shrinkage and selection operator (LASSO) logistic regression and repeated 3-fold cross validation (training, validation, and test folds). This 3-fold cross-validation LASSO model procedure was used to predict toxicity progression from 43 asymptomatic patients during the PET study. Dose-volume metrics were also tested in both the multivariate classification and the symptom progression prediction analyses. Classification performance was quantified with the area under the curve (AUC) from receiver operating characteristic analysis on the test set from the 3-fold analyses. Statistical analysis showed increasing nSUV is related to esophagitis severity. Axial-averaged maximum nSUV for 1 esophageal slice and esophageal length with at least 40% of axial-averaged nSUV both had AUCs of 0.85 for classifying grade 2 or higher esophagitis at the time of the PET study and AUCs of 0.91 and 0.92, respectively, for maximum grade 2 or higher by treatment completion. Symptom progression was predicted with an AUC of 0.75. Dose metrics performed poorly at classifying esophagitis (AUC of 0.52, grade 2 or higher mid treatment) or predicting symptom progression (AUC of 0.67). Normalized uptake can objectively, locally, and noninvasively quantify esophagitis during radiation therapy and predict eventual symptoms from asymptomatic patients. Normalized uptake may provide patient-specific dose-response information not discernible from dose. Copyright © 2016 Elsevier Inc. All rights reserved.

  3. A Machine Learning and Cross-Validation Approach for the Discrimination of Vegetation Physiognomic Types Using Satellite Based Multispectral and Multitemporal Data.

    PubMed

    Sharma, Ram C; Hara, Keitarou; Hirayama, Hidetake

    2017-01-01

    This paper presents the performance and evaluation of a number of machine learning classifiers for the discrimination between the vegetation physiognomic classes using the satellite based time-series of the surface reflectance data. Discrimination of six vegetation physiognomic classes, Evergreen Coniferous Forest, Evergreen Broadleaf Forest, Deciduous Coniferous Forest, Deciduous Broadleaf Forest, Shrubs, and Herbs, was dealt with in the research. Rich-feature data were prepared from time-series of the satellite data for the discrimination and cross-validation of the vegetation physiognomic types using machine learning approach. A set of machine learning experiments comprised of a number of supervised classifiers with different model parameters was conducted to assess how the discrimination of vegetation physiognomic classes varies with classifiers, input features, and ground truth data size. The performance of each experiment was evaluated by using the 10-fold cross-validation method. Experiment using the Random Forests classifier provided highest overall accuracy (0.81) and kappa coefficient (0.78). However, accuracy metrics did not vary much with experiments. Accuracy metrics were found to be very sensitive to input features and size of ground truth data. The results obtained in the research are expected to be useful for improving the vegetation physiognomic mapping in Japan.

  4. Assessing the Potential of Folded Globular Polyproteins As Hydrogel Building Blocks

    PubMed Central

    2016-01-01

    The native states of proteins generally have stable well-defined folded structures endowing these biomolecules with specific functionality and molecular recognition abilities. Here we explore the potential of using folded globular polyproteins as building blocks for hydrogels. Photochemically cross-linked hydrogels were produced from polyproteins containing either five domains of I27 ((I27)5), protein L ((pL)5), or a 1:1 blend of these proteins. SAXS analysis showed that (I27)5 exists as a single rod-like structure, while (pL)5 shows signatures of self-aggregation in solution. SANS measurements showed that both polyprotein hydrogels have a similar nanoscopic structure, with protein L hydrogels being formed from smaller and more compact clusters. The polyprotein hydrogels showed small energy dissipation in a load/unload cycle, which significantly increased when the hydrogels were formed in the unfolded state. This study demonstrates the use of folded proteins as building blocks in hydrogels, and highlights the potential versatility that can be offered in tuning the mechanical, structural, and functional properties of polyproteins. PMID:28006103

  5. Modeling river total bed material load discharge using artificial intelligence approaches (based on conceptual inputs)

    NASA Astrophysics Data System (ADS)

    Roushangar, Kiyoumars; Mehrabani, Fatemeh Vojoudi; Shiri, Jalal

    2014-06-01

    This study presents Artificial Intelligence (AI)-based modeling of total bed material load through developing the accuracy level of the predictions of traditional models. Gene expression programming (GEP) and adaptive neuro-fuzzy inference system (ANFIS)-based models were developed and validated for estimations. Sediment data from Qotur River (Northwestern Iran) were used for developing and validation of the applied techniques. In order to assess the applied techniques in relation to traditional models, stream power-based and shear stress-based physical models were also applied in the studied case. The obtained results reveal that developed AI-based models using minimum number of dominant factors, give more accurate results than the other applied models. Nonetheless, it was revealed that k-fold test is a practical but high-cost technique for complete scanning of applied data and avoiding the over-fitting.

  6. KINKFOLD—an AutoLISP program for construction of geological cross-sections using borehole image data

    NASA Astrophysics Data System (ADS)

    Özkaya, Sait Ismail

    2002-04-01

    KINKFOLD is an AutoLISP program designed to construct geological cross-sections from borehole image or dip meter logs. The program uses the kink-fold method for cross-section construction. Beds are folded around hinge lines as angle bisectors so that bedding thickness remains unchanged. KINKFOLD may be used to model a wide variety of parallel fold structures, including overturned and faulted folds, and folds truncated by unconformities. The program accepts data from vertical or inclined boreholes. The KINKFOLD program cannot be used to model fault drag, growth folds, inversion structures or disharmonic folds where the bed thickness changes either because of deformation or deposition. Faulted structures and similar folds can be modelled by KINKFOLD by omitting dip measurements within fault drag zones and near axial planes of similar folds.

  7. Nomograms for predicting graft function and survival in living donor kidney transplantation based on the UNOS Registry.

    PubMed

    Tiong, H Y; Goldfarb, D A; Kattan, M W; Alster, J M; Thuita, L; Yu, C; Wee, A; Poggio, E D

    2009-03-01

    We developed nomograms that predict transplant renal function at 1 year (Modification of Diet in Renal Disease equation [estimated glomerular filtration rate]) and 5-year graft survival after living donor kidney transplantation. Data for living donor renal transplants were obtained from the United Network for Organ Sharing registry for 2000 to 2003. Nomograms were designed using linear or Cox regression models to predict 1-year estimated glomerular filtration rate and 5-year graft survival based on pretransplant information including demographic factors, immunosuppressive therapy, immunological factors and organ procurement technique. A third nomogram was constructed to predict 5-year graft survival using additional information available by 6 months after transplantation. These data included delayed graft function, any treated rejection episodes and the 6-month estimated glomerular filtration rate. The nomograms were internally validated using 10-fold cross-validation. The renal function nomogram had an r-square value of 0.13. It worked best when predicting estimated glomerular filtration rate values between 50 and 70 ml per minute per 1.73 m(2). The 5-year graft survival nomograms had a concordance index of 0.71 for the pretransplant nomogram and 0.78 for the 6-month posttransplant nomogram. Calibration was adequate for all nomograms. Nomograms based on data from the United Network for Organ Sharing registry have been validated to predict the 1-year estimated glomerular filtration rate and 5-year graft survival. These nomograms may facilitate individualized patient care in living donor kidney transplantation.

  8. Developing and validating predictive decision tree models from mining chemical structural fingerprints and high-throughput screening data in PubChem.

    PubMed

    Han, Lianyi; Wang, Yanli; Bryant, Stephen H

    2008-09-25

    Recent advances in high-throughput screening (HTS) techniques and readily available compound libraries generated using combinatorial chemistry or derived from natural products enable the testing of millions of compounds in a matter of days. Due to the amount of information produced by HTS assays, it is a very challenging task to mine the HTS data for potential interest in drug development research. Computational approaches for the analysis of HTS results face great challenges due to the large quantity of information and significant amounts of erroneous data produced. In this study, Decision Trees (DT) based models were developed to discriminate compound bioactivities by using their chemical structure fingerprints provided in the PubChem system http://pubchem.ncbi.nlm.nih.gov. The DT models were examined for filtering biological activity data contained in four assays deposited in the PubChem Bioassay Database including assays tested for 5HT1a agonists, antagonists, and HIV-1 RT-RNase H inhibitors. The 10-fold Cross Validation (CV) sensitivity, specificity and Matthews Correlation Coefficient (MCC) for the models are 57.2 approximately 80.5%, 97.3 approximately 99.0%, 0.4 approximately 0.5 respectively. A further evaluation was also performed for DT models built for two independent bioassays, where inhibitors for the same HIV RNase target were screened using different compound libraries, this experiment yields enrichment factor of 4.4 and 9.7. Our results suggest that the designed DT models can be used as a virtual screening technique as well as a complement to traditional approaches for hits selection.

  9. Five-Factor Model personality disorder prototypes: a review of their development, validity, and comparison to alternative approaches.

    PubMed

    Miller, Joshua D

    2012-12-01

    In this article, the development of Five-Factor Model (FFM) personality disorder (PD) prototypes for the assessment of DSM-IV PDs are reviewed, as well as subsequent procedures for scoring individuals' FFM data with regard to these PD prototypes, including similarity scores and simple additive counts that are based on a quantitative prototype matching methodology. Both techniques, which result in very strongly correlated scores, demonstrate convergent and discriminant validity, and provide clinically useful information with regard to various forms of functioning. The techniques described here for use with FFM data are quite different from the prototype matching methods used elsewhere. © 2012 The Author. Journal of Personality © 2012, Wiley Periodicals, Inc.

  10. Measurement of the $$\\mathrm{Z}\\gamma^{*} \\to \\tau\\tau$$ cross section in pp collisions at $$\\sqrt{s} = $$ 13 TeV and validation of $$\\tau$$ lepton analysis techniques

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sirunyan, Albert M; et al.

    A measurement is presented of themore » $$\\mathrm{Z}/\\gamma^{*} \\to \\tau\\tau$$ cross section in pp collisions at $$\\sqrt{s} = $$ 13 TeV, using data recorded by the CMS experiment at the LHC, corresponding to an integrated luminosity of 2.3 fb$$^{-1}$$. The product of the inclusive cross section and branching fraction is measured to be $$\\sigma(\\mathrm{pp} \\to \\mathrm{Z}/\\gamma^{*}\\text{+X}) \\, \\mathcal{B}(\\mathrm{Z}/\\gamma^{*} \\to \\tau\\tau) = $$ 1848 $$\\pm$$ 12 (stat) $$\\pm$$ 67 (syst+lumi) pb, in agreement with the standard model expectation, computed at next-to-next-to-leading order accuracy in perturbative quantum chromodynamics. The measurement is used to validate new analysis techniques relevant for future measurements of $$\\tau$$ lepton production. The measurement also provides the reconstruction efficiency and energy scale for $$\\tau$$ decays to hadrons+$$\

  11. EVALUATION OF GEOPHYSICAL METHODS FOR THE DETECTION OF SUBSURFACE TETRACHLOROETHYLENE IN CONTROLLED SPILL EXPERIMENTS

    EPA Science Inventory

    This paper presents some of the results of five of the techniques: cross borehole complex resistivity (CR) also referred to as spectral induced polarization (SIP), cross borehole high resolution seismic (HRS), borehole self potential (SP), surface ground penetration radar (GPR), ...

  12. Addressing Participant Validity in a Small Internet Health Survey (The Restore Study): Protocol and Recommendations for Survey Response Validation

    PubMed Central

    Dewitt, James; Capistrant, Benjamin; Kohli, Nidhi; Mitteldorf, Darryl; Merengwa, Enyinnaya; West, William

    2018-01-01

    Background While deduplication and cross-validation protocols have been recommended for large Web-based studies, protocols for survey response validation of smaller studies have not been published. Objective This paper reports the challenges of survey validation inherent in a small Web-based health survey research. Methods The subject population was North American, gay and bisexual, prostate cancer survivors, who represent an under-researched, hidden, difficult-to-recruit, minority-within-a-minority population. In 2015-2016, advertising on a large Web-based cancer survivor support network, using email and social media, yielded 478 completed surveys. Results Our manual deduplication and cross-validation protocol identified 289 survey submissions (289/478, 60.4%) as likely spam, most stemming from advertising on social media. The basic components of this deduplication and validation protocol are detailed. An unexpected challenge encountered was invalid survey responses evolving across the study period. This necessitated the static detection protocol be augmented with a dynamic one. Conclusions Five recommendations for validation of Web-based samples, especially with smaller difficult-to-recruit populations, are detailed. PMID:29691203

  13. A novel, fast and efficient single-sensor automatic sleep-stage classification based on complementary cross-frequency coupling estimates.

    PubMed

    Dimitriadis, Stavros I; Salis, Christos; Linden, David

    2018-04-01

    Limitations of the manual scoring of polysomnograms, which include data from electroencephalogram (EEG), electro-oculogram (EOG), electrocardiogram (ECG) and electromyogram (EMG) channels have long been recognized. Manual staging is resource intensive and time consuming, and thus considerable effort must be spent to ensure inter-rater reliability. As a result, there is a great interest in techniques based on signal processing and machine learning for a completely Automatic Sleep Stage Classification (ASSC). In this paper, we present a single-EEG-sensor ASSC technique based on the dynamic reconfiguration of different aspects of cross-frequency coupling (CFC) estimated between predefined frequency pairs over 5 s epoch lengths. The proposed analytic scheme is demonstrated using the PhysioNet Sleep European Data Format (EDF) Database with repeat recordings from 20 healthy young adults. We validate our methodology in a second sleep dataset. We achieved very high classification sensitivity, specificity and accuracy of 96.2 ± 2.2%, 94.2 ± 2.3%, and 94.4 ± 2.2% across 20 folds, respectively, and also a high mean F1 score (92%, range 90-94%) when a multi-class Naive Bayes classifier was applied. High classification performance has been achieved also in the second sleep dataset. Our method outperformed the accuracy of previous studies not only on different datasets but also on the same database. Single-sensor ASSC makes the entire methodology appropriate for longitudinal monitoring using wearable EEG in real-world and laboratory-oriented environments. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.

  14. Enhancement of ZnO based flexible nano generators via sol gel technique for sensing and energy harvesting applications.

    PubMed

    Rajagopalan, Pandey; Singh, Vipul; I A, Palani

    2018-01-10

    Zinc oxide (ZnO) is a remarkable inorganic semiconductor with exceptional piezoelectric properties compared to other semiconductors. However, in comparison to lead-based hazardous piezoelectric materials, its features have undesired limitations. Here we report the 5~6 folds enhancement in the piezoelectric properties via chemical doping of copper matched to intrinsic ZnO. The flexible piezoelectric nanogenerator (F-PENG) device was fabricated using an unpretentious solution process of spin coating with other advantages like robust, low weight, improved adhesion, and low cost. The devices were used to demonstrate energy harvesting from a Standard weight as low as 4 gm and can work as a self-powered mass sensor in a broad range of 4 to 100 gm. The device exhibited a novel energy harvesting technique from a wind source due to its inherent flexibility. At three different velocities (10~30 m/s) and five different angles of attack (0~180 degrees), the device validated the ability to discern different velocities and directions of flow. The device will be useful for mapping the flow of air apart from harvesting the energy. The simulation was done to verify the underlining mechanism of aerodynamics involved in it. © 2018 IOP Publishing Ltd.

  15. [Resistance mechanisms and cross-resistance of phoxim-resistant Frankliniella occidentalis Pergande population].

    PubMed

    Wang, Sheng-Yin; Zhou, Xian-Hong; Zhang, An-Sheng; Li, Li-Li; Men, Xing-Yuan; Zhang, Si-Cong; Liu, Yong-Jie; Yu, Yi

    2012-07-01

    To understand the resistance risks of Frankliniella occidentalis Pergande against phoxim, this paper studied the resistance mechanisms of phoxim-resistant F. occidentalis population against phoxim and the cross-resistance of the population against other insecticides. The phoxim-resistant population had medium level cross-resistance to chlorpyrifos, lambda-cyhalothrin, and methomyl, low level cross-resistance to chlorfenapyr, imidacloprid, emamectin-benzoate, and spinosad, but no cross-resistance to acetamiprid and abamectin. The synergists piperonyl butoxide (PBO), s, s, s-tributyl phosphorotrithioate (DEF), and triphenyl phosphate (TPP) had significant synergism (P < 0.05) on the toxicity of phoxim to the resistant (XK), field (BJ), and susceptible (S) populations, while diethyl maleate (DEM) had no significant synergism to XK and S populations but had significant synergism to BJ population. As compared with S population, the XK and BJ populations had significantly increased activities of mixed-functional oxidases P450 (2.79-fold and 1.48-fold), b, (2.88-fold and 1.88-fold), O-demethylase (2.60-fold and 1.68-fold), and carboxylesterase (2.02-fold and 1.61-fold, respectively), and XK population had a significantly increased acetylcholine esterase activity (3.10-fold). Both XK and BJ population had an increased activity of glutathione S-transferases (1.11-fold and 1.20-fold, respectively), but the increment was not significant. The increased detoxification enzymes activities in F. occidentalis could play an important role in the resistance of the plant against phoxim.

  16. 77 FR 67668 - Folding Gift Boxes From China; Revised Scheduling of the Expedited Five-Year Review Concerning...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-11-13

    ... INTERNATIONAL TRADE COMMISSION [Investigation No. 731-TA-921 (Second Review)] Folding Gift Boxes... on Folding Gift Boxes From China AGENCY: United States International Trade Commission. ACTION: Notice... the second five-year review of the antidumping duty order on Folding Gift Boxes from China...

  17. Pettiness: Conceptualization, measurement and cross-cultural differences.

    PubMed

    Ng, Reuben; Levy, Becca

    2018-01-01

    Although pettiness, defined as the tendency to get agitated over trivial matters, is a facet of neuroticism which has negative health implications, no measure exists. The goal of the current study was to develop, and validate a short pettiness scale. In Study 1 (N = 2136), Exploratory Factor Analysis distilled a one-factor model with five items. Convergent validity was established using the Big Five Inventory, DASS, Satisfaction with Life Scale, and Conner-Davidson Resilience Scale. As predicted, pettiness was positively associated with neuroticism, depression, anxiety and stress but negatively related to extraversion, agreeableness, conscientiousness, openness, life satisfaction and resilience. Also, as predicted, pettiness was not significantly related to physical functioning, or blind and constructive patriotism, indicating discriminant validity. Confirmatory Factor Analysis in Study 2 (N = 734) revealed a stable one-factor model of pettiness. In Study 3 (N = 532), the scale, which showed a similar factor structure in the USA and Singapore, also reflected predicted cross-cultural patterns: Pettiness was found to be significantly lower in the United States, a culture categorized as "looser" than in Singapore, a culture classified as "tighter" in terms of Gelfand and colleagues' framework of national tendencies to oppose social deviance. Results suggest that this brief 5-item tool is a reliable and valid measure of pettiness, and its use in health research is encouraged.

  18. Use of Nanofibers to Strengthen Hydrogels of Silica, Other Oxides, and Aerogels

    NASA Technical Reports Server (NTRS)

    Meador, Mary Ann B.; Capadona, Lynn A.; Hurwitz, Frances; Vivod, Stephanie L.; Lake, Max

    2010-01-01

    Research has shown that including up to 5 percent w/w carbon nanofibers in a silica backbone of polymer crosslinked aerogels improves its strength, tripling compressive modulus and increasing tensile stress-at-break five-fold with no increase in density or decrease in porosity. In addition, the initial silica hydrogels, which are produced as a first step in manufacturing the aerogels, can be quite fragile and difficult to handle before cross-linking. The addition of the carbon nanofiber also improves the strength of the initial hydrogels before cross-linking, improving the manufacturing process. This can also be extended to other oxide aerogels, such as alumina or aluminosilicates, and other nanofiber types, such as silicon carbide.

  19. Assessing the Validity of Self-Rated Health with the Short Physical Performance Battery: A Cross-Sectional Analysis of the International Mobility in Aging Study.

    PubMed

    Pérez-Zepeda, Mario U; Belanger, Emmanuelle; Zunzunegui, Maria-Victoria; Phillips, Susan; Ylli, Alban; Guralnik, Jack

    2016-01-01

    The aim of this study was to explore the validity of self-rated health across different populations of older adults, when compared to the Short Physical Performance Battery. Cross-sectional analysis of the International Mobility in Aging Study. Five locations: Saint-Hyacinthe and Kingston (Canada), Tirana (Albania), Manizales (Colombia), and Natal (Brazil). Older adults between 65 and 74 years old (n = 1,995). The Short Physical Performance Battery (SPPB) was used to measure physical performance. Self-rated health was assessed with one single five-point question. Linear trends between SPPB scores and self-rated health were tested separately for men and women at each of the five international study sites. Poor physical performance (independent variable) (SPPB less than 8) was used in logistic regression models of self-rated health (dependent variable), adjusting for potential covariates. All analyses were stratified by gender and site of origin. A significant linear association was found between the mean scores of the Short Physical Performance Battery and ordinal categories of self-rated health across research sites and gender groups. After extensive control for objective physical and mental health indicators and socio-demographic variables, these graded associations became non-significant in some research sites. These findings further confirm the validity of SRH as a measure of overall health status in older adults.

  20. Subsurface structural interpretation by applying trishear algorithm: An example from the Lenghu5 fold-and-thrust belt, Qaidam Basin, Northern Tibetan Plateau

    NASA Astrophysics Data System (ADS)

    Pei, Yangwen; Paton, Douglas A.; Wu, Kongyou; Xie, Liujuan

    2017-08-01

    The application of trishear algorithm, in which deformation occurs in a triangle zone in front of a propagating fault tip, is often used to understand fault related folding. In comparison to kink-band methods, a key characteristic of trishear algorithm is that non-uniform deformation within the triangle zone allows the layer thickness and horizon length to change during deformation, which is commonly observed in natural structures. An example from the Lenghu5 fold-and-thrust belt (Qaidam Basin, Northern Tibetan Plateau) is interpreted to help understand how to employ trishear forward modelling to improve the accuracy of seismic interpretation. High resolution fieldwork data, including high-angle dips, 'dragging structures', thinning hanging-wall and thickening footwall, are used to determined best-fit trishear model to explain the deformation happened to the Lenghu5 fold-and-thrust belt. We also consider the factors that increase the complexity of trishear models, including: (a) fault-dip changes and (b) pre-existing faults. We integrate fault dip change and pre-existing faults to predict subsurface structures that are apparently under seismic resolution. The analogue analysis by trishear models indicates that the Lenghu5 fold-and-thrust belt is controlled by an upward-steepening reverse fault above a pre-existing opposite-thrusting fault in deeper subsurface. The validity of the trishear model is confirmed by the high accordance between the model and the high-resolution fieldwork. The validated trishear forward model provides geometric constraints to the faults and horizons in the seismic section, e.g., fault cutoffs and fault tip position, faults' intersecting relationship and horizon/fault cross-cutting relationship. The subsurface prediction using trishear algorithm can significantly increase the accuracy of seismic interpretation, particularly in seismic sections with low signal/noise ratio.

  1. Cuffless and Continuous Blood Pressure Estimation from the Heart Sound Signals

    PubMed Central

    Peng, Rong-Chao; Yan, Wen-Rong; Zhang, Ning-Ling; Lin, Wan-Hua; Zhou, Xiao-Lin; Zhang, Yuan-Ting

    2015-01-01

    Cardiovascular disease, like hypertension, is one of the top killers of human life and early detection of cardiovascular disease is of great importance. However, traditional medical devices are often bulky and expensive, and unsuitable for home healthcare. In this paper, we proposed an easy and inexpensive technique to estimate continuous blood pressure from the heart sound signals acquired by the microphone of a smartphone. A cold-pressor experiment was performed in 32 healthy subjects, with a smartphone to acquire heart sound signals and with a commercial device to measure continuous blood pressure. The Fourier spectrum of the second heart sound and the blood pressure were regressed using a support vector machine, and the accuracy of the regression was evaluated using 10-fold cross-validation. Statistical analysis showed that the mean correlation coefficients between the predicted values from the regression model and the measured values from the commercial device were 0.707, 0.712, and 0.748 for systolic, diastolic, and mean blood pressure, respectively, and that the mean errors were less than 5 mmHg, with standard deviations less than 8 mmHg. These results suggest that this technique is of potential use for cuffless and continuous blood pressure monitoring and it has promising application in home healthcare services. PMID:26393591

  2. Cuffless and Continuous Blood Pressure Estimation from the Heart Sound Signals.

    PubMed

    Peng, Rong-Chao; Yan, Wen-Rong; Zhang, Ning-Ling; Lin, Wan-Hua; Zhou, Xiao-Lin; Zhang, Yuan-Ting

    2015-09-17

    Cardiovascular disease, like hypertension, is one of the top killers of human life and early detection of cardiovascular disease is of great importance. However, traditional medical devices are often bulky and expensive, and unsuitable for home healthcare. In this paper, we proposed an easy and inexpensive technique to estimate continuous blood pressure from the heart sound signals acquired by the microphone of a smartphone. A cold-pressor experiment was performed in 32 healthy subjects, with a smartphone to acquire heart sound signals and with a commercial device to measure continuous blood pressure. The Fourier spectrum of the second heart sound and the blood pressure were regressed using a support vector machine, and the accuracy of the regression was evaluated using 10-fold cross-validation. Statistical analysis showed that the mean correlation coefficients between the predicted values from the regression model and the measured values from the commercial device were 0.707, 0.712, and 0.748 for systolic, diastolic, and mean blood pressure, respectively, and that the mean errors were less than 5 mmHg, with standard deviations less than 8 mmHg. These results suggest that this technique is of potential use for cuffless and continuous blood pressure monitoring and it has promising application in home healthcare services.

  3. ANALYSIS OF CLINICAL AND DERMOSCOPIC FEATURES FOR BASAL CELL CARCINOMA NEURAL NETWORK CLASSIFICATION

    PubMed Central

    Cheng, Beibei; Stanley, R. Joe; Stoecker, William V; Stricklin, Sherea M.; Hinton, Kristen A.; Nguyen, Thanh K.; Rader, Ryan K.; Rabinovitz, Harold S.; Oliviero, Margaret; Moss, Randy H.

    2012-01-01

    Background Basal cell carcinoma (BCC) is the most commonly diagnosed cancer in the United States. In this research, we examine four different feature categories used for diagnostic decisions, including patient personal profile (patient age, gender, etc.), general exam (lesion size and location), common dermoscopic (blue-gray ovoids, leaf-structure dirt trails, etc.), and specific dermoscopic lesion (white/pink areas, semitranslucency, etc.). Specific dermoscopic features are more restricted versions of the common dermoscopic features. Methods Combinations of the four feature categories are analyzed over a data set of 700 lesions, with 350 BCCs and 350 benign lesions, for lesion discrimination using neural network-based techniques, including Evolving Artificial Neural Networks and Evolving Artificial Neural Network Ensembles. Results Experiment results based on ten-fold cross validation for training and testing the different neural network-based techniques yielded an area under the receiver operating characteristic curve as high as 0.981 when all features were combined. The common dermoscopic lesion features generally yielded higher discrimination results than other individual feature categories. Conclusions Experimental results show that combining clinical and image information provides enhanced lesion discrimination capability over either information source separately. This research highlights the potential of data fusion as a model for the diagnostic process. PMID:22724561

  4. Prediction of adult height in girls: the Beunen-Malina-Freitas method.

    PubMed

    Beunen, Gaston P; Malina, Robert M; Freitas, Duarte L; Thomis, Martine A; Maia, José A; Claessens, Albrecht L; Gouveia, Elvio R; Maes, Hermine H; Lefevre, Johan

    2011-12-01

    The purpose of this study was to validate and cross-validate the Beunen-Malina-Freitas method for non-invasive prediction of adult height in girls. A sample of 420 girls aged 10-15 years from the Madeira Growth Study were measured at yearly intervals and then 8 years later. Anthropometric dimensions (lengths, breadths, circumferences, and skinfolds) were measured; skeletal age was assessed using the Tanner-Whitehouse 3 method and menarcheal status (present or absent) was recorded. Adult height was measured and predicted using stepwise, forward, and maximum R (2) regression techniques. Multiple correlations, mean differences, standard errors of prediction, and error boundaries were calculated. A sample of the Leuven Longitudinal Twin Study was used to cross-validate the regressions. Age-specific coefficients of determination (R (2)) between predicted and measured adult height varied between 0.57 and 0.96, while standard errors of prediction varied between 1.1 and 3.9 cm. The cross-validation confirmed the validity of the Beunen-Malina-Freitas method in girls aged 12-15 years, but at lower ages the cross-validation was less consistent. We conclude that the Beunen-Malina-Freitas method is valid for the prediction of adult height in girls aged 12-15 years. It is applicable to European populations or populations of European ancestry.

  5. Validated spectrophotometric methods for simultaneous determination of Omeprazole, Tinidazole and Doxycycline in their ternary mixture

    NASA Astrophysics Data System (ADS)

    Lotfy, Hayam M.; Hegazy, Maha A.; Mowaka, Shereen; Mohamed, Ekram Hany

    2016-01-01

    A comparative study of smart spectrophotometric techniques for the simultaneous determination of Omeprazole (OMP), Tinidazole (TIN) and Doxycycline (DOX) without prior separation steps is developed. These techniques consist of several consecutive steps utilizing zero/or ratio/or derivative spectra. The proposed techniques adopt nine simple different methods, namely direct spectrophotometry, dual wavelength, first derivative-zero crossing, amplitude factor, spectrum subtraction, ratio subtraction, derivative ratio-zero crossing, constant center, and successive derivative ratio method. The calibration graphs are linear over the concentration range of 1-20 μg/mL, 5-40 μg/mL and 2-30 μg/mL for OMP, TIN and DOX, respectively. These methods are tested by analyzing synthetic mixtures of the above drugs and successfully applied to commercial pharmaceutical preparation. The methods that are validated according to the ICH guidelines, accuracy, precision, and repeatability, were found to be within the acceptable limits.

  6. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shi, Jade; Nobrega, R. Paul; Schwantes, Christian

    The dynamics of globular proteins can be described in terms of transitions between a folded native state and less-populated intermediates, or excited states, which can play critical roles in both protein folding and function. Excited states are by definition transient species, and therefore are difficult to characterize using current experimental techniques. We report an atomistic model of the excited state ensemble of a stabilized mutant of an extensively studied flavodoxin fold protein CheY. We employed a hybrid simulation and experimental approach in which an aggregate 42 milliseconds of all-atom molecular dynamics were used as an informative prior for the structuremore » of the excited state ensemble. The resulting prior was then refined against small-angle X-ray scattering (SAXS) data employing an established method (EROS). The most striking feature of the resulting excited state ensemble was an unstructured N-terminus stabilized by non-native contacts in a conformation that is topologically simpler than the native state. We then predict incisive single molecule FRET experiments, using these results, as a means of model validation. Our study demonstrates the paradigm of uniting simulation and experiment in a statistical model to study the structure of protein excited states and rationally design validating experiments.« less

  7. Accelerating cine-MR Imaging in Mouse Hearts Using Compressed Sensing

    PubMed Central

    Wech, Tobias; Lemke, Angela; Medway, Debra; Stork, Lee-Anne; Lygate, Craig A; Neubauer, Stefan; Köstler, Herbert; Schneider, Jürgen E

    2011-01-01

    Purpose To combine global cardiac function imaging with compressed sensing (CS) in order to reduce scan time and to validate this technique in normal mouse hearts and in a murine model of chronic myocardial infarction. Materials and Methods To determine the maximally achievable acceleration factor, fully acquired cine data, obtained in sham and chronically infarcted (MI) mouse hearts were 2–4-fold undersampled retrospectively, followed by CS reconstruction and blinded image segmentation. Subsequently, dedicated CS sampling schemes were implemented at a preclinical 9.4 T magnetic resonance imaging (MRI) system, and 2- and 3-fold undersampled cine data were acquired in normal mouse hearts with high temporal and spatial resolution. Results The retrospective analysis demonstrated that an undersampling factor of three is feasible without impairing accuracy of cardiac functional parameters. Dedicated CS sampling schemes applied prospectively to normal mouse hearts yielded comparable left-ventricular functional parameters, and intra- and interobserver variability between fully and 3-fold undersampled data. Conclusion This study introduces and validates an alternative means to speed up experimental cine-MRI without the need for expensive hardware. J. Magn. Reson. Imaging 2011. © 2011 Wiley Periodicals, Inc. PMID:21932360

  8. Cross-Cultural Validation of the Patient Perception of Integrated Care Survey.

    PubMed

    Tietschert, Maike V; Angeli, Federica; van Raak, Arno J A; Ruwaard, Dirk; Singer, Sara J

    2017-07-20

    To test the cross-cultural validity of the U.S. Patient Perception of Integrated Care (PPIC) Survey in a Dutch sample using a standardized procedure. Primary data collected from patients of five primary care centers in the south of the Netherlands, through survey research from 2014 to 2015. Cross-sectional data collected from patients who saw multiple health care providers during 6 months preceding data collection. The PPIC survey includes 59 questions that measure patient perceived care integration across providers, settings, and time. Data analysis followed a standardized procedure guiding data preparation, psychometric analysis, and included invariance testing with the U.S. dataset. Latent scale structures of the Dutch and U.S. survey were highly comparable. Factor "Integration with specialist" had lower reliability scores and noninvariance. For the remaining factors, internal consistency and invariance estimates were strong. The standardized cross-cultural validation procedure produced strong support for comparable psychometric characteristics of the Dutch and U.S. surveys. Future research should examine the usability of the proposed procedure for contexts with greater cultural differences. © Health Research and Educational Trust.

  9. Empirical gradient threshold technique for automated segmentation across image modalities and cell lines.

    PubMed

    Chalfoun, J; Majurski, M; Peskin, A; Breen, C; Bajcsy, P; Brady, M

    2015-10-01

    New microscopy technologies are enabling image acquisition of terabyte-sized data sets consisting of hundreds of thousands of images. In order to retrieve and analyze the biological information in these large data sets, segmentation is needed to detect the regions containing cells or cell colonies. Our work with hundreds of large images (each 21,000×21,000 pixels) requires a segmentation method that: (1) yields high segmentation accuracy, (2) is applicable to multiple cell lines with various densities of cells and cell colonies, and several imaging modalities, (3) can process large data sets in a timely manner, (4) has a low memory footprint and (5) has a small number of user-set parameters that do not require adjustment during the segmentation of large image sets. None of the currently available segmentation methods meet all these requirements. Segmentation based on image gradient thresholding is fast and has a low memory footprint. However, existing techniques that automate the selection of the gradient image threshold do not work across image modalities, multiple cell lines, and a wide range of foreground/background densities (requirement 2) and all failed the requirement for robust parameters that do not require re-adjustment with time (requirement 5). We present a novel and empirically derived image gradient threshold selection method for separating foreground and background pixels in an image that meets all the requirements listed above. We quantify the difference between our approach and existing ones in terms of accuracy, execution speed, memory usage and number of adjustable parameters on a reference data set. This reference data set consists of 501 validation images with manually determined segmentations and image sizes ranging from 0.36 Megapixels to 850 Megapixels. It includes four different cell lines and two image modalities: phase contrast and fluorescent. Our new technique, called Empirical Gradient Threshold (EGT), is derived from this reference data set with a 10-fold cross-validation method. EGT segments cells or colonies with resulting Dice accuracy index measurements above 0.92 for all cross-validation data sets. EGT results has also been visually verified on a much larger data set that includes bright field and Differential Interference Contrast (DIC) images, 16 cell lines and 61 time-sequence data sets, for a total of 17,479 images. This method is implemented as an open-source plugin to ImageJ as well as a standalone executable that can be downloaded from the following link: https://isg.nist.gov/. © 2015 The Authors Journal of Microscopy © 2015 Royal Microscopical Society.

  10. 10-fold detection range increase in quadrant-photodiode position sensing for photonic force microscope

    NASA Astrophysics Data System (ADS)

    Perrone, Sandro; Volpe, Giovanni; Petrov, Dmitri

    2008-10-01

    We propose a technique that permits one to increase by one order of magnitude the detection range of position sensing for the photonic force microscope with quadrant photodetectors (QPDs). This technique takes advantage of the unavoidable cross-talk between output signals of the QPD and does not assume that the output signals are linear in the probe displacement. We demonstrate the increase in the detection range from 150 to 1400 nm for a trapped polystyrene sphere with radius of 300 nm as probe.

  11. 10-fold detection range increase in quadrant-photodiode position sensing for photonic force microscope

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Perrone, Sandro; Volpe, Giovanni; Petrov, Dmitri

    2008-10-15

    We propose a technique that permits one to increase by one order of magnitude the detection range of position sensing for the photonic force microscope with quadrant photodetectors (QPDs). This technique takes advantage of the unavoidable cross-talk between output signals of the QPD and does not assume that the output signals are linear in the probe displacement. We demonstrate the increase in the detection range from 150 to 1400 nm for a trapped polystyrene sphere with radius of 300 nm as probe.

  12. 10-fold detection range increase in quadrant-photodiode position sensing for photonic force microscope.

    PubMed

    Perrone, Sandro; Volpe, Giovanni; Petrov, Dmitri

    2008-10-01

    We propose a technique that permits one to increase by one order of magnitude the detection range of position sensing for the photonic force microscope with quadrant photodetectors (QPDs). This technique takes advantage of the unavoidable cross-talk between output signals of the QPD and does not assume that the output signals are linear in the probe displacement. We demonstrate the increase in the detection range from 150 to 1400 nm for a trapped polystyrene sphere with radius of 300 nm as probe.

  13. Neural Network Prediction of ICU Length of Stay Following Cardiac Surgery Based on Pre-Incision Variables

    PubMed Central

    Pothula, Venu M.; Yuan, Stanley C.; Maerz, David A.; Montes, Lucresia; Oleszkiewicz, Stephen M.; Yusupov, Albert; Perline, Richard

    2015-01-01

    Background Advanced predictive analytical techniques are being increasingly applied to clinical risk assessment. This study compared a neural network model to several other models in predicting the length of stay (LOS) in the cardiac surgical intensive care unit (ICU) based on pre-incision patient characteristics. Methods Thirty six variables collected from 185 cardiac surgical patients were analyzed for contribution to ICU LOS. The Automatic Linear Modeling (ALM) module of IBM-SPSS software identified 8 factors with statistically significant associations with ICU LOS; these factors were also analyzed with the Artificial Neural Network (ANN) module of the same software. The weighted contributions of each factor (“trained” data) were then applied to data for a “new” patient to predict ICU LOS for that individual. Results Factors identified in the ALM model were: use of an intra-aortic balloon pump; O2 delivery index; age; use of positive cardiac inotropic agents; hematocrit; serum creatinine ≥ 1.3 mg/deciliter; gender; arterial pCO2. The r2 value for ALM prediction of ICU LOS in the initial (training) model was 0.356, p <0.0001. Cross validation in prediction of a “new” patient yielded r2 = 0.200, p <0.0001. The same 8 factors analyzed with ANN yielded a training prediction r2 of 0.535 (p <0.0001) and a cross validation prediction r2 of 0.410, p <0.0001. Two additional predictive algorithms were studied, but they had lower prediction accuracies. Our validated neural network model identified the upper quartile of ICU LOS with an odds ratio of 9.8(p <0.0001). Conclusions ANN demonstrated a 2-fold greater accuracy than ALM in prediction of observed ICU LOS. This greater accuracy would be presumed to result from the capacity of ANN to capture nonlinear effects and higher order interactions. Predictive modeling may be of value in early anticipation of risks of post-operative morbidity and utilization of ICU facilities. PMID:26710254

  14. Three-dimensional textural features of conventional MRI improve diagnostic classification of childhood brain tumours.

    PubMed

    Fetit, Ahmed E; Novak, Jan; Peet, Andrew C; Arvanitits, Theodoros N

    2015-09-01

    The aim of this study was to assess the efficacy of three-dimensional texture analysis (3D TA) of conventional MR images for the classification of childhood brain tumours in a quantitative manner. The dataset comprised pre-contrast T1 - and T2-weighted MRI series obtained from 48 children diagnosed with brain tumours (medulloblastoma, pilocytic astrocytoma and ependymoma). 3D and 2D TA were carried out on the images using first-, second- and higher order statistical methods. Six supervised classification algorithms were trained with the most influential 3D and 2D textural features, and their performances in the classification of tumour types, using the two feature sets, were compared. Model validation was carried out using the leave-one-out cross-validation (LOOCV) approach, as well as stratified 10-fold cross-validation, in order to provide additional reassurance. McNemar's test was used to test the statistical significance of any improvements demonstrated by 3D-trained classifiers. Supervised learning models trained with 3D textural features showed improved classification performances to those trained with conventional 2D features. For instance, a neural network classifier showed 12% improvement in area under the receiver operator characteristics curve (AUC) and 19% in overall classification accuracy. These improvements were statistically significant for four of the tested classifiers, as per McNemar's tests. This study shows that 3D textural features extracted from conventional T1 - and T2-weighted images can improve the diagnostic classification of childhood brain tumours. Long-term benefits of accurate, yet non-invasive, diagnostic aids include a reduction in surgical procedures, improvement in surgical and therapy planning, and support of discussions with patients' families. It remains necessary, however, to extend the analysis to a multicentre cohort in order to assess the scalability of the techniques used. Copyright © 2015 John Wiley & Sons, Ltd.

  15. Cross validation issues in multiobjective clustering

    PubMed Central

    Brusco, Michael J.; Steinley, Douglas

    2018-01-01

    The implementation of multiobjective programming methods in combinatorial data analysis is an emergent area of study with a variety of pragmatic applications in the behavioural sciences. Most notably, multiobjective programming provides a tool for analysts to model trade offs among competing criteria in clustering, seriation, and unidimensional scaling tasks. Although multiobjective programming has considerable promise, the technique can produce numerically appealing results that lack empirical validity. With this issue in mind, the purpose of this paper is to briefly review viable areas of application for multiobjective programming and, more importantly, to outline the importance of cross-validation when using this method in cluster analysis. PMID:19055857

  16. Recognizing emotional speech in Persian: a validated database of Persian emotional speech (Persian ESD).

    PubMed

    Keshtiari, Niloofar; Kuhlmann, Michael; Eslami, Moharram; Klann-Delius, Gisela

    2015-03-01

    Research on emotional speech often requires valid stimuli for assessing perceived emotion through prosody and lexical content. To date, no comprehensive emotional speech database for Persian is officially available. The present article reports the process of designing, compiling, and evaluating a comprehensive emotional speech database for colloquial Persian. The database contains a set of 90 validated novel Persian sentences classified in five basic emotional categories (anger, disgust, fear, happiness, and sadness), as well as a neutral category. These sentences were validated in two experiments by a group of 1,126 native Persian speakers. The sentences were articulated by two native Persian speakers (one male, one female) in three conditions: (1) congruent (emotional lexical content articulated in a congruent emotional voice), (2) incongruent (neutral sentences articulated in an emotional voice), and (3) baseline (all emotional and neutral sentences articulated in neutral voice). The speech materials comprise about 470 sentences. The validity of the database was evaluated by a group of 34 native speakers in a perception test. Utterances recognized better than five times chance performance (71.4 %) were regarded as valid portrayals of the target emotions. Acoustic analysis of the valid emotional utterances revealed differences in pitch, intensity, and duration, attributes that may help listeners to correctly classify the intended emotion. The database is designed to be used as a reliable material source (for both text and speech) in future cross-cultural or cross-linguistic studies of emotional speech, and it is available for academic research purposes free of charge. To access the database, please contact the first author.

  17. Cross-modal face recognition using multi-matcher face scores

    NASA Astrophysics Data System (ADS)

    Zheng, Yufeng; Blasch, Erik

    2015-05-01

    The performance of face recognition can be improved using information fusion of multimodal images and/or multiple algorithms. When multimodal face images are available, cross-modal recognition is meaningful for security and surveillance applications. For example, a probe face is a thermal image (especially at nighttime), while only visible face images are available in the gallery database. Matching a thermal probe face onto the visible gallery faces requires crossmodal matching approaches. A few such studies were implemented in facial feature space with medium recognition performance. In this paper, we propose a cross-modal recognition approach, where multimodal faces are cross-matched in feature space and the recognition performance is enhanced with stereo fusion at image, feature and/or score level. In the proposed scenario, there are two cameras for stereo imaging, two face imagers (visible and thermal images) in each camera, and three recognition algorithms (circular Gaussian filter, face pattern byte, linear discriminant analysis). A score vector is formed with three cross-matched face scores from the aforementioned three algorithms. A classifier (e.g., k-nearest neighbor, support vector machine, binomial logical regression [BLR]) is trained then tested with the score vectors by using 10-fold cross validations. The proposed approach was validated with a multispectral stereo face dataset from 105 subjects. Our experiments show very promising results: ACR (accuracy rate) = 97.84%, FAR (false accept rate) = 0.84% when cross-matching the fused thermal faces onto the fused visible faces by using three face scores and the BLR classifier.

  18. Elastica solution for a nanotube formed by self-adhesion of a folded thin film

    NASA Astrophysics Data System (ADS)

    Glassmaker, N. J.; Hui, C. Y.

    2004-09-01

    Schmidt and Eberl demonstrated the construction of tubes with submicron diameters by the method of folding thin solid films [Nature (London) 410, 168 (2001)]. In their method, a thin film is folded 180° and brought into adhesive contact with itself. The resulting sealed loop forms a nanotube with the thickness of the tube walls equal to the thickness of the thin film. The calculation of the diameter of the tube and the shape of its cross section in equilibrium are the subjects of this study. The tube is modeled as a two-dimensional elastica when viewed in cross section, and adhesive behavior is governed by an energy release rate criterion. A numerical technique is used to find elastic equilibria for a large range of material parameters. With these solutions in hand, the problem of designing a nanotube becomes transparent. It is shown that one dimensionless parameter determines the diameter of the nanotube, while another fixes its shape. Each of these parameters is a ratio involving the material's mechanical properties and the film thickness. Before concluding, we verify our model by comparing its results with the experimental observations of Schmidt and Eberl, for their materials.

  19. Predicting disulfide connectivity from protein sequence using multiple sequence feature vectors and secondary structure.

    PubMed

    Song, Jiangning; Yuan, Zheng; Tan, Hao; Huber, Thomas; Burrage, Kevin

    2007-12-01

    Disulfide bonds are primary covalent crosslinks between two cysteine residues in proteins that play critical roles in stabilizing the protein structures and are commonly found in extracy-toplasmatic or secreted proteins. In protein folding prediction, the localization of disulfide bonds can greatly reduce the search in conformational space. Therefore, there is a great need to develop computational methods capable of accurately predicting disulfide connectivity patterns in proteins that could have potentially important applications. We have developed a novel method to predict disulfide connectivity patterns from protein primary sequence, using a support vector regression (SVR) approach based on multiple sequence feature vectors and predicted secondary structure by the PSIPRED program. The results indicate that our method could achieve a prediction accuracy of 74.4% and 77.9%, respectively, when averaged on proteins with two to five disulfide bridges using 4-fold cross-validation, measured on the protein and cysteine pair on a well-defined non-homologous dataset. We assessed the effects of different sequence encoding schemes on the prediction performance of disulfide connectivity. It has been shown that the sequence encoding scheme based on multiple sequence feature vectors coupled with predicted secondary structure can significantly improve the prediction accuracy, thus enabling our method to outperform most of other currently available predictors. Our work provides a complementary approach to the current algorithms that should be useful in computationally assigning disulfide connectivity patterns and helps in the annotation of protein sequences generated by large-scale whole-genome projects. The prediction web server and Supplementary Material are accessible at http://foo.maths.uq.edu.au/~huber/disulfide

  20. Literature classification for semi-automated updating of biological knowledgebases

    PubMed Central

    2013-01-01

    Background As the output of biological assays increase in resolution and volume, the body of specialized biological data, such as functional annotations of gene and protein sequences, enables extraction of higher-level knowledge needed for practical application in bioinformatics. Whereas common types of biological data, such as sequence data, are extensively stored in biological databases, functional annotations, such as immunological epitopes, are found primarily in semi-structured formats or free text embedded in primary scientific literature. Results We defined and applied a machine learning approach for literature classification to support updating of TANTIGEN, a knowledgebase of tumor T-cell antigens. Abstracts from PubMed were downloaded and classified as either "relevant" or "irrelevant" for database update. Training and five-fold cross-validation of a k-NN classifier on 310 abstracts yielded classification accuracy of 0.95, thus showing significant value in support of data extraction from the literature. Conclusion We here propose a conceptual framework for semi-automated extraction of epitope data embedded in scientific literature using principles from text mining and machine learning. The addition of such data will aid in the transition of biological databases to knowledgebases. PMID:24564403

  1. e-Bitter: Bitterant Prediction by the Consensus Voting From the Machine-Learning Methods

    PubMed Central

    Zheng, Suqing; Jiang, Mengying; Zhao, Chengwei; Zhu, Rui; Hu, Zhicheng; Xu, Yong; Lin, Fu

    2018-01-01

    In-silico bitterant prediction received the considerable attention due to the expensive and laborious experimental-screening of the bitterant. In this work, we collect the fully experimental dataset containing 707 bitterants and 592 non-bitterants, which is distinct from the fully or partially hypothetical non-bitterant dataset used in the previous works. Based on this experimental dataset, we harness the consensus votes from the multiple machine-learning methods (e.g., deep learning etc.) combined with the molecular fingerprint to build the bitter/bitterless classification models with five-fold cross-validation, which are further inspected by the Y-randomization test and applicability domain analysis. One of the best consensus models affords the accuracy, precision, specificity, sensitivity, F1-score, and Matthews correlation coefficient (MCC) of 0.929, 0.918, 0.898, 0.954, 0.936, and 0.856 respectively on our test set. For the automatic prediction of bitterant, a graphic program “e-Bitter” is developed for the convenience of users via the simple mouse click. To our best knowledge, it is for the first time to adopt the consensus model for the bitterant prediction and develop the first free stand-alone software for the experimental food scientist. PMID:29651416

  2. Deep learning based classification of breast tumors with shear-wave elastography.

    PubMed

    Zhang, Qi; Xiao, Yang; Dai, Wei; Suo, Jingfeng; Wang, Congzhi; Shi, Jun; Zheng, Hairong

    2016-12-01

    This study aims to build a deep learning (DL) architecture for automated extraction of learned-from-data image features from the shear-wave elastography (SWE), and to evaluate the DL architecture in differentiation between benign and malignant breast tumors. We construct a two-layer DL architecture for SWE feature extraction, comprised of the point-wise gated Boltzmann machine (PGBM) and the restricted Boltzmann machine (RBM). The PGBM contains task-relevant and task-irrelevant hidden units, and the task-relevant units are connected to the RBM. Experimental evaluation was performed with five-fold cross validation on a set of 227 SWE images, 135 of benign tumors and 92 of malignant tumors, from 121 patients. The features learned with our DL architecture were compared with the statistical features quantifying image intensity and texture. Results showed that the DL features achieved better classification performance with an accuracy of 93.4%, a sensitivity of 88.6%, a specificity of 97.1%, and an area under the receiver operating characteristic curve of 0.947. The DL-based method integrates feature learning with feature selection on SWE. It may be potentially used in clinical computer-aided diagnosis of breast cancer. Copyright © 2016 Elsevier B.V. All rights reserved.

  3. Prediction of road traffic death rate using neural networks optimised by genetic algorithm.

    PubMed

    Jafari, Seyed Ali; Jahandideh, Sepideh; Jahandideh, Mina; Asadabadi, Ebrahim Barzegari

    2015-01-01

    Road traffic injuries (RTIs) are realised as a main cause of public health problems at global, regional and national levels. Therefore, prediction of road traffic death rate will be helpful in its management. Based on this fact, we used an artificial neural network model optimised through Genetic algorithm to predict mortality. In this study, a five-fold cross-validation procedure on a data set containing total of 178 countries was used to verify the performance of models. The best-fit model was selected according to the root mean square errors (RMSE). Genetic algorithm, as a powerful model which has not been introduced in prediction of mortality to this extent in previous studies, showed high performance. The lowest RMSE obtained was 0.0808. Such satisfactory results could be attributed to the use of Genetic algorithm as a powerful optimiser which selects the best input feature set to be fed into the neural networks. Seven factors have been known as the most effective factors on the road traffic mortality rate by high accuracy. The gained results displayed that our model is very promising and may play a useful role in developing a better method for assessing the influence of road traffic mortality risk factors.

  4. Deep learning for brain tumor classification

    NASA Astrophysics Data System (ADS)

    Paul, Justin S.; Plassard, Andrew J.; Landman, Bennett A.; Fabbri, Daniel

    2017-03-01

    Recent research has shown that deep learning methods have performed well on supervised machine learning, image classification tasks. The purpose of this study is to apply deep learning methods to classify brain images with different tumor types: meningioma, glioma, and pituitary. A dataset was publicly released containing 3,064 T1-weighted contrast enhanced MRI (CE-MRI) brain images from 233 patients with either meningioma, glioma, or pituitary tumors split across axial, coronal, or sagittal planes. This research focuses on the 989 axial images from 191 patients in order to avoid confusing the neural networks with three different planes containing the same diagnosis. Two types of neural networks were used in classification: fully connected and convolutional neural networks. Within these two categories, further tests were computed via the augmentation of the original 512×512 axial images. Training neural networks over the axial data has proven to be accurate in its classifications with an average five-fold cross validation of 91.43% on the best trained neural network. This result demonstrates that a more general method (i.e. deep learning) can outperform specialized methods that require image dilation and ring-forming subregions on tumors.

  5. Sorting protein decoys by machine-learning-to-rank

    PubMed Central

    Jing, Xiaoyang; Wang, Kai; Lu, Ruqian; Dong, Qiwen

    2016-01-01

    Much progress has been made in Protein structure prediction during the last few decades. As the predicted models can span a broad range of accuracy spectrum, the accuracy of quality estimation becomes one of the key elements of successful protein structure prediction. Over the past years, a number of methods have been developed to address this issue, and these methods could be roughly divided into three categories: the single-model methods, clustering-based methods and quasi single-model methods. In this study, we develop a single-model method MQAPRank based on the learning-to-rank algorithm firstly, and then implement a quasi single-model method Quasi-MQAPRank. The proposed methods are benchmarked on the 3DRobot and CASP11 dataset. The five-fold cross-validation on the 3DRobot dataset shows the proposed single model method outperforms other methods whose outputs are taken as features of the proposed method, and the quasi single-model method can further enhance the performance. On the CASP11 dataset, the proposed methods also perform well compared with other leading methods in corresponding categories. In particular, the Quasi-MQAPRank method achieves a considerable performance on the CASP11 Best150 dataset. PMID:27530967

  6. e-Bitter: Bitterant Prediction by the Consensus Voting From the Machine-learning Methods

    NASA Astrophysics Data System (ADS)

    Zheng, Suqing; Jiang, Mengying; Zhao, Chengwei; Zhu, Rui; Hu, Zhicheng; Xu, Yong; Lin, Fu

    2018-03-01

    In-silico bitterant prediction received the considerable attention due to the expensive and laborious experimental-screening of the bitterant. In this work, we collect the fully experimental dataset containing 707 bitterants and 592 non-bitterants, which is distinct from the fully or partially hypothetical non-bitterant dataset used in the previous works. Based on this experimental dataset, we harness the consensus votes from the multiple machine-learning methods (e.g., deep learning etc.) combined with the molecular fingerprint to build the bitter/bitterless classification models with five-fold cross-validation, which are further inspected by the Y-randomization test and applicability domain analysis. One of the best consensus models affords the accuracy, precision, specificity, sensitivity, F1-score, and Matthews correlation coefficient (MCC) of 0.929, 0.918, 0.898, 0.954, 0.936, and 0.856 respectively on our test set. For the automatic prediction of bitterant, a graphic program “e-Bitter” is developed for the convenience of users via the simple mouse click. To our best knowledge, it is for the first time to adopt the consensus model for the bitterant prediction and develop the first free stand-alone software for the experimental food scientist.

  7. Sorting protein decoys by machine-learning-to-rank.

    PubMed

    Jing, Xiaoyang; Wang, Kai; Lu, Ruqian; Dong, Qiwen

    2016-08-17

    Much progress has been made in Protein structure prediction during the last few decades. As the predicted models can span a broad range of accuracy spectrum, the accuracy of quality estimation becomes one of the key elements of successful protein structure prediction. Over the past years, a number of methods have been developed to address this issue, and these methods could be roughly divided into three categories: the single-model methods, clustering-based methods and quasi single-model methods. In this study, we develop a single-model method MQAPRank based on the learning-to-rank algorithm firstly, and then implement a quasi single-model method Quasi-MQAPRank. The proposed methods are benchmarked on the 3DRobot and CASP11 dataset. The five-fold cross-validation on the 3DRobot dataset shows the proposed single model method outperforms other methods whose outputs are taken as features of the proposed method, and the quasi single-model method can further enhance the performance. On the CASP11 dataset, the proposed methods also perform well compared with other leading methods in corresponding categories. In particular, the Quasi-MQAPRank method achieves a considerable performance on the CASP11 Best150 dataset.

  8. Inter-rater and intra-rater reliability of the Bahasa Melayu version of Rose Angina Questionnaire.

    PubMed

    Hassan, N B; Choudhury, S R; Naing, L; Conroy, R M; Rahman, A R A

    2007-01-01

    The objective of the study is to translate the Rose Questionnaire (RQ) into a Bahasa Melayu version and adapt it cross-culturally, and to measure its inter-rater and intrarater reliability. This cross sectional study was conducted in the respondents' homes or workplaces in Kelantan, Malaysia. One hundred respondents aged 30 and above with different socio-demographic status were interviewed for face validity. For each inter-rater and intra-rater reliability, a sample of 150 respondents was interviewed. Inter-rater and intra-rater reliabilities were assessed by Cohen's kappa. The overall inter-rater agreements by the five pair of interviewers at point one and two were 0.86, and intrarater reliability by the five interviewers on the seven-item questionnaire at poinone and two was 0.88, as measured by kappa coefficient. The translated Malay version of RQ demonstrated an almost perfect inter-rater and intra-rater reliability and further validation such as sensitivity and specificity analysis of this translated questionnaire is highly recommended.

  9. Fingerprinting stress: stylolite and calcite twinning paleopiezometry reveal the complexity of stress distribution during the growth of the Monte Nero anticline (Apennines, Italy).

    NASA Astrophysics Data System (ADS)

    Beaudoin, Nicolas; Koehn, Daniel; Lacombe, Olivier; Lecouty, Alexandre; Billi, Andrea; Aharonov, Einat; Parlangeau, Camille

    2016-04-01

    This contribution presents for the first time how quantitative stress estimates can be derived by combining calcite twinning and stylolite roughness stress fingerprinting techniques in a structure part of a complex fold and thrust belts. We report a high-resolution deformation and stress history that was experienced by Meso-Cenozoic limestone strata in the overturned Monte Nero Anticline during its late Miocene-Pliocene growth in the Umbria-Marche Arcuate Ridge (northern Apennines, Italy). New methodological development enables an easier use for the inversion technique of sedimentary and tectonic stylolite roughness. A stylolite-fracture network developed during layer-parallel shortening (LPS), as well as syn- and post-folding. Stress fingerprinting shows how stress builds up in the sedimentary strata during LPS with variations of differential stress before folding around a value of 50 MPa. The stress regime oscillated between strike-slip and compressional during LPS and became transiently extensional in limbs of developing fold due to a coeval increase of vertical stress related to local burial and decrease of maximum horizontal stress related to hinge development, before ultimately becoming strike-slip again during late stage fold tightening. Our case study shows that stress fingerprinting is possible and that this novel method can be used to unravel complex temporal relationships that relate to local variations within evolving regional orogenic stresses. Beyond regional implication, this study validates our approach as a new exciting toolbox to high-resolution stress fingerprinting in basins and orogens.

  10. Application of proteomics in the discovery of candidate protein biomarkers in a Diabetes Autoantibody Standardization Program (DASP) sample subset

    PubMed Central

    Metz, Thomas O.; Qian, Wei-Jun; Jacobs, Jon M.; Gritsenko, Marina A.; Moore, Ronald J.; Polpitiya, Ashoka D.; Monroe, Matthew E.; Camp, David G.; Mueller, Patricia W.; Smith, Richard D.

    2009-01-01

    Novel biomarkers of type 1 diabetes must be identified and validated in initial, exploratory studies before they can be assessed in proficiency evaluations. Currently, untargeted “-omics” approaches are under-utilized in profiling studies of clinical samples. This report describes the evaluation of capillary liquid chromatography (LC) coupled with mass spectrometry (MS) in a pilot proteomic analysis of human plasma and serum from a subset of control and type 1 diabetic individuals enrolled in the Diabetes Autoantibody Standardization Program with the goal of identifying candidate biomarkers of type 1 diabetes. Initial high-resolution capillary LC-MS/MS experiments were performed to augment an existing plasma peptide database, while subsequent LC-FTICR studies identified quantitative differences in the abundance of plasma proteins. Analysis of LC-FTICR proteomic data identified five candidate protein biomarkers of type 1 diabetes. Alpha-2-glycoprotein 1 (zinc), corticosteroid-binding globulin, and lumican were 2-fold up-regulated in type 1 diabetic samples relative to control samples, whereas clusterin and serotransferrin were 2-fold up-regulated in control samples relative to type 1 diabetic samples. Observed perturbations in the levels of all five proteins are consistent with the metabolic aberrations found in type 1 diabetes. While the discovery of these candidate protein biomarkers of type 1 diabetes is encouraging, follow up studies are required for validation in a larger population of individuals and for determination of laboratory-defined sensitivity and specificity values using blinded samples. PMID:18092746

  11. Application of proteomics in the discovery of candidate protein biomarkers in a diabetes autoantibody standardization program sample subset.

    PubMed

    Metz, Thomas O; Qian, Wei-Jun; Jacobs, Jon M; Gritsenko, Marina A; Moore, Ronald J; Polpitiya, Ashoka D; Monroe, Matthew E; Camp, David G; Mueller, Patricia W; Smith, Richard D

    2008-02-01

    Novel biomarkers of type 1 diabetes must be identified and validated in initial, exploratory studies before they can be assessed in proficiency evaluations. Currently, untargeted "-omics" approaches are underutilized in profiling studies of clinical samples. This report describes the evaluation of capillary liquid chromatography (LC) coupled with mass spectrometry (MS) in a pilot proteomic analysis of human plasma and serum from a subset of control and type 1 diabetic individuals enrolled in the Diabetes Autoantibody Standardization Program, with the goal of identifying candidate biomarkers of type 1 diabetes. Initial high-resolution capillary LC-MS/MS experiments were performed to augment an existing plasma peptide database, while subsequent LC-FTICR studies identified quantitative differences in the abundance of plasma proteins. Analysis of LC-FTICR proteomic data identified five candidate protein biomarkers of type 1 diabetes. alpha-2-Glycoprotein 1 (zinc), corticosteroid-binding globulin, and lumican were 2-fold up-regulated in type 1 diabetic samples relative to control samples, whereas clusterin and serotransferrin were 2-fold up-regulated in control samples relative to type 1 diabetic samples. Observed perturbations in the levels of all five proteins are consistent with the metabolic aberrations found in type 1 diabetes. While the discovery of these candidate protein biomarkers of type 1 diabetes is encouraging, follow up studies are required for validation in a larger population of individuals and for determination of laboratory-defined sensitivity and specificity values using blinded samples.

  12. Protein-protein interaction inference based on semantic similarity of Gene Ontology terms.

    PubMed

    Zhang, Shu-Bo; Tang, Qiang-Rong

    2016-07-21

    Identifying protein-protein interactions is important in molecular biology. Experimental methods to this issue have their limitations, and computational approaches have attracted more and more attentions from the biological community. The semantic similarity derived from the Gene Ontology (GO) annotation has been regarded as one of the most powerful indicators for protein interaction. However, conventional methods based on GO similarity fail to take advantage of the specificity of GO terms in the ontology graph. We proposed a GO-based method to predict protein-protein interaction by integrating different kinds of similarity measures derived from the intrinsic structure of GO graph. We extended five existing methods to derive the semantic similarity measures from the descending part of two GO terms in the GO graph, then adopted a feature integration strategy to combines both the ascending and the descending similarity scores derived from the three sub-ontologies to construct various kinds of features to characterize each protein pair. Support vector machines (SVM) were employed as discriminate classifiers, and five-fold cross validation experiments were conducted on both human and yeast protein-protein interaction datasets to evaluate the performance of different kinds of integrated features, the experimental results suggest the best performance of the feature that combines information from both the ascending and the descending parts of the three ontologies. Our method is appealing for effective prediction of protein-protein interaction. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Using trend templates in a neonatal seizure algorithm improves detection of short seizures in a foetal ovine model.

    PubMed

    Zwanenburg, Alex; Andriessen, Peter; Jellema, Reint K; Niemarkt, Hendrik J; Wolfs, Tim G A M; Kramer, Boris W; Delhaas, Tammo

    2015-03-01

    Seizures below one minute in duration are difficult to assess correctly using seizure detection algorithms. We aimed to improve neonatal detection algorithm performance for short seizures through the use of trend templates for seizure onset and end. Bipolar EEG were recorded within a transiently asphyxiated ovine model at 0.7 gestational age, a common experimental model for studying brain development in humans of 30-34 weeks of gestation. Transient asphyxia led to electrographic seizures within 6-8 h. A total of 3159 seizures, 2386 shorter than one minute, were annotated in 1976 h-long EEG recordings from 17 foetal lambs. To capture EEG characteristics, five features, sensitive to seizures, were calculated and used to derive trend information. Feature values and trend information were used as input for support vector machine classification and subsequently post-processed. Performance metrics, calculated after post-processing, were compared between analyses with and without employing trend information. Detector performance was assessed after five-fold cross-validation conducted ten times with random splits. The use of trend templates for seizure onset and end in a neonatal seizure detection algorithm significantly improves the correct detection of short seizures using two-channel EEG recordings from 54.3% (52.6-56.1) to 59.5% (58.5-59.9) at FDR 2.0 (median (range); p < 0.001, Wilcoxon signed rank test). Using trend templates might therefore aid in detection of short seizures by EEG monitoring at the NICU.

  14. CarcinoPred-EL: Novel models for predicting the carcinogenicity of chemicals using molecular fingerprints and ensemble learning methods.

    PubMed

    Zhang, Li; Ai, Haixin; Chen, Wen; Yin, Zimo; Hu, Huan; Zhu, Junfeng; Zhao, Jian; Zhao, Qi; Liu, Hongsheng

    2017-05-18

    Carcinogenicity refers to a highly toxic end point of certain chemicals, and has become an important issue in the drug development process. In this study, three novel ensemble classification models, namely Ensemble SVM, Ensemble RF, and Ensemble XGBoost, were developed to predict carcinogenicity of chemicals using seven types of molecular fingerprints and three machine learning methods based on a dataset containing 1003 diverse compounds with rat carcinogenicity. Among these three models, Ensemble XGBoost is found to be the best, giving an average accuracy of 70.1 ± 2.9%, sensitivity of 67.0 ± 5.0%, and specificity of 73.1 ± 4.4% in five-fold cross-validation and an accuracy of 70.0%, sensitivity of 65.2%, and specificity of 76.5% in external validation. In comparison with some recent methods, the ensemble models outperform some machine learning-based approaches and yield equal accuracy and higher specificity but lower sensitivity than rule-based expert systems. It is also found that the ensemble models could be further improved if more data were available. As an application, the ensemble models are employed to discover potential carcinogens in the DrugBank database. The results indicate that the proposed models are helpful in predicting the carcinogenicity of chemicals. A web server called CarcinoPred-EL has been built for these models ( http://ccsipb.lnu.edu.cn/toxicity/CarcinoPred-EL/ ).

  15. Identification of S-glutathionylation sites in species-specific proteins by incorporating five sequence-derived features into the general pseudo-amino acid composition.

    PubMed

    Zhao, Xiaowei; Ning, Qiao; Ai, Meiyue; Chai, Haiting; Yang, Guifu

    2016-06-07

    As a selective and reversible protein post-translational modification, S-glutathionylation generates mixed disulfides between glutathione (GSH) and cysteine residues, and plays an important role in regulating protein activity, stability, and redox regulation. To fully understand S-glutathionylation mechanisms, identification of substrates and specific S-Glutathionylated sites is crucial. Experimental identification of S-glutathionylated sites is labor-intensive and time consuming, so establishing an effective computational method is much desirable due to their convenient and fast speed. Therefore, in this study, a new bioinformatics tool named SSGlu (Species-Specific identification of Protein S-glutathionylation Sites) was developed to identify species-specific protein S-glutathionylated sites, utilizing support vector machines that combine multiple sequence-derived features with a two-step feature selection. By 5-fold cross validation, the performance of SSGlu was measured with an AUC of 0.8105 and 0.8041 for Homo sapiens and Mus musculus, respectively. Additionally, SSGlu was compared with the existing methods, and the higher MCC and AUC of SSGlu demonstrated that SSGlu was very promising to predict S-glutathionylated sites. Furthermore, a site-specific analysis showed that S-glutathionylation intimately correlated with the features derived from its surrounding sites. The conclusions derived from this study might help to understand more of the S-glutathionylation mechanism and guide the related experimental validation. For public access, SSGlu is freely accessible at http://59.73.198.144:8080/SSGlu/. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Design of fuzzy classifier for diabetes disease using Modified Artificial Bee Colony algorithm.

    PubMed

    Beloufa, Fayssal; Chikh, M A

    2013-10-01

    In this study, diagnosis of diabetes disease, which is one of the most important diseases, is conducted with artificial intelligence techniques. We have proposed a novel Artificial Bee Colony (ABC) algorithm in which a mutation operator is added to an Artificial Bee Colony for improving its performance. When the current best solution cannot be updated, a blended crossover operator (BLX-α) of genetic algorithm is applied, in order to enhance the diversity of ABC, without compromising with the solution quality. This modified version of ABC is used as a new tool to create and optimize automatically the membership functions and rules base directly from data. We take the diabetes dataset used in our work from the UCI machine learning repository. The performances of the proposed method are evaluated through classification rate, sensitivity and specificity values using 10-fold cross-validation method. The obtained classification rate of our method is 84.21% and it is very promising when compared with the previous research in the literature for the same problem. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  17. Natural Language Processing Techniques for Extracting and Categorizing Finding Measurements in Narrative Radiology Reports.

    PubMed

    Sevenster, M; Buurman, J; Liu, P; Peters, J F; Chang, P J

    2015-01-01

    Accumulating quantitative outcome parameters may contribute to constructing a healthcare organization in which outcomes of clinical procedures are reproducible and predictable. In imaging studies, measurements are the principal category of quantitative para meters. The purpose of this work is to develop and evaluate two natural language processing engines that extract finding and organ measurements from narrative radiology reports and to categorize extracted measurements by their "temporality". The measurement extraction engine is developed as a set of regular expressions. The engine was evaluated against a manually created ground truth. Automated categorization of measurement temporality is defined as a machine learning problem. A ground truth was manually developed based on a corpus of radiology reports. A maximum entropy model was created using features that characterize the measurement itself and its narrative context. The model was evaluated in a ten-fold cross validation protocol. The measurement extraction engine has precision 0.994 and recall 0.991. Accuracy of the measurement classification engine is 0.960. The work contributes to machine understanding of radiology reports and may find application in software applications that process medical data.

  18. Benign-malignant mass classification in mammogram using edge weighted local texture features

    NASA Astrophysics Data System (ADS)

    Rabidas, Rinku; Midya, Abhishek; Sadhu, Anup; Chakraborty, Jayasree

    2016-03-01

    This paper introduces novel Discriminative Robust Local Binary Pattern (DRLBP) and Discriminative Robust Local Ternary Pattern (DRLTP) for the classification of mammographic masses as benign or malignant. Mass is one of the common, however, challenging evidence of breast cancer in mammography and diagnosis of masses is a difficult task. Since DRLBP and DRLTP overcome the drawbacks of Local Binary Pattern (LBP) and Local Ternary Pattern (LTP) by discriminating a brighter object against the dark background and vice-versa, in addition to the preservation of the edge information along with the texture information, several edge-preserving texture features are extracted, in this study, from DRLBP and DRLTP. Finally, a Fisher Linear Discriminant Analysis method is incorporated with discriminating features, selected by stepwise logistic regression method, for the classification of benign and malignant masses. The performance characteristics of DRLBP and DRLTP features are evaluated using a ten-fold cross-validation technique with 58 masses from the mini-MIAS database, and the best result is observed with DRLBP having an area under the receiver operating characteristic curve of 0.982.

  19. Longitudinal Study-Based Dementia Prediction for Public Health

    PubMed Central

    Kim, HeeChel; Chun, Hong-Woo; Kim, Seonho; Coh, Byoung-Youl; Kwon, Oh-Jin; Moon, Yeong-Ho

    2017-01-01

    The issue of public health in Korea has attracted significant attention given the aging of the country’s population, which has created many types of social problems. The approach proposed in this article aims to address dementia, one of the most significant symptoms of aging and a public health care issue in Korea. The Korean National Health Insurance Service Senior Cohort Database contains personal medical data of every citizen in Korea. There are many different medical history patterns between individuals with dementia and normal controls. The approach used in this study involved examination of personal medical history features from personal disease history, sociodemographic data, and personal health examinations to develop a prediction model. The prediction model used a support-vector machine learning technique to perform a 10-fold cross-validation analysis. The experimental results demonstrated promising performance (80.9% F-measure). The proposed approach supported the significant influence of personal medical history features during an optimal observation period. It is anticipated that a biomedical “big data”-based disease prediction model may assist the diagnosis of any disease more correctly. PMID:28867810

  20. A hybrid feature selection method using multiclass SVM for diagnosis of erythemato-squamous disease

    NASA Astrophysics Data System (ADS)

    Maryam, Setiawan, Noor Akhmad; Wahyunggoro, Oyas

    2017-08-01

    The diagnosis of erythemato-squamous disease is a complex problem and difficult to detect in dermatology. Besides that, it is a major cause of skin cancer. Data mining implementation in the medical field helps expert to diagnose precisely, accurately, and inexpensively. In this research, we use data mining technique to developed a diagnosis model based on multiclass SVM with a novel hybrid feature selection method to diagnose erythemato-squamous disease. Our hybrid feature selection method, named ChiGA (Chi Square and Genetic Algorithm), uses the advantages from filter and wrapper methods to select the optimal feature subset from original feature. Chi square used as filter method to remove redundant features and GA as wrapper method to select the ideal feature subset with SVM used as classifier. Experiment performed with 10 fold cross validation on erythemato-squamous diseases dataset taken from University of California Irvine (UCI) machine learning database. The experimental result shows that the proposed model based multiclass SVM with Chi Square and GA can give an optimum feature subset. There are 18 optimum features with 99.18% accuracy.

  1. A support vector machine for predicting defibrillation outcomes from waveform metrics.

    PubMed

    Howe, Andrew; Escalona, Omar J; Di Maio, Rebecca; Massot, Bertrand; Cromie, Nick A; Darragh, Karen M; Adgey, Jennifer; McEneaney, David J

    2014-03-01

    Algorithms to predict shock success based on VF waveform metrics could significantly enhance resuscitation by optimising the timing of defibrillation. To investigate robust methods of predicting defibrillation success in VF cardiac arrest patients, by using a support vector machine (SVM) optimisation approach. Frequency-domain (AMSA, dominant frequency and median frequency) and time-domain (slope and RMS amplitude) VF waveform metrics were calculated in a 4.1Y window prior to defibrillation. Conventional prediction test validity of each waveform parameter was conducted and used AUC>0.6 as the criterion for inclusion as a corroborative attribute processed by the SVM classification model. The latter used a Gaussian radial-basis-function (RBF) kernel and the error penalty factor C was fixed to 1. A two-fold cross-validation resampling technique was employed. A total of 41 patients had 115 defibrillation instances. AMSA, slope and RMS waveform metrics performed test validation with AUC>0.6 for predicting termination of VF and return-to-organised rhythm. Predictive accuracy of the optimised SVM design for termination of VF was 81.9% (± 1.24 SD); positive and negative predictivity were respectively 84.3% (± 1.98 SD) and 77.4% (± 1.24 SD); sensitivity and specificity were 87.6% (± 2.69 SD) and 71.6% (± 9.38 SD) respectively. AMSA, slope and RMS were the best VF waveform frequency-time parameters predictors of termination of VF according to test validity assessment. This a priori can be used for a simplified SVM optimised design that combines the predictive attributes of these VF waveform metrics for improved prediction accuracy and generalisation performance without requiring the definition of any threshold value on waveform metrics. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  2. Modelling by partial least squares the relationship between the HPLC mobile phases and analytes on phenyl column.

    PubMed

    Markopoulou, Catherine K; Kouskoura, Maria G; Koundourellis, John E

    2011-06-01

    Twenty-five descriptors and 61 structurally different analytes have been used on a partial least squares (PLS) to latent structure technique in order to study chromatographically their interaction mechanism on a phenyl column. According to the model, 240 different retention times of the analytes, expressed as Y variable (log k), at different % MeOH mobile-phase concentrations have been correlated with their theoretical most important structural or molecular descriptors. The goodness-of-fit was estimated by the coefficient of multiple determinations r(2) (0.919), and the root mean square error of estimation (RMSEE=0.1283) values with a predictive ability (Q(2)) of 0.901. The model was further validated using cross-validation (CV), validated by 20 response permutations r(2) (0.0, 0.0146), Q(2) (0.0, -0.136) and validated by external prediction. The contribution of certain mechanism interactions between the analytes, the mobile phase and the column, proportional or counterbalancing is also studied. Trying to evaluate the influence on Y of every variable in a PLS model, VIP (variables importance in the projection) plot provides evidence that lipophilicity (expressed as Log D, Log P), polarizability, refractivity and the eluting power of the mobile phase are dominant in the retention mechanism on a phenyl column. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Can multi-subpopulation reference sets improve the genomic predictive ability for pigs?

    PubMed

    Fangmann, A; Bergfelder-Drüing, S; Tholen, E; Simianer, H; Erbe, M

    2015-12-01

    In most countries and for most livestock species, genomic evaluations are obtained from within-breed analyses. To achieve reliable breeding values, however, a sufficient reference sample size is essential. To increase this size, the use of multibreed reference populations for small populations is considered a suitable option in other species. Over decades, the separate breeding work of different pig breeding organizations in Germany has led to stratified subpopulations in the breed German Large White. Due to this fact and the limited number of Large White animals available in each organization, there was a pressing need for ascertaining if multi-subpopulation genomic prediction is superior compared with within-subpopulation prediction in pigs. Direct genomic breeding values were estimated with genomic BLUP for the trait "number of piglets born alive" using genotype data (Illumina Porcine 60K SNP BeadChip) from 2,053 German Large White animals from five different commercial pig breeding companies. To assess the prediction accuracy of within- and multi-subpopulation reference sets, a random 5-fold cross-validation with 20 replications was performed. The five subpopulations considered were only slightly differentiated from each other. However, the prediction accuracy of the multi-subpopulations approach was not better than that of the within-subpopulation evaluation, for which the predictive ability was already high. Reference sets composed of closely related multi-subpopulation sets performed better than sets of distantly related subpopulations but not better than the within-subpopulation approach. Despite the low differentiation of the five subpopulations, the genetic connectedness between these different subpopulations seems to be too small to improve the prediction accuracy by applying multi-subpopulation reference sets. Consequently, resources should be used for enlarging the reference population within subpopulation, for example, by adding genotyped females.

  4. Discrimination of raw and processed Dipsacus asperoides by near infrared spectroscopy combined with least squares-support vector machine and random forests

    NASA Astrophysics Data System (ADS)

    Xin, Ni; Gu, Xiao-Feng; Wu, Hao; Hu, Yu-Zhu; Yang, Zhong-Lin

    2012-04-01

    Most herbal medicines could be processed to fulfill the different requirements of therapy. The purpose of this study was to discriminate between raw and processed Dipsacus asperoides, a common traditional Chinese medicine, based on their near infrared (NIR) spectra. Least squares-support vector machine (LS-SVM) and random forests (RF) were employed for full-spectrum classification. Three types of kernels, including linear kernel, polynomial kernel and radial basis function kernel (RBF), were checked for optimization of LS-SVM model. For comparison, a linear discriminant analysis (LDA) model was performed for classification, and the successive projections algorithm (SPA) was executed prior to building an LDA model to choose an appropriate subset of wavelengths. The three methods were applied to a dataset containing 40 raw herbs and 40 corresponding processed herbs. We ran 50 runs of 10-fold cross validation to evaluate the model's efficiency. The performance of the LS-SVM with RBF kernel (RBF LS-SVM) was better than the other two kernels. The RF, RBF LS-SVM and SPA-LDA successfully classified all test samples. The mean error rates for the 50 runs of 10-fold cross validation were 1.35% for RBF LS-SVM, 2.87% for RF, and 2.50% for SPA-LDA. The best classification results were obtained by using LS-SVM with RBF kernel, while RF was fast in the training and making predictions.

  5. Generative Topographic Mapping of Conformational Space.

    PubMed

    Horvath, Dragos; Baskin, Igor; Marcou, Gilles; Varnek, Alexandre

    2017-10-01

    Herein, Generative Topographic Mapping (GTM) was challenged to produce planar projections of the high-dimensional conformational space of complex molecules (the 1LE1 peptide). GTM is a probability-based mapping strategy, and its capacity to support property prediction models serves to objectively assess map quality (in terms of regression statistics). The properties to predict were total, non-bonded and contact energies, surface area and fingerprint darkness. Map building and selection was controlled by a previously introduced evolutionary strategy allowed to choose the best-suited conformational descriptors, options including classical terms and novel atom-centric autocorrellograms. The latter condensate interatomic distance patterns into descriptors of rather low dimensionality, yet precise enough to differentiate between close favorable contacts and atom clashes. A subset of 20 K conformers of the 1LE1 peptide, randomly selected from a pool of 2 M geometries (generated by the S4MPLE tool) was employed for map building and cross-validation of property regression models. The GTM build-up challenge reached robust three-fold cross-validated determination coefficients of Q 2 =0.7…0.8, for all modeled properties. Mapping of the full 2 M conformer set produced intuitive and information-rich property landscapes. Functional and folding subspaces appear as well-separated zones, even though RMSD with respect to the PDB structure was never used as a selection criterion of the maps. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  6. Predicting drug-induced liver injury in human with Naïve Bayes classifier approach.

    PubMed

    Zhang, Hui; Ding, Lan; Zou, Yi; Hu, Shui-Qing; Huang, Hai-Guo; Kong, Wei-Bao; Zhang, Ji

    2016-10-01

    Drug-induced liver injury (DILI) is one of the major safety concerns in drug development. Although various toxicological studies assessing DILI risk have been developed, these methods were not sufficient in predicting DILI in humans. Thus, developing new tools and approaches to better predict DILI risk in humans has become an important and urgent task. In this study, we aimed to develop a computational model for assessment of the DILI risk with using a larger scale human dataset and Naïve Bayes classifier. The established Naïve Bayes prediction model was evaluated by 5-fold cross validation and an external test set. For the training set, the overall prediction accuracy of the 5-fold cross validation was 94.0 %. The sensitivity, specificity, positive predictive value and negative predictive value were 97.1, 89.2, 93.5 and 95.1 %, respectively. The test set with the concordance of 72.6 %, sensitivity of 72.5 %, specificity of 72.7 %, positive predictive value of 80.4 %, negative predictive value of 63.2 %. Furthermore, some important molecular descriptors related to DILI risk and some toxic/non-toxic fragments were identified. Thus, we hope the prediction model established here would be employed for the assessment of human DILI risk, and the obtained molecular descriptors and substructures should be taken into consideration in the design of new candidate compounds to help medicinal chemists rationally select the chemicals with the best prospects to be effective and safe.

  7. Diagnostic features of Alzheimer's disease extracted from PET sinograms

    NASA Astrophysics Data System (ADS)

    Sayeed, A.; Petrou, M.; Spyrou, N.; Kadyrov, A.; Spinks, T.

    2002-01-01

    Texture analysis of positron emission tomography (PET) images of the brain is a very difficult task, due to the poor signal to noise ratio. As a consequence, very few techniques can be implemented successfully. We use a new global analysis technique known as the Trace transform triple features. This technique can be applied directly to the raw sinograms to distinguish patients with Alzheimer's disease (AD) from normal volunteers. FDG-PET images of 18 AD and 10 normal controls obtained from the same CTI ECAT-953 scanner were used in this study. The Trace transform triple feature technique was used to extract features that were invariant to scaling, translation and rotation, referred to as invariant features, as well as features that were sensitive to rotation but invariant to scaling and translation, referred to as sensitive features in this study. The features were used to classify the groups using discriminant function analysis. Cross-validation tests using stepwise discriminant function analysis showed that combining both sensitive and invariant features produced the best results, when compared with the clinical diagnosis. Selecting the five best features produces an overall accuracy of 93% with sensitivity of 94% and specificity of 90%. This is comparable with the classification accuracy achieved by Kippenhan et al (1992), using regional metabolic activity.

  8. Validating LES for Jet Aeroacoustics

    NASA Technical Reports Server (NTRS)

    Bridges, James

    2011-01-01

    Engineers charged with making jet aircraft quieter have long dreamed of being able to see exactly how turbulent eddies produce sound and this dream is now coming true with the advent of large eddy simulation (LES). Two obvious challenges remain: validating the LES codes at the resolution required to see the fluid-acoustic coupling, and the interpretation of the massive datasets that result in having dreams come true. This paper primarily addresses the former, the use of advanced experimental techniques such as particle image velocimetry (PIV) and Raman and Rayleigh scattering, to validate the computer codes and procedures used to create LES solutions. It also addresses the latter problem in discussing what are relevant measures critical for aeroacoustics that should be used in validating LES codes. These new diagnostic techniques deliver measurements and flow statistics of increasing sophistication and capability, but what of their accuracy? And what are the measures to be used in validation? This paper argues that the issue of accuracy be addressed by cross-facility and cross-disciplinary examination of modern datasets along with increased reporting of internal quality checks in PIV analysis. Further, it is argued that the appropriate validation metrics for aeroacoustic applications are increasingly complicated statistics that have been shown in aeroacoustic theory to be critical to flow-generated sound.

  9. The Vocal Cord Dysfunction Questionnaire: Validity and Reliability of the Persian Version.

    PubMed

    Ghaemi, Hamide; Khoddami, Seyyedeh Maryam; Soleymani, Zahra; Zandieh, Fariborz; Jalaie, Shohreh; Ahanchian, Hamid; Khadivi, Ehsan

    2017-12-25

    The aim of this study was to develop, validate, and assess the reliability of the Persian version of Vocal Cord Dysfunction Questionnaire (VCDQ P ). The study design was cross-sectional or cultural survey. Forty-four patients with vocal fold dysfunction (VFD) and 40 healthy volunteers were recruited for the study. To assess the content validity, the prefinal questions were given to 15 experts to comment on its essential. Ten patients with VFD rated the importance of VCDQ P in detecting face validity. Eighteen of the patients with VFD completed the VCDQ 1 week later for test-retest reliability. To detect absolute reliability, standard error of measurement and smallest detected change were calculated. Concurrent validity was assessed by completing the Persian Chronic Obstructive Pulmonary Disease (COPD) Assessment Test (CAT) by 34 patients with VFD. Discriminant validity was measured from 34 participants. The VCDQ was further validated by administering the questionnaire to 40 healthy volunteers. Validation of the VCDQ as a treatment outcome tool was conducted in 18 patients with VFD using pre- and posttreatment scores. The internal consistency was confirmed (Cronbach α = 0.78). The test-retest reliability was excellent (intraclass correlation coefficient = 0.97). The standard error of measurement and smallest detected change values were acceptable (0.39 and 1.08, respectively). There was a significant correlation between the VCDQ P and the CAT total scores (P < 0.05). Discriminative validity was significantly different. The VCDQ scores in patients with VFD before and after treatment was significantly different (P < 0.001). The VCDQ was cross-culturally adapted to Persian and demonstrated to be a valid and reliable self-administered questionnaire in Persian-speaking population. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  10. Using a cross correlation technique to refine the accuracy of the Failure Forecast Method: Application to Soufrière Hills volcano, Montserrat

    NASA Astrophysics Data System (ADS)

    Salvage, R. O.; Neuberg, J. W.

    2016-09-01

    Prior to many volcanic eruptions, an acceleration in seismicity has been observed, suggesting the potential for this as a forecasting tool. The Failure Forecast Method (FFM) relates an accelerating precursor to the timing of failure by an empirical power law, with failure being defined in this context as the onset of an eruption. Previous applications of the FFM have used a wide variety of accelerating time series, often generating questionable forecasts with large misfits between data and the forecast, as well as the generation of a number of different forecasts from the same data series. Here, we show an alternative approach applying the FFM in combination with a cross correlation technique which identifies seismicity from a single active source mechanism and location at depth. Isolating a single system at depth avoids additional uncertainties introduced by averaging data over a number of different accelerating phenomena, and consequently reduces the misfit between the data and the forecast. Similar seismic waveforms were identified in the precursory accelerating seismicity to dome collapses at Soufrière Hills volcano, Montserrat in June 1997, July 2003 and February 2010. These events were specifically chosen since they represent a spectrum of collapse scenarios at this volcano. The cross correlation technique generates a five-fold increase in the number of seismic events which could be identified from continuous seismic data rather than using triggered data, thus providing a more holistic understanding of the ongoing seismicity at the time. The use of similar seismicity as a forecasting tool for collapses in 1997 and 2003 greatly improved the forecasted timing of the dome collapse, as well as improving the confidence in the forecast, thereby outperforming the classical application of the FFM. We suggest that focusing on a single active seismic system at depth allows a more accurate forecast of some of the major dome collapses from the ongoing eruption at Soufrière Hills volcano, and provides a simple addition to the well-used methodology of the FFM.

  11. Computer-aided technique for automatic determination of the relationship between transglottal pressure change and voice fundamental frequency.

    PubMed

    Deguchi, Shinji; Kawashima, Kazutaka; Washio, Seiichi

    2008-12-01

    The effect of artificially altered transglottal pressures on the voice fundamental frequency (F0) is known to be associated with vocal fold stiffness. Its measurement, though useful as a potential diagnostic tool for noncontact assessment of vocal fold stiffness, often requires manual and painstaking determination of an unstable F0 of voice. Here, we provide a computer-aided technique that enables one to carry out the determination easily and accurately. Human subjects vocalized in accordance with a series of reference sounds from a speaker controlled by a computer. Transglottal pressures were altered by means of a valve embedded in a mouthpiece. Time-varying vocal F0 was extracted, without manual procedures, from a specific range of the voice spectrum determined on the basis of the controlled reference sounds. The validity of the proposed technique was assessed for 11 healthy subjects. Fluctuating voice F0 was tracked automatically during experiments, providing the relationship between transglottal pressure change and F0 on the computer. The proposed technique overcomes the difficulty in automatic determination of the voice F0, which tends to be transient both in normal voice and in some types of pathological voice.

  12. Ammonium Sulfate Improves Detection of Hydrophilic Quaternary Ammonium Compounds through Decreased Ion Suppression in Matrix-Assisted Laser Desorption/Ionization Imaging Mass Spectrometry.

    PubMed

    Sugiyama, Eiji; Masaki, Noritaka; Matsushita, Shoko; Setou, Mitsutoshi

    2015-11-17

    Hydrophilic quaternary ammonium compounds (QACs) include derivatives of carnitine (Car) or choline, which are known to have essential bioactivities. Here we developed a technique for improving the detection of hydrophilic QACs using ammonium sulfate (AS) in matrix-assisted laser desorption/ionization-imaging mass spectrometry (MALDI-IMS). In MALDI mass spectrometry for brain homogenates, the addition of AS greatly increased the signal intensities of Car, acetylcarnitine (AcCar), and glycerophosphocholine (GPC) by approximately 300-, 700-, and 2500-fold. The marked improvement required a higher AS concentration than that needed for suppressing the potassium adduction on phosphatidylcholine and 2,5-dihydroxybenzoic acid. Adding AS also increased the signal intensities of Car, AcCar, and GPC by approximately 10-, 20-, and 40-fold in MALDI-IMS. Consequently, the distributions of five hydrophilic QACs (Car, AcCar, GPC, choline, and phosphocholine) were simultaneously visualized by this technique. The distinct mechanism from other techniques such as improved matrix application, derivatization, or postionization suggests the great potential of AS addition to achieve higher sensitivity of MALDI-IMS for various analytes.

  13. Elevated expression of esterase and cytochrome P450 are related with lambda-cyhalothrin resistance and lead to cross resistance in Aphis glycines Matsumura.

    PubMed

    Xi, Jinghui; Pan, Yiou; Bi, Rui; Gao, Xiwu; Chen, Xuewei; Peng, Tianfei; Zhang, Min; Zhang, Hua; Hu, Xiaoyue; Shang, Qingli

    2015-02-01

    A resistant strain of the Aphis glycines Matsumura (CRR) has developed 76.67-fold resistance to lambda-cyhalothrin compared with the susceptible (CSS) strain. Synergists piperonyl butoxide (PBO), S,S,S-Tributyltrithiophosphate (DEF) and triphenyl phosphate (TPP) dramatically increased the toxicity of lambda-cyhalothrin to the resistant strain. Bioassay results indicated that the CRR strain had developed high levels of cross-resistance to chlorpyrifos (11.66-fold), acephate (8.20-fold), cypermethrin (53.24-fold), esfenvalerate (13.83-fold), cyfluthrin (9.64-fold), carbofuran (14.60-fold), methomyl (9.32-fold) and bifenthrin (4.81-fold), but did not have cross-resistance to chlorfenapyr, imidacloprid, diafenthiuron, abamectin. The transcriptional levels of CYP6A2-like, CYP6A14-like and cytochrome b-c1 complex subunit 9-like increased significantly in the resistant strain than that in the susceptible. Similar trend were observed in the transcripts and DNA copy number of CarE and E4 esterase. Overall, these results demonstrate that increased esterase hydrolysis activity, combined with elevated cytochrome P450 monooxygenase detoxicatication, plays an important role in the high levels of lambda-cyhalothrin resistance and can cause cross-resistance to other insecticides in the CRR strain. Copyright © 2014 Elsevier Inc. All rights reserved.

  14. Multiview hyperspectral topography of tissue structural and functional characteristics

    NASA Astrophysics Data System (ADS)

    Zhang, Shiwu; Liu, Peng; Huang, Jiwei; Xu, Ronald

    2012-12-01

    Accurate and in vivo characterization of structural, functional, and molecular characteristics of biological tissue will facilitate quantitative diagnosis, therapeutic guidance, and outcome assessment in many clinical applications, such as wound healing, cancer surgery, and organ transplantation. However, many clinical imaging systems have limitations and fail to provide noninvasive, real time, and quantitative assessment of biological tissue in an operation room. To overcome these limitations, we developed and tested a multiview hyperspectral imaging system. The multiview hyperspectral imaging system integrated the multiview and the hyperspectral imaging techniques in a single portable unit. Four plane mirrors are cohered together as a multiview reflective mirror set with a rectangular cross section. The multiview reflective mirror set was placed between a hyperspectral camera and the measured biological tissue. For a single image acquisition task, a hyperspectral data cube with five views was obtained. The five-view hyperspectral image consisted of a main objective image and four reflective images. Three-dimensional topography of the scene was achieved by correlating the matching pixels between the objective image and the reflective images. Three-dimensional mapping of tissue oxygenation was achieved using a hyperspectral oxygenation algorithm. The multiview hyperspectral imaging technique is currently under quantitative validation in a wound model, a tissue-simulating blood phantom, and an in vivo biological tissue model. The preliminary results have demonstrated the technical feasibility of using multiview hyperspectral imaging for three-dimensional topography of tissue functional properties.

  15. One Small Step for a Man: Estimation of Gender, Age and Height from Recordings of One Step by a Single Inertial Sensor

    PubMed Central

    Riaz, Qaiser; Vögele, Anna; Krüger, Björn; Weber, Andreas

    2015-01-01

    A number of previous works have shown that information about a subject is encoded in sparse kinematic information, such as the one revealed by so-called point light walkers. With the work at hand, we extend these results to classifications of soft biometrics from inertial sensor recordings at a single body location from a single step. We recorded accelerations and angular velocities of 26 subjects using integrated measurement units (IMUs) attached at four locations (chest, lower back, right wrist and left ankle) when performing standardized gait tasks. The collected data were segmented into individual walking steps. We trained random forest classifiers in order to estimate soft biometrics (gender, age and height). We applied two different validation methods to the process, 10-fold cross-validation and subject-wise cross-validation. For all three classification tasks, we achieve high accuracy values for all four sensor locations. From these results, we can conclude that the data of a single walking step (6D: accelerations and angular velocities) allow for a robust estimation of the gender, height and age of a person. PMID:26703601

  16. Approximate l-fold cross-validation with Least Squares SVM and Kernel Ridge Regression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Edwards, Richard E; Zhang, Hao; Parker, Lynne Edwards

    2013-01-01

    Kernel methods have difficulties scaling to large modern data sets. The scalability issues are based on computational and memory requirements for working with a large matrix. These requirements have been addressed over the years by using low-rank kernel approximations or by improving the solvers scalability. However, Least Squares Support VectorMachines (LS-SVM), a popular SVM variant, and Kernel Ridge Regression still have several scalability issues. In particular, the O(n^3) computational complexity for solving a single model, and the overall computational complexity associated with tuning hyperparameters are still major problems. We address these problems by introducing an O(n log n) approximate l-foldmore » cross-validation method that uses a multi-level circulant matrix to approximate the kernel. In addition, we prove our algorithm s computational complexity and present empirical runtimes on data sets with approximately 1 million data points. We also validate our approximate method s effectiveness at selecting hyperparameters on real world and standard benchmark data sets. Lastly, we provide experimental results on using a multi-level circulant kernel approximation to solve LS-SVM problems with hyperparameters selected using our method.« less

  17. Benford's Law based detection of latent fingerprint forgeries on the example of artificial sweat printed fingerprints captured by confocal laser scanning microscopes

    NASA Astrophysics Data System (ADS)

    Hildebrandt, Mario; Dittmann, Jana

    2015-03-01

    The possibility of forging latent fingerprints at crime scenes is known for a long time. Ever since it has been stated that an expert is capable of recognizing the presence of multiple identical latent prints as an indicator towards forgeries. With the possibility of printing fingerprint patterns to arbitrary surfaces using affordable ink- jet printers equipped with artificial sweat, it is rather simple to create a multitude of fingerprints with slight variations to avoid raising any suspicion. Such artificially printed fingerprints are often hard to detect during the analysis procedure. Moreover, the visibility of particular detection properties might be decreased depending on the utilized enhancement and acquisition technique. In previous work primarily such detection properties are used in combination with non-destructive high resolution sensory and pattern recognition techniques to detect fingerprint forgeries. In this paper we apply Benford's Law in the spatial domain to differentiate between real latent fingerprints and printed fingerprints. This technique has been successfully applied in media forensics to detect image manipulations. We use the differences between Benford's Law and the distribution of the most significant digit of the intensity and topography data from a confocal laser scanning microscope as features for a pattern recognition based detection of printed fingerprints. Our evaluation based on 3000 printed and 3000 latent print samples shows a very good detection performance of up to 98.85% using WEKA's Bagging classifier in a 10-fold stratified cross-validation.

  18. Comparison between genetic parameters of cheese yield and nutrient recovery or whey loss traits measured from individual model cheese-making methods or predicted from unprocessed bovine milk samples using Fourier-transform infrared spectroscopy.

    PubMed

    Bittante, G; Ferragina, A; Cipolat-Gotet, C; Cecchinato, A

    2014-10-01

    Cheese yield is an important technological trait in the dairy industry. The aim of this study was to infer the genetic parameters of some cheese yield-related traits predicted using Fourier-transform infrared (FTIR) spectral analysis and compare the results with those obtained using an individual model cheese-producing procedure. A total of 1,264 model cheeses were produced using 1,500-mL milk samples collected from individual Brown Swiss cows, and individual measurements were taken for 10 traits: 3 cheese yield traits (fresh curd, curd total solids, and curd water as a percent of the weight of the processed milk), 4 milk nutrient recovery traits (fat, protein, total solids, and energy of the curd as a percent of the same nutrient in the processed milk), and 3 daily cheese production traits per cow (fresh curd, total solids, and water weight of the curd). Each unprocessed milk sample was analyzed using a MilkoScan FT6000 (Foss, Hillerød, Denmark) over the spectral range, from 5,000 to 900 wavenumber × cm(-1). The FTIR spectrum-based prediction models for the previously mentioned traits were developed using modified partial least-square regression. Cross-validation of the whole data set yielded coefficients of determination between the predicted and measured values in cross-validation of 0.65 to 0.95 for all traits, except for the recovery of fat (0.41). A 3-fold external validation was also used, in which the available data were partitioned into 2 subsets: a training set (one-third of the herds) and a testing set (two-thirds). The training set was used to develop calibration equations, whereas the testing subsets were used for external validation of the calibration equations and to estimate the heritabilities and genetic correlations of the measured and FTIR-predicted phenotypes. The coefficients of determination between the predicted and measured values in cross-validation results obtained from the training sets were very similar to those obtained from the whole data set, but the coefficient of determination of validation values for the external validation sets were much lower for all traits (0.30 to 0.73), and particularly for fat recovery (0.05 to 0.18), for the training sets compared with the full data set. For each testing subset, the (co)variance components for the measured and FTIR-predicted phenotypes were estimated using bivariate Bayesian analyses and linear models. The intraherd heritabilities for the predicted traits obtained from our internal cross-validation using the whole data set ranged from 0.085 for daily yield of curd solids to 0.576 for protein recovery, and were similar to those obtained from the measured traits (0.079 to 0.586, respectively). The heritabilities estimated from the testing data set used for external validation were more variable but similar (on average) to the corresponding values obtained from the whole data set. Moreover, the genetic correlations between the predicted and measured traits were high in general (0.791 to 0.996), and they were always higher than the corresponding phenotypic correlations (0.383 to 0.995), especially for the external validation subset. In conclusion, we herein report that application of the cross-validation technique to the whole data set tended to overestimate the predictive ability of FTIR spectra, give more precise phenotypic predictions than the calibrations obtained using smaller data sets, and yield genetic correlations similar to those obtained from the measured traits. Collectively, our findings indicate that FTIR predictions have the potential to be used as indicator traits for the rapid and inexpensive selection of dairy populations for improvement of cheese yield, milk nutrient recovery in curd, and daily cheese production per cow. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  19. Identifying low-coverage surface species on supported noble metal nanoparticle catalysts by DNP-NMR

    DOE PAGES

    Johnson, Robert L.; Perras, Frédéric A.; Kobayashi, Takeshi; ...

    2015-11-20

    DNP-NMR spectroscopy has been applied to enhance the signal for organic molecules adsorbed on γ-Al 2O 3-supported Pd nanoparticles. In addition, by offering >2500-fold time savings, the technique enabled the observation of 13C- 13C cross-peaks for low coverage species, which were assigned to products from oxidative degradation of methionine adsorbed on the nanoparticle surface.

  20. Genomic prediction using different estimation methodology, blending and cross-validation techniques for growth traits and visual scores in Hereford and Braford cattle.

    PubMed

    Campos, G S; Reimann, F A; Cardoso, L L; Ferreira, C E R; Junqueira, V S; Schmidt, P I; Braccini Neto, J; Yokoo, M J I; Sollero, B P; Boligon, A A; Cardoso, F F

    2018-05-07

    The objective of the present study was to evaluate the accuracy and bias of direct and blended genomic predictions using different methods and cross-validation techniques for growth traits (weight and weight gains) and visual scores (conformation, precocity, muscling and size) obtained at weaning and at yearling in Hereford and Braford breeds. Phenotypic data contained 126,290 animals belonging to the Delta G Connection genetic improvement program, and a set of 3,545 animals genotyped with the 50K chip and 131 sires with the 777K. After quality control, 41,045 markers remained for all animals. An animal model was used to estimate (co)variances components and to predict breeding values, which were later used to calculate the deregressed estimated breeding values (DEBV). Animals with genotype and phenotype for the traits studied were divided into four or five groups by random and k-means clustering cross-validation strategies. The values of accuracy of the direct genomic values (DGV) were moderate to high magnitude for at weaning and at yearling traits, ranging from 0.19 to 0.45 for the k-means and 0.23 to 0.78 for random clustering among all traits. The greatest gain in relation to the pedigree BLUP (PBLUP) was 9.5% with the BayesB method with both the k-means and the random clustering. Blended genomic value accuracies ranged from 0.19 to 0.56 for k-means and from 0.21 to 0.82 for random clustering. The analyzes using the historical pedigree and phenotypes contributed additional information to calculate the GEBV and in general, the largest gains were for the single-step (ssGBLUP) method in bivariate analyses with a mean increase of 43.00% among all traits measured at weaning and of 46.27% for those evaluated at yearling. The accuracy values for the marker effects estimation methods were lower for k-means clustering, indicating that the training set relationship to the selection candidates is a major factor affecting accuracy of genomic predictions. The gains in accuracy obtained with genomic blending methods, mainly ssGBLUP in bivariate analyses, indicate that genomic predictions should be used as a tool to improve genetic gains in relation to the traditional PBLUP selection.

  1. Application of Time-Resolved Tryptophan Phosphorescence Spectroscopy to Protein Folding Studies.

    NASA Astrophysics Data System (ADS)

    Subramaniam, Vinod

    This thesis presents studies of the protein folding problem, one of the most significant questions in contemporary biophysics. Sensitive biophysical techniques, including room temperature tryptophan phosphorescence, which reports on the local environment of the residue, and the lability of proteins to denaturation, a global parameter, were used to assess the validity of the traditional assumption that the biologically active state of a protein is the 'native' state, and to determine whether the pathways of folding in vitro lead to the folded state achieved in vivo. Phosphorescence techniques have also been extended to study, for the first time, emission from tryptophan residues engineered into specific positions as reporters of protein structure. During in vitro refolding of E. coli alkaline phosphatase and bovine 13-lactoglobulin, significant differences were found between the refolded proteins and the native conformations, which have no apparent effect on the biological functions. Slow conformational transitions, termed 'annealing,' that occur long after the return of enzyme activity of alkaline phosphatase are manifested in the retarded recovery of phosphorescence intensity, lifetime, and protein lability. While 'annealing' is not observed for beta -lactoglobulin, both phosphorescence and lability experiments reveal changes in the structure of the refolded protein, even though its biological activity, retinol binding, is fully recovered. This result suggests that the pathways of folding in vitro need not lead to the structure formed in vivo. We have used phosphorescence techniques to study the refolding of ribonuclease T1, which exhibits slow kinetics characteristic of proline isomerization. Furthermore, the ability to extract structural information from phosphorescent tryptophan probes engineered into selected regions represents an important advance in studying protein structure; we have reported the first such results from a mutant staphylococcal nuclease. The refolding data have been interpreted in the context of recent theoretical work on rugged energy landscape models of protein folding. Our results suggest that the barriers to folding can be as large as ~ 20 kcal-mol^{-1}, and imply that the conventional definition of the 'native' state as the biologically active conformation may need revision to acknowledge that the active state may represent a long-lived intermediate on the pathway to the native structure.

  2. The Irrational Beliefs Inventory: psychometric properties and cross-cultural validation of its Arabic version.

    PubMed

    Al-Heeti, Khalaf N M; Hamid, Abdalla A R M; Alghorani, Mohammad A

    2012-08-01

    The purpose of this study was to examine the psychometric properties of the adapted Irrational Beliefs Inventory (IBI-34) and thus begin the process of assessing its adequacy for use in an Arab culture. The scale was translated and then administered to two samples of undergraduate students from the United Arab Emirates University. Data from 384 students were used in the main analysis, and data from 251 students were used for cross-validation. Principal components analysis (PCA) with varimax rotation followed by PCA with oblimin rotation yielded the same five components in both the main sample and the validation sample, thus consistent with the original Dutch study. Only 34 of the original 50 items were adequate to represent the five constructs. Cronbach's alpha coefficient for the overall scale was .76 and for the subscales ranged between .71 and .76, except for the Rigidity subscale, which was .54. The adapted IBI-34 correlated significantly and negatively with the General Health Questionnaire and Beck Depression Inventory, providing support for concurrent validity. Due to the non-significant differences between male and female participants on the total score of the IBI-34, the scale can be used for both sexes by summing across all items to give a total score that can be used as a general indicator of the irrational thinking.

  3. Assessing the Validity of Self-Rated Health with the Short Physical Performance Battery: A Cross-Sectional Analysis of the International Mobility in Aging Study

    PubMed Central

    Belanger, Emmanuelle; Zunzunegui, Maria–Victoria; Phillips, Susan; Ylli, Alban; Guralnik, Jack

    2016-01-01

    Objective The aim of this study was to explore the validity of self-rated health across different populations of older adults, when compared to the Short Physical Performance Battery. Design Cross-sectional analysis of the International Mobility in Aging Study. Setting Five locations: Saint-Hyacinthe and Kingston (Canada), Tirana (Albania), Manizales (Colombia), and Natal (Brazil). Participants Older adults between 65 and 74 years old (n = 1,995). Methods The Short Physical Performance Battery (SPPB) was used to measure physical performance. Self-rated health was assessed with one single five-point question. Linear trends between SPPB scores and self-rated health were tested separately for men and women at each of the five international study sites. Poor physical performance (independent variable) (SPPB less than 8) was used in logistic regression models of self-rated health (dependent variable), adjusting for potential covariates. All analyses were stratified by gender and site of origin. Results A significant linear association was found between the mean scores of the Short Physical Performance Battery and ordinal categories of self-rated health across research sites and gender groups. After extensive control for objective physical and mental health indicators and socio-demographic variables, these graded associations became non-significant in some research sites. Conclusion These findings further confirm the validity of SRH as a measure of overall health status in older adults. PMID:27089219

  4. In silico target prediction for elucidating the mode of action of herbicides including prospective validation.

    PubMed

    Chiddarwar, Rucha K; Rohrer, Sebastian G; Wolf, Antje; Tresch, Stefan; Wollenhaupt, Sabrina; Bender, Andreas

    2017-01-01

    The rapid emergence of pesticide resistance has given rise to a demand for herbicides with new mode of action (MoA). In the agrochemical sector, with the availability of experimental high throughput screening (HTS) data, it is now possible to utilize in silico target prediction methods in the early discovery phase to suggest the MoA of a compound via data mining of bioactivity data. While having been established in the pharmaceutical context, in the agrochemical area this approach poses rather different challenges, as we have found in this work, partially due to different chemistry, but even more so due to different (usually smaller) amounts of data, and different ways of conducting HTS. With the aim to apply computational methods for facilitating herbicide target identification, 48,000 bioactivity data against 16 herbicide targets were processed to train Laplacian modified Naïve Bayesian (NB) classification models. The herbicide target prediction model ("HerbiMod") is an ensemble of 16 binary classification models which are evaluated by internal, external and prospective validation sets. In addition to the experimental inactives, 10,000 random agrochemical inactives were included in the training process, which showed to improve the overall balanced accuracy of our models up to 40%. For all the models, performance in terms of balanced accuracy of≥80% was achieved in five-fold cross validation. Ranking target predictions was addressed by means of z-scores which improved predictivity over using raw scores alone. An external testset of 247 compounds from ChEMBL and a prospective testset of 394 compounds from BASF SE tested against five well studied herbicide targets (ACC, ALS, HPPD, PDS and PROTOX) were used for further validation. Only 4% of the compounds in the external testset lied in the applicability domain and extrapolation (and correct prediction) was hence impossible, which on one hand was surprising, and on the other hand illustrated the utilization of using applicability domains in the first place. However, performance better than 60% in balanced accuracy was achieved on the prospective testset, where all the compounds fell within the applicability domain, and which hence underlines the possibility of using target prediction also in the area of agrochemicals. Copyright © 2016 Elsevier Inc. All rights reserved.

  5. Unconstrained snoring detection using a smartphone during ordinary sleep.

    PubMed

    Shin, Hangsik; Cho, Jaegeol

    2014-08-15

    Snoring can be a representative symptom of a sleep disorder, and thus snoring detection is quite important to improving the quality of an individual's daily life. The purpose of this research is to develop an unconstrained snoring detection technique that can be integrated into a smartphone application. In contrast with previous studies, we developed a practical technique for snoring detection during ordinary sleep by using the built-in sound recording system of a smartphone, and the recording was carried out in a standard private bedroom. The experimental protocol was designed to include a variety of actions that frequently produce noise (including coughing, playing music, talking, rining an alarm, opening/closing doors, running a fan, playing the radio, and walking) in order to accurately recreate the actual circumstances during sleep. The sound data were recorded for 10 individuals during actual sleep. In total, 44 snoring data sets and 75 noise datasets were acquired. The algorithm uses formant analysis to examine sound features according to the frequency and magnitude. Then, a quadratic classifier is used to distinguish snoring from non-snoring noises. Ten-fold cross validation was used to evaluate the developed snoring detection methods, and validation was repeated 100 times randomly to improve statistical effectiveness. The overall results showed that the proposed method is competitive with those from previous research. The proposed method presented 95.07% accuracy, 98.58% sensitivity, 94.62% specificity, and 70.38% positive predictivity. Though there was a relatively high false positive rate, the results show the possibility for ubiquitous personal snoring detection through a smartphone application that takes into account data from normally occurring noises without training using preexisting data.

  6. WISC-R Types of Learning Disabilities: A Profile Analysis with Cross-Validation.

    ERIC Educational Resources Information Center

    Holcomb, William R.; And Others

    1987-01-01

    Profiles (Wechsler Intelligence Scale for Children - Revised) of 119 children in five learning disability programs were placed in six homogeneous groups using cluster analysis. One group showed superior intelligence quotient (IQ) with motor coordination deficits and severe emotional problems, while three groups represented children with low IQs…

  7. Single-molecule analysis of DNA cross-links using nanopore technology

    NASA Astrophysics Data System (ADS)

    Wolna, Anna H.

    The alpha-hemolysin (alpha-HL) protein ion channel is a potential next-generation sequencing platform that has been extensively used to study nucleic acids at a single-molecule level. After applying a potential across a lipid bilayer, the imbedded alpha-HL allows monitoring of the duration and current levels of DNA translocation and immobilization. Because this method does not require DNA amplification prior to sequencing, all the DNA damage present in the cell at any given time will be present during the sequencing experiment. The goal of this research is to determine if these damage sites give distinguishable current levels beyond those observed for the canonical nucleobases. Because DNA cross-links are one of the most prevalent types of DNA damage occurring in vivo, the blockage current levels were determined for thymine-dimers, guanine(C8)-thymine(N3) cross-links and platinum adducts. All of these cross-links give a different blockage current level compared to the undamaged strands when immobilized in the ion channel, and they all can easily translocate across the alpha-HL channel. Additionally, the alpha-HL nanopore technique presents a unique opportunity to study the effects of DNA cross-links, such as thymine-dimers, on the secondary structure of DNA G-quadruplexes folded from the human telomere sequence. Using this single-molecule nanopore technique we can detect subtle structural differences that cannot be easily addressed using conventional methods. The human telomere plays crucial roles in maintaining genome stability. In the presence of suitable cations, the repetitive 5'-TTAGGG human telomere sequence can fold into G-quadruplexes that adopt the hybrid fold in vivo. The telomere sequence is hypersensitive to UV-induced thymine-dimer (T=T) formation, and yet the presence of thymine dimers does not cause telomere shortening. The potential structural disruption and thermodynamic stability of the T=T-containing natural telomere sequences were studied to understand how this damage is tolerated in telomeric DNA. The alpha-HL experiments determined that T=Ts disrupt double-chain reversal loop formation but are well tolerated in edgewise and diagonal loops of the hybrid G-quadruplexes. These studies demonstrated the power of the alpha-HL ion channel to analyze DNA modifications and secondary structures at a single-molecule level.

  8. Validity and reliability of rectus femoris ultrasound measurements: Comparison of curved-array and linear-array transducers.

    PubMed

    Hammond, Kendra; Mampilly, Jobby; Laghi, Franco A; Goyal, Amit; Collins, Eileen G; McBurney, Conor; Jubran, Amal; Tobin, Martin J

    2014-01-01

    Muscle-mass loss augers increased morbidity and mortality in critically ill patients. Muscle-mass loss can be assessed by wide linear-array ultrasound transducers connected to cumbersome, expensive console units. Whether cheaper, hand-carried units equipped with curved-array transducers can be used as alternatives is unknown. Accordingly, our primary aim was to investigate in 15 nondisabled subjects the validity of measurements of rectus femoris cross-sectional area by using a curved-array transducer against a linear-array transducer-the reference-standard technique. In these subjects, we also determined the reliability of measurements obtained by a novice operator versus measurements obtained by an experienced operator. Lastly, the relationship between quadriceps strength and rectus area recorded by two experienced operators with a curved-array transducer was assessed in 17 patients with chronic obstructive pulmonary disease (COPD). In nondisabled subjects, the rectus cross-sectional area measured with the curved-array transducer by the novice and experienced operators was valid (intraclass correlation coefficient [ICC]: 0.98, typical percentage error [%TE]: 3.7%) and reliable (ICC: 0.79, %TE: 9.7%). In the subjects with COPD, both reliability (ICC: 0.99) and repeatability (%TE: 7.6% and 9.8%) were high. Rectus area was related to quadriceps strength in COPD for both experienced operators (coefficient of determination: 0.67 and 0.70). In conclusion, measurements of rectus femoris cross-sectional area recorded with a curved-array transducer connected to a hand-carried unit are valid, reliable, and reproducible, leading us to contend that this technique is suitable for cross-sectional and longitudinal studies.

  9. Calculation of Five Thermodynamic Molecular Descriptors by Means of a General Computer Algorithm Based on the Group-Additivity Method: Standard Enthalpies of Vaporization, Sublimation and Solvation, and Entropy of Fusion of Ordinary Organic Molecules and Total Phase-Change Entropy of Liquid Crystals.

    PubMed

    Naef, Rudolf; Acree, William E

    2017-06-25

    The calculation of the standard enthalpies of vaporization, sublimation and solvation of organic molecules is presented using a common computer algorithm on the basis of a group-additivity method. The same algorithm is also shown to enable the calculation of their entropy of fusion as well as the total phase-change entropy of liquid crystals. The present method is based on the complete breakdown of the molecules into their constituting atoms and their immediate neighbourhood; the respective calculations of the contribution of the atomic groups by means of the Gauss-Seidel fitting method is based on experimental data collected from literature. The feasibility of the calculations for each of the mentioned descriptors was verified by means of a 10-fold cross-validation procedure proving the good to high quality of the predicted values for the three mentioned enthalpies and for the entropy of fusion, whereas the predictive quality for the total phase-change entropy of liquid crystals was poor. The goodness of fit ( Q ²) and the standard deviation (σ) of the cross-validation calculations for the five descriptors was as follows: 0.9641 and 4.56 kJ/mol ( N = 3386 test molecules) for the enthalpy of vaporization, 0.8657 and 11.39 kJ/mol ( N = 1791) for the enthalpy of sublimation, 0.9546 and 4.34 kJ/mol ( N = 373) for the enthalpy of solvation, 0.8727 and 17.93 J/mol/K ( N = 2637) for the entropy of fusion and 0.5804 and 32.79 J/mol/K ( N = 2643) for the total phase-change entropy of liquid crystals. The large discrepancy between the results of the two closely related entropies is discussed in detail. Molecules for which both the standard enthalpies of vaporization and sublimation were calculable, enabled the estimation of their standard enthalpy of fusion by simple subtraction of the former from the latter enthalpy. For 990 of them the experimental enthalpy-of-fusion values are also known, allowing their comparison with predictions, yielding a correlation coefficient R ² of 0.6066.

  10. The association of eight potentially functional polymorphisms in five adrenergic receptor-encoding genes with myocardial infarction risk in Han Chinese.

    PubMed

    Xia, Kun; Ding, Rongjing; Zhang, Zhiyong; Li, Weiming; Shang, Xiaoming; Yang, Xinchun; Wang, Lefeng; Zhang, Qi

    2017-08-15

    Adrenergic receptors play a key role in activating the sympathetic nervous system, which often accompanies with the development of myocardial infarction (MI). Here, we aimed to test the association of eight potentially functional polymorphisms in five adrenergic receptor-encoding genes with MI risk. Genotypes were available for 717 MI patients and 612 controls. There were no detectable deviations from the Hardy-Weinberg equilibrium for all study polymorphisms. Allele frequencies differed remarkably for ADRA2B D/I (P<0.001), ADRB1 Ser49Gly (P=0.002), ADRB2 Gln27Glu (P=0.005), and ADRB3 Trp64Arg (P<0.001) polymorphisms, even after the Bonferroni correction. Systolic blood pressure was significantly lower in ADRA2B II genotype carriers than in the DD genotype carriers (P=0.006), while plasma high-density lipoprotein cholesterol was significantly higher in patients carrying ADRA2B I allele and ADRB1 49Ser allele than in patients with the DD genotype and 49Gly/49Gly genotype, respectively (P=0.018 and 0.033). Overall best interaction model consisted of ADRA2B D/I, ADRB1 Ser49Gly, dyslipidemia and hypertension, with the highest testing accuracy of 0.627 and the maximal 10-fold cross-validation consistency (P=0.017). Finally, a nomogram was depicted based on four significant polymorphisms and metabolic risk factors, and it had a better predictive utility and was internally validated with a discrimination C-index of 0.723 (P<0.001). Altogether, we identified two polymorphisms, ADRA2B D/I and ADRB1 Ser49Arg, which not only altered genetic susceptibility to MI, but also impacted on blood pressure and plasma lipid changes, and their combination with metabolic risk factors constituted the overall best interaction model. Copyright © 2017. Published by Elsevier B.V.

  11. Comets as natural laboratories: Interpretations of the structure of the inner heliosphere

    NASA Astrophysics Data System (ADS)

    Ramanjooloo, Yudish; Jones, Geraint H.; Coates, Andrew J.; Owens, Mathew J.

    2015-11-01

    Much has been learnt about the heliosphere’s structure from in situ solar wind spacecraft observations. Their coverage is however limited in time and space. Comets can be considered to be natural laboratories of the inner heliosphere, as their ion tails trace the solar wind flow. Solar wind conditions influence comets’ induced magnetotails, formed through the draping of the heliospheric magnetic field by the velocity shear in the mass-loaded solar wind.I present a novel imaging technique and software to exploit the vast catalogues of amateur and professional images of comet ion tails. My projection technique uses the comet’s orbital plane to sample its ion tail as a proxy for determining multi-latitudinal radial solar wind velocities in each comet’s vicinity. Making full use of many observing stations from astrophotography hobbyists to professional observatories and spacecraft, this approach is applied to several comets observed in recent years. This work thus assesses the validity of analysing comets’ ion tails as complementary sources of information on dynamical heliospheric phenomena and the underlying continuous solar wind.Complementary velocities, measured from folding ion rays and a velocity profile map built from consecutive images, are derived as an alternative means of quantifying the solar wind-cometary ionosphere interaction, including turbulent transient phenomena such as coronal mass ejections. I review the validity of these techniques by comparing near-Earth comets to solar wind MHD models (ENLIL) in the inner heliosphere and extrapolated measurements by ACE to the orbit of comet C/2004 Q2 (Machholz), a near-Earth comet. My radial velocities are mapped back to the solar wind source surface to identify sources of the quiescent solar wind and heliospheric current sheet crossings. Comets were found to be good indicators of solar wind structure, but the quality of results is strongly dependent on the observing geometry.

  12. Extracting time-frequency feature of single-channel vastus medialis EMG signals for knee exercise pattern recognition.

    PubMed

    Zhang, Yi; Li, Peiyang; Zhu, Xuyang; Su, Steven W; Guo, Qing; Xu, Peng; Yao, Dezhong

    2017-01-01

    The EMG signal indicates the electrophysiological response to daily living of activities, particularly to lower-limb knee exercises. Literature reports have shown numerous benefits of the Wavelet analysis in EMG feature extraction for pattern recognition. However, its application to typical knee exercises when using only a single EMG channel is limited. In this study, three types of knee exercises, i.e., flexion of the leg up (standing), hip extension from a sitting position (sitting) and gait (walking) are investigated from 14 healthy untrained subjects, while EMG signals from the muscle group of vastus medialis and the goniometer on the knee joint of the detected leg are synchronously monitored and recorded. Four types of lower-limb motions including standing, sitting, stance phase of walking, and swing phase of walking, are segmented. The Wavelet Transform (WT) based Singular Value Decomposition (SVD) approach is proposed for the classification of four lower-limb motions using a single-channel EMG signal from the muscle group of vastus medialis. Based on lower-limb motions from all subjects, the combination of five-level wavelet decomposition and SVD is used to comprise the feature vector. The Support Vector Machine (SVM) is then configured to build a multiple-subject classifier for which the subject independent accuracy will be given across all subjects for the classification of four types of lower-limb motions. In order to effectively indicate the classification performance, EMG features from time-domain (e.g., Mean Absolute Value (MAV), Root-Mean-Square (RMS), integrated EMG (iEMG), Zero Crossing (ZC)) and frequency-domain (e.g., Mean Frequency (MNF) and Median Frequency (MDF)) are also used to classify lower-limb motions. The five-fold cross validation is performed and it repeats fifty times in order to acquire the robust subject independent accuracy. Results show that the proposed WT-based SVD approach has the classification accuracy of 91.85%±0.88% which outperforms other feature models.

  13. GRMDA: Graph Regression for MiRNA-Disease Association Prediction

    PubMed Central

    Chen, Xing; Yang, Jing-Ru; Guan, Na-Na; Li, Jian-Qiang

    2018-01-01

    Nowadays, as more and more associations between microRNAs (miRNAs) and diseases have been discovered, miRNA has gradually become a hot topic in the biological field. Because of the high consumption of time and money on carrying out biological experiments, computational method which can help scientists choose the most likely associations between miRNAs and diseases for further experimental studies is desperately needed. In this study, we proposed a method of Graph Regression for MiRNA-Disease Association prediction (GRMDA) which combines known miRNA-disease associations, miRNA functional similarity, disease semantic similarity, and Gaussian interaction profile kernel similarity. We used Gaussian interaction profile kernel similarity to supplement the shortage of miRNA functional similarity and disease semantic similarity. Furthermore, the graph regression was synchronously performed in three latent spaces, including association space, miRNA similarity space, and disease similarity space, by using two matrix factorization approaches called Singular Value Decomposition and Partial Least-Squares to extract important related attributes and filter the noise. In the leave-one-out cross validation and five-fold cross validation, GRMDA obtained the AUCs of 0.8272 and 0.8080 ± 0.0024, respectively. Thus, its performance is better than some previous models. In the case study of Lymphoma using the recorded miRNA-disease associations in HMDD V2.0 database, 88% of top 50 predicted miRNAs were verified by experimental literatures. In order to test the performance of GRMDA on new diseases with no known related miRNAs, we took Breast Neoplasms as an example by regarding all the known related miRNAs as unknown ones. We found that 100% of top 50 predicted miRNAs were verified. Moreover, 84% of top 50 predicted miRNAs in case study for Esophageal Neoplasms based on HMDD V1.0 were verified to have known associations. In conclusion, GRMDA is an effective and practical method for miRNA-disease association prediction. PMID:29515453

  14. GRMDA: Graph Regression for MiRNA-Disease Association Prediction.

    PubMed

    Chen, Xing; Yang, Jing-Ru; Guan, Na-Na; Li, Jian-Qiang

    2018-01-01

    Nowadays, as more and more associations between microRNAs (miRNAs) and diseases have been discovered, miRNA has gradually become a hot topic in the biological field. Because of the high consumption of time and money on carrying out biological experiments, computational method which can help scientists choose the most likely associations between miRNAs and diseases for further experimental studies is desperately needed. In this study, we proposed a method of Graph Regression for MiRNA-Disease Association prediction (GRMDA) which combines known miRNA-disease associations, miRNA functional similarity, disease semantic similarity, and Gaussian interaction profile kernel similarity. We used Gaussian interaction profile kernel similarity to supplement the shortage of miRNA functional similarity and disease semantic similarity. Furthermore, the graph regression was synchronously performed in three latent spaces, including association space, miRNA similarity space, and disease similarity space, by using two matrix factorization approaches called Singular Value Decomposition and Partial Least-Squares to extract important related attributes and filter the noise. In the leave-one-out cross validation and five-fold cross validation, GRMDA obtained the AUCs of 0.8272 and 0.8080 ± 0.0024, respectively. Thus, its performance is better than some previous models. In the case study of Lymphoma using the recorded miRNA-disease associations in HMDD V2.0 database, 88% of top 50 predicted miRNAs were verified by experimental literatures. In order to test the performance of GRMDA on new diseases with no known related miRNAs, we took Breast Neoplasms as an example by regarding all the known related miRNAs as unknown ones. We found that 100% of top 50 predicted miRNAs were verified. Moreover, 84% of top 50 predicted miRNAs in case study for Esophageal Neoplasms based on HMDD V1.0 were verified to have known associations. In conclusion, GRMDA is an effective and practical method for miRNA-disease association prediction.

  15. A novel approach based on KATZ measure to predict associations of human microbiota with non-infectious diseases.

    PubMed

    Chen, Xing; Huang, Yu-An; You, Zhu-Hong; Yan, Gui-Ying; Wang, Xue-Song

    2017-03-01

    Accumulating clinical observations have indicated that microbes living in the human body are closely associated with a wide range of human noninfectious diseases, which provides promising insights into the complex disease mechanism understanding. Predicting microbe-disease associations could not only boost human disease diagnostic and prognostic, but also improve the new drug development. However, little efforts have been attempted to understand and predict human microbe-disease associations on a large scale until now. In this work, we constructed a microbe-human disease association network and further developed a novel computational model of KATZ measure for Human Microbe-Disease Association prediction (KATZHMDA) based on the assumption that functionally similar microbes tend to have similar interaction and non-interaction patterns with noninfectious diseases, and vice versa. To our knowledge, KATZHMDA is the first tool for microbe-disease association prediction. The reliable prediction performance could be attributed to the use of KATZ measurement, and the introduction of Gaussian interaction profile kernel similarity for microbes and diseases. LOOCV and k-fold cross validation were implemented to evaluate the effectiveness of this novel computational model based on known microbe-disease associations obtained from HMDAD database. As a result, KATZHMDA achieved reliable performance with average AUCs of 0.8130 ± 0.0054, 0.8301 ± 0.0033 and 0.8382 in 2-fold and 5-fold cross validation and LOOCV framework, respectively. It is anticipated that KATZHMDA could be used to obtain more novel microbes associated with important noninfectious human diseases and therefore benefit drug discovery and human medical improvement. Matlab codes and dataset explored in this work are available at http://dwz.cn/4oX5mS . xingchen@amss.ac.cn or zhuhongyou@gmail.com or wangxuesongcumt@163.com. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  16. Translation, Validation, and Adaptation of the Time Use Diary from English into the Malay Language for Use in Malaysia.

    PubMed

    Asmuri, Siti Noraini; Brown, Ted; Broom, Lisa J

    2016-07-01

    Valid translations of time use scales are needed by occupational therapists for use in different cross-cultural contexts to gather relevant data to inform practice and research. The purpose of this study was to describe the process of translating, adapting, and validating the Time Use Diary from its current English language edition into a Malay language version. Five steps of the cross-cultural adaptation process were completed: (i) translation from English into the Malay language by a qualified translator, (ii) synthesis of the translated Malay version, (iii) backtranslation from Malay to English by three bilingual speakers, (iv) expert committee review and discussion, and (v) pilot testing of the Malay language version with two participant groups. The translated version was found to be a reliable and valid tool identifying changes and potential challenges in the time use of older adults. This provides Malaysian occupational therapists with a useful tool for gathering time use data in practice settings and for research purposes.

  17. A Decision Tree for Nonmetric Sex Assessment from the Skull.

    PubMed

    Langley, Natalie R; Dudzik, Beatrix; Cloutier, Alesia

    2018-01-01

    This study uses five well-documented cranial nonmetric traits (glabella, mastoid process, mental eminence, supraorbital margin, and nuchal crest) and one additional trait (zygomatic extension) to develop a validated decision tree for sex assessment. The decision tree was built and cross-validated on a sample of 293 U.S. White individuals from the William M. Bass Donated Skeletal Collection. Ordinal scores from the six traits were analyzed using the partition modeling option in JMP Pro 12. A holdout sample of 50 skulls was used to test the model. The most accurate decision tree includes three variables: glabella, zygomatic extension, and mastoid process. This decision tree yielded 93.5% accuracy on the training sample, 94% on the cross-validated sample, and 96% on a holdout validation sample. Linear weighted kappa statistics indicate acceptable agreement among observers for these variables. Mental eminence should be avoided, and definitions and figures should be referenced carefully to score nonmetric traits. © 2017 American Academy of Forensic Sciences.

  18. A Comparison of the Validity of the Five-Factor Model (FFM) Personality Disorder Prototypes Using FFM Self-Report and Interview Measures

    ERIC Educational Resources Information Center

    Miller, Joshua D.; Bagby, R. Michael; Pilkonis, Paul A.

    2005-01-01

    Recent studies have demonstrated that personality disorders (PDs) can be assessed via a prototype-matching technique, which enables researchers and clinicians to match an individual's five-factor model (FFM) personality profile to an expert-generated prototype. The current study examined the relations between these prototype scores, using…

  19. The use of wavelet packet transform and artificial neural networks in analysis and classification of dysphonic voices.

    PubMed

    Crovato, César David Paredes; Schuck, Adalberto

    2007-10-01

    This paper presents a dysphonic voice classification system using the wavelet packet transform and the best basis algorithm (BBA) as dimensionality reductor and 06 artificial neural networks (ANN) acting as specialist systems. Each ANN was a 03-layer multilayer perceptron with 64 input nodes, 01 output node and in the intermediary layer the number of neurons depends on the related training pathology group. The dysphonic voice database was separated in five pathology groups and one healthy control group. Each ANN was trained and associated with one of the 06 groups, and fed by the best base tree (BBT) nodes' entropy values, using the multiple cross validation (MCV) method and the leave-one-out (LOO) variation technique and success rates obtained were 87.5%, 95.31%, 87.5%, 100%, 96.87% and 89.06% for the groups 01 to 06, respectively.

  20. The use of a gas chromatography-sensor system combined with advanced statistical methods, towards the diagnosis of urological malignancies

    PubMed Central

    Aggio, Raphael B. M.; de Lacy Costello, Ben; White, Paul; Khalid, Tanzeela; Ratcliffe, Norman M.; Persad, Raj; Probert, Chris S. J.

    2016-01-01

    Prostate cancer is one of the most common cancers. Serum prostate-specific antigen (PSA) is used to aid the selection of men undergoing biopsies. Its use remains controversial. We propose a GC-sensor algorithm system for classifying urine samples from patients with urological symptoms. This pilot study includes 155 men presenting to urology clinics, 58 were diagnosed with prostate cancer, 24 with bladder cancer and 73 with haematuria and or poor stream, without cancer. Principal component analysis (PCA) was applied to assess the discrimination achieved, while linear discriminant analysis (LDA) and support vector machine (SVM) were used as statistical models for sample classification. Leave-one-out cross-validation (LOOCV), repeated 10-fold cross-validation (10FoldCV), repeated double cross-validation (DoubleCV) and Monte Carlo permutations were applied to assess performance. Significant separation was found between prostate cancer and control samples, bladder cancer and controls and between bladder and prostate cancer samples. For prostate cancer diagnosis, the GC/SVM system classified samples with 95% sensitivity and 96% specificity after LOOCV. For bladder cancer diagnosis, the SVM reported 96% sensitivity and 100% specificity after LOOCV, while the DoubleCV reported 87% sensitivity and 99% specificity, with SVM showing 78% and 98% sensitivity between prostate and bladder cancer samples. Evaluation of the results of the Monte Carlo permutation of class labels obtained chance-like accuracy values around 50% suggesting the observed results for bladder cancer and prostate cancer detection are not due to over fitting. The results of the pilot study presented here indicate that the GC system is able to successfully identify patterns that allow classification of urine samples from patients with urological cancers. An accurate diagnosis based on urine samples would reduce the number of negative prostate biopsies performed, and the frequency of surveillance cystoscopy for bladder cancer patients. Larger cohort studies are planned to investigate the potential of this system. Future work may lead to non-invasive breath analyses for diagnosing urological conditions. PMID:26865331

  1. Multiplexed quantification of nucleic acids with large dynamic range using multivolume digital RT-PCR on a rotational SlipChip tested with HIV and hepatitis C viral load.

    PubMed

    Shen, Feng; Sun, Bing; Kreutz, Jason E; Davydova, Elena K; Du, Wenbin; Reddy, Poluru L; Joseph, Loren J; Ismagilov, Rustem F

    2011-11-09

    In this paper, we are working toward a problem of great importance to global health: determination of viral HIV and hepatitis C (HCV) loads under point-of-care and resource limited settings. While antiretroviral treatments are becoming widely available, viral load must be evaluated at regular intervals to prevent the spread of drug resistance and requires a quantitative measurement of RNA concentration over a wide dynamic range (from 50 up to 10(6) molecules/mL for HIV and up to 10(8) molecules/mL for HCV). "Digital" single molecule measurements are attractive for quantification, but the dynamic range of such systems is typically limited or requires excessive numbers of compartments. Here we designed and tested two microfluidic rotational SlipChips to perform multivolume digital RT-PCR (MV digital RT-PCR) experiments with large and tunable dynamic range. These designs were characterized using synthetic control RNA and validated with HIV viral RNA and HCV control viral RNA. The first design contained 160 wells of each of four volumes (125 nL, 25 nL, 5 nL, and 1 nL) to achieve a dynamic range of 5.2 × 10(2) to 4.0 × 10(6) molecules/mL at 3-fold resolution. The second design tested the flexibility of this approach, and further expanded it to allow for multiplexing while maintaining a large dynamic range by adding additional wells with volumes of 0.2 nL and 625 nL and dividing the SlipChip into five regions to analyze five samples each at a dynamic range of 1.8 × 10(3) to 1.2 × 10(7) molecules/mL at 3-fold resolution. No evidence of cross-contamination was observed. The multiplexed SlipChip can be used to analyze a single sample at a dynamic range of 1.7 × 10(2) to 2.0 × 10(7) molecules/mL at 3-fold resolution with limit of detection of 40 molecules/mL. HIV viral RNA purified from clinical samples were tested on the SlipChip, and viral load results were self-consistent and in good agreement with results determined using the Roche COBAS AmpliPrep/COBAS TaqMan HIV-1 Test. With further validation, this SlipChip should become useful to precisely quantify viral HIV and HCV RNA for high-performance diagnostics in resource-limited settings. These microfluidic designs should also be valuable for other diagnostic and research applications, including detecting rare cells and rare mutations, prenatal diagnostics, monitoring residual disease, and quantifying copy number variation and gene expression patterns. The theory for the design and analysis of multivolume digital PCR experiments is presented in other work by Kreutz et al.

  2. Technique to achieve the symmetry of the new inframammary fold

    PubMed Central

    Pozzi, Marcello; Zoccali, Giovanni; Buccheri, Ernesto Maria; de Vita, Roy

    2014-01-01

    Summary The literature outlines several surgical techniques to restore inframmammary fold definition, but symmetry of the fold is often left to irreproducible procedures. We report our personal technique to restore the symmetry of the inframmammary fold during multistep breast reconstruction. PMID:25078934

  3. Van’t Hoff global analyses of variable temperature isothermal titration calorimetry data

    PubMed Central

    Freiburger, Lee A.; Auclair, Karine; Mittermaier, Anthony K.

    2016-01-01

    Isothermal titration calorimetry (ITC) can provide detailed information on the thermodynamics of biomolecular interactions in the form of equilibrium constants, KA, and enthalpy changes, ΔHA. A powerful application of this technique involves analyzing the temperature dependences of ITC-derived KA and ΔHA values to gain insight into thermodynamic linkage between binding and additional equilibria, such as protein folding. We recently developed a general method for global analysis of variable temperature ITC data that significantly improves the accuracy of extracted thermodynamic parameters and requires no prior knowledge of the coupled equilibria. Here we report detailed validation of this method using Monte Carlo simulations and an application to study coupled folding and binding in an aminoglycoside acetyltransferase enzyme. PMID:28018008

  4. Development of the Brazilian Portuguese version of the Achilles Tendon Total Rupture Score (ATRS BrP): a cross-cultural adaptation with reliability and construct validity evaluation.

    PubMed

    Zambelli, Roberto; Pinto, Rafael Z; Magalhães, João Murilo Brandão; Lopes, Fernando Araujo Silva; Castilho, Rodrigo Simões; Baumfeld, Daniel; Dos Santos, Thiago Ribeiro Teles; Maffulli, Nicola

    2016-01-01

    There is a need for a patient-relevant instrument to evaluate outcome after treatment in patients with a total Achilles tendon rupture. The purpose of this study was to undertake a cross-cultural adaptation of the Achilles Tendon Total Rupture Score (ATRS) into Brazilian Portuguese, determining the test-retest reliability and construct validity of the instrument. A five-step approach was used in the cross-cultural adaptation process: initial translation (two bilingual Brazilian translators), synthesis of translation, back-translation (two native English language translators), consensus version and evaluation (expert committee), and testing phase. A total of 46 patients were recruited to evaluate the test-retest reproducibility and construct validity of the Brazilian Portuguese version of the ATRS. Test-retest reproducibility was performed by assessing each participant on two separate occasions. The construct validity was determined by the correlation index between the ATRS and the Orthopedic American Foot and Ankle Society (AOFAS) questionnaires. The final version of the Brazilian Portuguese ATRS had the same number of questions as the original ATRS. For the reliability analysis, an ICC(2,1) of 0.93 (95 % CI: 0.88 to 0.96) with SEM of 1.56 points and MDC of 4.32 was observed, indicating excellent reliability. The construct validity showed excellent correlation with R = 0.76 (95 % CI: 0.52 to 0.89, P < 0.001). The ATRS was successfully cross-culturally validated into Brazilian Portuguese. This version was a reliable and valid measure of function in patients who suffered complete rupture of the Achilles Tendon.

  5. Reader Response Techniques for Teaching Secondary and Post-Secondary Reading. College Reading and Learning Assistance Technical Report 85-07.

    ERIC Educational Resources Information Center

    Chase, Nancy D.

    This paper describes a five-step technique for secondary and postsecondary reading instruction, compatible with reader response theory, and addressing the need for academically underprepared students to experience the validation of their personal responses to texts. The first step involves identifying prior knowledge and opinions before reading…

  6. An Exploratory Study of Pre-Admission Predictors of Hardiness and Retention for United States Military Academy Cadets Using Regression Modeling

    DTIC Science & Technology

    2013-06-01

    Character in Sports Index CV Cross Validation FAS Faculty Appraisal Score FFM Five-Factor Model, also known as the “Big Five” GAM... FFM ). USMA does not allow personality testing as a selection tool. However, perhaps we may discover whether pre-admission information can predict...characteristic, and personality factors as described by the Five Factor Model ( FFM ) to determine their effect on one’s academic performance at USMA (Clark

  7. Clarifying and Measuring Filial Concepts across Five Cultural Groups

    PubMed Central

    Jones, Patricia S.; Lee, Jerry W.; Zhang, Xinwei E.

    2011-01-01

    Literature on responsibility of adult children for aging parents reflects lack of conceptual clarity. We examined filial concepts across five cultural groups: African-, Asian-, Euro-, Latino-, and Native Americans. Data were randomly divided for scale development (n = 285) and cross-validation (n = 284). Exploratory factor analysis on 59 items identified three filial concepts: Responsibility, Respect, and Care. Confirmatory factor analysis on a 12-item final scale showed data fit the three-factor model better than the single factor solution despite substantial correlations between the factors (.82, .82 for Care with Responsibility and Respect, and .74 for Responsibility with Respect). The scale can be used in cross-cultural research to test hypotheses that predict associations among filial values, filial caregiving, and caregiver health outcomes. PMID:21618557

  8. MSX-3D: a tool to validate 3D protein models using mass spectrometry.

    PubMed

    Heymann, Michaël; Paramelle, David; Subra, Gilles; Forest, Eric; Martinez, Jean; Geourjon, Christophe; Deléage, Gilbert

    2008-12-01

    The technique of chemical cross-linking followed by mass spectrometry has proven to bring valuable information about the protein structure and interactions between proteic subunits. It is an effective and efficient way to experimentally investigate some aspects of a protein structure when NMR and X-ray crystallography data are lacking. We introduce MSX-3D, a tool specifically geared to validate protein models using mass spectrometry. In addition to classical peptides identifications, it allows an interactive 3D visualization of the distance constraints derived from a cross-linking experiment. Freely available at http://proteomics-pbil.ibcp.fr

  9. Sculpting Mountains: Interactive Terrain Modeling Based on Subsurface Geology.

    PubMed

    Cordonnier, Guillaume; Cani, Marie-Paule; Benes, Bedrich; Braun, Jean; Galin, Eric

    2018-05-01

    Most mountain ranges are formed by the compression and folding of colliding tectonic plates. Subduction of one plate causes large-scale asymmetry while their layered composition (or stratigraphy) explains the multi-scale folded strata observed on real terrains. We introduce a novel interactive modeling technique to generate visually plausible, large scale terrains that capture these phenomena. Our method draws on both geological knowledge for consistency and on sculpting systems for user interaction. The user is provided hands-on control on the shape and motion of tectonic plates, represented using a new geologically-inspired model for the Earth crust. The model captures their volume preserving and complex folding behaviors under collision, causing mountains to grow. It generates a volumetric uplift map representing the growth rate of subsurface layers. Erosion and uplift movement are jointly simulated to generate the terrain. The stratigraphy allows us to render folded strata on eroded cliffs. We validated the usability of our sculpting interface through a user study, and compare the visual consistency of the earth crust model with geological simulation results and real terrains.

  10. A Spanish-language patient safety questionnaire to measure medical and nursing students' attitudes and knowledge.

    PubMed

    Mira, José J; Navarro, Isabel M; Guilabert, Mercedes; Poblete, Rodrigo; Franco, Astolfo L; Jiménez, Pilar; Aquino, Margarita; Fernández-Trujillo, Francisco J; Lorenzo, Susana; Vitaller, Julián; de Valle, Yohana Díaz; Aibar, Carlos; Aranaz, Jesús M; De Pedro, José A

    2015-08-01

    To design and validate a questionnaire for assessing attitudes and knowledge about patient safety using a sample of medical and nursing students undergoing clinical training in Spain and four countries in Latin America. In this cross-sectional study, a literature review was carried out and total of 786 medical and nursing students were surveyed at eight universities from five countries (Chile, Colombia, El Salvador, Guatemala, and Spain) to develop and refine a Spanish-language questionnaire on knowledge and attitudes about patient safety. The scope of the questionnaire was based on five dimensions (factors) presented in studies related to patient safety culture found in PubMed and Scopus. Based on the five factors, 25 reactive items were developed. Composite reliability indexes and Cronbach's alpha statistics were estimated for each factor, and confirmatory factor analysis was conducted to assess validity. After a pilot test, the questionnaire was refined using confirmatory models, maximum-likelihood estimation, and the variance-covariance matrix (as input). Multiple linear regression models were used to confirm external validity, considering variables related to patient safety culture as dependent variables and the five factors as independent variables. The final instrument was a structured five-point Likert self-administered survey (the "Latino Student Patient Safety Questionnaire") consisting of 21 items grouped into five factors. Compound reliability indexes (Cronbach's alpha statistic) calculated for the five factors were about 0.7 or higher. The results of the multiple linear regression analyses indicated good model fit (goodness-of-fit index: 0.9). Item-total correlations were higher than 0.3 in all cases. The convergent-discriminant validity was adequate. The questionnaire designed and validated in this study assesses nursing and medical students' attitudes and knowledge about patient safety. This instrument could be used to indirectly evaluate whether or not students in health disciplines are acquiring and thus likely to put into practice the professional skills currently considered most appropriate for patient safety.

  11. The reliability, validity, sensitivity, specificity and predictive values of the Chinese version of the Rowland Universal Dementia Assessment Scale.

    PubMed

    Chen, Chia-Wei; Chu, Hsin; Tsai, Chia-Fen; Yang, Hui-Ling; Tsai, Jui-Chen; Chung, Min-Huey; Liao, Yuan-Mei; Chi, Mei-Ju; Chou, Kuei-Ru

    2015-11-01

    The purpose of this study was to translate the Rowland Universal Dementia Assessment Scale into Chinese and to evaluate the psychometric properties (reliability and validity) and the diagnostic properties (sensitivity, specificity and predictive values) of the Chinese version of the Rowland Universal Dementia Assessment Scale. The accurate detection of early dementia requires screening tools with favourable cross-cultural linguistic and appropriate sensitivity, specificity, and predictive values, particularly for Chinese-speaking populations. This was a cross-sectional, descriptive study. Overall, 130 participants suspected to have cognitive impairment were enrolled in the study. A test-retest for determining reliability was scheduled four weeks after the initial test. Content validity was determined by five experts, whereas construct validity was established by using contrasted group technique. The participants' clinical diagnoses were used as the standard in calculating the sensitivity, specificity, positive predictive value and negative predictive value. The study revealed that the Chinese version of the Rowland Universal Dementia Assessment Scale exhibited a test-retest reliability of 0.90, an internal consistency reliability of 0.71, an inter-rater reliability (kappa value) of 0.88 and a content validity index of 0.97. Both the patients and healthy contrast group exhibited significant differences in their cognitive ability. The optimal cut-off points for the Chinese version of the Rowland Universal Dementia Assessment Scale in the test for mild cognitive impairment and dementia were 24 and 22, respectively; moreover, for these two conditions, the sensitivities of the scale were 0.79 and 0.76, the specificities were 0.91 and 0.81, the areas under the curve were 0.85 and 0.78, the positive predictive values were 0.99 and 0.83 and the negative predictive values were 0.96 and 0.91 respectively. The Chinese version of the Rowland Universal Dementia Assessment Scale exhibited sound reliability, validity, sensitivity, specificity and predictive values. This scale can help clinical staff members to quickly and accurately diagnose cognitive impairment and provide appropriate treatment as early as possible. © 2015 John Wiley & Sons Ltd.

  12. Cross-cultural adaptation of the Individual Work Performance Questionnaire.

    PubMed

    Koopmans, Linda; Bernaards, Claire M; Hildebrandt, Vincent H; Lerner, Debra; de Vet, Henrica C W; van der Beek, Allard J

    2015-01-01

    The Individual Work Performance Questionnaire (IWPQ), measuring task performance, contextual performance, and counterproductive work behavior, was developed in The Netherlands. To cross-culturally adapt the IWPQ from the Dutch to the American-English language, and assess the questionnaire's internal consistency and content validity in the American-English context. A five stage translation and adaptation process was used: forward translation, synthesis, back-translation, expert committee review, and pilot-testing. During the pilot-testing, cognitive interviews with 40 American workers were performed, to examine the comprehensibility, applicability, and completeness of the American-English IWPQ. Questionnaire instructions were slightly modified to aid interpretation in the American-English language. Inconsistencies with verb tense were identified, and it was decided to consistently use simple past tense. The wording of five items was modified to better suit the American-English language. In general, participants were positive on the comprehensibility, applicability and completeness of the questionnaire during the pilot-testing phase. Furthermore, the study showed positive results concerning the internal consistency (Cronbach's alphas for the scales between 0.79-0.89) and content validity of the American-English IWPQ. The results indicate that the cross-cultural adaptation of the American-English IWPQ was successful and that the measurement properties of the translated version are promising.

  13. 77 FR 42762 - Scheduling of an Expedited Five-Year Review Concerning the Antidumping Duty Order on Folding Gift...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-07-20

    ... INTERNATIONAL TRADE COMMISSION [Investigation No. 731-TA-921 (Second Review)] Scheduling of an Expedited Five-Year Review Concerning the Antidumping Duty Order on Folding Gift Boxes From China AGENCY... folding gift boxes from China would be likely to lead to continuation or recurrence of material injury...

  14. A Bio-Inspired Herbal Tea Flavour Assessment Technique

    PubMed Central

    Zakaria, Nur Zawatil Isqi; Masnan, Maz Jamilah; Zakaria, Ammar; Shakaff, Ali Yeon Md

    2014-01-01

    Herbal-based products are becoming a widespread production trend among manufacturers for the domestic and international markets. As the production increases to meet the market demand, it is very crucial for the manufacturer to ensure that their products have met specific criteria and fulfil the intended quality determined by the quality controller. One famous herbal-based product is herbal tea. This paper investigates bio-inspired flavour assessments in a data fusion framework involving an e-nose and e-tongue. The objectives are to attain good classification of different types and brands of herbal tea, classification of different flavour masking effects and finally classification of different concentrations of herbal tea. Two data fusion levels were employed in this research, low level data fusion and intermediate level data fusion. Four classification approaches; LDA, SVM, KNN and PNN were examined in search of the best classifier to achieve the research objectives. In order to evaluate the classifiers' performance, an error estimator based on k-fold cross validation and leave-one-out were applied. Classification based on GC-MS TIC data was also included as a comparison to the classification performance using fusion approaches. Generally, KNN outperformed the other classification techniques for the three flavour assessments in the low level data fusion and intermediate level data fusion. However, the classification results based on GC-MS TIC data are varied. PMID:25010697

  15. Uncertainty Quantification Techniques of SCALE/TSUNAMI

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rearden, Bradley T; Mueller, Don

    2011-01-01

    The Standardized Computer Analysis for Licensing Evaluation (SCALE) code system developed at Oak Ridge National Laboratory (ORNL) includes Tools for Sensitivity and Uncertainty Analysis Methodology Implementation (TSUNAMI). The TSUNAMI code suite can quantify the predicted change in system responses, such as k{sub eff}, reactivity differences, or ratios of fluxes or reaction rates, due to changes in the energy-dependent, nuclide-reaction-specific cross-section data. Where uncertainties in the neutron cross-section data are available, the sensitivity of the system to the cross-section data can be applied to propagate the uncertainties in the cross-section data to an uncertainty in the system response. Uncertainty quantification ismore » useful for identifying potential sources of computational biases and highlighting parameters important to code validation. Traditional validation techniques often examine one or more average physical parameters to characterize a system and identify applicable benchmark experiments. However, with TSUNAMI correlation coefficients are developed by propagating the uncertainties in neutron cross-section data to uncertainties in the computed responses for experiments and safety applications through sensitivity coefficients. The bias in the experiments, as a function of their correlation coefficient with the intended application, is extrapolated to predict the bias and bias uncertainty in the application through trending analysis or generalized linear least squares techniques, often referred to as 'data adjustment.' Even with advanced tools to identify benchmark experiments, analysts occasionally find that the application models include some feature or material for which adequately similar benchmark experiments do not exist to support validation. For example, a criticality safety analyst may want to take credit for the presence of fission products in spent nuclear fuel. In such cases, analysts sometimes rely on 'expert judgment' to select an additional administrative margin to account for gap in the validation data or to conclude that the impact on the calculated bias and bias uncertainty is negligible. As a result of advances in computer programs and the evolution of cross-section covariance data, analysts can use the sensitivity and uncertainty analysis tools in the TSUNAMI codes to estimate the potential impact on the application-specific bias and bias uncertainty resulting from nuclides not represented in available benchmark experiments. This paper presents the application of methods described in a companion paper.« less

  16. A Cross-Cultural Analysis of Personality Structure Through the Lens of the HEXACO Model.

    PubMed

    Ion, Andrei; Iliescu, Dragos; Aldhafri, Said; Rana, Neeti; Ratanadilok, Kattiya; Widyanti, Ari; Nedelcea, Cătălin

    2017-01-01

    Across 5 different samples, totaling more than 1,600 participants from India, Indonesia, Oman, Romania, and Thailand, the authors address the question of cross-cultural replicability of a personality structure, while exploring the utility of exploratory structural equation modeling (ESEM) as a data analysis technique in cross-cultural personality research. Personality was measured with an alternative, non-Five-Factor Model (FFM) personality framework, provided by the HEXACO-PI (Lee & Ashton, 2004 ). The results show that the HEXACO framework was replicated in some of the investigated cultures. The ESEM data analysis technique proved to be especially useful in investigating the between-group measurement equivalence of broad personality measures across different cultures.

  17. Design and 4D Printing of Cross-Folded Origami Structures: A Preliminary Investigation.

    PubMed

    Teoh, Joanne Ee Mei; An, Jia; Feng, Xiaofan; Zhao, Yue; Chua, Chee Kai; Liu, Yong

    2018-03-03

    In 4D printing research, different types of complex structure folding and unfolding have been investigated. However, research on cross-folding of origami structures (defined as a folding structure with at least two overlapping folds) has not been reported. This research focuses on the investigation of cross-folding structures using multi-material components along different axes and different horizontal hinge thickness with single homogeneous material. Tensile tests were conducted to determine the impact of multi-material components and horizontal hinge thickness. In the case of multi-material structures, the hybrid material composition has a significant impact on the overall maximum strain and Young's modulus properties. In the case of single material structures, the shape recovery speed is inversely proportional to the horizontal hinge thickness, while the flexural or bending strength is proportional to the horizontal hinge thickness. A hinge with a thickness of 0.5 mm could be folded three times prior to fracture whilst a hinge with a thickness of 0.3 mm could be folded only once prior to fracture. A hinge with a thickness of 0.1 mm could not even be folded without cracking. The introduction of a physical hole in the center of the folding/unfolding line provided stress relief and prevented fracture. A complex flower petal shape was used to successfully demonstrate the implementation of overlapping and non-overlapping folding lines using both single material segments and multi-material segments. Design guidelines for establishing cross-folding structures using multi-material components along different axes and different horizontal hinge thicknesses with single or homogeneous material were established. These guidelines can be used to design and implement complex origami structures with overlapping and non-overlapping folding lines. Combined overlapping folding structures could be implemented and allocating specific hole locations in the overall designs could be further explored. In addition, creating a more precise prediction by investigating sets of in between hinge thicknesses and comparing the folding times before fracture, will be the subject of future work.

  18. [Cross-Mapping: diagnostic labels formulated according to the ICNP® versus diagnosis of NANDA International].

    PubMed

    Tannure, Meire Chucre; Salgado, Patrícia de Oliveira; Chianca, Tânia Couto Machado

    2014-01-01

    This descriptive study aimed at elaborating nursing diagnostic labels according to ICNP®; conducting a cross-mapping between the diagnostic formulations and the diagnostic labels of NANDA-I; identifying the diagnostic labels thus obtained that were also listed in the NANDA-I; and mapping them according to Basic Human Needs. The workshop technique was applied to 32 intensive care nurses, the cross-mapping and validation based on agreement with experts. The workshop produced 1665 diagnostic labels which were further refined into 120 labels. They were then submitted to a cross-mapping process with both NANDA-I diagnostic labels and the Basic Human Needs. The mapping results underwent content validation by two expert nurses leading to concordance rates of 92% and 100%. It was found that 63 labels were listed in NANDA-I and 47 were not.

  19. Comparative pharmacokinetics of rhein in normal and loperamide-induced constipated rats and microarray analysis of drug-metabolizing genes.

    PubMed

    Hou, Mei-Ling; Chang, Li-Wen; Lin, Chi-Hung; Lin, Lie-Chwen; Tsai, Tung-Hu

    2014-09-11

    Rhein is a pharmacological active component found in Rheum palmatum L. that is the major herb of the San-Huang-Xie-Xin-Tang (SHXXT), a medicinal herbal product used as a remedy for constipation. Here we have investigated the comparative pharmacokinetics of rhein in normal and constipated rats. Microarray analysis was used to explore whether drug-metabolizing genes will be altered after SHXXT treatment. The comparative pharmacokinetics of rhein in normal and loperamide-induced constipated rats was studied by liquid chromatography with electrospray ionization tandem mass spectrometry (LC-MS/MS). Gene expression profiling in drug-metabolizing genes after SHXXT treatment was investigated by microarray analysis and real-time polymerase chain reaction (RT-PCR). A validated LC-MS/MS method was applied to investigate the comparative pharmacokinetics of rhein in normal and loperamide-induced constipated rats. The pharmacokinetic results demonstrate that the loperamide-induced constipation reduced the absorption of rhein. Cmax significantly reduced by 2.5-fold, the AUC decreased by 27.8%; however, the elimination half-life (t1/2) was prolonged by 1.6-fold. Tmax and mean residence time (MRT) were significantly prolonged by 2.8-fold, and 1.7-fold, respectively. The volume of distribution (Vss) increased by 2.2-fold. The data of microarray analysis on gene expression indicate that five drug-metabolizing genes, including Cyp7a1, Cyp2c6, Ces2e, Atp1b1, and Slc7a2 were significantly altered by the SHXXT (0.5 g/kg) treatment. The loperamide-induced constipation reduced the absorption of rhein. Since among the 25,338 genes analyzed, there were five genes significantly altered by SHXXT treatment. Thus, information on minor drug-metabolizing genes altered by SHXXT treatment indicates that SHXXT is relatively safe for clinical application. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  20. Cross Validation Through Two-Dimensional Solution Surface for Cost-Sensitive SVM.

    PubMed

    Gu, Bin; Sheng, Victor S; Tay, Keng Yeow; Romano, Walter; Li, Shuo

    2017-06-01

    Model selection plays an important role in cost-sensitive SVM (CS-SVM). It has been proven that the global minimum cross validation (CV) error can be efficiently computed based on the solution path for one parameter learning problems. However, it is a challenge to obtain the global minimum CV error for CS-SVM based on one-dimensional solution path and traditional grid search, because CS-SVM is with two regularization parameters. In this paper, we propose a solution and error surfaces based CV approach (CV-SES). More specifically, we first compute a two-dimensional solution surface for CS-SVM based on a bi-parameter space partition algorithm, which can fit solutions of CS-SVM for all values of both regularization parameters. Then, we compute a two-dimensional validation error surface for each CV fold, which can fit validation errors of CS-SVM for all values of both regularization parameters. Finally, we obtain the CV error surface by superposing K validation error surfaces, which can find the global minimum CV error of CS-SVM. Experiments are conducted on seven datasets for cost sensitive learning and on four datasets for imbalanced learning. Experimental results not only show that our proposed CV-SES has a better generalization ability than CS-SVM with various hybrids between grid search and solution path methods, and than recent proposed cost-sensitive hinge loss SVM with three-dimensional grid search, but also show that CV-SES uses less running time.

  1. A Cross-Cultural Examination of Music Instruction Analysis and Evaluation Techniques.

    ERIC Educational Resources Information Center

    Price, Harry E.; Ogawa, Yoko; Arizumi, Koji

    1997-01-01

    Examines whether analysis techniques of student/teacher interactions widely used throughout the United Sates could be applied to a music instruction setting in Japan by analyzing two videotaped music lessons of different teachers. Finds that teacher A, who used five times more feedback, was rated higher overall by both Japanese and U.S. students.…

  2. Mechanisms of Increased Resistance to Chlorhexidine and Cross-Resistance to Colistin following Exposure of Klebsiella pneumoniae Clinical Isolates to Chlorhexidine

    PubMed Central

    Bock, Lucy J.; Bonney, Laura C.

    2016-01-01

    ABSTRACT Klebsiella pneumoniae is an opportunistic pathogen that is often difficult to treat due to its multidrug resistance (MDR). We have previously shown that K. pneumoniae strains are able to “adapt” (become more resistant) to the widely used bisbiguanide antiseptic chlorhexidine. Here, we investigated the mechanisms responsible for and the phenotypic consequences of chlorhexidine adaptation, with particular reference to antibiotic cross-resistance. In five of six strains, adaptation to chlorhexidine also led to resistance to the last-resort antibiotic colistin. Here, we show that chlorhexidine adaptation is associated with mutations in the two-component regulator phoPQ and a putative Tet repressor gene (smvR) adjacent to the major facilitator superfamily (MFS) efflux pump gene, smvA. Upregulation of smvA (10- to 27-fold) was confirmed in smvR mutant strains, and this effect and the associated phenotype were suppressed when a wild-type copy of smvR was introduced on plasmid pACYC. Upregulation of phoPQ (5- to 15-fold) and phoPQ-regulated genes, pmrD (6- to 19-fold) and pmrK (18- to 64-fold), was confirmed in phoPQ mutant strains. In contrast, adaptation of K. pneumoniae to colistin did not result in increased chlorhexidine resistance despite the presence of mutations in phoQ and elevated phoPQ, pmrD, and pmrK transcript levels. Insertion of a plasmid containing phoPQ from chlorhexidine-adapted strains into wild-type K. pneumoniae resulted in elevated expression levels of phoPQ, pmrD, and pmrK and increased resistance to colistin, but not chlorhexidine. The potential risk of colistin resistance emerging in K. pneumoniae as a consequence of exposure to chlorhexidine has important clinical implications for infection prevention procedures. PMID:27799211

  3. Semi-Empirical Validation of the Cross-Band Relative Absorption Technique for the Measurement of Molecular Mixing Ratios

    NASA Technical Reports Server (NTRS)

    Pliutau, Denis; Prasad, Narasimha S

    2013-01-01

    Studies were performed to carry out semi-empirical validation of a new measurement approach we propose for molecular mixing ratios determination. The approach is based on relative measurements in bands of O2 and other molecules and as such may be best described as cross band relative absorption (CoBRA). . The current validation studies rely upon well verified and established theoretical and experimental databases, satellite data assimilations and modeling codes such as HITRAN, line-by-line radiative transfer model (LBLRTM), and the modern-era retrospective analysis for research and applications (MERRA). The approach holds promise for atmospheric mixing ratio measurements of CO2 and a variety of other molecules currently under investigation for several future satellite lidar missions. One of the advantages of the method is a significant reduction of the temperature sensitivity uncertainties which is illustrated with application to the ASCENDS mission for the measurement of CO2 mixing ratios (XCO2). Additional advantages of the method include the possibility to closely match cross-band weighting function combinations which is harder to achieve using conventional differential absorption techniques and the potential for additional corrections for water vapor and other interferences without using the data from numerical weather prediction (NWP) models.

  4. Cross Validity of the Behavior Style Questionnaire and Child Personality Scale in Nursery School Children.

    ERIC Educational Resources Information Center

    Simonds, John F.; Simonds, M. Patricia

    1982-01-01

    Mothers of 182 nursery school children completed the Behavior Style Questionnaire (BSQ) and the Child Personality Scale (CPS). Intercorrelational analyses showed many significantly correlated items. Scores of the five CPS factors clearly distinguished between subjects in easy and difficult BSQ clusters. Found boys significantly more introverted…

  5. The Development and Testing of a Tool for Analysis of Computer-Mediated Conferencing Transcripts.

    ERIC Educational Resources Information Center

    Fahy, Patrick J.; Crawford, Gail; Ally, Mohamed; Cookson, Peter; Keller, Verna; Prosser, Frank

    2000-01-01

    The Zhu model for analyzing computer mediated communications was further developed by an Athabasca University (Alberta) distance education research team based on ease of use, reliability, validity, theoretical support, and cross-discipline utility. Five classification categories of the new model are vertical questioning, horizontal questioning,…

  6. The value of nodal information in predicting lung cancer relapse using 4DPET/4DCT

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Heyse, E-mail: heyse.li@mail.utoronto.ca; Becker, Nathan; Raman, Srinivas

    2015-08-15

    Purpose: There is evidence that computed tomography (CT) and positron emission tomography (PET) imaging metrics are prognostic and predictive in nonsmall cell lung cancer (NSCLC) treatment outcomes. However, few studies have explored the use of standardized uptake value (SUV)-based image features of nodal regions as predictive features. The authors investigated and compared the use of tumor and node image features extracted from the radiotherapy target volumes to predict relapse in a cohort of NSCLC patients undergoing chemoradiation treatment. Methods: A prospective cohort of 25 patients with locally advanced NSCLC underwent 4DPET/4DCT imaging for radiation planning. Thirty-seven image features were derivedmore » from the CT-defined volumes and SUVs of the PET image from both the tumor and nodal target regions. The machine learning methods of logistic regression and repeated stratified five-fold cross-validation (CV) were used to predict local and overall relapses in 2 yr. The authors used well-known feature selection methods (Spearman’s rank correlation, recursive feature elimination) within each fold of CV. Classifiers were ranked on their Matthew’s correlation coefficient (MCC) after CV. Area under the curve, sensitivity, and specificity values are also presented. Results: For predicting local relapse, the best classifier found had a mean MCC of 0.07 and was composed of eight tumor features. For predicting overall relapse, the best classifier found had a mean MCC of 0.29 and was composed of a single feature: the volume greater than 0.5 times the maximum SUV (N). Conclusions: The best classifier for predicting local relapse had only tumor features. In contrast, the best classifier for predicting overall relapse included a node feature. Overall, the methods showed that nodes add value in predicting overall relapse but not local relapse.« less

  7. Microarray analysis of gene expression patterns of high lycopene tomato generated from seeds after long-term space flight

    NASA Astrophysics Data System (ADS)

    Lu, Jinying; Ren, Chunxiao; Pan, Yi; Nechitailo, Galina S.; Liu, Min

    Lycopene content is a most vital trait of tomatoes due to the role of lycopene in reducing the risk of some kinds of cancers. In this experiment, we gained a high lycopene (hl) tomato (named HY-2), after seven generations of self-cross selection, from seeds Russian MNP-1 carried in Russia MIR space station for six years. HPLC result showed that the lycopene content was 1.6 times more than that in Russian MNP-1 (the wild type). Microarray analysis presented the general profile of differential expressed genes at the tomato developmental stage of 7DPB (days post breaker). One hundred and forty three differential expression genes were identified according to the following criterion: the average changes were no less than 1.5 folds with q-value (similar to FDR) less than 0.05 or changes were no less than 1.5 folds in all three biological replications. Most of the differential expressed genes were mainly involved in metabolism, response to stimulus, biosynthesis, development and regulation. Particularly, we discussed the genes involved in protein metabolism, response to unfolded protein, carotenoid biosynthesis and photosynthesis that might be related to the fruit development and the accumulation of lycopene. What's more, we conducted QRT-PCR validation of five key genes (Fps, CrtL-b, CrtR-b, Zep and Nxs) in the lycopene biosynthesis pathway through time courses and that provided the direct molecular evidence for the hl phenotype. Our results demonstrate that long-term space flight, as a rarely used tool, can positively cause some beneficial mutations in the seeds and thus to help to generate a high quality variety, combined with ground selections.

  8. Laser bonding with ICG-infused chitosan patches: preliminary experiences in suine dura mater and vocal folds

    NASA Astrophysics Data System (ADS)

    Rossi, Francesca; Matteini, Paolo; Ratto, Fulvio; Pini, Roberto; Iacoangeli, Maurizio; Giannoni, Luca; Fortuna, Damiano; Di Cicco, Emiliano; Corbara, Sylwia; Dallari, Stefano

    2014-05-01

    Laser bonding is a promising minimally invasive approach, emerging as a valid alternative to conventional suturing techniques. It shows widely demonstrated advantages in wound treatment: immediate closuring effect, minimal inflammatory response and scar formation, reduced healing time. This laser based technique can overcome the difficulties in working through narrow surgical corridors (e.g. the modern "key-hole" surgery as well as the endoscopy setting) or in thin tissues that are impossible to treat with staples and/or stitches. We recently proposed the use of chitosan matrices, stained with conventional chromophores, to be used in laser bonding of vascular tissue. In this work we propose the same procedure to perform laser bonding of vocal folds and dura mater repair. Laser bonding of vocal folds is proposed to avoid the development of adhesions (synechiae), after conventional or CO2 laser surgery. Laser bonding application in neurosurgery is proposed for the treatment of dural defects being the Cerebro Spinal Fluid leaks still a major issue. Vocal folds and dura mater were harvested from 9-months old porks and used in the experimental sessions within 4 hours after sacrifice. In vocal folds treatment, an IdocyanineGreen-infused chitosan patch was applied onto the anterior commissure, while the dura mater was previously incised and then bonded. A diode laser emitting at 810 nm, equipped with a 600 μm diameter optical fiber was used to weld the patch onto the tissue, by delivering single laser spots to induce local patch/tissue adhesion. The result is an immediate adhesion of the patch to the tissue. Standard histology was performed, in order to study the induced photothermal effect at the bonding sites. This preliminary experimental activity shows the advantages of the proposed technique in respect to standard surgery: simplification of the procedure; decreased foreign-body reaction; reduced inflammatory response; reduced operating times and better handling in depth.

  9. Cross-Cultural Adaptation and Validation of the Italian Version of SWAL-QOL.

    PubMed

    Ginocchio, Daniela; Alfonsi, Enrico; Mozzanica, Francesco; Accornero, Anna Rosa; Bergonzoni, Antonella; Chiarello, Giulia; De Luca, Nicoletta; Farneti, Daniele; Marilia, Simonelli; Calcagno, Paola; Turroni, Valentina; Schindler, Antonio

    2016-10-01

    The aim of the study was to evaluate the reliability and validity of the Italian SWAL-QOL (I-SWAL-QOL). The study consisted of five phases: item generation, reliability analysis, normative data generation, validity analysis, and responsiveness analysis. The item generation phase followed the five-step, cross-cultural, adaptation process of translation and back-translation. A group of 92 dysphagic patients was enrolled for the internal consistency analysis. Seventy-eight patients completed the I-SWAL-QOL twice, 2 weeks apart, for test-retest reliability analysis. A group of 200 asymptomatic subjects completed the I-SWAL-QOL for normative data generation. I-SWAL-QOL scores obtained by both the group of dysphagic subjects and asymptomatic ones were compared for validity analysis. I-SWAL-QOL scores were correlated with SF-36 scores in 67 patients with dysphagia for concurrent validity analysis. Finally, I-SWAL-QOL scores obtained in a group of 30 dysphagic patients before and after successful rehabilitation treatment were compared for responsiveness analysis. All the enrolled patients managed to complete the I-SWAL-QOL without needing any assistance, within 20 min. Internal consistency was acceptable for all I-SWAL-QOL subscales (α > 0.70). Test-retest reliability was also satisfactory for all subscales (ICC > 0.7). A significant difference between the dysphagic group and the control group was found in all I-SWAL-QOL subscales (p < 0.05). Mild to moderate correlations between I-SWAL-QOL and SF-36 subscales were observed. I-SWAL-QOL scores obtained in the pre-treatment condition were significantly lower than those obtained after swallowing rehabilitation. I-SWAL-QOL is reliable, valid, responsive to changes in QOL, and recommended for clinical practice and outcome research.

  10. NoFold: RNA structure clustering without folding or alignment.

    PubMed

    Middleton, Sarah A; Kim, Junhyong

    2014-11-01

    Structures that recur across multiple different transcripts, called structure motifs, often perform a similar function-for example, recruiting a specific RNA-binding protein that then regulates translation, splicing, or subcellular localization. Identifying common motifs between coregulated transcripts may therefore yield significant insight into their binding partners and mechanism of regulation. However, as most methods for clustering structures are based on folding individual sequences or doing many pairwise alignments, this results in a tradeoff between speed and accuracy that can be problematic for large-scale data sets. Here we describe a novel method for comparing and characterizing RNA secondary structures that does not require folding or pairwise alignment of the input sequences. Our method uses the idea of constructing a distance function between two objects by their respective distances to a collection of empirical examples or models, which in our case consists of 1973 Rfam family covariance models. Using this as a basis for measuring structural similarity, we developed a clustering pipeline called NoFold to automatically identify and annotate structure motifs within large sequence data sets. We demonstrate that NoFold can simultaneously identify multiple structure motifs with an average sensitivity of 0.80 and precision of 0.98 and generally exceeds the performance of existing methods. We also perform a cross-validation analysis of the entire set of Rfam families, achieving an average sensitivity of 0.57. We apply NoFold to identify motifs enriched in dendritically localized transcripts and report 213 enriched motifs, including both known and novel structures. © 2014 Middleton and Kim; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  11. The Cenozoic fold-and-thrust belt of Eastern Sardinia: Evidences from the integration of field data with numerically balanced geological cross section

    NASA Astrophysics Data System (ADS)

    Arragoni, S.; Maggi, M.; Cianfarra, P.; Salvini, F.

    2016-06-01

    Newly collected structural data in Eastern Sardinia (Italy) integrated with numerical techniques led to the reconstruction of a 2-D admissible and balanced model revealing the presence of a widespread Cenozoic fold-and-thrust belt. The model was achieved with the FORC software, obtaining a 3-D (2-D + time) numerical reconstruction of the continuous evolution of the structure through time. The Mesozoic carbonate units of Eastern Sardinia and their basement present a fold-and-thrust tectonic setting, with a westward direction of tectonic transport (referred to the present-day coordinates). The tectonic style of the upper levels is thin skinned, with flat sectors prevailing over ramps and younger-on-older thrusts. Three regional tectonic units are present, bounded by two regional thrusts. Strike-slip faults overprint the fold-and-thrust belt and developed during the Sardinia-Corsica Block rotation along the strike of the preexisting fault ramps, not affecting the numerical section balancing. This fold-and-thrust belt represents the southward prosecution of the Alpine Corsica collisional chain and the missing link between the Alpine Chain and the Calabria-Peloritani Block. Relative ages relate its evolution to the meso-Alpine event (Eocene-Oligocene times), prior to the opening of the Tyrrhenian Sea (Tortonian). Results fill a gap of information about the geodynamic evolution of the European margin in Central Mediterranean, between Corsica and the Calabria-Peloritani Block, and imply the presence of remnants of this double-verging belt, missing in the Southern Tyrrhenian basin, within the Southern Apennine chain. The used methodology proved effective for constraining balanced cross sections also for areas lacking exposures of the large-scale structures, as the case of Eastern Sardinia.

  12. Addressing Participant Validity in a Small Internet Health Survey (The Restore Study): Protocol and Recommendations for Survey Response Validation.

    PubMed

    Dewitt, James; Capistrant, Benjamin; Kohli, Nidhi; Rosser, B R Simon; Mitteldorf, Darryl; Merengwa, Enyinnaya; West, William

    2018-04-24

    While deduplication and cross-validation protocols have been recommended for large Web-based studies, protocols for survey response validation of smaller studies have not been published. This paper reports the challenges of survey validation inherent in a small Web-based health survey research. The subject population was North American, gay and bisexual, prostate cancer survivors, who represent an under-researched, hidden, difficult-to-recruit, minority-within-a-minority population. In 2015-2016, advertising on a large Web-based cancer survivor support network, using email and social media, yielded 478 completed surveys. Our manual deduplication and cross-validation protocol identified 289 survey submissions (289/478, 60.4%) as likely spam, most stemming from advertising on social media. The basic components of this deduplication and validation protocol are detailed. An unexpected challenge encountered was invalid survey responses evolving across the study period. This necessitated the static detection protocol be augmented with a dynamic one. Five recommendations for validation of Web-based samples, especially with smaller difficult-to-recruit populations, are detailed. ©James Dewitt, Benjamin Capistrant, Nidhi Kohli, B R Simon Rosser, Darryl Mitteldorf, Enyinnaya Merengwa, William West. Originally published in JMIR Research Protocols (http://www.researchprotocols.org), 24.04.2018.

  13. Cross-cultural adaptation and validation of the Protective Nursing Advocacy Scale for Brazilian nurses 1

    PubMed Central

    Tomaschewski-Barlem, Jamila Geri; Lunardi, Valéria Lerch; Barlem, Edison Luiz Devos; da Silveira, Rosemary Silva; Dalmolin, Graziele de Lima; Ramos, Aline Marcelino

    2015-01-01

    Abstract Objective: to adapt culturally and validate the Protective Nursing Advocacy Scale for Brazilian nurses. Method: methodological study carried out with 153 nurses from two hospitals in the South region of Brazil, one public and the other philanthropic. The cross-cultural adaptation of the Protective Nursing Advocacy Scale was performed according to international standards, and its validation was carried out for use in the Brazilian context, by means of factor analysis and Cronbach's alpha as measure of internal consistency. Results: by means of evaluation by a committee of experts and application of pre-test, face validity and content validity of the instrument were considered satisfactory. From the factor analysis, five constructs were identified: negative implications of the advocacy practice, advocacy actions, facilitators of the advocacy practice, perceptions that favor practice advocacy and barriers to advocacy practice. The instrument showed satisfactory internal consistency, with Cronbach's alpha values ranging from 0.70 to 0.87. Conclusion: it was concluded that the Protective Nursing Advocacy Scale - Brazilian version, is a valid and reliable instrument for use in the evaluation of beliefs and actions of health advocacy, performed by Brazilian nurses in their professional practice environment. PMID:26444169

  14. Inattention in primary school is not good for your future school achievement—A pattern classification study

    PubMed Central

    Bøe, Tormod; Lundervold, Arvid

    2017-01-01

    Inattention in childhood is associated with academic problems later in life. The contribution of specific aspects of inattentive behaviour is, however, less known. We investigated feature importance of primary school teachers’ reports on nine aspects of inattentive behaviour, gender and age in predicting future academic achievement. Primary school teachers of n = 2491 children (7–9 years) rated nine items reflecting different aspects of inattentive behaviour in 2002. A mean academic achievement score from the previous semester in high school (2012) was available for each youth from an official school register. All scores were at a categorical level. Feature importances were assessed by using multinominal logistic regression, classification and regression trees analysis, and a random forest algorithm. Finally, a comprehensive pattern classification procedure using k-fold cross-validation was implemented. Overall, inattention was rated as more severe in boys, who also obtained lower academic achievement scores in high school than girls. Problems related to sustained attention and distractibility were together with age and gender defined as the most important features to predict future achievement scores. Using these four features as input to a collection of classifiers employing k-fold cross-validation for prediction of academic achievement level, we obtained classification accuracy, precision and recall that were clearly better than chance levels. Primary school teachers’ reports of problems related to sustained attention and distractibility were identified as the two most important features of inattentive behaviour predicting academic achievement in high school. Identification and follow-up procedures of primary school children showing these characteristics should be prioritised to prevent future academic failure. PMID:29182663

  15. Inattention in primary school is not good for your future school achievement-A pattern classification study.

    PubMed

    Lundervold, Astri J; Bøe, Tormod; Lundervold, Arvid

    2017-01-01

    Inattention in childhood is associated with academic problems later in life. The contribution of specific aspects of inattentive behaviour is, however, less known. We investigated feature importance of primary school teachers' reports on nine aspects of inattentive behaviour, gender and age in predicting future academic achievement. Primary school teachers of n = 2491 children (7-9 years) rated nine items reflecting different aspects of inattentive behaviour in 2002. A mean academic achievement score from the previous semester in high school (2012) was available for each youth from an official school register. All scores were at a categorical level. Feature importances were assessed by using multinominal logistic regression, classification and regression trees analysis, and a random forest algorithm. Finally, a comprehensive pattern classification procedure using k-fold cross-validation was implemented. Overall, inattention was rated as more severe in boys, who also obtained lower academic achievement scores in high school than girls. Problems related to sustained attention and distractibility were together with age and gender defined as the most important features to predict future achievement scores. Using these four features as input to a collection of classifiers employing k-fold cross-validation for prediction of academic achievement level, we obtained classification accuracy, precision and recall that were clearly better than chance levels. Primary school teachers' reports of problems related to sustained attention and distractibility were identified as the two most important features of inattentive behaviour predicting academic achievement in high school. Identification and follow-up procedures of primary school children showing these characteristics should be prioritised to prevent future academic failure.

  16. Rigid Origami via Optical Programming and Deferred Self-Folding of a Two-Stage Photopolymer.

    PubMed

    Glugla, David J; Alim, Marvin D; Byars, Keaton D; Nair, Devatha P; Bowman, Christopher N; Maute, Kurt K; McLeod, Robert R

    2016-11-02

    We demonstrate the formation of shape-programmed, glassy origami structures using a single-layer photopolymer with two mechanically distinct phases. The latent origami pattern consisting of rigid, high cross-link density panels and flexible, low cross-link density creases is fabricated using a series of photomask exposures. Strong optical absorption of the polymer formulation creates depth-wise gradients in the cross-link density of the creases, enforcing directed folding which enables programming of both mountain and valley folds within the same sheet. These multiple photomask patterns can be sequentially applied because the sheet remains flat until immersed into a photopolymerizable monomer solution that differentially swells the polymer to fold and form the origami structure. After folding, a uniform photoexposure polymerizes the absorbed solution, permanently fixing the shape of the folded structure while simultaneously increasing the modulus of the folds. This approach creates sharp folds by mimicking the stiff panels and flexible creases of paper origami while overcoming the traditional trade-off of self-actuated materials that require low modulus for folding and high modulus for mechanical robustness. Using this process, we demonstrate a waterbomb base capable of supporting 1500 times its own weight.

  17. Size-Selected Ag Nanoparticles with Five-Fold Symmetry

    PubMed Central

    2009-01-01

    Silver nanoparticles were synthesized using the inert gas aggregation technique. We found the optimal experimental conditions to synthesize nanoparticles at different sizes: 1.3 ± 0.2, 1.7 ± 0.3, 2.5 ± 0.4, 3.7 ± 0.4, 4.5 ± 0.9, and 5.5 ± 0.3 nm. We were able to investigate the dependence of the size of the nanoparticles on the synthesis parameters. Our data suggest that the aggregation of clusters (dimers, trimer, etc.) into the active zone of the nanocluster source is the predominant physical mechanism for the formation of the nanoparticles. Our experiments were carried out in conditions that kept the density of nanoparticles low, and the formation of larges nanoparticles by coalescence processes was avoided. In order to preserve the structural and morphological properties, the impact energy of the clusters landing into the substrate was controlled, such that the acceleration energy of the nanoparticles was around 0.1 eV/atom, assuring a soft landing deposition. High-resolution transmission electron microscopy images showed that the nanoparticles were icosahedral in shape, preferentially oriented with a five-fold axis perpendicular to the substrate surface. Our results show that the synthesis by inert gas aggregation technique is a very promising alternative to produce metal nanoparticles when the control of both size and shape are critical for the development of practical applications. PMID:20596397

  18. Size-selected ag nanoparticles with five-fold symmetry.

    PubMed

    Gracia-Pinilla, Miguelángel; Ferrer, Domingo; Mejía-Rosales, Sergio; Pérez-Tijerina, Eduardo

    2009-05-15

    Silver nanoparticles were synthesized using the inert gas aggregation technique. We found the optimal experimental conditions to synthesize nanoparticles at different sizes: 1.3 ± 0.2, 1.7 ± 0.3, 2.5 ± 0.4, 3.7 ± 0.4, 4.5 ± 0.9, and 5.5 ± 0.3 nm. We were able to investigate the dependence of the size of the nanoparticles on the synthesis parameters. Our data suggest that the aggregation of clusters (dimers, trimer, etc.) into the active zone of the nanocluster source is the predominant physical mechanism for the formation of the nanoparticles. Our experiments were carried out in conditions that kept the density of nanoparticles low, and the formation of larges nanoparticles by coalescence processes was avoided. In order to preserve the structural and morphological properties, the impact energy of the clusters landing into the substrate was controlled, such that the acceleration energy of the nanoparticles was around 0.1 eV/atom, assuring a soft landing deposition. High-resolution transmission electron microscopy images showed that the nanoparticles were icosahedral in shape, preferentially oriented with a five-fold axis perpendicular to the substrate surface. Our results show that the synthesis by inert gas aggregation technique is a very promising alternative to produce metal nanoparticles when the control of both size and shape are critical for the development of practical applications.

  19. The Pain Self-Efficacy Questionnaire: Cross-Cultural Adaptation into Italian and Assessment of Its Measurement Properties.

    PubMed

    Chiarotto, Alessandro; Vanti, Carla; Ostelo, Raymond W; Ferrari, Silvano; Tedesco, Giuseppe; Rocca, Barbara; Pillastrini, Paolo; Monticone, Marco

    2015-11-01

    The Pain Self-Efficacy Questionnaire (PSEQ) is a patient self-reported measurement instrument that evaluates pain self-efficacy beliefs in patients with chronic pain. The measurement properties of the PSEQ have been tested in its original and translated versions, showing satisfactory results for validity and reliability. The aims of this study were 2 fold as follows: (1) to translate the PSEQ into Italian through a process of cross-cultural adaptation, (2) to test the measurement properties of the Italian PSEQ (PSEQ-I). The cross-cultural adaptation was completed in 5 months without omitting any item of the original PSEQ. Measurement properties were tested in 165 patients with chronic low back pain (CLBP) (65% women, mean age 49.9 years). Factor analysis confirmed the one-factor structure of the questionnaire. Internal consistency (Cronbach's α = 0.94) and test-retest reliability (ICCagreement  = 0.82) of the PSEQ-I showed good results. The smallest detectable change was equal to 15.69 scale points. The PSEQ-I displayed a high construct validity by meeting more than 75% of a priori hypotheses on correlations with measurement instruments assessing pain intensity, disability, anxiety, depression, pain catastrophizing, fear of movement, and coping strategies. Additionally, the PSEQ-I differentiated patients taking pain medication or not. The results of this study suggest that the PSEQ-I can be used as a valid and reliable tool in Italian patients with CLBP. © 2014 World Institute of Pain.

  20. Integrated Strategy Improves the Prediction Accuracy of miRNA in Large Dataset

    PubMed Central

    Lipps, David; Devineni, Sree

    2016-01-01

    MiRNAs are short non-coding RNAs of about 22 nucleotides, which play critical roles in gene expression regulation. The biogenesis of miRNAs is largely determined by the sequence and structural features of their parental RNA molecules. Based on these features, multiple computational tools have been developed to predict if RNA transcripts contain miRNAs or not. Although being very successful, these predictors started to face multiple challenges in recent years. Many predictors were optimized using datasets of hundreds of miRNA samples. The sizes of these datasets are much smaller than the number of known miRNAs. Consequently, the prediction accuracy of these predictors in large dataset becomes unknown and needs to be re-tested. In addition, many predictors were optimized for either high sensitivity or high specificity. These optimization strategies may bring in serious limitations in applications. Moreover, to meet continuously raised expectations on these computational tools, improving the prediction accuracy becomes extremely important. In this study, a meta-predictor mirMeta was developed by integrating a set of non-linear transformations with meta-strategy. More specifically, the outputs of five individual predictors were first preprocessed using non-linear transformations, and then fed into an artificial neural network to make the meta-prediction. The prediction accuracy of meta-predictor was validated using both multi-fold cross-validation and independent dataset. The final accuracy of meta-predictor in newly-designed large dataset is improved by 7% to 93%. The meta-predictor is also proved to be less dependent on datasets, as well as has refined balance between sensitivity and specificity. This study has two folds of importance: First, it shows that the combination of non-linear transformations and artificial neural networks improves the prediction accuracy of individual predictors. Second, a new miRNA predictor with significantly improved prediction accuracy is developed for the community for identifying novel miRNAs and the complete set of miRNAs. Source code is available at: https://github.com/xueLab/mirMeta PMID:28002428

  1. Re-evaluation of Spent Nuclear Fuel Assay Data for the Three Mile Island Unit 1 Reactor and Application to Code Validation

    DOE PAGES

    Gauld, Ian C.; Giaquinto, J. M.; Delashmitt, J. S.; ...

    2016-01-01

    Destructive radiochemical assay measurements of spent nuclear fuel rod segments from an assembly irradiated in the Three Mile Island unit 1 (TMI-1) pressurized water reactor have been performed at Oak Ridge National Laboratory (ORNL). Assay data are reported for five samples from two fuel rods of the same assembly. The TMI-1 assembly was a 15 X 15 design with an initial enrichment of 4.013 wt% 235U, and the measured samples achieved burnups between 45.5 and 54.5 gigawatt days per metric ton of initial uranium (GWd/t). Measurements were performed mainly using inductively coupled plasma mass spectrometry after elemental separation via highmore » performance liquid chromatography. High precision measurements were achieved using isotope dilution techniques for many of the lanthanides, uranium, and plutonium isotopes. Measurements are reported for more than 50 different isotopes and 16 elements. One of the two TMI-1 fuel rods measured in this work had been measured previously by Argonne National Laboratory (ANL), and these data have been widely used to support code and nuclear data validation. Recently, ORNL provided an important opportunity to independently cross check results against previous measurements performed at ANL. The measured nuclide concentrations are used to validate burnup calculations using the SCALE nuclear systems modeling and simulation code suite. These results show that the new measurements provide reliable benchmark data for computer code validation.« less

  2. Sub-classification of Advanced-Stage Hepatocellular Carcinoma: A Cohort Study Including 612 Patients Treated with Sorafenib.

    PubMed

    Yoo, Jeong-Ju; Chung, Goh Eun; Lee, Jeong-Hoon; Nam, Joon Yeul; Chang, Young; Lee, Jeong Min; Lee, Dong Ho; Kim, Hwi Young; Cho, Eun Ju; Yu, Su Jong; Kim, Yoon Jun; Yoon, Jung-Hwan

    2018-04-01

    Advanced hepatocellular carcinoma (HCC) is associated with various clinical conditions including major vessel invasion, metastasis, and poor performance status. The aim of this study was to establish a prognostic scoring system and to propose a sub-classification of the Barcelona-Clinic Liver Cancer (BCLC) stage C. This retrospective study included consecutive patientswho received sorafenib for BCLC stage C HCC at a single tertiary hospital in Korea. A Cox proportional hazard model was used to develop a scoring system, and internal validationwas performed by a 5-fold cross-validation. The performance of the model in predicting risk was assessed by the area under the curve and the Hosmer-Lemeshow test. A total of 612 BCLC stage C HCC patients were sub- classified into strata depending on their performance status. Five independent prognostic factors (Child-Pugh score, α-fetoprotein, tumor type, extrahepatic metastasis, and portal vein invasion) were identified and used in the prognostic scoring system. This scoring system showed good discrimination (area under the receiver operating characteristic curve, 0.734 to 0.818) and calibration functions (both p < 0.05 by the Hosmer-Lemeshow test at 1 month and 12 months, respectively). The differences in survival among the different risk groups classified by the total score were significant (p < 0.001 by the log-rank test in both the Eastern Cooperative Oncology Group 0 and 1 strata). The heterogeneity of patientswith BCLC stage C HCC requires sub-classification of advanced HCC. A prognostic scoring system with five independent factors is useful in predicting the survival of patients with BCLC stage C HCC.

  3. Fingerprinting stress: Stylolite and calcite twinning paleopiezometry revealing the complexity of progressive stress patterns during folding—The case of the Monte Nero anticline in the Apennines, Italy

    NASA Astrophysics Data System (ADS)

    Beaudoin, Nicolas; Koehn, Daniel; Lacombe, Olivier; Lecouty, Alexandre; Billi, Andrea; Aharonov, Einat; Parlangeau, Camille

    2016-07-01

    In this study we show for the first time how quantitative stress estimates can be derived by combining calcite twinning and stylolite roughness stress fingerprinting techniques in a fold-and-thrust belt. First, we present a new method that gives access to stress inversion using tectonic stylolites without access to the stylolite surface and compare results with calcite twin inversion. Second, we use our new approach to present a high-resolution deformation and stress history that affected Meso-Cenozoic limestone strata in the Monte Nero Anticline during its late Miocene-Pliocene growth in the Umbria-Marche Arcuate Ridge (northern Apennines, Italy). In this area an extensive stylolite-joint/vein network developed during layer-parallel shortening (LPS), as well as during and after folding. Stress fingerprinting illustrates how stress in the sedimentary strata did build up prior to folding during LPS. The stress regime oscillated between strike slip and compressional during LPS before ultimately becoming strike slip again during late stage fold tightening. Our case study shows that high-resolution stress fingerprinting is possible and that this novel method can be used to unravel temporal relationships that relate to local variations of regional orogenic stresses. Beyond regional implications, this study validates our approach as a new powerful toolbox to high-resolution stress fingerprinting in basins and orogens combining joint and vein analysis with sedimentary and tectonic stylolite and calcite twin inversion techniques.

  4. GiNA, an Efficient and High-Throughput Software for Horticultural Phenotyping

    PubMed Central

    Diaz-Garcia, Luis; Covarrubias-Pazaran, Giovanny; Schlautman, Brandon; Zalapa, Juan

    2016-01-01

    Traditional methods for trait phenotyping have been a bottleneck for research in many crop species due to their intensive labor, high cost, complex implementation, lack of reproducibility and propensity to subjective bias. Recently, multiple high-throughput phenotyping platforms have been developed, but most of them are expensive, species-dependent, complex to use, and available only for major crops. To overcome such limitations, we present the open-source software GiNA, which is a simple and free tool for measuring horticultural traits such as shape- and color-related parameters of fruits, vegetables, and seeds. GiNA is multiplatform software available in both R and MATLAB® programming languages and uses conventional images from digital cameras with minimal requirements. It can process up to 11 different horticultural morphological traits such as length, width, two-dimensional area, volume, projected skin, surface area, RGB color, among other parameters. Different validation tests produced highly consistent results under different lighting conditions and camera setups making GiNA a very reliable platform for high-throughput phenotyping. In addition, five-fold cross validation between manually generated and GiNA measurements for length and width in cranberry fruits were 0.97 and 0.92. In addition, the same strategy yielded prediction accuracies above 0.83 for color estimates produced from images of cranberries analyzed with GiNA compared to total anthocyanin content (TAcy) of the same fruits measured with the standard methodology of the industry. Our platform provides a scalable, easy-to-use and affordable tool for massive acquisition of phenotypic data of fruits, seeds, and vegetables. PMID:27529547

  5. GiNA, an Efficient and High-Throughput Software for Horticultural Phenotyping.

    PubMed

    Diaz-Garcia, Luis; Covarrubias-Pazaran, Giovanny; Schlautman, Brandon; Zalapa, Juan

    2016-01-01

    Traditional methods for trait phenotyping have been a bottleneck for research in many crop species due to their intensive labor, high cost, complex implementation, lack of reproducibility and propensity to subjective bias. Recently, multiple high-throughput phenotyping platforms have been developed, but most of them are expensive, species-dependent, complex to use, and available only for major crops. To overcome such limitations, we present the open-source software GiNA, which is a simple and free tool for measuring horticultural traits such as shape- and color-related parameters of fruits, vegetables, and seeds. GiNA is multiplatform software available in both R and MATLAB® programming languages and uses conventional images from digital cameras with minimal requirements. It can process up to 11 different horticultural morphological traits such as length, width, two-dimensional area, volume, projected skin, surface area, RGB color, among other parameters. Different validation tests produced highly consistent results under different lighting conditions and camera setups making GiNA a very reliable platform for high-throughput phenotyping. In addition, five-fold cross validation between manually generated and GiNA measurements for length and width in cranberry fruits were 0.97 and 0.92. In addition, the same strategy yielded prediction accuracies above 0.83 for color estimates produced from images of cranberries analyzed with GiNA compared to total anthocyanin content (TAcy) of the same fruits measured with the standard methodology of the industry. Our platform provides a scalable, easy-to-use and affordable tool for massive acquisition of phenotypic data of fruits, seeds, and vegetables.

  6. Sequence-structure mapping errors in the PDB: OB-fold domains

    PubMed Central

    Venclovas, Česlovas; Ginalski, Krzysztof; Kang, Chulhee

    2004-01-01

    The Protein Data Bank (PDB) is the single most important repository of structural data for proteins and other biologically relevant molecules. Therefore, it is critically important to keep the PDB data, as much as possible, error-free. In this study, we have analyzed PDB crystal structures possessing oligonucleotide/oligosaccharide binding (OB)-fold, one of the highly populated folds, for the presence of sequence-structure mapping errors. Using energy-based structure quality assessment coupled with sequence analyses, we have found that there are at least five OB-structures in the PDB that have regions where sequences have been incorrectly mapped onto the structure. We have demonstrated that the combination of these computation techniques is effective not only in detecting sequence-structure mapping errors, but also in providing guidance to correct them. Namely, we have used results of computational analysis to direct a revision of X-ray data for one of the PDB entries containing a fairly inconspicuous sequence-structure mapping error. The revised structure has been deposited with the PDB. We suggest use of computational energy assessment and sequence analysis techniques to facilitate structure determination when homologs having known structure are available to use as a reference. Such computational analysis may be useful in either guiding the sequence-structure assignment process or verifying the sequence mapping within poorly defined regions. PMID:15133161

  7. Parallel magnetic resonance imaging using coils with localized sensitivities.

    PubMed

    Goldfarb, James W; Holland, Agnes E

    2004-09-01

    The purpose of this study was to present clinical examples and illustrate the inefficiencies of a conventional reconstruction using a commercially available phased array coil with localized sensitivities. Five patients were imaged at 1.5 T using a cardiac-synchronized gadolinium-enhanced acquisition and a commercially available four-element phased array coil. Four unique sets of images were reconstructed from the acquired k-space data: (a) sum-of-squares image using four elements of the coil; localized sum-of-squares images from the (b) anterior coils and (c) posterior coils and a (c) local reconstruction. Images were analyzed for artifacts and usable field-of-view. Conventional image reconstruction produced images with fold-over artifacts in all cases spanning a portion of the image (mean 90 mm; range 36-126 mm). The local reconstruction removed fold-over artifacts and resulted in an effective increase in the field-of-view (mean 50%; range 20-70%). Commercially available phased array coils do not always have overlapping sensitivities. Fold-over artifacts can be removed using an alternate reconstruction method. When assessing the advantages of parallel imaging techniques, gains achieved using techniques such as SENSE and SMASH should be gauged against the acquisition time of the localized method rather than the conventional sum-of-squares method.

  8. Beware of external validation! - A Comparative Study of Several Validation Techniques used in QSAR Modelling.

    PubMed

    Majumdar, Subhabrata; Basak, Subhash C

    2018-04-26

    Proper validation is an important aspect of QSAR modelling. External validation is one of the widely used validation methods in QSAR where the model is built on a subset of the data and validated on the rest of the samples. However, its effectiveness for datasets with a small number of samples but large number of predictors remains suspect. Calculating hundreds or thousands of molecular descriptors using currently available software has become the norm in QSAR research, owing to computational advances in the past few decades. Thus, for n chemical compounds and p descriptors calculated for each molecule, the typical chemometric dataset today has high value of p but small n (i.e. n < p). Motivated by the evidence of inadequacies of external validation in estimating the true predictive capability of a statistical model in recent literature, this paper performs an extensive and comparative study of this method with several other validation techniques. We compared four validation methods: leave-one-out, K-fold, external and multi-split validation, using statistical models built using the LASSO regression, which simultaneously performs variable selection and modelling. We used 300 simulated datasets and one real dataset of 95 congeneric amine mutagens for this evaluation. External validation metrics have high variation among different random splits of the data, hence are not recommended for predictive QSAR models. LOO has the overall best performance among all validation methods applied in our scenario. Results from external validation are too unstable for the datasets we analyzed. Based on our findings, we recommend using the LOO procedure for validating QSAR predictive models built on high-dimensional small-sample data. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  9. Radiation therapy induces circulating serum Hsp72 in patients with prostate cancer.

    PubMed

    Hurwitz, Mark D; Kaur, Punit; Nagaraja, Ganachari M; Bausero, Maria A; Manola, Judith; Asea, Alexzander

    2010-06-01

    Hsp72 found in the extracellular milieu has been shown to play an important role in immune regulation. The impact of common cancer therapies on extracellular release of Hsp72 however, has been to date undefined. Serum from 13 patients undergoing radiation therapy (XRT) for prostate cancer with or without hormonal therapy (ADT) was measured for levels of circulating serum Hsp72 and pro-inflammatory cytokines (IL-6 and TNF-alpha) using the classical sandwich ELISA technique and the relative expression of CD8(+) T lymphocytes and natural killer (NK) cells was measured using flow cytometry. Mouse orthotopic xenograft of human prostate cancer tumors (DU-145 and PC-3) were used to validate and further characterize the response noted in the clinical setting. The biological significance of tumor released Hsp72 was studied in human dendritic cells (DC) in vitro. Circulating serum Hsp72 levels increased an average of 3.5-fold (median per patient 4.8-fold) with XRT but not with ADT (p=0.0002). Increases in IL-6 (3.3-fold), TNF-alpha (1.8-fold), CD8(+) CTL (2.1-fold) and NK cells (3.2-fold) also occurred. Using PC-3 and DU-145 human prostate cancer xenograft models in mice, we confirmed that XRT induces Hsp72 release primarily from implanted tumors. In vitro studies using supernatant recovered from irradiated human prostate cancer cells point to exosomes containing Hsp72 as a possible stimulator of pro-inflammatory cytokine production and costimulatory molecules expression in human DC. The current study confirms for the first time in an actual clinical setting elevation of circulating serum Hsp72 with XRT. The accompanying studies in mice and in vitro identify the released exosomes containing Hsp72 as playing a pivotal role in stimulating pro-inflammatory immune responses. These findings, if validated, may lead to new treatment paradigms for common human malignancies. Copyright 2010 Elsevier Ireland Ltd. All rights reserved.

  10. Radiation therapy induces circulating serum Hsp72 in patients with prostate cancer

    PubMed Central

    Hurwitz, Mark D.; Kaur, Punit; Nagaraja, Ganachari M.; Bausero, Maria A.; Manola, Judith; Asea, Alexzander

    2010-01-01

    Background and purpose Hsp72 found in the extracellular milieu has been shown to play an important role in immune regulation. The impact of common cancer therapies on extracellular release of Hsp72 however, has been to date undefined. Materials and methods Serum from 13 patients undergoing radiation therapy (XRT) for prostate cancer with or without hormonal therapy (ADT) was measured for levels of circulating serum Hsp72 and pro-inflammatory cytokines (IL-6 and TNF-α) using the classical sandwich ELISA technique and the relative expression of CD8+ T lymphocytes and natural killer (NK) cells was measured using flow cytometry. Mouse orthotopic xenograft of human prostate cancer tumors (DU145 and PC3) were used to validate and further characterize the response noted in the clinical setting. The biological significance of tumor released Hsp72 was studied in human dendritic cells (DC) in vitro. Results Circulating serum Hsp72 levels increased an average of 3.5-fold (median per patient 4.8-fold) with XRT but not with ADT (p = 0.0002). Increases in IL-6 (3.3-fold), TNF-α (1.8-fold), CD8+ CTL (2.1-fold) and NK cells (3.2-fold) also occurred. Using PC3 and DU145 human prostate cancer xenograft models in mice, we confirmed that XRT induces Hsp72 release primarily from implanted tumors. In vitro studies using supernatant recovered from irradiated human prostate cancer cells point to exosomes containing Hsp72 as a possible stimulator of pro-inflammatory cytokine production and costimulatory molecules expression in human DC. Conclusions The current study confirms for the first time in an actual clinical setting elevation of circulating serum Hsp72 with XRT. The accompanying studies in mice and in vitro identify the released exosomes containing Hsp72 as playing a pivotal role in stimulating pro-inflammatory immune responses. These findings, if validated, may lead to new treatment paradigms for common human malignancies. PMID:20430459

  11. A comparison of the validity of the five-factor model (FFM) personality disorder prototypes. Using FFM self-report and interview measures.

    PubMed

    Miller, Joshua D; Bagby, R Michael; Pilkonis, Paul A

    2005-12-01

    Recent studies have demonstrated that personality disorders (PDs) can be assessed via a prototype-matching technique, which enables researchers and clinicians to match an individual's five-factor model (FFM) personality profile to an expert-generated prototype. The current study examined the relations between these prototype scores, using interview and self-report data, and PD symptoms in an outpatient sample (N = 115). Both sets of PD prototype scores demonstrated significant convergent validity with PD symptom counts, suggesting that the FFM PD prototype scores are appropriate for use with both sources of data.

  12. Testing and Validating Machine Learning Classifiers by Metamorphic Testing☆

    PubMed Central

    Xie, Xiaoyuan; Ho, Joshua W. K.; Murphy, Christian; Kaiser, Gail; Xu, Baowen; Chen, Tsong Yueh

    2011-01-01

    Machine Learning algorithms have provided core functionality to many application domains - such as bioinformatics, computational linguistics, etc. However, it is difficult to detect faults in such applications because often there is no “test oracle” to verify the correctness of the computed outputs. To help address the software quality, in this paper we present a technique for testing the implementations of machine learning classification algorithms which support such applications. Our approach is based on the technique “metamorphic testing”, which has been shown to be effective to alleviate the oracle problem. Also presented include a case study on a real-world machine learning application framework, and a discussion of how programmers implementing machine learning algorithms can avoid the common pitfalls discovered in our study. We also conduct mutation analysis and cross-validation, which reveal that our method has high effectiveness in killing mutants, and that observing expected cross-validation result alone is not sufficiently effective to detect faults in a supervised classification program. The effectiveness of metamorphic testing is further confirmed by the detection of real faults in a popular open-source classification program. PMID:21532969

  13. Developing a dengue forecast model using machine learning: A case study in China.

    PubMed

    Guo, Pi; Liu, Tao; Zhang, Qin; Wang, Li; Xiao, Jianpeng; Zhang, Qingying; Luo, Ganfeng; Li, Zhihao; He, Jianfeng; Zhang, Yonghui; Ma, Wenjun

    2017-10-01

    In China, dengue remains an important public health issue with expanded areas and increased incidence recently. Accurate and timely forecasts of dengue incidence in China are still lacking. We aimed to use the state-of-the-art machine learning algorithms to develop an accurate predictive model of dengue. Weekly dengue cases, Baidu search queries and climate factors (mean temperature, relative humidity and rainfall) during 2011-2014 in Guangdong were gathered. A dengue search index was constructed for developing the predictive models in combination with climate factors. The observed year and week were also included in the models to control for the long-term trend and seasonality. Several machine learning algorithms, including the support vector regression (SVR) algorithm, step-down linear regression model, gradient boosted regression tree algorithm (GBM), negative binomial regression model (NBM), least absolute shrinkage and selection operator (LASSO) linear regression model and generalized additive model (GAM), were used as candidate models to predict dengue incidence. Performance and goodness of fit of the models were assessed using the root-mean-square error (RMSE) and R-squared measures. The residuals of the models were examined using the autocorrelation and partial autocorrelation function analyses to check the validity of the models. The models were further validated using dengue surveillance data from five other provinces. The epidemics during the last 12 weeks and the peak of the 2014 large outbreak were accurately forecasted by the SVR model selected by a cross-validation technique. Moreover, the SVR model had the consistently smallest prediction error rates for tracking the dynamics of dengue and forecasting the outbreaks in other areas in China. The proposed SVR model achieved a superior performance in comparison with other forecasting techniques assessed in this study. The findings can help the government and community respond early to dengue epidemics.

  14. Validating LES for Jet Aeroacoustics

    NASA Technical Reports Server (NTRS)

    Bridges, James; Wernet, Mark P.

    2011-01-01

    Engineers charged with making jet aircraft quieter have long dreamed of being able to see exactly how turbulent eddies produce sound and this dream is now coming true with the advent of large eddy simulation (LES). Two obvious challenges remain: validating the LES codes at the resolution required to see the fluid-acoustic coupling, and the interpretation of the massive datasets that are produced. This paper addresses the former, the use of advanced experimental techniques such as particle image velocimetry (PIV) and Raman and Rayleigh scattering, to validate the computer codes and procedures used to create LES solutions. This paper argues that the issue of accuracy of the experimental measurements be addressed by cross-facility and cross-disciplinary examination of modern datasets along with increased reporting of internal quality checks in PIV analysis. Further, it argues that the appropriate validation metrics for aeroacoustic applications are increasingly complicated statistics that have been shown in aeroacoustic theory to be critical to flow-generated sound, such as two-point space-time velocity correlations. A brief review of data sources available is presented along with examples illustrating cross-facility and internal quality checks required of the data before it should be accepted for validation of LES.

  15. Folding free energy surfaces of three small proteins under crowding: validation of the postprocessing method by direct simulation

    NASA Astrophysics Data System (ADS)

    Qin, Sanbo; Mittal, Jeetain; Zhou, Huan-Xiang

    2013-08-01

    We have developed a ‘postprocessing’ method for modeling biochemical processes such as protein folding under crowded conditions (Qin and Zhou 2009 Biophys. J. 97 12-19). In contrast to the direct simulation approach, in which the protein undergoing folding is simulated along with crowders, the postprocessing method requires only the folding simulation without crowders. The influence of the crowders is then obtained by taking conformations from the crowder-free simulation and calculating the free energies of transferring to the crowders. This postprocessing yields the folding free energy surface of the protein under crowding. Here the postprocessing results for the folding of three small proteins under ‘repulsive’ crowding are validated by those obtained previously by the direct simulation approach (Mittal and Best 2010 Biophys. J. 98 315-20). This validation confirms the accuracy of the postprocessing approach and highlights its distinct advantages in modeling biochemical processes under cell-like crowded conditions, such as enabling an atomistic representation of the test proteins.

  16. R package PRIMsrc: Bump Hunting by Patient Rule Induction Method for Survival, Regression and Classification

    PubMed Central

    Dazard, Jean-Eudes; Choe, Michael; LeBlanc, Michael; Rao, J. Sunil

    2015-01-01

    PRIMsrc is a novel implementation of a non-parametric bump hunting procedure, based on the Patient Rule Induction Method (PRIM), offering a unified treatment of outcome variables, including censored time-to-event (Survival), continuous (Regression) and discrete (Classification) responses. To fit the model, it uses a recursive peeling procedure with specific peeling criteria and stopping rules depending on the response. To validate the model, it provides an objective function based on prediction-error or other specific statistic, as well as two alternative cross-validation techniques, adapted to the task of decision-rule making and estimation in the three types of settings. PRIMsrc comes as an open source R package, including at this point: (i) a main function for fitting a Survival Bump Hunting model with various options allowing cross-validated model selection to control model size (#covariates) and model complexity (#peeling steps) and generation of cross-validated end-point estimates; (ii) parallel computing; (iii) various S3-generic and specific plotting functions for data visualization, diagnostic, prediction, summary and display of results. It is available on CRAN and GitHub. PMID:26798326

  17. In vivo validation of a new technique that compensates for soft tissue artefact in the upper-arm: preliminary results.

    PubMed

    Cutti, Andrea Giovanni; Cappello, Angelo; Davalli, Angelo

    2006-01-01

    Soft tissue artefact is the dominant error source for upper extremity motion analyses that use skin-mounted markers, especially in humeral axial rotation. A new in vivo technique is presented that is based on the definition of a humerus bone-embedded frame almost "artefact free" but influenced by the elbow orientation in the measurement of the humeral axial rotation, and on an algorithm designed to solve this kinematic coupling. The technique was validated in vivo in a study of six healthy subjects who performed five arm-movement tasks. For each task the similarity between a gold standard pattern and the axial rotation pattern before and after the application of the compensation algorithm was evaluated in terms of explained variance, gain, phase and offset. In addition the root mean square error between the patterns was used as a global similarity estimator. After the application, for four out of five tasks, patterns were highly correlated, in phase, with almost equal gain and limited offset; the root mean square error decreased from the original 9 degrees to 3 degrees . The proposed technique appears to help compensate for the soft tissue artefact affecting axial rotation. A further development is also proposed to make the technique effective also for the pure prono-supination task.

  18. Accurate prediction of RNA-binding protein residues with two discriminative structural descriptors.

    PubMed

    Sun, Meijian; Wang, Xia; Zou, Chuanxin; He, Zenghui; Liu, Wei; Li, Honglin

    2016-06-07

    RNA-binding proteins participate in many important biological processes concerning RNA-mediated gene regulation, and several computational methods have been recently developed to predict the protein-RNA interactions of RNA-binding proteins. Newly developed discriminative descriptors will help to improve the prediction accuracy of these prediction methods and provide further meaningful information for researchers. In this work, we designed two structural features (residue electrostatic surface potential and triplet interface propensity) and according to the statistical and structural analysis of protein-RNA complexes, the two features were powerful for identifying RNA-binding protein residues. Using these two features and other excellent structure- and sequence-based features, a random forest classifier was constructed to predict RNA-binding residues. The area under the receiver operating characteristic curve (AUC) of five-fold cross-validation for our method on training set RBP195 was 0.900, and when applied to the test set RBP68, the prediction accuracy (ACC) was 0.868, and the F-score was 0.631. The good prediction performance of our method revealed that the two newly designed descriptors could be discriminative for inferring protein residues interacting with RNAs. To facilitate the use of our method, a web-server called RNAProSite, which implements the proposed method, was constructed and is freely available at http://lilab.ecust.edu.cn/NABind .

  19. Improving Genomic Prediction in Cassava Field Experiments Using Spatial Analysis.

    PubMed

    Elias, Ani A; Rabbi, Ismail; Kulakow, Peter; Jannink, Jean-Luc

    2018-01-04

    Cassava ( Manihot esculenta Crantz) is an important staple food in sub-Saharan Africa. Breeding experiments were conducted at the International Institute of Tropical Agriculture in cassava to select elite parents. Taking into account the heterogeneity in the field while evaluating these trials can increase the accuracy in estimation of breeding values. We used an exploratory approach using the parametric spatial kernels Power, Spherical, and Gaussian to determine the best kernel for a given scenario. The spatial kernel was fit simultaneously with a genomic kernel in a genomic selection model. Predictability of these models was tested through a 10-fold cross-validation method repeated five times. The best model was chosen as the one with the lowest prediction root mean squared error compared to that of the base model having no spatial kernel. Results from our real and simulated data studies indicated that predictability can be increased by accounting for spatial variation irrespective of the heritability of the trait. In real data scenarios we observed that the accuracy can be increased by a median value of 3.4%. Through simulations, we showed that a 21% increase in accuracy can be achieved. We also found that Range (row) directional spatial kernels, mostly Gaussian, explained the spatial variance in 71% of the scenarios when spatial correlation was significant. Copyright © 2018 Elias et al.

  20. Comparison of Grouping Schemes for Exposure to Total Dust in Cement Factories in Korea.

    PubMed

    Koh, Dong-Hee; Kim, Tae-Woo; Jang, Seung Hee; Ryu, Hyang-Woo; Park, Donguk

    2015-08-01

    The purpose of this study was to evaluate grouping schemes for exposure to total dust in cement industry workers using non-repeated measurement data. In total, 2370 total dust measurements taken from nine Portland cement factories in 1995-2009 were analyzed. Various grouping schemes were generated based on work process, job, factory, or average exposure. To characterize variance components of each grouping scheme, we developed mixed-effects models with a B-spline time trend incorporated as fixed effects and a grouping variable incorporated as a random effect. Using the estimated variance components, elasticity was calculated. To compare the prediction performances of different grouping schemes, 10-fold cross-validation tests were conducted, and root mean squared errors and pooled correlation coefficients were calculated for each grouping scheme. The five exposure groups created a posteriori by ranking job and factory combinations according to average dust exposure showed the best prediction performance and highest elasticity among various grouping schemes. Our findings suggest a grouping method based on ranking of job, and factory combinations would be the optimal choice in this population. Our grouping method may aid exposure assessment efforts in similar occupational settings, minimizing the misclassification of exposures. © The Author 2015. Published by Oxford University Press on behalf of the British Occupational Hygiene Society.

  1. Distant failure prediction for early stage NSCLC by analyzing PET with sparse representation

    NASA Astrophysics Data System (ADS)

    Hao, Hongxia; Zhou, Zhiguo; Wang, Jing

    2017-03-01

    Positron emission tomography (PET) imaging has been widely explored for treatment outcome prediction. Radiomicsdriven methods provide a new insight to quantitatively explore underlying information from PET images. However, it is still a challenging problem to automatically extract clinically meaningful features for prognosis. In this work, we develop a PET-guided distant failure predictive model for early stage non-small cell lung cancer (NSCLC) patients after stereotactic ablative radiotherapy (SABR) by using sparse representation. The proposed method does not need precalculated features and can learn intrinsically distinctive features contributing to classification of patients with distant failure. The proposed framework includes two main parts: 1) intra-tumor heterogeneity description; and 2) dictionary pair learning based sparse representation. Tumor heterogeneity is initially captured through anisotropic kernel and represented as a set of concatenated vectors, which forms the sample gallery. Then, given a test tumor image, its identity (i.e., distant failure or not) is classified by applying the dictionary pair learning based sparse representation. We evaluate the proposed approach on 48 NSCLC patients treated by SABR at our institute. Experimental results show that the proposed approach can achieve an area under the characteristic curve (AUC) of 0.70 with a sensitivity of 69.87% and a specificity of 69.51% using a five-fold cross validation.

  2. Concurrent Validity of the International Family Quality of Life Survey.

    PubMed

    Samuel, Preethy S; Pociask, Fredrick D; DiZazzo-Miller, Rosanne; Carrellas, Ann; LeRoy, Barbara W

    2016-01-01

    The measurement of the social construct of Family Quality of Life (FQOL) is a parsimonious alternative to the current approach of measuring familial outcomes using a battery of tools related to individual-level outcomes. The purpose of this study was to examine the internal consistency and concurrent validity of the International FQOL Survey (FQOLS-2006), using cross-sectional data collected from 65 family caregivers of children with developmental disabilities. It shows a moderate correlation between the total FQOL scores of the FQOLS-2006 and the Beach Center's FQOL scale. The validity of five FQOLS-2006 domains was supported by the correlations between conceptually related domains.

  3. Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

    PubMed Central

    Hoo-Chang, Shin; Roth, Holger R.; Gao, Mingchen; Lu, Le; Xu, Ziyue; Nogues, Isabella; Yao, Jianhua; Mollura, Daniel

    2016-01-01

    Remarkable progress has been made in image recognition, primarily due to the availability of large-scale annotated datasets (i.e. ImageNet) and the revival of deep convolutional neural networks (CNN). CNNs enable learning data-driven, highly representative, layered hierarchical image features from sufficient training data. However, obtaining datasets as comprehensively annotated as ImageNet in the medical imaging domain remains a challenge. There are currently three major techniques that successfully employ CNNs to medical image classification: training the CNN from scratch, using off-the-shelf pre-trained CNN features, and conducting unsupervised CNN pre-training with supervised fine-tuning. Another effective method is transfer learning, i.e., fine-tuning CNN models (supervised) pre-trained from natural image dataset to medical image tasks (although domain transfer between two medical image datasets is also possible). In this paper, we exploit three important, but previously understudied factors of employing deep convolutional neural networks to computer-aided detection problems. We first explore and evaluate different CNN architectures. The studied models contain 5 thousand to 160 million parameters, and vary in numbers of layers. We then evaluate the influence of dataset scale and spatial image context on performance. Finally, we examine when and why transfer learning from pre-trained ImageNet (via fine-tuning) can be useful. We study two specific computeraided detection (CADe) problems, namely thoraco-abdominal lymph node (LN) detection and interstitial lung disease (ILD) classification. We achieve the state-of-the-art performance on the mediastinal LN detection, with 85% sensitivity at 3 false positive per patient, and report the first five-fold cross-validation classification results on predicting axial CT slices with ILD categories. Our extensive empirical evaluation, CNN model analysis and valuable insights can be extended to the design of high performance CAD systems for other medical imaging tasks. PMID:26886976

  4. Applying Mondrian Cross-Conformal Prediction To Estimate Prediction Confidence on Large Imbalanced Bioactivity Data Sets.

    PubMed

    Sun, Jiangming; Carlsson, Lars; Ahlberg, Ernst; Norinder, Ulf; Engkvist, Ola; Chen, Hongming

    2017-07-24

    Conformal prediction has been proposed as a more rigorous way to define prediction confidence compared to other application domain concepts that have earlier been used for QSAR modeling. One main advantage of such a method is that it provides a prediction region potentially with multiple predicted labels, which contrasts to the single valued (regression) or single label (classification) output predictions by standard QSAR modeling algorithms. Standard conformal prediction might not be suitable for imbalanced data sets. Therefore, Mondrian cross-conformal prediction (MCCP) which combines the Mondrian inductive conformal prediction with cross-fold calibration sets has been introduced. In this study, the MCCP method was applied to 18 publicly available data sets that have various imbalance levels varying from 1:10 to 1:1000 (ratio of active/inactive compounds). Our results show that MCCP in general performed well on bioactivity data sets with various imbalance levels. More importantly, the method not only provides confidence of prediction and prediction regions compared to standard machine learning methods but also produces valid predictions for the minority class. In addition, a compound similarity based nonconformity measure was investigated. Our results demonstrate that although it gives valid predictions, its efficiency is much worse than that of model dependent metrics.

  5. Cross-cultural adaptation of the Disability of Arm, Shoulder, and Hand questionnaire: Spanish for Puerto Rico Version

    PubMed Central

    Mulero-Portela, Ana L.; Colón-Santaella, Carmen L.; Cruz-Gomez, Cynthia

    2010-01-01

    The purpose of this study was to perform a cross-cultural adaptation of the Disability of Arm, Shoulder, and Hand (DASH) questionnaire to Spanish for Puerto Rico. Five steps were followed for the cross-cultural adaptation: forward translations into Spanish for Puerto Rico, synthesis of the translations, back translations into English, revision by an expert committee, and field test of the prefinal version. Psychometric characteristics of reliability and construct validity were evaluated for the final version. Internal consistency of the final version was high (Cronbach's α = 0.97) and item-to-total correlations were moderate (range from 0.44 to 0.85). Construct validity was evaluated by correlating the DASH with the scales of the Functional Assessment of Cancer Therapy - Breast. Fair to moderate correlations found in this study between the DASH and most scales of the Functional Assessment of Cancer Therapy - Breast support the construct validity of the Puerto Rico-Spanish DASH. The final version of the questionnaire was revised and approved by the Institute for Work and Health of Canada. Revisions to the original DASH English version are recommended. This version of the DASH is valid and reliable, and it can be used to evaluate outcomes in both clinical and research settings. PMID:19901616

  6. Cross-Study Homogeneity of Psoriasis Gene Expression in Skin across a Large Expression Range

    PubMed Central

    Kerkof, Keith; Timour, Martin; Russell, Christopher B.

    2013-01-01

    Background In psoriasis, only limited overlap between sets of genes identified as differentially expressed (psoriatic lesional vs. psoriatic non-lesional) was found using statistical and fold-change cut-offs. To provide a framework for utilizing prior psoriasis data sets we sought to understand the consistency of those sets. Methodology/Principal Findings Microarray expression profiling and qRT-PCR were used to characterize gene expression in PP and PN skin from psoriasis patients. cDNA (three new data sets) and cRNA hybridization (four existing data sets) data were compared using a common analysis pipeline. Agreement between data sets was assessed using varying qualitative and quantitative cut-offs to generate a DEG list in a source data set and then using other data sets to validate the list. Concordance increased from 67% across all probe sets to over 99% across more than 10,000 probe sets when statistical filters were employed. The fold-change behavior of individual genes tended to be consistent across the multiple data sets. We found that genes with <2-fold change values were quantitatively reproducible between pairs of data-sets. In a subset of transcripts with a role in inflammation changes detected by microarray were confirmed by qRT-PCR with high concordance. For transcripts with both PN and PP levels within the microarray dynamic range, microarray and qRT-PCR were quantitatively reproducible, including minimal fold-changes in IL13, TNFSF11, and TNFRSF11B and genes with >10-fold changes in either direction such as CHRM3, IL12B and IFNG. Conclusions/Significance Gene expression changes in psoriatic lesions were consistent across different studies, despite differences in patient selection, sample handling, and microarray platforms but between-study comparisons showed stronger agreement within than between platforms. We could use cut-offs as low as log10(ratio) = 0.1 (fold-change = 1.26), generating larger gene lists that validate on independent data sets. The reproducibility of PP signatures across data sets suggests that different sample sets can be productively compared. PMID:23308107

  7. TU-D-207B-01: A Prediction Model for Distinguishing Radiation Necrosis From Tumor Progression After Gamma Knife Radiosurgery Based On Radiomics Features From MR Images

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Z; MD Anderson Cancer Center, Houston, TX; Ho, A

    Purpose: To develop and validate a prediction model using radiomics features extracted from MR images to distinguish radiation necrosis from tumor progression for brain metastases treated with Gamma knife radiosurgery. Methods: The images used to develop the model were T1 post-contrast MR scans from 71 patients who had had pathologic confirmation of necrosis or progression; 1 lesion was identified per patient (17 necrosis and 54 progression). Radiomics features were extracted from 2 images at 2 time points per patient, both obtained prior to resection. Each lesion was manually contoured on each image, and 282 radiomics features were calculated for eachmore » lesion. The correlation for each radiomics feature between two time points was calculated within each group to identify a subset of features with distinct values between two groups. The delta of this subset of radiomics features, characterizing changes from the earlier time to the later one, was included as a covariate to build a prediction model using support vector machines with a cubic polynomial kernel function. The model was evaluated with a 10-fold cross-validation. Results: Forty radiomics features were selected based on consistent correlation values of approximately 0 for the necrosis group and >0.2 for the progression group. In performing the 10-fold cross-validation, we narrowed this number down to 11 delta radiomics features for the model. This 11-delta-feature model showed an overall prediction accuracy of 83.1%, with a true positive rate of 58.8% in predicting necrosis and 90.7% for predicting tumor progression. The area under the curve for the prediction model was 0.79. Conclusion: These delta radiomics features extracted from MR scans showed potential for distinguishing radiation necrosis from tumor progression. This tool may be a useful, noninvasive means of determining the status of an enlarging lesion after radiosurgery, aiding decision-making regarding surgical resection versus conservative medical management.« less

  8. Prediction of redox-sensitive cysteines using sequential distance and other sequence-based features.

    PubMed

    Sun, Ming-An; Zhang, Qing; Wang, Yejun; Ge, Wei; Guo, Dianjing

    2016-08-24

    Reactive oxygen species can modify the structure and function of proteins and may also act as important signaling molecules in various cellular processes. Cysteine thiol groups of proteins are particularly susceptible to oxidation. Meanwhile, their reversible oxidation is of critical roles for redox regulation and signaling. Recently, several computational tools have been developed for predicting redox-sensitive cysteines; however, those methods either only focus on catalytic redox-sensitive cysteines in thiol oxidoreductases, or heavily depend on protein structural data, thus cannot be widely used. In this study, we analyzed various sequence-based features potentially related to cysteine redox-sensitivity, and identified three types of features for efficient computational prediction of redox-sensitive cysteines. These features are: sequential distance to the nearby cysteines, PSSM profile and predicted secondary structure of flanking residues. After further feature selection using SVM-RFE, we developed Redox-Sensitive Cysteine Predictor (RSCP), a SVM based classifier for redox-sensitive cysteine prediction using primary sequence only. Using 10-fold cross-validation on RSC758 dataset, the accuracy, sensitivity, specificity, MCC and AUC were estimated as 0.679, 0.602, 0.756, 0.362 and 0.727, respectively. When evaluated using 10-fold cross-validation with BALOSCTdb dataset which has structure information, the model achieved performance comparable to current structure-based method. Further validation using an independent dataset indicates it is robust and of relatively better accuracy for predicting redox-sensitive cysteines from non-enzyme proteins. In this study, we developed a sequence-based classifier for predicting redox-sensitive cysteines. The major advantage of this method is that it does not rely on protein structure data, which ensures more extensive application compared to other current implementations. Accurate prediction of redox-sensitive cysteines not only enhances our understanding about the redox sensitivity of cysteine, it may also complement the proteomics approach and facilitate further experimental investigation of important redox-sensitive cysteines.

  9. Resistance of green lacewing, Chrysoperla carnea Stephens to nitenpyram: Cross-resistance patterns, mechanism, stability, and realized heritability.

    PubMed

    Mansoor, Muhammad Mudassir; Raza, Abu Bakar Muhammad; Abbas, Naeem; Aqueel, Muhammad Anjum; Afzal, Muhammad

    2017-01-01

    The green lacewing, Chrysoperla carnea Stephens (Neuroptera: Chrysopidae) is a major generalist predator employed in integrated pest management (IPM) plans for pest control on many crops. Nitenpyram, a neonicotinoid insecticide has widely been used against the sucking pests of cotton in Pakistan. Therefore, a field green lacewing strain was exposed to nitenpyram for five generations to investigate resistance evolution, cross-resistance pattern, stability, realized heritability, and mechanisms of resistance. Before starting the selection with nitenpyram, a field collected strain showed 22.08-, 23.09-, 484.69- and 602.90-fold resistance to nitenpyram, buprofezin, spinosad and acetamiprid, respectively compared with the Susceptible strain. After continuous selection for five generations (G1-G5) with nitenpyram in the laboratory, the Field strain (Niten-SEL) developed a resistance ratio of 423.95 at G6. The Niten-SEL strain at G6 showed no cross-resistance to buprofezin and acetamiprid and negative cross-resistance to spinosad compared with the Field strain (G1). For resistance stability, the Niten-SEL strain was left unexposed to any insecticide for four generations (G6-G9) and bioassay results at G10 showed that resistance to nitenpyram, buprofezin and spinosad was stable, while resistance to acetamiprid was unstable. The realized heritability values were 0.97, 0.16, 0.03, and -0.16 to nitenpyram, buprofezin, acetamiprid and spinosad, respectively, after five generations of selection. Moreover, the enzyme inhibitors (PBO or DEF) significantly decreased the nitenpyram resistance in the resistant strain, suggesting that resistance was due to microsomal oxidases and esterases. These results are very helpful for integration of green lacewings in IPM programs. Copyright © 2016 Elsevier B.V. All rights reserved.

  10. Automated segmentation of geographic atrophy using deep convolutional neural networks

    NASA Astrophysics Data System (ADS)

    Hu, Zhihong; Wang, Ziyuan; Sadda, SriniVas R.

    2018-02-01

    Geographic atrophy (GA) is an end-stage manifestation of the advanced age-related macular degeneration (AMD), the leading cause of blindness and visual impairment in developed nations. Techniques to rapidly and precisely detect and quantify GA would appear to be of critical importance in advancing the understanding of its pathogenesis. In this study, we develop an automated supervised classification system using deep convolutional neural networks (CNNs) for segmenting GA in fundus autofluorescene (FAF) images. More specifically, to enhance the contrast of GA relative to the background, we apply the contrast limited adaptive histogram equalization. Blood vessels may cause GA segmentation errors due to similar intensity level to GA. A tensor-voting technique is performed to identify the blood vessels and a vessel inpainting technique is applied to suppress the GA segmentation errors due to the blood vessels. To handle the large variation of GA lesion sizes, three deep CNNs with three varying sized input image patches are applied. Fifty randomly chosen FAF images are obtained from fifty subjects with GA. The algorithm-defined GA regions are compared with manual delineation by a certified grader. A two-fold cross-validation is applied to evaluate the algorithm performance. The mean segmentation accuracy, true positive rate (i.e. sensitivity), true negative rate (i.e. specificity), positive predictive value, false discovery rate, and overlap ratio, between the algorithm- and manually-defined GA regions are 0.97 +/- 0.02, 0.89 +/- 0.08, 0.98 +/- 0.02, 0.87 +/- 0.12, 0.13 +/- 0.12, and 0.79 +/- 0.12 respectively, demonstrating a high level of agreement.

  11. Real Alerts and Artifact Classification in Archived Multi-signal Vital Sign Monitoring Data—Implications for Mining Big Data — Implications for Mining Big Data

    PubMed Central

    Hravnak, Marilyn; Chen, Lujie; Dubrawski, Artur; Bose, Eliezer; Clermont, Gilles; Pinsky, Michael R.

    2015-01-01

    PURPOSE Huge hospital information system databases can be mined for knowledge discovery and decision support, but artifact in stored non-invasive vital sign (VS) high-frequency data streams limits its use. We used machine-learning (ML) algorithms trained on expert-labeled VS data streams to automatically classify VS alerts as real or artifact, thereby “cleaning” such data for future modeling. METHODS 634 admissions to a step-down unit had recorded continuous noninvasive VS monitoring data (heart rate [HR], respiratory rate [RR], peripheral arterial oxygen saturation [SpO2] at 1/20Hz., and noninvasive oscillometric blood pressure [BP]) Time data were across stability thresholds defined VS event epochs. Data were divided Block 1 as the ML training/cross-validation set and Block 2 the test set. Expert clinicians annotated Block 1 events as perceived real or artifact. After feature extraction, ML algorithms were trained to create and validate models automatically classifying events as real or artifact. The models were then tested on Block 2. RESULTS Block 1 yielded 812 VS events, with 214 (26%) judged by experts as artifact (RR 43%, SpO2 40%, BP 15%, HR 2%). ML algorithms applied to the Block 1 training/cross-validation set (10-fold cross-validation) gave area under the curve (AUC) scores of 0.97 RR, 0.91 BP and 0.76 SpO2. Performance when applied to Block 2 test data was AUC 0.94 RR, 0.84 BP and 0.72 SpO2). CONCLUSIONS ML-defined algorithms applied to archived multi-signal continuous VS monitoring data allowed accurate automated classification of VS alerts as real or artifact, and could support data mining for future model building. PMID:26438655

  12. Two-dimensional strain-mapping by electron backscatter diffraction and confocal Raman spectroscopy

    NASA Astrophysics Data System (ADS)

    Gayle, Andrew J.; Friedman, Lawrence H.; Beams, Ryan; Bush, Brian G.; Gerbig, Yvonne B.; Michaels, Chris A.; Vaudin, Mark D.; Cook, Robert F.

    2017-11-01

    The strain field surrounding a spherical indentation in silicon is mapped in two dimensions (2-D) using electron backscatter diffraction (EBSD) cross-correlation and confocal Raman spectroscopy techniques. The 200 mN indentation created a 4 μm diameter residual contact impression in the silicon (001) surface. Maps about 50 μm × 50 μm area with 128 pixels × 128 pixels were generated in several hours, extending, by comparison, assessment of the accuracy of both techniques to mapping multiaxial strain states in 2-D. EBSD measurements showed a residual strain field dominated by in-surface normal and shear strains, with alternating tensile and compressive lobes extending about three to four indentation diameters from the contact and exhibiting two-fold symmetry. Raman measurements showed a residual Raman shift field, dominated by positive shifts, also extending about three to four indentation diameters from the contact but exhibiting four-fold symmetry. The 2-D EBSD results, in combination with a mechanical-spectroscopic analysis, were used to successfully predict the 2-D Raman shift map in scale, symmetry, and shift magnitude. Both techniques should be useful in enhancing the reliability of microelectromechanical systems (MEMS) through identification of the 2-D strain fields in MEMS devices.

  13. Polymer Uncrossing and Knotting in Protein Folding, and Their Role in Minimal Folding Pathways

    PubMed Central

    Mohazab, Ali R.; Plotkin, Steven S.

    2013-01-01

    We introduce a method for calculating the extent to which chain non-crossing is important in the most efficient, optimal trajectories or pathways for a protein to fold. This involves recording all unphysical crossing events of a ghost chain, and calculating the minimal uncrossing cost that would have been required to avoid such events. A depth-first tree search algorithm is applied to find minimal transformations to fold , , , and knotted proteins. In all cases, the extra uncrossing/non-crossing distance is a small fraction of the total distance travelled by a ghost chain. Different structural classes may be distinguished by the amount of extra uncrossing distance, and the effectiveness of such discrimination is compared with other order parameters. It was seen that non-crossing distance over chain length provided the best discrimination between structural and kinetic classes. The scaling of non-crossing distance with chain length implies an inevitable crossover to entanglement-dominated folding mechanisms for sufficiently long chains. We further quantify the minimal folding pathways by collecting the sequence of uncrossing moves, which generally involve leg, loop, and elbow-like uncrossing moves, and rendering the collection of these moves over the unfolded ensemble as a multiple-transformation “alignment”. The consensus minimal pathway is constructed and shown schematically for representative cases of an , , and knotted protein. An overlap parameter is defined between pathways; we find that proteins have minimal overlap indicating diverse folding pathways, knotted proteins are highly constrained to follow a dominant pathway, and proteins are somewhere in between. Thus we have shown how topological chain constraints can induce dominant pathway mechanisms in protein folding. PMID:23365638

  14. Hazards and Health Risks Encountered by Manual Sand Dredgers from Udupi, India: A Cross-sectional Study

    PubMed Central

    Shaikh, Alfiya; Nayak, Priyanka; Navada, Rajesh

    2017-01-01

    Introduction Globalization and urbanization have resulted in an increased demand on sand dredging. Legal and environmental restrictions on automated dredging have led to a rise in manual technique. The working techniques and environment involved in manual sand dredging may expose the workers to multiple work related disorders. Aim To determine the health risks and occupational hazards involved in manual sand dredging. Materials and Methods An assessment schedule was developed and content was validated by five experts for the study. A cross-sectional study was then conducted using this assessment schedule. Thirty manual sand dredgers were recruited from three randomly selected docks on Swarna riverbed in Udupi district, Karnataka, India. A detailed work and worksite assessments were conducted using systematic observation and close-ended questions. Work-related health risk evaluation included onsite-evaluation and self-reported health complains. Results The prevalence of musculoskeletal pain and discomfort was 93.34% with lower back (70%), shoulder (56.7%) and neck (46.7%) involvements being most common regions. Prevalence of sensory deficits at multiple site and ear pain was 66.6% and 76.6% respectively. All the workers recruited, complained of dermatological and ophthalmic involvements. Also, lack of health and safety measures like personal protective devices and security schemes were identified. Conclusion This study shows a high prevalence of multiple work-related disorders and hazards involved in manual sand dredging, a highly demanding job in coastal Karnataka. Lack of health and safety measures were also identified. PMID:28892936

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gauld, Ian C.; Giaquinto, J. M.; Delashmitt, J. S.

    Destructive radiochemical assay measurements of spent nuclear fuel rod segments from an assembly irradiated in the Three Mile Island unit 1 (TMI-1) pressurized water reactor have been performed at Oak Ridge National Laboratory (ORNL). Assay data are reported for five samples from two fuel rods of the same assembly. The TMI-1 assembly was a 15 X 15 design with an initial enrichment of 4.013 wt% 235U, and the measured samples achieved burnups between 45.5 and 54.5 gigawatt days per metric ton of initial uranium (GWd/t). Measurements were performed mainly using inductively coupled plasma mass spectrometry after elemental separation via highmore » performance liquid chromatography. High precision measurements were achieved using isotope dilution techniques for many of the lanthanides, uranium, and plutonium isotopes. Measurements are reported for more than 50 different isotopes and 16 elements. One of the two TMI-1 fuel rods measured in this work had been measured previously by Argonne National Laboratory (ANL), and these data have been widely used to support code and nuclear data validation. Recently, ORNL provided an important opportunity to independently cross check results against previous measurements performed at ANL. The measured nuclide concentrations are used to validate burnup calculations using the SCALE nuclear systems modeling and simulation code suite. These results show that the new measurements provide reliable benchmark data for computer code validation.« less

  16. Combined Sensory Impairment (Deaf-Blindness) in Five Percent of Adults with Intellectual Disabilities

    ERIC Educational Resources Information Center

    Meuwese-Jongejeugd, Anneke; van Splunder, Jacques; Vink, Marianne; Stilma, Jan Sietse; van Zanten, Bert; Verschuure, Hans; Bernsen, Roos; Evenhuis, Heleen

    2008-01-01

    Our purpose in this cross-sectional study with 1,598 adult clients who had intellectual disabilities was to obtain valid prevalences of sensory impairments and to identify associations. The diagnoses were made through ophthalmologic and audiometric assessments, applying WHO/IASSID definitions. Re-weighted prevalences were 5.0% (95% CI 3.9-6.2%)…

  17. Estimation of personal PM2.5 and BC exposure by a modeling approach - Results of a panel study in Shanghai, China.

    PubMed

    Chen, Chen; Cai, Jing; Wang, Cuicui; Shi, Jingjin; Chen, Renjie; Yang, Changyuan; Li, Huichu; Lin, Zhijing; Meng, Xia; Zhao, Ang; Liu, Cong; Niu, Yue; Xia, Yongjie; Peng, Li; Zhao, Zhuohui; Chillrud, Steven; Yan, Beizhan; Kan, Haidong

    2018-06-06

    Epidemiologic studies of PM 2.5 (particulate matter with aerodynamic diameter ≤2.5 μm) and black carbon (BC) typically use ambient measurements as exposure proxies given that individual measurement is infeasible among large populations. Failure to account for variation in exposure will bias epidemiologic study results. The ability of ambient measurement as a proxy of exposure in regions with heavy pollution is untested. We aimed to investigate effects of potential determinants and to estimate PM 2.5 and BC exposure by a modeling approach. We collected 417 24 h personal PM 2.5 and 130 72 h personal BC measurements from a panel of 36 nonsmoking college students in Shanghai, China. Each participant underwent 4 rounds of three consecutive 24-h sampling sessions through December 2014 to July 2015. We applied backwards regression to construct mixed effect models incorporating all accessible variables of ambient pollution, climate and time-location information for exposure prediction. All models were evaluated by marginal R 2 and root mean square error (RMSE) from a leave-one-out-cross-validation (LOOCV) and a 10-fold cross-validation (10-fold CV). Personal PM 2.5 was 47.6% lower than ambient level, with mean (±Standard Deviation, SD) level of 39.9 (±32.1) μg/m 3 ; whereas personal BC (6.1 (±2.8) μg/m 3 ) was about one-fold higher than the corresponding ambient concentrations. Ambient levels were the most significant determinants of PM 2.5 and BC exposure. Meteorological and season indicators were also important predictors. Our final models predicted 75% of the variance in 24 h personal PM 2.5 and 72 h personal BC. LOOCV analysis showed an R 2 (RMSE) of 0.73 (0.40) for PM 2.5 and 0.66 (0.27) for BC. Ten-fold CV analysis showed a R 2 (RMSE) of 0.73 (0.41) for PM 2.5 and 0.68 (0.26) for BC. We used readily accessible data and established intuitive models that can predict PM 2.5 and BC exposure. This modeling approach can be a feasible solution for PM exposure estimation in epidemiological studies. Copyright © 2018 Elsevier Ltd. All rights reserved.

  18. The PYRIN domain: A member of the death domain-fold superfamily

    PubMed Central

    Fairbrother, Wayne J.; Gordon, Nathaniel C.; Humke, Eric W.; O'Rourke, Karen M.; Starovasnik, Melissa A.; Yin, Jian-Ping; Dixit, Vishva M.

    2001-01-01

    PYRIN domains were identified recently as putative protein–protein interaction domains at the N-termini of several proteins thought to function in apoptotic and inflammatory signaling pathways. The ∼95 residue PYRIN domains have no statistically significant sequence homology to proteins with known three-dimensional structure. Using secondary structure prediction and potential-based fold recognition methods, however, the PYRIN domain is predicted to be a member of the six-helix bundle death domain-fold superfamily that includes death domains (DDs), death effector domains (DEDs), and caspase recruitment domains (CARDs). Members of the death domain-fold superfamily are well established mediators of protein–protein interactions found in many proteins involved in apoptosis and inflammation, indicating further that the PYRIN domains serve a similar function. An homology model of the PYRIN domain of CARD7/DEFCAP/NAC/NALP1, a member of the Apaf-1/Ced-4 family of proteins, was constructed using the three-dimensional structures of the FADD and p75 neurotrophin receptor DDs, and of the Apaf-1 and caspase-9 CARDs, as templates. Validation of the model using a variety of computational techniques indicates that the fold prediction is consistent with the sequence. Comparison of a circular dichroism spectrum of the PYRIN domain of CARD7/DEFCAP/NAC/NALP1 with spectra of several proteins known to adopt the death domain-fold provides experimental support for the structure prediction. PMID:11514682

  19. Comparison of multianalyte proficiency test results by sum of ranking differences, principal component analysis, and hierarchical cluster analysis.

    PubMed

    Škrbić, Biljana; Héberger, Károly; Durišić-Mladenović, Nataša

    2013-10-01

    Sum of ranking differences (SRD) was applied for comparing multianalyte results obtained by several analytical methods used in one or in different laboratories, i.e., for ranking the overall performances of the methods (or laboratories) in simultaneous determination of the same set of analytes. The data sets for testing of the SRD applicability contained the results reported during one of the proficiency tests (PTs) organized by EU Reference Laboratory for Polycyclic Aromatic Hydrocarbons (EU-RL-PAH). In this way, the SRD was also tested as a discriminant method alternative to existing average performance scores used to compare mutlianalyte PT results. SRD should be used along with the z scores--the most commonly used PT performance statistics. SRD was further developed to handle the same rankings (ties) among laboratories. Two benchmark concentration series were selected as reference: (a) the assigned PAH concentrations (determined precisely beforehand by the EU-RL-PAH) and (b) the averages of all individual PAH concentrations determined by each laboratory. Ranking relative to the assigned values and also to the average (or median) values pointed to the laboratories with the most extreme results, as well as revealed groups of laboratories with similar overall performances. SRD reveals differences between methods or laboratories even if classical test(s) cannot. The ranking was validated using comparison of ranks by random numbers (a randomization test) and using seven folds cross-validation, which highlighted the similarities among the (methods used in) laboratories. Principal component analysis and hierarchical cluster analysis justified the findings based on SRD ranking/grouping. If the PAH-concentrations are row-scaled, (i.e., z scores are analyzed as input for ranking) SRD can still be used for checking the normality of errors. Moreover, cross-validation of SRD on z scores groups the laboratories similarly. The SRD technique is general in nature, i.e., it can be applied to any experimental problem in which multianalyte results obtained either by several analytical procedures, analysts, instruments, or laboratories need to be compared.

  20. Predicting drug-target interactions by dual-network integrated logistic matrix factorization

    NASA Astrophysics Data System (ADS)

    Hao, Ming; Bryant, Stephen H.; Wang, Yanli

    2017-01-01

    In this work, we propose a dual-network integrated logistic matrix factorization (DNILMF) algorithm to predict potential drug-target interactions (DTI). The prediction procedure consists of four steps: (1) inferring new drug/target profiles and constructing profile kernel matrix; (2) diffusing drug profile kernel matrix with drug structure kernel matrix; (3) diffusing target profile kernel matrix with target sequence kernel matrix; and (4) building DNILMF model and smoothing new drug/target predictions based on their neighbors. We compare our algorithm with the state-of-the-art method based on the benchmark dataset. Results indicate that the DNILMF algorithm outperforms the previously reported approaches in terms of AUPR (area under precision-recall curve) and AUC (area under curve of receiver operating characteristic) based on the 5 trials of 10-fold cross-validation. We conclude that the performance improvement depends on not only the proposed objective function, but also the used nonlinear diffusion technique which is important but under studied in the DTI prediction field. In addition, we also compile a new DTI dataset for increasing the diversity of currently available benchmark datasets. The top prediction results for the new dataset are confirmed by experimental studies or supported by other computational research.

  1. Glaucoma risk index: automated glaucoma detection from color fundus images.

    PubMed

    Bock, Rüdiger; Meier, Jörg; Nyúl, László G; Hornegger, Joachim; Michelson, Georg

    2010-06-01

    Glaucoma as a neurodegeneration of the optic nerve is one of the most common causes of blindness. Because revitalization of the degenerated nerve fibers of the optic nerve is impossible early detection of the disease is essential. This can be supported by a robust and automated mass-screening. We propose a novel automated glaucoma detection system that operates on inexpensive to acquire and widely used digital color fundus images. After a glaucoma specific preprocessing, different generic feature types are compressed by an appearance-based dimension reduction technique. Subsequently, a probabilistic two-stage classification scheme combines these features types to extract the novel Glaucoma Risk Index (GRI) that shows a reasonable glaucoma detection performance. On a sample set of 575 fundus images a classification accuracy of 80% has been achieved in a 5-fold cross-validation setup. The GRI gains a competitive area under ROC (AUC) of 88% compared to the established topography-based glaucoma probability score of scanning laser tomography with AUC of 87%. The proposed color fundus image-based GRI achieves a competitive and reliable detection performance on a low-priced modality by the statistical analysis of entire images of the optic nerve head. Copyright (c) 2010 Elsevier B.V. All rights reserved.

  2. A Pathological Brain Detection System based on Extreme Learning Machine Optimized by Bat Algorithm.

    PubMed

    Lu, Siyuan; Qiu, Xin; Shi, Jianping; Li, Na; Lu, Zhi-Hai; Chen, Peng; Yang, Meng-Meng; Liu, Fang-Yuan; Jia, Wen-Juan; Zhang, Yudong

    2017-01-01

    It is beneficial to classify brain images as healthy or pathological automatically, because 3D brain images can generate so much information which is time consuming and tedious for manual analysis. Among various 3D brain imaging techniques, magnetic resonance (MR) imaging is the most suitable for brain, and it is now widely applied in hospitals, because it is helpful in the four ways of diagnosis, prognosis, pre-surgical, and postsurgical procedures. There are automatic detection methods; however they suffer from low accuracy. Therefore, we proposed a novel approach which employed 2D discrete wavelet transform (DWT), and calculated the entropies of the subbands as features. Then, a bat algorithm optimized extreme learning machine (BA-ELM) was trained to identify pathological brains from healthy controls. A 10x10-fold cross validation was performed to evaluate the out-of-sample performance. The method achieved a sensitivity of 99.04%, a specificity of 93.89%, and an overall accuracy of 98.33% over 132 MR brain images. The experimental results suggest that the proposed approach is accurate and robust in pathological brain detection. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  3. Extracting Information from Electronic Medical Records to Identify the Obesity Status of a Patient Based on Comorbidities and Bodyweight Measures.

    PubMed

    Figueroa, Rosa L; Flores, Christopher A

    2016-08-01

    Obesity is a chronic disease with an increasing impact on the world's population. In this work, we present a method of identifying obesity automatically using text mining techniques and information related to body weight measures and obesity comorbidities. We used a dataset of 3015 de-identified medical records that contain labels for two classification problems. The first classification problem distinguishes between obesity, overweight, normal weight, and underweight. The second classification problem differentiates between obesity types: super obesity, morbid obesity, severe obesity and moderate obesity. We used a Bag of Words approach to represent the records together with unigram and bigram representations of the features. We implemented two approaches: a hierarchical method and a nonhierarchical one. We used Support Vector Machine and Naïve Bayes together with ten-fold cross validation to evaluate and compare performances. Our results indicate that the hierarchical approach does not work as well as the nonhierarchical one. In general, our results show that Support Vector Machine obtains better performances than Naïve Bayes for both classification problems. We also observed that bigram representation improves performance compared with unigram representation.

  4. Computer-aided detection of basal cell carcinoma through blood content analysis in dermoscopy images

    NASA Astrophysics Data System (ADS)

    Kharazmi, Pegah; Kalia, Sunil; Lui, Harvey; Wang, Z. Jane; Lee, Tim K.

    2018-02-01

    Basal cell carcinoma (BCC) is the most common type of skin cancer, which is highly damaging to the skin at its advanced stages and causes huge costs on the healthcare system. However, most types of BCC are easily curable if detected at early stage. Due to limited access to dermatologists and expert physicians, non-invasive computer-aided diagnosis is a viable option for skin cancer screening. A clinical biomarker of cancerous tumors is increased vascularization and excess blood flow. In this paper, we present a computer-aided technique to differentiate cancerous skin tumors from benign lesions based on vascular characteristics of the lesions. Dermoscopy image of the lesion is first decomposed using independent component analysis of the RGB channels to derive melanin and hemoglobin maps. A novel set of clinically inspired features and ratiometric measurements are then extracted from each map to characterize the vascular properties and blood content of the lesion. The feature set is then fed into a random forest classifier. Over a dataset of 664 skin lesions, the proposed method achieved an area under ROC curve of 0.832 in a 10-fold cross validation for differentiating basal cell carcinomas from benign lesions.

  5. Shape classification of wear particles by image boundary analysis using machine learning algorithms

    NASA Astrophysics Data System (ADS)

    Yuan, Wei; Chin, K. S.; Hua, Meng; Dong, Guangneng; Wang, Chunhui

    2016-05-01

    The shape features of wear particles generated from wear track usually contain plenty of information about the wear states of a machinery operational condition. Techniques to quickly identify types of wear particles quickly to respond to the machine operation and prolong the machine's life appear to be lacking and are yet to be established. To bridge rapid off-line feature recognition with on-line wear mode identification, this paper presents a new radial concave deviation (RCD) method that mainly involves the use of the particle boundary signal to analyze wear particle features. Signal output from the RCDs subsequently facilitates the determination of several other feature parameters, typically relevant to the shape and size of the wear particle. Debris feature and type are identified through the use of various classification methods, such as linear discriminant analysis, quadratic discriminant analysis, naïve Bayesian method, and classification and regression tree method (CART). The average errors of the training and test via ten-fold cross validation suggest CART is a highly suitable approach for classifying and analyzing particle features. Furthermore, the results of the wear debris analysis enable the maintenance team to diagnose faults appropriately.

  6. Characterization of electroencephalography signals for estimating saliency features in videos.

    PubMed

    Liang, Zhen; Hamada, Yasuyuki; Oba, Shigeyuki; Ishii, Shin

    2018-05-12

    Understanding the functions of the visual system has been one of the major targets in neuroscience formany years. However, the relation between spontaneous brain activities and visual saliency in natural stimuli has yet to be elucidated. In this study, we developed an optimized machine learning-based decoding model to explore the possible relationships between the electroencephalography (EEG) characteristics and visual saliency. The optimal features were extracted from the EEG signals and saliency map which was computed according to an unsupervised saliency model ( Tavakoli and Laaksonen, 2017). Subsequently, various unsupervised feature selection/extraction techniques were examined using different supervised regression models. The robustness of the presented model was fully verified by means of ten-fold or nested cross validation procedure, and promising results were achieved in the reconstruction of saliency features based on the selected EEG characteristics. Through the successful demonstration of using EEG characteristics to predict the real-time saliency distribution in natural videos, we suggest the feasibility of quantifying visual content through measuring brain activities (EEG signals) in real environments, which would facilitate the understanding of cortical involvement in the processing of natural visual stimuli and application developments motivated by human visual processing. Copyright © 2018 Elsevier Ltd. All rights reserved.

  7. Reply to Comments on "the Cenozoic Fold-and-Thrust Belt of Eastern Sardinia: Evidences from the Integration of Field Data With Numerically Balanced Geological Cross Section" by Arragoni et al. (2016)

    NASA Astrophysics Data System (ADS)

    Salvini, F.; Arragoni, S.; Cianfarra, P.; Maggi, M.

    2017-10-01

    The comment by Berra et al. (2017) on the evidence of Alpine tectonics in Eastern Sardinia proposed by Arragoni et al. (2016) is based on the sedimentological interpretations of few local outcrops in a marginal portion of the study area. The Cenozoic Alpine fold-and-thrust setting, which characterizes this region, presents flat-over-flat shear planes acting along originally stratigraphic contacts, where stratigraphic continuity is obviously maintained. The ramp sectors present steeply dipping bedding attitudes, and there is no need to invoke and to force prograding clinoforms with unrealistic angles to justify them. The balanced geological cross section proposed by Arragoni et al. (2016) is fully supported by robust newly collected structural data and is compatible with the overall tectonic setting, while the interpretation proposed by Berra et al. (2017) lacks a detailed structural investigation. We believe that the partial application of the techniques available to modern geology may lead to incorrect interpretations, thus representing an obstacle for the progress of knowledge in the Earth sciences.

  8. Protein Secondary Structure Prediction Using AutoEncoder Network and Bayes Classifier

    NASA Astrophysics Data System (ADS)

    Wang, Leilei; Cheng, Jinyong

    2018-03-01

    Protein secondary structure prediction is belong to bioinformatics,and it's important in research area. In this paper, we propose a new prediction way of protein using bayes classifier and autoEncoder network. Our experiments show some algorithms including the construction of the model, the classification of parameters and so on. The data set is a typical CB513 data set for protein. In terms of accuracy, the method is the cross validation based on the 3-fold. Then we can get the Q3 accuracy. Paper results illustrate that the autoencoder network improved the prediction accuracy of protein secondary structure.

  9. KiDS-450: tomographic cross-correlation of galaxy shear with Planck lensing

    NASA Astrophysics Data System (ADS)

    Harnois-Déraps, Joachim; Tröster, Tilman; Chisari, Nora Elisa; Heymans, Catherine; van Waerbeke, Ludovic; Asgari, Marika; Bilicki, Maciej; Choi, Ami; Erben, Thomas; Hildebrandt, Hendrik; Hoekstra, Henk; Joudaki, Shahab; Kuijken, Konrad; Merten, Julian; Miller, Lance; Robertson, Naomi; Schneider, Peter; Viola, Massimo

    2017-10-01

    We present the tomographic cross-correlation between galaxy lensing measured in the Kilo Degree Survey (KiDS-450) with overlapping lensing measurements of the cosmic microwave background (CMB), as detected by Planck 2015. We compare our joint probe measurement to the theoretical expectation for a flat Λ cold dark matter cosmology, assuming the best-fitting cosmological parameters from the KiDS-450 cosmic shear and Planck CMB analyses. We find that our results are consistent within 1σ with the KiDS-450 cosmology, with an amplitude re-scaling parameter AKiDS = 0.86 ± 0.19. Adopting a Planck cosmology, we find our results are consistent within 2σ, with APlanck = 0.68 ± 0.15. We show that the agreement is improved in both cases when the contamination to the signal by intrinsic galaxy alignments is accounted for, increasing A by ∼0.1. This is the first tomographic analysis of the galaxy lensing - CMB lensing cross-correlation signal, and is based on five photometric redshift bins. We use this measurement as an independent validation of the multiplicative shear calibration and of the calibrated source redshift distribution at high redshifts. We find that constraints on these two quantities are strongly correlated when obtained from this technique, which should therefore not be considered as a stand-alone competitive calibration tool.

  10. Joint learning of ultrasonic backscattering statistical physics and signal confidence primal for characterizing atherosclerotic plaques using intravascular ultrasound.

    PubMed

    Sheet, Debdoot; Karamalis, Athanasios; Eslami, Abouzar; Noël, Peter; Chatterjee, Jyotirmoy; Ray, Ajoy K; Laine, Andrew F; Carlier, Stephane G; Navab, Nassir; Katouzian, Amin

    2014-01-01

    Intravascular Ultrasound (IVUS) is a predominant imaging modality in interventional cardiology. It provides real-time cross-sectional images of arteries and assists clinicians to infer about atherosclerotic plaques composition. These plaques are heterogeneous in nature and constitute fibrous tissue, lipid deposits and calcifications. Each of these tissues backscatter ultrasonic pulses and are associated with a characteristic intensity in B-mode IVUS image. However, clinicians are challenged when colocated heterogeneous tissue backscatter mixed signals appearing as non-unique intensity patterns in B-mode IVUS image. Tissue characterization algorithms have been developed to assist clinicians to identify such heterogeneous tissues and assess plaque vulnerability. In this paper, we propose a novel technique coined as Stochastic Driven Histology (SDH) that is able to provide information about co-located heterogeneous tissues. It employs learning of tissue specific ultrasonic backscattering statistical physics and signal confidence primal from labeled data for predicting heterogeneous tissue composition in plaques. We employ a random forest for the purpose of learning such a primal using sparsely labeled and noisy samples. In clinical deployment, the posterior prediction of different lesions constituting the plaque is estimated. Folded cross-validation experiments have been performed with 53 plaques indicating high concurrence with traditional tissue histology. On the wider horizon, this framework enables learning of tissue-energy interaction statistical physics and can be leveraged for promising clinical applications requiring tissue characterization beyond the application demonstrated in this paper. Copyright © 2013 Elsevier B.V. All rights reserved.

  11. Large scale wind tunnel investigation of a folding tilt rotor

    NASA Technical Reports Server (NTRS)

    1972-01-01

    A twenty-five foot diameter folding tilt rotor was tested in a large scale wind tunnel to determine its aerodynamic characteristics in unfolded, partially folded, and fully folded configurations. During the tests, the rotor completed over forty start/stop sequences. After completing the sequences in a stepwise manner, smooth start/stop transitions were made in approximately two seconds. Wind tunnel speeds up through seventy-five knots were used, at which point the rotor mast angle was increased to four degrees, corresponding to a maneuver condition of one and one-half g.

  12. Design and 4D Printing of Cross-Folded Origami Structures: A Preliminary Investigation

    PubMed Central

    Teoh, Joanne Ee Mei; Feng, Xiaofan; Zhao, Yue; Liu, Yong

    2018-01-01

    In 4D printing research, different types of complex structure folding and unfolding have been investigated. However, research on cross-folding of origami structures (defined as a folding structure with at least two overlapping folds) has not been reported. This research focuses on the investigation of cross-folding structures using multi-material components along different axes and different horizontal hinge thickness with single homogeneous material. Tensile tests were conducted to determine the impact of multi-material components and horizontal hinge thickness. In the case of multi-material structures, the hybrid material composition has a significant impact on the overall maximum strain and Young’s modulus properties. In the case of single material structures, the shape recovery speed is inversely proportional to the horizontal hinge thickness, while the flexural or bending strength is proportional to the horizontal hinge thickness. A hinge with a thickness of 0.5 mm could be folded three times prior to fracture whilst a hinge with a thickness of 0.3 mm could be folded only once prior to fracture. A hinge with a thickness of 0.1 mm could not even be folded without cracking. The introduction of a physical hole in the center of the folding/unfolding line provided stress relief and prevented fracture. A complex flower petal shape was used to successfully demonstrate the implementation of overlapping and non-overlapping folding lines using both single material segments and multi-material segments. Design guidelines for establishing cross-folding structures using multi-material components along different axes and different horizontal hinge thicknesses with single or homogeneous material were established. These guidelines can be used to design and implement complex origami structures with overlapping and non-overlapping folding lines. Combined overlapping folding structures could be implemented and allocating specific hole locations in the overall designs could be further explored. In addition, creating a more precise prediction by investigating sets of in between hinge thicknesses and comparing the folding times before fracture, will be the subject of future work. PMID:29510503

  13. Zinc ascorbate: a combined experimental and computational study for structure elucidation

    NASA Astrophysics Data System (ADS)

    Ünaleroǧlu, C.; Zümreoǧlu-Karan, B.; Mert, Y.

    2002-03-01

    The structure of Zn(HA)2·4H2O (HA=ascorbate) has been examined by a number of techniques (13C NMR, 1H NMR, IR, EI/MS and TGA) and also modeled by the semi-empirical PM3 method. The experimental and computational results agreed on a five-fold coordination around Zn(II) where one ascorbate binds monodentately, the other bidentately and two water molecules occupy the remaining sites of a distorted square pyramid.

  14. SERS quantitative urine creatinine measurement of human subject

    NASA Astrophysics Data System (ADS)

    Wang, Tsuei Lian; Chiang, Hui-hua K.; Lu, Hui-hsin; Hung, Yung-da

    2005-03-01

    SERS method for biomolecular analysis has several potentials and advantages over traditional biochemical approaches, including less specimen contact, non-destructive to specimen, and multiple components analysis. Urine is an easily available body fluid for monitoring the metabolites and renal function of human body. We developed surface-enhanced Raman scattering (SERS) technique using 50nm size gold colloidal particles for quantitative human urine creatinine measurements. This paper shows that SERS shifts of creatinine (104mg/dl) in artificial urine is from 1400cm-1 to 1500cm-1 which was analyzed for quantitative creatinine measurement. Ten human urine samples were obtained from ten healthy persons and analyzed by the SERS technique. Partial least square cross-validation (PLSCV) method was utilized to obtain the estimated creatinine concentration in clinically relevant (55.9mg/dl to 208mg/dl) concentration range. The root-mean square error of cross validation (RMSECV) is 26.1mg/dl. This research demonstrates the feasibility of using SERS for human subject urine creatinine detection, and establishes the SERS platform technique for bodily fluids measurement.

  15. Development, validation and utility of an in vitro technique for assessment of potential clinical drug-drug interactions involving P-glycoprotein.

    PubMed

    Keogh, John P; Kunta, Jeevan R

    2006-04-01

    Regulatory interest is increasing for drug transporters generally and P-glycoprotein (Pgp) in particular, primarily in the area of drug-drug interactions. To aid in both identifying and discharging the potential liabilities associated with drug-transporter interactions, the pharmaceutical industry has a growing requirement for routine and robust non-clinical assays. An assay was designed, optimised and validated to determine the in vitro inhibitory potency of new chemical entities (NCEs) towards human Pgp-mediated transport. [3H]-Digoxin was established as a suitable probe substrate by investigating its characteristics in the in vitro system (MDCKII-MDR1 cells grown in 24-multiwell inserts). The inhibitory potencies (apparent IC50) of known Pgp inhibitors astemizole, GF120918, ketoconazole, itraconazole, quinidine, verapamil and quinine were determined over at least a 1000-fold concentration range. Validation was carried out using manual and automatic techniques. [3H]-Digoxin was found to be stable and have good mass balance in the system. In contrast to [A-->B] transport, [3H]-digoxin [B-->A] transport rates were readily measured with good reproducibility. There was no evidence of saturation of transport up to 10 microM digoxin and 30 nM digoxin was selected for routine assay use, reflecting clinical therapeutic concentrations. IC50 values ranged over approximately 100-fold with excellent reproducibility. Results from manual and automated versions were in close agreement. This method is suitable for routine use to assess the in vitro inhibitory potency of NCEs on Pgp-mediated digoxin transport. Comparison of IC50 values against clinical interaction profiles for the probe inhibitors indicated the in vitro assay is predictive of clinical digoxin-drug interactions mediated via Pgp.

  16. Development and validation of a gene expression oligo microarray for the gilthead sea bream (Sparus aurata).

    PubMed

    Ferraresso, Serena; Vitulo, Nicola; Mininni, Alba N; Romualdi, Chiara; Cardazzo, Barbara; Negrisolo, Enrico; Reinhardt, Richard; Canario, Adelino V M; Patarnello, Tomaso; Bargelloni, Luca

    2008-12-03

    Aquaculture represents the most sustainable alternative of seafood supply to substitute for the declining marine fisheries, but severe production bottlenecks remain to be solved. The application of genomic technologies offers much promise to rapidly increase our knowledge on biological processes in farmed species and overcome such bottlenecks. Here we present an integrated platform for mRNA expression profiling in the gilthead sea bream (Sparus aurata), a marine teleost of great importance for aquaculture. A public data base was constructed, consisting of 19,734 unique clusters (3,563 contigs and 16,171 singletons). Functional annotation was obtained for 8,021 clusters. Over 4,000 sequences were also associated with a GO entry. Two 60mer probes were designed for each gene and in-situ synthesized on glass slides using Agilent SurePrint technology. Platform reproducibility and accuracy were assessed on two early stages of sea bream development (one-day and four days old larvae). Correlation between technical replicates was always > 0.99, with strong positive correlation between paired probes. A two class SAM test identified 1,050 differentially expressed genes between the two developmental stages. Functional analysis suggested that down-regulated transcripts (407) in older larvae are mostly essential/housekeeping genes, whereas tissue-specific genes are up-regulated in parallel with the formation of key organs (eye, digestive system). Cross-validation of microarray data was carried out using quantitative qRT-PCR on 11 target genes, selected to reflect the whole range of fold-change and both up-regulated and down-regulated genes. A statistically significant positive correlation was obtained comparing expression levels for each target gene across all biological replicates. Good concordance between qRT-PCR and microarray data was observed between 2- and 7-fold change, while fold-change compression in the microarray was present for differences greater than 10-fold in the qRT-PCR. A highly reliable oligo-microarray platform was developed and validated for the gilthead sea bream despite the presently limited knowledge of the species transcriptome. Because of the flexible design this array will be able to accommodate additional probes as soon as novel unique transcripts are available.

  17. Automatic welding detection by an intelligent tool pipe inspection

    NASA Astrophysics Data System (ADS)

    Arizmendi, C. J.; Garcia, W. L.; Quintero, M. A.

    2015-07-01

    This work provide a model based on machine learning techniques in welds recognition, based on signals obtained through in-line inspection tool called “smart pig” in Oil and Gas pipelines. The model uses a signal noise reduction phase by means of pre-processing algorithms and attribute-selection techniques. The noise reduction techniques were selected after a literature review and testing with survey data. Subsequently, the model was trained using recognition and classification algorithms, specifically artificial neural networks and support vector machines. Finally, the trained model was validated with different data sets and the performance was measured with cross validation and ROC analysis. The results show that is possible to identify welding automatically with an efficiency between 90 and 98 percent.

  18. Validation of FFM PD counts for screening personality pathology and psychopathy in adolescence.

    PubMed

    Decuyper, Mieke; De Clercq, Barbara; De Bolle, Marleen; De Fruyt, Filip

    2009-12-01

    Miller and colleagues (Miller, Bagby, Pilkonis, Reynolds, & Lynam, 2005) recently developed a Five-Factor Model (FFM) personality disorder (PD) count technique for describing and diagnosing PDs and psychopathy in adulthood. This technique conceptualizes PDs relying on general trait models and uses facets from the expert-generated PD prototypes to score the FFM PDs. The present study corroborates on the study of Miller and colleagues (2005) and investigates in Study 1 whether the PD count technique shows discriminant validity to describe PDs in adolescence. Study 2 extends this objective to psychopathy. Results suggest that the FFM PD count technique is equally successful in adolescence as in adulthood to describe PD symptoms, supporting the use of this descriptive method in adolescence. The normative data and accompanying PD count benchmarks enable to use FFM scores for PD screening purposes in adolescence.

  19. Assessment of MRI-Based Automated Fetal Cerebral Cortical Folding Measures in Prediction of Gestational Age in the Third Trimester.

    PubMed

    Wu, J; Awate, S P; Licht, D J; Clouchoux, C; du Plessis, A J; Avants, B B; Vossough, A; Gee, J C; Limperopoulos, C

    2015-07-01

    Traditional methods of dating a pregnancy based on history or sonographic assessment have a large variation in the third trimester. We aimed to assess the ability of various quantitative measures of brain cortical folding on MR imaging in determining fetal gestational age in the third trimester. We evaluated 8 different quantitative cortical folding measures to predict gestational age in 33 healthy fetuses by using T2-weighted fetal MR imaging. We compared the accuracy of the prediction of gestational age by these cortical folding measures with the accuracy of prediction by brain volume measurement and by a previously reported semiquantitative visual scale of brain maturity. Regression models were constructed, and measurement biases and variances were determined via a cross-validation procedure. The cortical folding measures are accurate in the estimation and prediction of gestational age (mean of the absolute error, 0.43 ± 0.45 weeks) and perform better than (P = .024) brain volume (mean of the absolute error, 0.72 ± 0.61 weeks) or sonography measures (SDs approximately 1.5 weeks, as reported in literature). Prediction accuracy is comparable with that of the semiquantitative visual assessment score (mean, 0.57 ± 0.41 weeks). Quantitative cortical folding measures such as global average curvedness can be an accurate and reliable estimator of gestational age and brain maturity for healthy fetuses in the third trimester and have the potential to be an indicator of brain-growth delays for at-risk fetuses and preterm neonates. © 2015 by American Journal of Neuroradiology.

  20. Contrast Enhancement of the LOASIS CPA Laser and Effects on Electron Beam Performance of LWFA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Toth, Csaba; Gonsalves, Anthony J.; Panasenko, Dmitriy

    2009-01-22

    A nonlinear optical pulse cleaning technique based on cross-polarized wave (XPW) generation filtering [1] has been implemented to improve laser pulse contrast, and consequently to control pre-ionization in laser-plasma accelerator experiments. Three orders of magnitude improvement in pre-pulse contrast has been achieved, resulting in 4-fold increase in electron charge and improved stability of both the electron beam energy and THz radiation generated as a secondary process in the gas-jet-based LWFA experiments.

  1. Statistical Evaluation of Combined Daily Gauge Observations and Rainfall Satellite Estimations over Continental South America

    NASA Technical Reports Server (NTRS)

    Vila, Daniel; deGoncalves, Luis Gustavo; Toll, David L.; Rozante, Jose Roberto

    2008-01-01

    This paper describes a comprehensive assessment of a new high-resolution, high-quality gauge-satellite based analysis of daily precipitation over continental South America during 2004. This methodology is based on a combination of additive and multiplicative bias correction schemes in order to get the lowest bias when compared with the observed values. Inter-comparisons and cross-validations tests have been carried out for the control algorithm (TMPA real-time algorithm) and different merging schemes: additive bias correction (ADD), ratio bias correction (RAT) and TMPA research version, for different months belonging to different seasons and for different network densities. All compared merging schemes produce better results than the control algorithm, but when finer temporal (daily) and spatial scale (regional networks) gauge datasets is included in the analysis, the improvement is remarkable. The Combined Scheme (CoSch) presents consistently the best performance among the five techniques. This is also true when a degraded daily gauge network is used instead of full dataset. This technique appears a suitable tool to produce real-time, high-resolution, high-quality gauge-satellite based analyses of daily precipitation over land in regional domains.

  2. Measurement of stream channel habitat using sonar

    USGS Publications Warehouse

    Flug, Marshall; Seitz, Heather; Scott, John

    1998-01-01

    An efficient and low cost technique using a sonar system was evaluated for describing channel geometry and quantifying inundated area in a large river. The boat-mounted portable sonar equipment was used to record water depths and river width measurements for direct storage on a laptop computer. The field data collected from repeated traverses at a cross-section were evaluated to determine the precision of the system and field technique. Results from validation at two different sites showed average sample standard deviations (S.D.s) of 0.12 m for these complete cross-sections, with coefficient of variations of 10%. Validation using only the mid-channel river cross-section data yields an average sample S.D. of 0.05 m, with a coefficient of variation below 5%, at a stable and gauged river site using only measurements of water depths greater than 0.6 m. Accuracy of the sonar system was evaluated by comparison to traditionally surveyed transect data from a regularly gauged site. We observed an average mean squared deviation of 46.0 cm2, considering only that portion of the cross-section inundated by more than 0.6 m of water. Our procedure proved to be a reliable, accurate, safe, quick, and economic method to record river depths, discharges, bed conditions, and substratum composition necessary for stream habitat studies.

  3. Balanced sections and the propagation of décollement: A Jura perspective

    NASA Astrophysics Data System (ADS)

    Laubscher, Hans

    2003-12-01

    The propagation of thrusting is an important problem in tectonics that is usually approached by forward (kinematical) modeling of balanced sections. Although modeling techniques are similar in most foreland fold-thrust belts, it turns out that in the Jura, there are modeling problems that require modifications of widely used techniques. In particular, attention is called to the role of model constraints that complement the set of observational constraints in order to fully define the model. In the eastern Jura, such model constraints may be inferred from the regional geology, which shows a peculiar noncoaxial relation between thrusts and subsequent folds. This relation implies changes in the direction of translation and the mode of deformation in the course of the propagation of décollement. These changes are conjectured to be the result of a change in partial decoupling between the thin-skinned fold-thrust system (nappe) and the obliquely subducted foreland. As a particularly instructive case in point, a cross section through the Weissenstein range is discussed. A two-step forward (kinematical) model is proposed that uses both local observational constraints as well as model constraints inferred from regional data. As a first step, a fault bend fold is generated in the hanging wall of a thrust of 1500 m shortening. As a second step, this structure is transferred by flexural slip into the actual fold observed at the surface. This requires an additional 1600 m of shortening and leads to folding of the original thrust. Thereafter, the footwall is deformed so as to respect the constraint that this deformation must fit into the space defined by the folded thrust as the upper boundary and the décollement surface as the lower boundary, and that, in addition, should be confined to the area immediately below the fold. In modeling the footwall deformation a mix of balancing methods is used: fault propagation folds for the competent intervals of the stratigraphic column and area balancing for the incompetent ones. Further propagation of décollement into the foreland is made possible by the folding process, which is dominated by a sort of kinking and which is the main contribution to structural elevation and hence to producing a sort of critical taper of the moving thin-skinned wedge.

  4. Cascade Back-Propagation Learning in Neural Networks

    NASA Technical Reports Server (NTRS)

    Duong, Tuan A.

    2003-01-01

    The cascade back-propagation (CBP) algorithm is the basis of a conceptual design for accelerating learning in artificial neural networks. The neural networks would be implemented as analog very-large-scale integrated (VLSI) circuits, and circuits to implement the CBP algorithm would be fabricated on the same VLSI circuit chips with the neural networks. Heretofore, artificial neural networks have learned slowly because it has been necessary to train them via software, for lack of a good on-chip learning technique. The CBP algorithm is an on-chip technique that provides for continuous learning in real time. Artificial neural networks are trained by example: A network is presented with training inputs for which the correct outputs are known, and the algorithm strives to adjust the weights of synaptic connections in the network to make the actual outputs approach the correct outputs. The input data are generally divided into three parts. Two of the parts, called the "training" and "cross-validation" sets, respectively, must be such that the corresponding input/output pairs are known. During training, the cross-validation set enables verification of the status of the input-to-output transformation learned by the network to avoid over-learning. The third part of the data, termed the "test" set, consists of the inputs that are required to be transformed into outputs; this set may or may not include the training set and/or the cross-validation set. Proposed neural-network circuitry for on-chip learning would be divided into two distinct networks; one for training and one for validation. Both networks would share the same synaptic weights.

  5. Cross-validation and Peeling Strategies for Survival Bump Hunting using Recursive Peeling Methods

    PubMed Central

    Dazard, Jean-Eudes; Choe, Michael; LeBlanc, Michael; Rao, J. Sunil

    2015-01-01

    We introduce a framework to build a survival/risk bump hunting model with a censored time-to-event response. Our Survival Bump Hunting (SBH) method is based on a recursive peeling procedure that uses a specific survival peeling criterion derived from non/semi-parametric statistics such as the hazards-ratio, the log-rank test or the Nelson--Aalen estimator. To optimize the tuning parameter of the model and validate it, we introduce an objective function based on survival or prediction-error statistics, such as the log-rank test and the concordance error rate. We also describe two alternative cross-validation techniques adapted to the joint task of decision-rule making by recursive peeling and survival estimation. Numerical analyses show the importance of replicated cross-validation and the differences between criteria and techniques in both low and high-dimensional settings. Although several non-parametric survival models exist, none addresses the problem of directly identifying local extrema. We show how SBH efficiently estimates extreme survival/risk subgroups unlike other models. This provides an insight into the behavior of commonly used models and suggests alternatives to be adopted in practice. Finally, our SBH framework was applied to a clinical dataset. In it, we identified subsets of patients characterized by clinical and demographic covariates with a distinct extreme survival outcome, for which tailored medical interventions could be made. An R package PRIMsrc (Patient Rule Induction Method in Survival, Regression and Classification settings) is available on CRAN (Comprehensive R Archive Network) and GitHub. PMID:27034730

  6. Probabilistic cross-link analysis and experiment planning for high-throughput elucidation of protein structure.

    PubMed

    Ye, Xiaoduan; O'Neil, Patrick K; Foster, Adrienne N; Gajda, Michal J; Kosinski, Jan; Kurowski, Michal A; Bujnicki, Janusz M; Friedman, Alan M; Bailey-Kellogg, Chris

    2004-12-01

    Emerging high-throughput techniques for the characterization of protein and protein-complex structures yield noisy data with sparse information content, placing a significant burden on computation to properly interpret the experimental data. One such technique uses cross-linking (chemical or by cysteine oxidation) to confirm or select among proposed structural models (e.g., from fold recognition, ab initio prediction, or docking) by testing the consistency between cross-linking data and model geometry. This paper develops a probabilistic framework for analyzing the information content in cross-linking experiments, accounting for anticipated experimental error. This framework supports a mechanism for planning experiments to optimize the information gained. We evaluate potential experiment plans using explicit trade-offs among key properties of practical importance: discriminability, coverage, balance, ambiguity, and cost. We devise a greedy algorithm that considers those properties and, from a large number of combinatorial possibilities, rapidly selects sets of experiments expected to discriminate pairs of models efficiently. In an application to residue-specific chemical cross-linking, we demonstrate the ability of our approach to plan experiments effectively involving combinations of cross-linkers and introduced mutations. We also describe an experiment plan for the bacteriophage lambda Tfa chaperone protein in which we plan dicysteine mutants for discriminating threading models by disulfide formation. Preliminary results from a subset of the planned experiments are consistent and demonstrate the practicality of planning. Our methods provide the experimenter with a valuable tool (available from the authors) for understanding and optimizing cross-linking experiments.

  7. Uncovering Specific Electrostatic Interactions in the Denatured States of Proteins

    PubMed Central

    Shen, Jana K.

    2010-01-01

    The stability and folding of proteins are modulated by energetically significant interactions in the denatured state that is in equilibrium with the native state. These interactions remain largely invisible to current experimental techniques, however, due to the sparse population and conformational heterogeneity of the denatured-state ensemble under folding conditions. Molecular dynamics simulations using physics-based force fields can in principle offer atomistic details of the denatured state. However, practical applications are plagued with the lack of rigorous means to validate microscopic information and deficiencies in force fields and solvent models. This study presents a method based on coupled titration and molecular dynamics sampling of the denatured state starting from the extended sequence under native conditions. The resulting denatured-state pKas allow for the prediction of experimental observables such as pH- and mutation-induced stability changes. I show the capability and use of the method by investigating the electrostatic interactions in the denatured states of wild-type and K12M mutant of NTL9 protein. This study shows that the major errors in electrostatics can be identified by validating the titration properties of the fragment peptides derived from the sequence of the intact protein. Consistent with experimental evidence, our simulations show a significantly depressed pKa for Asp8 in the denatured state of wild-type, which is due to a nonnative interaction between Asp8 and Lys12. Interestingly, the simulation also shows a nonnative interaction between Asp8 and Glu48 in the denatured state of the mutant. I believe the presented method is general and can be applied to extract and validate microscopic electrostatics of the entire folding energy landscape. PMID:20682271

  8. More than the sum of its parts: Coarse-grained peptide-lipid interactions from a simple cross-parametrization

    NASA Astrophysics Data System (ADS)

    Bereau, Tristan; Wang, Zun-Jing; Deserno, Markus

    2014-03-01

    Interfacial systems are at the core of fascinating phenomena in many disciplines, such as biochemistry, soft-matter physics, and food science. However, the parametrization of accurate, reliable, and consistent coarse-grained (CG) models for systems at interfaces remains a challenging endeavor. In the present work, we explore to what extent two independently developed solvent-free CG models of peptides and lipids—of different mapping schemes, parametrization methods, target functions, and validation criteria—can be combined by only tuning the cross-interactions. Our results show that the cross-parametrization can reproduce a number of structural properties of membrane peptides (for example, tilt and hydrophobic mismatch), in agreement with existing peptide-lipid CG force fields. We find encouraging results for two challenging biophysical problems: (i) membrane pore formation mediated by the cooperative action of several antimicrobial peptides, and (ii) the insertion and folding of the helix-forming peptide WALP23 in the membrane.

  9. CNN-BLPred: a Convolutional neural network based predictor for β-Lactamases (BL) and their classes.

    PubMed

    White, Clarence; Ismail, Hamid D; Saigo, Hiroto; Kc, Dukka B

    2017-12-28

    The β-Lactamase (BL) enzyme family is an important class of enzymes that plays a key role in bacterial resistance to antibiotics. As the newly identified number of BL enzymes is increasing daily, it is imperative to develop a computational tool to classify the newly identified BL enzymes into one of its classes. There are two types of classification of BL enzymes: Molecular Classification and Functional Classification. Existing computational methods only address Molecular Classification and the performance of these existing methods is unsatisfactory. We addressed the unsatisfactory performance of the existing methods by implementing a Deep Learning approach called Convolutional Neural Network (CNN). We developed CNN-BLPred, an approach for the classification of BL proteins. The CNN-BLPred uses Gradient Boosted Feature Selection (GBFS) in order to select the ideal feature set for each BL classification. Based on the rigorous benchmarking of CCN-BLPred using both leave-one-out cross-validation and independent test sets, CCN-BLPred performed better than the other existing algorithms. Compared with other architectures of CNN, Recurrent Neural Network, and Random Forest, the simple CNN architecture with only one convolutional layer performs the best. After feature extraction, we were able to remove ~95% of the 10,912 features using Gradient Boosted Trees. During 10-fold cross validation, we increased the accuracy of the classic BL predictions by 7%. We also increased the accuracy of Class A, Class B, Class C, and Class D performance by an average of 25.64%. The independent test results followed a similar trend. We implemented a deep learning algorithm known as Convolutional Neural Network (CNN) to develop a classifier for BL classification. Combined with feature selection on an exhaustive feature set and using balancing method such as Random Oversampling (ROS), Random Undersampling (RUS) and Synthetic Minority Oversampling Technique (SMOTE), CNN-BLPred performs significantly better than existing algorithms for BL classification.

  10. Feature extraction using convolutional neural network for classifying breast density in mammographic images

    NASA Astrophysics Data System (ADS)

    Thomaz, Ricardo L.; Carneiro, Pedro C.; Patrocinio, Ana C.

    2017-03-01

    Breast cancer is the leading cause of death for women in most countries. The high levels of mortality relate mostly to late diagnosis and to the direct proportionally relationship between breast density and breast cancer development. Therefore, the correct assessment of breast density is important to provide better screening for higher risk patients. However, in modern digital mammography the discrimination among breast densities is highly complex due to increased contrast and visual information for all densities. Thus, a computational system for classifying breast density might be a useful tool for aiding medical staff. Several machine-learning algorithms are already capable of classifying small number of classes with good accuracy. However, machinelearning algorithms main constraint relates to the set of features extracted and used for classification. Although well-known feature extraction techniques might provide a good set of features, it is a complex task to select an initial set during design of a classifier. Thus, we propose feature extraction using a Convolutional Neural Network (CNN) for classifying breast density by a usual machine-learning classifier. We used 307 mammographic images downsampled to 260x200 pixels to train a CNN and extract features from a deep layer. After training, the activation of 8 neurons from a deep fully connected layer are extracted and used as features. Then, these features are feedforward to a single hidden layer neural network that is cross-validated using 10-folds to classify among four classes of breast density. The global accuracy of this method is 98.4%, presenting only 1.6% of misclassification. However, the small set of samples and memory constraints required the reuse of data in both CNN and MLP-NN, therefore overfitting might have influenced the results even though we cross-validated the network. Thus, although we presented a promising method for extracting features and classifying breast density, a greater database is still required for evaluating the results.

  11. Mechanical restoration of large-scale folded multilayers using the finite element method: Application to the Zagros Simply Folded Belt, N-Iraq

    NASA Astrophysics Data System (ADS)

    Frehner, Marcel; Reif, Daniel; Grasemann, Bernhard

    2010-05-01

    There are a large number of numerical finite element studies concerned with modeling the evolution of folded geological layers through time. This body of research includes many aspects of folding and many different approaches, such as two- and three-dimensional studies, single-layer folding, detachment folding, development of chevron folds, Newtonian, power-law viscous and more complex rheologies, influence of anisotropy, pure-shear, simple-shear and other boundary conditions and so forth. In recent years, studies of multilayer folding emerged, thanks to more advanced mesh generator software and increased computational power. Common to all of these studies is the fact that they consider a forward directed time evolution, as in nature. Very few studies use the finite element method for reverse-time simulations. In such studies, folded geological layers are taken as initial conditions for the numerical simulation. The folding process is reversed by changing the signs of the boundary conditions that supposedly drove the folding process. In such studies, the geometry of the geological layers before the folding process is searched and the amount of shortening necessary for the final folded geometry can be calculated. In contrast to a kinematic or geometric fold restoration procedure, the described approach takes the mechanical behavior of the geological layers into account, such as rheology and the relative strength of the individual layers. This approach is therefore called mechanical restoration of folds. In this study, the concept of mechanical restoration is applied to a two-dimensional 50km long NE-SW-cross-section through the Zagros Simply Folded Belt in Iraqi Kurdistan, NE from the city of Erbil. The Simply Folded Belt is dominated by gentle to open folding and faults are either absent or record only minor offset. Therefore, this region is ideal for testing the concept of mechanical restoration. The profile used is constructed from structural field measurements and digital elevation models using the dip-domain method for balancing the cross-section. The lithology consists of Cretaceous to Cenozoic sediments. Massive carbonate rock units act as the competent layers compared to the incompetent behavior of siltstone, claystone and marl layers. We show the first results of the mechanical restoration of the Zagros cross-section and we discuss advantages and disadvantages, as well as some technical aspects of the applied method. First results indicate that a shortening of at least 50% was necessary to create the present-day folded cross-section. This value is higher than estimates of the amount of shortening solely based on kinematic or geometric restoration. One particular problem that is discussed is the presence of (unnaturally) sharp edges in a balanced cross-section produced using the dip-domain method, which need to be eliminated for mechanical restoration calculations to get reasonable results.

  12. 3D Bragg coherent diffractive imaging of five-fold multiply twinned gold nanoparticle

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, Jong Woo; Ulvestad, Andrew; Manna, Sohini

    The formation mechanism of five-fold multiply twinned nanoparticles has been a long-term topic because of their geometrical incompatibility. So, various models have been proposed to explain how the internal structure of the multiply twinned nanoparticles accommodates the constraints of the solid-angle deficiency. Here, we investigate the internal structure, strain field and strain energy density of 600 nm sized five-fold multiply twinned gold nanoparticles quantitatively using Bragg coherent diffractive imaging, which is suitable for the study of buried defects and three-dimensional strain distribution with great precision. Our study reveals that the strain energy density in five-fold multiply twinned gold nanoparticles ismore » an order of magnitude higher than that of the single nanocrystals such as an octahedron and triangular plate synthesized under the same conditions. This result indicates that the strain developed while accommodating an angular misfit, although partially released through the introduction of structural defects, is still large throughout the crystal.« less

  13. 3D Bragg coherent diffractive imaging of five-fold multiply twinned gold nanoparticle

    DOE PAGES

    Kim, Jong Woo; Ulvestad, Andrew; Manna, Sohini; ...

    2017-08-11

    The formation mechanism of five-fold multiply twinned nanoparticles has been a long-term topic because of their geometrical incompatibility. So, various models have been proposed to explain how the internal structure of the multiply twinned nanoparticles accommodates the constraints of the solid-angle deficiency. Here, we investigate the internal structure, strain field and strain energy density of 600 nm sized five-fold multiply twinned gold nanoparticles quantitatively using Bragg coherent diffractive imaging, which is suitable for the study of buried defects and three-dimensional strain distribution with great precision. Our study reveals that the strain energy density in five-fold multiply twinned gold nanoparticles ismore » an order of magnitude higher than that of the single nanocrystals such as an octahedron and triangular plate synthesized under the same conditions. This result indicates that the strain developed while accommodating an angular misfit, although partially released through the introduction of structural defects, is still large throughout the crystal.« less

  14. Improving accuracy of genomic prediction in Brangus cattle by adding animals with imputed low-density SNP genotypes.

    PubMed

    Lopes, F B; Wu, X-L; Li, H; Xu, J; Perkins, T; Genho, J; Ferretti, R; Tait, R G; Bauck, S; Rosa, G J M

    2018-02-01

    Reliable genomic prediction of breeding values for quantitative traits requires the availability of sufficient number of animals with genotypes and phenotypes in the training set. As of 31 October 2016, there were 3,797 Brangus animals with genotypes and phenotypes. These Brangus animals were genotyped using different commercial SNP chips. Of them, the largest group consisted of 1,535 animals genotyped by the GGP-LDV4 SNP chip. The remaining 2,262 genotypes were imputed to the SNP content of the GGP-LDV4 chip, so that the number of animals available for training the genomic prediction models was more than doubled. The present study showed that the pooling of animals with both original or imputed 40K SNP genotypes substantially increased genomic prediction accuracies on the ten traits. By supplementing imputed genotypes, the relative gains in genomic prediction accuracies on estimated breeding values (EBV) were from 12.60% to 31.27%, and the relative gain in genomic prediction accuracies on de-regressed EBV was slightly small (i.e. 0.87%-18.75%). The present study also compared the performance of five genomic prediction models and two cross-validation methods. The five genomic models predicted EBV and de-regressed EBV of the ten traits similarly well. Of the two cross-validation methods, leave-one-out cross-validation maximized the number of animals at the stage of training for genomic prediction. Genomic prediction accuracy (GPA) on the ten quantitative traits was validated in 1,106 newly genotyped Brangus animals based on the SNP effects estimated in the previous set of 3,797 Brangus animals, and they were slightly lower than GPA in the original data. The present study was the first to leverage currently available genotype and phenotype resources in order to harness genomic prediction in Brangus beef cattle. © 2018 Blackwell Verlag GmbH.

  15. Evolution of robot-assisted orthotopic ileal neobladder formation: a step-by-step update to the University of Southern California (USC) technique.

    PubMed

    Chopra, Sameer; de Castro Abreu, Andre Luis; Berger, Andre K; Sehgal, Shuchi; Gill, Inderbir; Aron, Monish; Desai, Mihir M

    2017-01-01

    To describe our, step-by-step, technique for robotic intracorporeal neobladder formation. The main surgical steps to forming the intracorporeal orthotopic ileal neobladder are: isolation of 65 cm of small bowel; small bowel anastomosis; bowel detubularisation; suture of the posterior wall of the neobladder; neobladder-urethral anastomosis and cross folding of the pouch; and uretero-enteral anastomosis. Improvements have been made to these steps to enhance time efficiency without compromising neobladder configuration. Our technical improvements have resulted in an improvement in operative time from 450 to 360 min. We describe an updated step-by-step technique of robot-assisted intracorporeal orthotopic ileal neobladder formation. © 2016 The Authors BJU International © 2016 BJU International Published by John Wiley & Sons Ltd.

  16. Estimating energy expenditure from heart rate in older adults: a case for calibration.

    PubMed

    Schrack, Jennifer A; Zipunnikov, Vadim; Goldsmith, Jeff; Bandeen-Roche, Karen; Crainiceanu, Ciprian M; Ferrucci, Luigi

    2014-01-01

    Accurate measurement of free-living energy expenditure is vital to understanding changes in energy metabolism with aging. The efficacy of heart rate as a surrogate for energy expenditure is rooted in the assumption of a linear function between heart rate and energy expenditure, but its validity and reliability in older adults remains unclear. To assess the validity and reliability of the linear function between heart rate and energy expenditure in older adults using different levels of calibration. Heart rate and energy expenditure were assessed across five levels of exertion in 290 adults participating in the Baltimore Longitudinal Study of Aging. Correlation and random effects regression analyses assessed the linearity of the relationship between heart rate and energy expenditure and cross-validation models assessed predictive performance. Heart rate and energy expenditure were highly correlated (r=0.98) and linear regardless of age or sex. Intra-person variability was low but inter-person variability was high, with substantial heterogeneity of the random intercept (s.d. =0.372) despite similar slopes. Cross-validation models indicated individual calibration data substantially improves accuracy predictions of energy expenditure from heart rate, reducing the potential for considerable measurement bias. Although using five calibration measures provided the greatest reduction in the standard deviation of prediction errors (1.08 kcals/min), substantial improvement was also noted with two (0.75 kcals/min). These findings indicate standard regression equations may be used to make population-level inferences when estimating energy expenditure from heart rate in older adults but caution should be exercised when making inferences at the individual level without proper calibration.

  17. Dual-time point scanning of integrated FDG PET/CT for the evaluation of mediastinal and hilar lymph nodes in non-small cell lung cancer diagnosed as operable by contrast-enhanced CT.

    PubMed

    Kasai, Takami; Motoori, Ken; Horikoshi, Takuro; Uchiyama, Katsuhiro; Yasufuku, Kazuhiro; Takiguchi, Yuichi; Takahashi, Fumiaki; Kuniyasu, Yoshio; Ito, Hisao

    2010-08-01

    To evaluate whether dual-time point scanning with integrated fluorine-18 fluorodeoxyglucose ((18)F-FDG) positron emission tomography and computed tomography (PET/CT) is useful for evaluation of mediastinal and hilar lymph nodes in non-small cell lung cancer diagnosed as operable by contrast-enhanced CT. PET/CT data and pathological findings of 560 nodal stations in 129 patients with pathologically proven non-small cell lung cancer diagnosed as operable by contrast-enhanced CT were reviewed retrospectively. Standardized uptake values (SUVs) on early scans (SUVe) 1h, and on delayed scans (SUVd) 2h after FDG injection of each nodal station were measured. Retention index (RI) (%) was calculated by subtracting SUVe from SUVd and dividing by SUVe. Logistic regression analysis was performed with seven kinds of models, consisting of (1) SUVe, (2) SUVd, (3) RI, (4) SUVe and SUVd, (5) SUVe and RI, (6) SUVd and RI, and (7) SUVe, SUVd and RI. The seven derived models were compared by receiver-operating characteristic (ROC) analysis. k-Fold cross-validation was performed with k values of 5 and 10. p<0.05 was considered statistically significant. Model (1) including the term of SUVe showed the largest area under the ROC curve among the seven models. The cut-off probability of metastasis of 3.5% with SUVe of 2.5 revealed a sensitivity of 78% and a specificity of 81% on ROC analysis, and approximately 60% and 80% on k-fold cross-validation. Single scanning of PET/CT is sufficiently useful for evaluating mediastinal and hilar nodes for metastasis. Copyright (c) 2009 Elsevier Ireland Ltd. All rights reserved.

  18. Development and Psychometric Evaluation of an Instrument to Assess Cross-Cultural Competence of Healthcare Professionals (CCCHP)

    PubMed Central

    Bernhard, Gerda; Knibbe, Ronald A.; von Wolff, Alessa; Dingoyan, Demet; Schulz, Holger; Mösko, Mike

    2015-01-01

    Background Cultural competence of healthcare professionals (HCPs) is recognized as a strategy to reduce cultural disparities in healthcare. However, standardised, valid and reliable instruments to assess HCPs’ cultural competence are notably lacking. The present study aims to 1) identify the core components of cultural competence from a healthcare perspective, 2) to develop a self-report instrument to assess cultural competence of HCPs and 3) to evaluate the psychometric properties of the new instrument. Methods The conceptual model and initial item pool, which were applied to the cross-cultural competence instrument for the healthcare profession (CCCHP), were derived from an expert survey (n = 23), interviews with HCPs (n = 12), and a broad narrative review on assessment instruments and conceptual models of cultural competence. The item pool was reduced systematically, which resulted in a 59-item instrument. A sample of 336 psychologists, in advanced psychotherapeutic training, and 409 medical students participated, in order to evaluate the construct validity and reliability of the CCCHP. Results Construct validity was supported by principal component analysis, which led to a 32-item six-component solution with 50% of the total variance explained. The different dimensions of HCPs’ cultural competence are: Cross-Cultural Motivation/Curiosity, Cross-Cultural Attitudes, Cross-Cultural Skills, Cross-Cultural Knowledge/Awareness and Cross-Cultural Emotions/Empathy. For the total instrument, the internal consistency reliability was .87 and the dimension’s Cronbach’s α ranged from .54 to .84. The discriminating power of the CCCHP was indicated by statistically significant mean differences in CCCHP subscale scores between predefined groups. Conclusions The 32-item CCCHP exhibits acceptable psychometric properties, particularly content and construct validity to examine HCPs’ cultural competence. The CCCHP with its five dimensions offers a comprehensive assessment of HCPs’ cultural competence, and has the ability to distinguish between groups that are expected to differ in cultural competence. This instrument can foster professional development through systematic self-assessment and thus contributes to improve the quality of patient care. PMID:26641876

  19. Development and Psychometric Evaluation of an Instrument to Assess Cross-Cultural Competence of Healthcare Professionals (CCCHP).

    PubMed

    Bernhard, Gerda; Knibbe, Ronald A; von Wolff, Alessa; Dingoyan, Demet; Schulz, Holger; Mösko, Mike

    2015-01-01

    Cultural competence of healthcare professionals (HCPs) is recognized as a strategy to reduce cultural disparities in healthcare. However, standardised, valid and reliable instruments to assess HCPs' cultural competence are notably lacking. The present study aims to 1) identify the core components of cultural competence from a healthcare perspective, 2) to develop a self-report instrument to assess cultural competence of HCPs and 3) to evaluate the psychometric properties of the new instrument. The conceptual model and initial item pool, which were applied to the cross-cultural competence instrument for the healthcare profession (CCCHP), were derived from an expert survey (n = 23), interviews with HCPs (n = 12), and a broad narrative review on assessment instruments and conceptual models of cultural competence. The item pool was reduced systematically, which resulted in a 59-item instrument. A sample of 336 psychologists, in advanced psychotherapeutic training, and 409 medical students participated, in order to evaluate the construct validity and reliability of the CCCHP. Construct validity was supported by principal component analysis, which led to a 32-item six-component solution with 50% of the total variance explained. The different dimensions of HCPs' cultural competence are: Cross-Cultural Motivation/Curiosity, Cross-Cultural Attitudes, Cross-Cultural Skills, Cross-Cultural Knowledge/Awareness and Cross-Cultural Emotions/Empathy. For the total instrument, the internal consistency reliability was .87 and the dimension's Cronbach's α ranged from .54 to .84. The discriminating power of the CCCHP was indicated by statistically significant mean differences in CCCHP subscale scores between predefined groups. The 32-item CCCHP exhibits acceptable psychometric properties, particularly content and construct validity to examine HCPs' cultural competence. The CCCHP with its five dimensions offers a comprehensive assessment of HCPs' cultural competence, and has the ability to distinguish between groups that are expected to differ in cultural competence. This instrument can foster professional development through systematic self-assessment and thus contributes to improve the quality of patient care.

  20. Electron-impact ionization cross sections out of the ground and 6P2 excited states of cesium

    NASA Astrophysics Data System (ADS)

    Łukomski, M.; Sutton, S.; Kedzierski, W.; Reddish, T. J.; Bartschat, K.; Bartlett, P. L.; Bray, I.; Stelbovics, A. T.; McConkey, J. W.

    2006-09-01

    An atom trapping technique for determining absolute, total ionization cross sections (TICS) out of an excited atom is presented. The unique feature of our method is in utilizing Doppler cooling of neutral atoms to determine ionization cross sections. This fluorescence-monitoring experiment, which is a variant of the “trap loss” technique, has enabled us to obtain the experimental electron impact ionization cross sections out of the Cs 6P3/22 state between 7eV and 400eV . CCC, RMPS, and Born theoretical results are also presented for both the ground and excited states of cesium and rubidium. In the low energy region (<11eV) where best agreement between these excited state measurements and theory might be expected, a discrepancy of approximately a factor of five is observed. Above this energy there are significant contributions to the TICS from both autoionization and multiple ionization.

Top