Support Vector Machine-Based Endmember Extraction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Filippi, Anthony M; Archibald, Richard K
Introduced in this paper is the utilization of Support Vector Machines (SVMs) to automatically perform endmember extraction from hyperspectral data. The strengths of SVM are exploited to provide a fast and accurate calculated representation of high-dimensional data sets that may consist of multiple distributions. Once this representation is computed, the number of distributions can be determined without prior knowledge. For each distribution, an optimal transform can be determined that preserves informational content while reducing the data dimensionality, and hence, the computational cost. Finally, endmember extraction for the whole data set is accomplished. Results indicate that this Support Vector Machine-Based Endmembermore » Extraction (SVM-BEE) algorithm has the capability of autonomously determining endmembers from multiple clusters with computational speed and accuracy, while maintaining a robust tolerance to noise.« less
Agricultural mapping using Support Vector Machine-Based Endmember Extraction (SVM-BEE)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Archibald, Richard K; Filippi, Anthony M; Bhaduri, Budhendra L
Extracting endmembers from remotely sensed images of vegetated areas can present difficulties. In this research, we applied a recently developed endmember-extraction algorithm based on Support Vector Machines (SVMs) to the problem of semi-autonomous estimation of vegetation endmembers from a hyperspectral image. This algorithm, referred to as Support Vector Machine-Based Endmember Extraction (SVM-BEE), accurately and rapidly yields a computed representation of hyperspectral data that can accommodate multiple distributions. The number of distributions is identified without prior knowledge, based upon this representation. Prior work established that SVM-BEE is robustly noise-tolerant and can semi-automatically and effectively estimate endmembers; synthetic data and a geologicmore » scene were previously analyzed. Here we compared the efficacies of the SVM-BEE and N-FINDR algorithms in extracting endmembers from a predominantly agricultural scene. SVM-BEE was able to estimate vegetation and other endmembers for all classes in the image, which N-FINDR failed to do. Classifications based on SVM-BEE endmembers were markedly more accurate compared with those based on N-FINDR endmembers.« less
Endmember extraction from hyperspectral image based on discrete firefly algorithm (EE-DFA)
NASA Astrophysics Data System (ADS)
Zhang, Chengye; Qin, Qiming; Zhang, Tianyuan; Sun, Yuanheng; Chen, Chao
2017-04-01
This study proposed a novel method to extract endmembers from hyperspectral image based on discrete firefly algorithm (EE-DFA). Endmembers are the input of many spectral unmixing algorithms. Hence, in this paper, endmember extraction from hyperspectral image is regarded as a combinational optimization problem to get best spectral unmixing results, which can be solved by the discrete firefly algorithm. Two series of experiments were conducted on the synthetic hyperspectral datasets with different SNR and the AVIRIS Cuprite dataset, respectively. The experimental results were compared with the endmembers extracted by four popular methods: the sequential maximum angle convex cone (SMACC), N-FINDR, Vertex Component Analysis (VCA), and Minimum Volume Constrained Nonnegative Matrix Factorization (MVC-NMF). What's more, the effect of the parameters in the proposed method was tested on both synthetic hyperspectral datasets and AVIRIS Cuprite dataset, and the recommended parameters setting was proposed. The results in this study demonstrated that the proposed EE-DFA method showed better performance than the existing popular methods. Moreover, EE-DFA is robust under different SNR conditions.
[A spatial adaptive algorithm for endmember extraction on multispectral remote sensing image].
Zhu, Chang-Ming; Luo, Jian-Cheng; Shen, Zhan-Feng; Li, Jun-Li; Hu, Xiao-Dong
2011-10-01
Due to the problem that the convex cone analysis (CCA) method can only extract limited endmember in multispectral imagery, this paper proposed a new endmember extraction method by spatial adaptive spectral feature analysis in multispectral remote sensing image based on spatial clustering and imagery slice. Firstly, in order to remove spatial and spectral redundancies, the principal component analysis (PCA) algorithm was used for lowering the dimensions of the multispectral data. Secondly, iterative self-organizing data analysis technology algorithm (ISODATA) was used for image cluster through the similarity of the pixel spectral. And then, through clustering post process and litter clusters combination, we divided the whole image data into several blocks (tiles). Lastly, according to the complexity of image blocks' landscape and the feature of the scatter diagrams analysis, the authors can determine the number of endmembers. Then using hourglass algorithm extracts endmembers. Through the endmember extraction experiment on TM multispectral imagery, the experiment result showed that the method can extract endmember spectra form multispectral imagery effectively. What's more, the method resolved the problem of the amount of endmember limitation and improved accuracy of the endmember extraction. The method has provided a new way for multispectral image endmember extraction.
NASA Astrophysics Data System (ADS)
Su, Yuanchao; Sun, Xu; Gao, Lianru; Li, Jun; Zhang, Bing
2016-10-01
Endmember extraction is a key step in hyperspectral unmixing. A new endmember extraction framework is proposed for hyperspectral endmember extraction. The proposed approach is based on the swarm intelligence (SI) algorithm, where discretization is used to solve the SI algorithm because pixels in a hyperspectral image are naturally defined within a discrete space. Moreover, a "distance" factor is introduced into the objective function to limit the endmember numbers which is generally limited in real scenarios, while traditional SI algorithms likely produce superabundant spectral signatures, which generally belong to the same classes. Three endmember extraction methods are proposed based on the artificial bee colony, ant colony optimization, and particle swarm optimization algorithms. Experiments with both simulated and real hyperspectral images indicate that the proposed framework can improve the accuracy of endmember extraction.
Spatial-spectral preprocessing for endmember extraction on GPU's
NASA Astrophysics Data System (ADS)
Jimenez, Luis I.; Plaza, Javier; Plaza, Antonio; Li, Jun
2016-10-01
Spectral unmixing is focused in the identification of spectrally pure signatures, called endmembers, and their corresponding abundances in each pixel of a hyperspectral image. Mainly focused on the spectral information contained in the hyperspectral images, endmember extraction techniques have recently included spatial information to achieve more accurate results. Several algorithms have been developed for automatic or semi-automatic identification of endmembers using spatial and spectral information, including the spectral-spatial endmember extraction (SSEE) where, within a preprocessing step in the technique, both sources of information are extracted from the hyperspectral image and equally used for this purpose. Previous works have implemented the SSEE technique in four main steps: 1) local eigenvectors calculation in each sub-region in which the original hyperspectral image is divided; 2) computation of the maxima and minima projection of all eigenvectors over the entire hyperspectral image in order to obtain a candidates pixels set; 3) expansion and averaging of the signatures of the candidate set; 4) ranking based on the spectral angle distance (SAD). The result of this method is a list of candidate signatures from which the endmembers can be extracted using various spectral-based techniques, such as orthogonal subspace projection (OSP), vertex component analysis (VCA) or N-FINDR. Considering the large volume of data and the complexity of the calculations, there is a need for efficient implementations. Latest- generation hardware accelerators such as commodity graphics processing units (GPUs) offer a good chance for improving the computational performance in this context. In this paper, we develop two different implementations of the SSEE algorithm using GPUs. Both are based on the eigenvectors computation within each sub-region of the first step, one using the singular value decomposition (SVD) and another one using principal component analysis (PCA). Based
Pure endmember extraction using robust kernel archetypoid analysis for hyperspectral imagery
NASA Astrophysics Data System (ADS)
Sun, Weiwei; Yang, Gang; Wu, Ke; Li, Weiyue; Zhang, Dianfa
2017-09-01
A robust kernel archetypoid analysis (RKADA) method is proposed to extract pure endmembers from hyperspectral imagery (HSI). The RKADA assumes that each pixel is a sparse linear mixture of all endmembers and each endmember corresponds to a real pixel in the image scene. First, it improves the re8gular archetypal analysis with a new binary sparse constraint, and the adoption of the kernel function constructs the principal convex hull in an infinite Hilbert space and enlarges the divergences between pairwise pixels. Second, the RKADA transfers the pure endmember extraction problem into an optimization problem by minimizing residual errors with the Huber loss function. The Huber loss function reduces the effects from big noises and outliers in the convergence procedure of RKADA and enhances the robustness of the optimization function. Third, the random kernel sinks for fast kernel matrix approximation and the two-stage algorithm for optimizing initial pure endmembers are utilized to improve its computational efficiency in realistic implementations of RKADA, respectively. The optimization equation of RKADA is solved by using the block coordinate descend scheme and the desired pure endmembers are finally obtained. Six state-of-the-art pure endmember extraction methods are employed to make comparisons with the RKADA on both synthetic and real Cuprite HSI datasets, including three geometrical algorithms vertex component analysis (VCA), alternative volume maximization (AVMAX) and orthogonal subspace projection (OSP), and three matrix factorization algorithms the preconditioning for successive projection algorithm (PreSPA), hierarchical clustering based on rank-two nonnegative matrix factorization (H2NMF) and self-dictionary multiple measurement vector (SDMMV). Experimental results show that the RKADA outperforms all the six methods in terms of spectral angle distance (SAD) and root-mean-square-error (RMSE). Moreover, the RKADA has short computational times in offline
NASA Astrophysics Data System (ADS)
Martin, Gabriel; Gonzalez-Ruiz, Vicente; Plaza, Antonio; Ortiz, Juan P.; Garcia, Inmaculada
2010-07-01
Lossy hyperspectral image compression has received considerable interest in recent years due to the extremely high dimensionality of the data. However, the impact of lossy compression on spectral unmixing techniques has not been widely studied. These techniques characterize mixed pixels (resulting from insufficient spatial resolution) in terms of a suitable combination of spectrally pure substances (called endmembers) weighted by their estimated fractional abundances. This paper focuses on the impact of JPEG2000-based lossy compression of hyperspectral images on the quality of the endmembers extracted by different algorithms. The three considered algorithms are the orthogonal subspace projection (OSP), which uses only spatial information, and the automatic morphological endmember extraction (AMEE) and spatial spectral endmember extraction (SSEE), which integrate both spatial and spectral information in the search for endmembers. The impact of compression on the resulting abundance estimation based on the endmembers derived by different methods is also substantiated. Experimental results are conducted using a hyperspectral data set collected by NASA Jet Propulsion Laboratory over the Cuprite mining district in Nevada. The experimental results are quantitatively analyzed using reference information available from U.S. Geological Survey, resulting in recommendations to specialists interested in applying endmember extraction and unmixing algorithms to compressed hyperspectral data.
Effects of band selection on endmember extraction for forestry applications
NASA Astrophysics Data System (ADS)
Karathanassi, Vassilia; Andreou, Charoula; Andronis, Vassilis; Kolokoussis, Polychronis
2014-10-01
In spectral unmixing theory, data reduction techniques play an important role as hyperspectral imagery contains an immense amount of data, posing many challenging problems such as data storage, computational efficiency, and the so called "curse of dimensionality". Feature extraction and feature selection are the two main approaches for dimensionality reduction. Feature extraction techniques are used for reducing the dimensionality of the hyperspectral data by applying transforms on hyperspectral data. Feature selection techniques retain the physical meaning of the data by selecting a set of bands from the input hyperspectral dataset, which mainly contain the information needed for spectral unmixing. Although feature selection techniques are well-known for their dimensionality reduction potentials they are rarely used in the unmixing process. The majority of the existing state-of-the-art dimensionality reduction methods set criteria to the spectral information, which is derived by the whole wavelength, in order to define the optimum spectral subspace. These criteria are not associated with any particular application but with the data statistics, such as correlation and entropy values. However, each application is associated with specific land c over materials, whose spectral characteristics present variations in specific wavelengths. In forestry for example, many applications focus on tree leaves, in which specific pigments such as chlorophyll, xanthophyll, etc. determine the wavelengths where tree species, diseases, etc., can be detected. For such applications, when the unmixing process is applied, the tree species, diseases, etc., are considered as the endmembers of interest. This paper focuses on investigating the effects of band selection on the endmember extraction by exploiting the information of the vegetation absorbance spectral zones. More precisely, it is explored whether endmember extraction can be optimized when specific sets of initial bands related to
Impact of JPEG2000 compression on spatial-spectral endmember extraction from hyperspectral data
NASA Astrophysics Data System (ADS)
Martín, Gabriel; Ruiz, V. G.; Plaza, Antonio; Ortiz, Juan P.; García, Inmaculada
2009-08-01
Hyperspectral image compression has received considerable interest in recent years. However, an important issue that has not been investigated in the past is the impact of lossy compression on spectral mixture analysis applications, which characterize mixed pixels in terms of a suitable combination of spectrally pure spectral substances (called endmembers) weighted by their estimated fractional abundances. In this paper, we specifically investigate the impact of JPEG2000 compression of hyperspectral images on the quality of the endmembers extracted by algorithms that incorporate both the spectral and the spatial information (useful for incorporating contextual information in the spectral endmember search). The two considered algorithms are the automatic morphological endmember extraction (AMEE) and the spatial spectral endmember extraction (SSEE) techniques. Experimental results are conducted using a well-known data set collected by AVIRIS over the Cuprite mining district in Nevada and with detailed ground-truth information available from U. S. Geological Survey. Our experiments reveal some interesting findings that may be useful to specialists applying spatial-spectral endmember extraction algorithms to compressed hyperspectral imagery.
Superpixel-Augmented Endmember Detection for Hyperspectral Images
NASA Technical Reports Server (NTRS)
Thompson, David R.; Castano, Rebecca; Gilmore, Martha
2011-01-01
Superpixels are homogeneous image regions comprised of several contiguous pixels. They are produced by shattering the image into contiguous, homogeneous regions that each cover between 20 and 100 image pixels. The segmentation aims for a many-to-one mapping from superpixels to image features; each image feature could contain several superpixels, but each superpixel occupies no more than one image feature. This conservative segmentation is relatively easy to automate in a robust fashion. Superpixel processing is related to the more general idea of improving hyperspectral analysis through spatial constraints, which can recognize subtle features at or below the level of noise by exploiting the fact that their spectral signatures are found in neighboring pixels. Recent work has explored spatial constraints for endmember extraction, showing significant advantages over techniques that ignore pixels relative positions. Methods such as AMEE (automated morphological endmember extraction) express spatial influence using fixed isometric relationships a local square window or Euclidean distance in pixel coordinates. In other words, two pixels covariances are based on their spatial proximity, but are independent of their absolute location in the scene. These isometric spatial constraints are most appropriate when spectral variation is smooth and constant over the image. Superpixels are simple to implement, efficient to compute, and are empirically effective. They can be used as a preprocessing step with any desired endmember extraction technique. Superpixels also have a solid theoretical basis in the hyperspectral linear mixing model, making them a principled approach for improving endmember extraction. Unlike existing approaches, superpixels can accommodate non-isometric covariance between image pixels (characteristic of discrete image features separated by step discontinuities). These kinds of image features are common in natural scenes. Analysts can substitute superpixels
NASA Astrophysics Data System (ADS)
Schmalz, M.; Ritter, G.
Accurate multispectral or hyperspectral signature classification is key to the nonimaging detection and recognition of space objects. Additionally, signature classification accuracy depends on accurate spectral endmember determination [1]. Previous approaches to endmember computation and signature classification were based on linear operators or neural networks (NNs) expressed in terms of the algebra (R, +, x) [1,2]. Unfortunately, class separation in these methods tends to be suboptimal, and the number of signatures that can be accurately classified often depends linearly on the number of NN inputs. This can lead to poor endmember distinction, as well as potentially significant classification errors in the presence of noise or densely interleaved signatures. In contrast to traditional CNNs, autoassociative morphological memories (AMM) are a construct similar to Hopfield autoassociatived memories defined on the (R, +, ?,?) lattice algebra [3]. Unlimited storage and perfect recall of noiseless real valued patterns has been proven for AMMs [4]. However, AMMs suffer from sensitivity to specific noise models, that can be characterized as erosive and dilative noise. On the other hand, the prior definition of a set of endmembers corresponds to material spectra lying on vertices of the minimum convex region covering the image data. These vertices can be characterized as morphologically independent patterns. It has further been shown that AMMs can be based on dendritic computation [3,6]. These techniques yield improved accuracy and class segmentation/separation ability in the presence of highly interleaved signature data. In this paper, we present a procedure for endmember determination based on AMM noise sensitivity, which employs morphological dendritic computation. We show that detected endmembers can be exploited by AMM based classification techniques, to achieve accurate signature classification in the presence of noise, closely spaced or interleaved signatures, and
NASA Astrophysics Data System (ADS)
Sun, Weiwei; Ma, Jun; Yang, Gang; Du, Bo; Zhang, Liangpei
2017-06-01
A new Bayesian method named Poisson Nonnegative Matrix Factorization with Parameter Subspace Clustering Constraint (PNMF-PSCC) has been presented to extract endmembers from Hyperspectral Imagery (HSI). First, the method integrates the liner spectral mixture model with the Bayesian framework and it formulates endmember extraction into a Bayesian inference problem. Second, the Parameter Subspace Clustering Constraint (PSCC) is incorporated into the statistical program to consider the clustering of all pixels in the parameter subspace. The PSCC could enlarge differences among ground objects and helps finding endmembers with smaller spectrum divergences. Meanwhile, the PNMF-PSCC method utilizes the Poisson distribution as the prior knowledge of spectral signals to better explain the quantum nature of light in imaging spectrometer. Third, the optimization problem of PNMF-PSCC is formulated into maximizing the joint density via the Maximum A Posterior (MAP) estimator. The program is finally solved by iteratively optimizing two sub-problems via the Alternating Direction Method of Multipliers (ADMM) framework and the FURTHESTSUM initialization scheme. Five state-of-the art methods are implemented to make comparisons with the performance of PNMF-PSCC on both the synthetic and real HSI datasets. Experimental results show that the PNMF-PSCC outperforms all the five methods in Spectral Angle Distance (SAD) and Root-Mean-Square-Error (RMSE), and especially it could identify good endmembers for ground objects with smaller spectrum divergences.
Noninvasive extraction of fetal electrocardiogram based on Support Vector Machine
NASA Astrophysics Data System (ADS)
Fu, Yumei; Xiang, Shihan; Chen, Tianyi; Zhou, Ping; Huang, Weiyan
2015-10-01
The fetal electrocardiogram (FECG) signal has important clinical value for diagnosing the fetal heart diseases and choosing suitable therapeutics schemes to doctors. So, the noninvasive extraction of FECG from electrocardiogram (ECG) signals becomes a hot research point. A new method, the Support Vector Machine (SVM) is utilized for the extraction of FECG with limited size of data. Firstly, the theory of the SVM and the principle of the extraction based on the SVM are studied. Secondly, the transformation of maternal electrocardiogram (MECG) component in abdominal composite signal is verified to be nonlinear and fitted with the SVM. Then, the SVM is trained, and the training results are compared with the real data to ensure the effect of the training. Meanwhile, the parameters of the SVM are optimized to achieve the best performance so that the learning machine can be utilized to fit the unknown samples. Finally, the FECG is extracted by removing the optimal estimation of MECG component from the abdominal composite signal. In order to evaluate the performance of FECG extraction based on the SVM, the Signal-to-Noise Ratio (SNR) and the visual test are used. The experimental results show that the FECG with good quality can be extracted, its SNR ratio is significantly increased as high as 9.2349 dB and the time cost is significantly decreased as short as 0.802 seconds. Compared with the traditional method, the noninvasive extraction method based on the SVM has a simple realization, the shorter treatment time and the better extraction quality under the same conditions.
Extracting Date/Time Expressions in Super-Function Based Japanese-English Machine Translation
NASA Astrophysics Data System (ADS)
Sasayama, Manabu; Kuroiwa, Shingo; Ren, Fuji
Super-Function Based Machine Translation(SFBMT) which is a type of Example-Based Machine Translation has a feature which makes it possible to expand the coverage of examples by changing nouns into variables, however, there were problems extracting entire date/time expressions containing parts-of-speech other than nouns, because only nouns/numbers were changed into variables. We describe a method for extracting date/time expressions for SFBMT. SFBMT uses noun determination rules to extract nouns and a bilingual dictionary to obtain correspondence of the extracted nouns between the source and the target languages. In this method, we add a rule to extract date/time expressions and then extract date/time expressions from a Japanese-English bilingual corpus. The evaluation results shows that the precision of this method for Japanese sentences is 96.7%, with a recall of 98.2% and the precision for English sentences is 94.7%, with a recall of 92.7%.
Determination of the spinel group end-members based on electron microprobe analyses
NASA Astrophysics Data System (ADS)
Ferracutti, Gabriela R.; Gargiulo, M. Florencia; Ganuza, M. Luján; Bjerg, Ernesto A.; Castro, Silvia M.
2015-04-01
The spinel group minerals have been the focus of many studies, not only because of their economic interest, but also due to the fact that they are very useful as petrogenetic indicators. The application End-Members Generator (EMG) allows to establish, based on electron microprobe analyses (EMPA), the 19 end-members of the spinel group: MgAl2O4 (Spinel sensu stricto, s.s.), FeAl2O4 (Hercynite), MnAl2O4 (Galaxite), ZnAl2O4 (Gahnite), MgFe2O4 (Magnesioferrite), Fe3O4 (Magnetite), MnFe2O4 (Jacobsite), ZnFe2O4 (Franklinite), NiFe2O4 (Trevorite), MgCr2O4 (Magnesiochromite), FeCr2O4 (Chromite), MnCr2O4 (Manganochromite), ZnCr2O4 (Zincochromite), NiCr2O4 (Nichromite), MgV2O4 (Magnesiocoulsonite), FeV2O4 (Coulsonite), MnV2O4 (Vuorelainenite), Mg2TiO4 (Qandilite) and Fe2TiO4 (Ulvöspinel). EMG is an application that does not require an installation process and was created with the purpose of performing calculations to obtain: cation proportions (per formula unit, p.f.u.), end-members of the spinel group, redistribution proportions for the corresponding end-members in the Magnetite prism or Ulvöspinel prism and a data validation section to check the results. EMG accepts .csv data files and the results obtained can be used to represent a given dataset with the SpinelViz program or any other 2D and/or 3D graph plotting software.
Unsupervised Unmixing of Hyperspectral Images Accounting for Endmember Variability.
Halimi, Abderrahim; Dobigeon, Nicolas; Tourneret, Jean-Yves
2015-12-01
This paper presents an unsupervised Bayesian algorithm for hyperspectral image unmixing, accounting for endmember variability. The pixels are modeled by a linear combination of endmembers weighted by their corresponding abundances. However, the endmembers are assumed random to consider their variability in the image. An additive noise is also considered in the proposed model, generalizing the normal compositional model. The proposed algorithm exploits the whole image to benefit from both spectral and spatial information. It estimates both the mean and the covariance matrix of each endmember in the image. This allows the behavior of each material to be analyzed and its variability to be quantified in the scene. A spatial segmentation is also obtained based on the estimated abundances. In order to estimate the parameters associated with the proposed Bayesian model, we propose to use a Hamiltonian Monte Carlo algorithm. The performance of the resulting unmixing strategy is evaluated through simulations conducted on both synthetic and real data.
Automation of Endmember Pixel Selection in SEBAL/METRIC Model
NASA Astrophysics Data System (ADS)
Bhattarai, N.; Quackenbush, L. J.; Im, J.; Shaw, S. B.
2015-12-01
The commonly applied surface energy balance for land (SEBAL) and its variant, mapping evapotranspiration (ET) at high resolution with internalized calibration (METRIC) models require manual selection of endmember (i.e. hot and cold) pixels to calibrate sensible heat flux. Current approaches for automating this process are based on statistical methods and do not appear to be robust under varying climate conditions and seasons. In this paper, we introduce a new approach based on simple machine learning tools and search algorithms that provides an automatic and time efficient way of identifying endmember pixels for use in these models. The fully automated models were applied on over 100 cloud-free Landsat images with each image covering several eddy covariance flux sites in Florida and Oklahoma. Observed land surface temperatures at automatically identified hot and cold pixels were within 0.5% of those from pixels manually identified by an experienced operator (coefficient of determination, R2, ≥ 0.92, Nash-Sutcliffe efficiency, NSE, ≥ 0.92, and root mean squared error, RMSE, ≤ 1.67 K). Daily ET estimates derived from the automated SEBAL and METRIC models were in good agreement with their manual counterparts (e.g., NSE ≥ 0.91 and RMSE ≤ 0.35 mm day-1). Automated and manual pixel selection resulted in similar estimates of observed ET across all sites. The proposed approach should reduce time demands for applying SEBAL/METRIC models and allow for their more widespread and frequent use. This automation can also reduce potential bias that could be introduced by an inexperienced operator and extend the domain of the models to new users.
Jiang, Min; Chen, Yukun; Liu, Mei; Rosenbloom, S Trent; Mani, Subramani; Denny, Joshua C; Xu, Hua
2011-01-01
The authors' goal was to develop and evaluate machine-learning-based approaches to extracting clinical entities-including medical problems, tests, and treatments, as well as their asserted status-from hospital discharge summaries written using natural language. This project was part of the 2010 Center of Informatics for Integrating Biology and the Bedside/Veterans Affairs (VA) natural-language-processing challenge. The authors implemented a machine-learning-based named entity recognition system for clinical text and systematically evaluated the contributions of different types of features and ML algorithms, using a training corpus of 349 annotated notes. Based on the results from training data, the authors developed a novel hybrid clinical entity extraction system, which integrated heuristic rule-based modules with the ML-base named entity recognition module. The authors applied the hybrid system to the concept extraction and assertion classification tasks in the challenge and evaluated its performance using a test data set with 477 annotated notes. Standard measures including precision, recall, and F-measure were calculated using the evaluation script provided by the Center of Informatics for Integrating Biology and the Bedside/VA challenge organizers. The overall performance for all three types of clinical entities and all six types of assertions across 477 annotated notes were considered as the primary metric in the challenge. Systematic evaluation on the training set showed that Conditional Random Fields outperformed Support Vector Machines, and semantic information from existing natural-language-processing systems largely improved performance, although contributions from different types of features varied. The authors' hybrid entity extraction system achieved a maximum overall F-score of 0.8391 for concept extraction (ranked second) and 0.9313 for assertion classification (ranked fourth, but not statistically different than the first three systems) on the test
Machine fault feature extraction based on intrinsic mode functions
NASA Astrophysics Data System (ADS)
Fan, Xianfeng; Zuo, Ming J.
2008-04-01
This work employs empirical mode decomposition (EMD) to decompose raw vibration signals into intrinsic mode functions (IMFs) that represent the oscillatory modes generated by the components that make up the mechanical systems generating the vibration signals. The motivation here is to develop vibration signal analysis programs that are self-adaptive and that can detect machine faults at the earliest onset of deterioration. The change in velocity of the amplitude of some IMFs over a particular unit time will increase when the vibration is stimulated by a component fault. Therefore, the amplitude acceleration energy in the intrinsic mode functions is proposed as an indicator of the impulsive features that are often associated with mechanical component faults. The periodicity of the amplitude acceleration energy for each IMF is extracted by spectrum analysis. A spectrum amplitude index is introduced as a method to select the optimal result. A comparison study of the method proposed here and some well-established techniques for detecting machinery faults is conducted through the analysis of both gear and bearing vibration signals. The results indicate that the proposed method has superior capability to extract machine fault features from vibration signals.
Information extraction with object based support vector machines and vegetation indices
NASA Astrophysics Data System (ADS)
Ustuner, Mustafa; Abdikan, Saygin; Balik Sanli, Fusun
2016-07-01
Information extraction through remote sensing data is important for policy and decision makers as extracted information provide base layers for many application of real world. Classification of remotely sensed data is the one of the most common methods of extracting information however it is still a challenging issue because several factors are affecting the accuracy of the classification. Resolution of the imagery, number and homogeneity of land cover classes, purity of training data and characteristic of adopted classifiers are just some of these challenging factors. Object based image classification has some superiority than pixel based classification for high resolution images since it uses geometry and structure information besides spectral information. Vegetation indices are also commonly used for the classification process since it provides additional spectral information for vegetation, forestry and agricultural areas. In this study, the impacts of the Normalized Difference Vegetation Index (NDVI) and Normalized Difference Red Edge Index (NDRE) on the classification accuracy of RapidEye imagery were investigated. Object based Support Vector Machines were implemented for the classification of crop types for the study area located in Aegean region of Turkey. Results demonstrated that the incorporation of NDRE increase the classification accuracy from 79,96% to 86,80% as overall accuracy, however NDVI decrease the classification accuracy from 79,96% to 78,90%. Moreover it is proven than object based classification with RapidEye data give promising results for crop type mapping and analysis.
Mašín, Ivan
2016-01-01
One of important sources of biomass-based fuel is Jatropha curcas L. Great attention is paid to the biofuel produced from the oil extracted from the Jatropha curcas L. seeds. A mechanised extraction is the most efficient and feasible method for oil extraction for small-scale farmers but there is a need to extract oil in more efficient manner which would increase the labour productivity, decrease production costs, and increase benefits of small-scale farmers. On the other hand innovators should be aware that further machines development is possible only when applying the systematic approach and design methodology in all stages of engineering design. Systematic approach in this case means that designers and development engineers rigorously apply scientific knowledge, integrate different constraints and user priorities, carefully plan product and activities, and systematically solve technical problems. This paper therefore deals with the complex approach to design specification determining that can bring new innovative concepts to design of mechanical machines for oil extraction. The presented case study as the main part of the paper is focused on new concept of screw of machine mechanically extracting oil from Jatropha curcas L. seeds. PMID:27668259
NASA Astrophysics Data System (ADS)
Baasch, B.; Müller, H.; von Dobeneck, T.
2018-07-01
In this work, we present a new methodology to predict grain-size distributions from geophysical data. Specifically, electric conductivity and magnetic susceptibility of seafloor sediments recovered from electromagnetic profiling data are used to predict grain-size distributions along shelf-wide survey lines. Field data from the NW Iberian shelf are investigated and reveal a strong relation between the electromagnetic properties and grain-size distribution. The here presented workflow combines unsupervised and supervised machine-learning techniques. Non-negative matrix factorization is used to determine grain-size end-members from sediment surface samples. Four end-members were found, which well represent the variety of sediments in the study area. A radial basis function network modified for prediction of compositional data is then used to estimate the abundances of these end-members from the electromagnetic properties. The end-members together with their predicted abundances are finally back transformed to grain-size distributions. A minimum spatial variation constraint is implemented in the training of the network to avoid overfitting and to respect the spatial distribution of sediment patterns. The predicted models are tested via leave-one-out cross-validation revealing high prediction accuracy with coefficients of determination (R2) between 0.76 and 0.89. The predicted grain-size distributions represent the well-known sediment facies and patterns on the NW Iberian shelf and provide new insights into their distribution, transition and dynamics. This study suggests that electromagnetic benthic profiling in combination with machine learning techniques is a powerful tool to estimate grain-size distribution of marine sediments.
NASA Astrophysics Data System (ADS)
Baasch, B.; M"uller, H.; von Dobeneck, T.
2018-04-01
In this work we present a new methodology to predict grain-size distributions from geophysical data. Specifically, electric conductivity and magnetic susceptibility of seafloor sediments recovered from electromagnetic profiling data are used to predict grain-size distributions along shelf-wide survey lines. Field data from the NW Iberian shelf are investigated and reveal a strong relation between the electromagnetic properties and grain-size distribution. The here presented workflow combines unsupervised and supervised machine learning techniques. Nonnegative matrix factorisation is used to determine grain-size end-members from sediment surface samples. Four end-members were found which well represent the variety of sediments in the study area. A radial-basis function network modified for prediction of compositional data is then used to estimate the abundances of these end-members from the electromagnetic properties. The end-members together with their predicted abundances are finally back transformed to grain-size distributions. A minimum spatial variation constraint is implemented in the training of the network to avoid overfitting and to respect the spatial distribution of sediment patterns. The predicted models are tested via leave-one-out cross-validation revealing high prediction accuracy with coefficients of determination (R2) between 0.76 and 0.89. The predicted grain-size distributions represent the well-known sediment facies and patterns on the NW Iberian shelf and provide new insights into their distribution, transition and dynamics. This study suggests that electromagnetic benthic profiling in combination with machine learning techniques is a powerful tool to estimate grain-size distribution of marine sediments.
Zheng, Shuai; Ghasemzadeh, Nima; Hayek, Salim S; Quyyumi, Arshed A
2017-01-01
Background Extracting structured data from narrated medical reports is challenged by the complexity of heterogeneous structures and vocabularies and often requires significant manual effort. Traditional machine-based approaches lack the capability to take user feedbacks for improving the extraction algorithm in real time. Objective Our goal was to provide a generic information extraction framework that can support diverse clinical reports and enables a dynamic interaction between a human and a machine that produces highly accurate results. Methods A clinical information extraction system IDEAL-X has been built on top of online machine learning. It processes one document at a time, and user interactions are recorded as feedbacks to update the learning model in real time. The updated model is used to predict values for extraction in subsequent documents. Once prediction accuracy reaches a user-acceptable threshold, the remaining documents may be batch processed. A customizable controlled vocabulary may be used to support extraction. Results Three datasets were used for experiments based on report styles: 100 cardiac catheterization procedure reports, 100 coronary angiographic reports, and 100 integrated reports—each combines history and physical report, discharge summary, outpatient clinic notes, outpatient clinic letter, and inpatient discharge medication report. Data extraction was performed by 3 methods: online machine learning, controlled vocabularies, and a combination of these. The system delivers results with F1 scores greater than 95%. Conclusions IDEAL-X adopts a unique online machine learning–based approach combined with controlled vocabularies to support data extraction for clinical reports. The system can quickly learn and improve, thus it is highly adaptable. PMID:28487265
Fuzzy Nonlinear Proximal Support Vector Machine for Land Extraction Based on Remote Sensing Image
Zhong, Xiaomei; Li, Jianping; Dou, Huacheng; Deng, Shijun; Wang, Guofei; Jiang, Yu; Wang, Yongjie; Zhou, Zebing; Wang, Li; Yan, Fei
2013-01-01
Currently, remote sensing technologies were widely employed in the dynamic monitoring of the land. This paper presented an algorithm named fuzzy nonlinear proximal support vector machine (FNPSVM) by basing on ETM+ remote sensing image. This algorithm is applied to extract various types of lands of the city Da’an in northern China. Two multi-category strategies, namely “one-against-one” and “one-against-rest” for this algorithm were described in detail and then compared. A fuzzy membership function was presented to reduce the effects of noises or outliers on the data samples. The approaches of feature extraction, feature selection, and several key parameter settings were also given. Numerous experiments were carried out to evaluate its performances including various accuracies (overall accuracies and kappa coefficient), stability, training speed, and classification speed. The FNPSVM classifier was compared to the other three classifiers including the maximum likelihood classifier (MLC), back propagation neural network (BPN), and the proximal support vector machine (PSVM) under different training conditions. The impacts of the selection of training samples, testing samples and features on the four classifiers were also evaluated in these experiments. PMID:23936016
Zheng, Shuai; Lu, James J; Ghasemzadeh, Nima; Hayek, Salim S; Quyyumi, Arshed A; Wang, Fusheng
2017-05-09
Extracting structured data from narrated medical reports is challenged by the complexity of heterogeneous structures and vocabularies and often requires significant manual effort. Traditional machine-based approaches lack the capability to take user feedbacks for improving the extraction algorithm in real time. Our goal was to provide a generic information extraction framework that can support diverse clinical reports and enables a dynamic interaction between a human and a machine that produces highly accurate results. A clinical information extraction system IDEAL-X has been built on top of online machine learning. It processes one document at a time, and user interactions are recorded as feedbacks to update the learning model in real time. The updated model is used to predict values for extraction in subsequent documents. Once prediction accuracy reaches a user-acceptable threshold, the remaining documents may be batch processed. A customizable controlled vocabulary may be used to support extraction. Three datasets were used for experiments based on report styles: 100 cardiac catheterization procedure reports, 100 coronary angiographic reports, and 100 integrated reports-each combines history and physical report, discharge summary, outpatient clinic notes, outpatient clinic letter, and inpatient discharge medication report. Data extraction was performed by 3 methods: online machine learning, controlled vocabularies, and a combination of these. The system delivers results with F1 scores greater than 95%. IDEAL-X adopts a unique online machine learning-based approach combined with controlled vocabularies to support data extraction for clinical reports. The system can quickly learn and improve, thus it is highly adaptable. ©Shuai Zheng, James J Lu, Nima Ghasemzadeh, Salim S Hayek, Arshed A Quyyumi, Fusheng Wang. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 09.05.2017.
Improving Automated Endmember Identification for Linear Unmixing of HyspIRI Spectral Data.
NASA Astrophysics Data System (ADS)
Gader, P.
2016-12-01
The size of data sets produced by imaging spectrometers is increasing rapidly. There is already a processing bottleneck. Part of the reason for this bottleneck is the need for expert input using interactive software tools. This process can be very time consuming and laborious but is currently crucial to ensuring the quality of the analysis. Automated algorithms can mitigate this problem. Although it is unlikely that processing systems can become completely automated, there is an urgent need to increase the level of automation. Spectral unmixing is a key component to processing HyspIRI data. Algorithms such as MESMA have been demonstrated to achieve results but require carefully, expert construction of endmember libraries. Unfortunately, many endmembers found by automated algorithms for finding endmembers are deemed unsuitable by experts because they are not physically reasonable. Unfortunately, endmembers that are not physically reasonable can achieve very low errors between the linear mixing model with those endmembers and the original data. Therefore, this error is not a reasonable way to resolve the problem on "non-physical" endmembers. There are many potential approaches for resolving these issues, including using Bayesian priors, but very little attention has been given to this problem. The study reported on here considers a modification of the Sparsity Promoting Iterated Constrained Endmember (SPICE) algorithm. SPICE finds endmembers and abundances and estimates the number of endmembers. The SPICE algorithm seeks to minimize a quadratic objective function with respect to endmembers E and fractions P. The modified SPICE algorithm, which we refer to as SPICED, is obtained by adding the term D to the objective function. The term D pressures the algorithm to minimize sum of the squared differences between each endmember and a weighted sum of the data. By appropriately modifying the, the endmembers are pushed towards a subset of the data with the potential for
Hussain, Lal; Ahmed, Adeel; Saeed, Sharjil; Rathore, Saima; Awan, Imtiaz Ahmed; Shah, Saeed Arif; Majid, Abdul; Idris, Adnan; Awan, Anees Ahmed
2018-02-06
Prostate is a second leading causes of cancer deaths among men. Early detection of cancer can effectively reduce the rate of mortality caused by Prostate cancer. Due to high and multiresolution of MRIs from prostate cancer require a proper diagnostic systems and tools. In the past researchers developed Computer aided diagnosis (CAD) systems that help the radiologist to detect the abnormalities. In this research paper, we have employed novel Machine learning techniques such as Bayesian approach, Support vector machine (SVM) kernels: polynomial, radial base function (RBF) and Gaussian and Decision Tree for detecting prostate cancer. Moreover, different features extracting strategies are proposed to improve the detection performance. The features extracting strategies are based on texture, morphological, scale invariant feature transform (SIFT), and elliptic Fourier descriptors (EFDs) features. The performance was evaluated based on single as well as combination of features using Machine Learning Classification techniques. The Cross validation (Jack-knife k-fold) was performed and performance was evaluated in term of receiver operating curve (ROC) and specificity, sensitivity, Positive predictive value (PPV), negative predictive value (NPV), false positive rate (FPR). Based on single features extracting strategies, SVM Gaussian Kernel gives the highest accuracy of 98.34% with AUC of 0.999. While, using combination of features extracting strategies, SVM Gaussian kernel with texture + morphological, and EFDs + morphological features give the highest accuracy of 99.71% and AUC of 1.00.
A Gaussian Mixture Model Representation of Endmember Variability in Hyperspectral Unmixing
NASA Astrophysics Data System (ADS)
Zhou, Yuan; Rangarajan, Anand; Gader, Paul D.
2018-05-01
Hyperspectral unmixing while considering endmember variability is usually performed by the normal compositional model (NCM), where the endmembers for each pixel are assumed to be sampled from unimodal Gaussian distributions. However, in real applications, the distribution of a material is often not Gaussian. In this paper, we use Gaussian mixture models (GMM) to represent the endmember variability. We show, given the GMM starting premise, that the distribution of the mixed pixel (under the linear mixing model) is also a GMM (and this is shown from two perspectives). The first perspective originates from the random variable transformation and gives a conditional density function of the pixels given the abundances and GMM parameters. With proper smoothness and sparsity prior constraints on the abundances, the conditional density function leads to a standard maximum a posteriori (MAP) problem which can be solved using generalized expectation maximization. The second perspective originates from marginalizing over the endmembers in the GMM, which provides us with a foundation to solve for the endmembers at each pixel. Hence, our model can not only estimate the abundances and distribution parameters, but also the distinct endmember set for each pixel. We tested the proposed GMM on several synthetic and real datasets, and showed its potential by comparing it to current popular methods.
NASA Astrophysics Data System (ADS)
Gong, Z.; Dekkers, M. J.; Heslop, D.; Mullender, T. A. T.
2009-08-01
To identify remagnetization is essential for palaeomagnetic studies and their geodynamic implications. The traditional approach is often based on directional analysis of palaeomagnetic data and field tests, which may be inconclusive if the apparent polar wander path (APWP) is poorly constrained or if the remagnetization predates folding. In several cases, rock magnetic work, particularly, the measurement of hysteresis loops allows identification of the so-called `remagnetized' and `non-remagnetized' trends. However, for weakly magnetic samples, this approach can be equivocal. Here, to improve the diagnosis of remagnetization, we investigated 192 isothermal remanent magnetization (IRM) acquisition curves (up to 700 mT) of remagnetized and non-remagnetized limestones from the Organyà Basin, northern Spain. Also, 96 IRM acquisition curves from non-remagnetized marls were studied as a cross-check for the non-remagnetized limestones. A non-parametric end-member modelling approach is used to analyse the IRM acquisition curve data sets. First, remagnetized and non-remagnetized groups were treated separately. Two or three end-members were found to adequately describe the data variability: one end-member represents the high-coercivity contribution, whereas the low-coercivity part can be described by either one end-member or two reasonably similar end-members. In the remagnetized limestones, the low-coercivity end-members tend to saturate at higher field values than in the non-remagnetized limestones. When the entire data set was processed together, a three-end-member model was judged optimal. This model consists of a high-coercivity end-member, a low-coercivity end-member that saturates at ~300-400 mT and a low-coercivity end-member that approximately saturates at 700 mT. Higher contributions of the latter end-member appear to occur dominantly in the remagnetized limestones, whereas the reverse is true for the non-remagnetized limestones, so they plot in clearly
Agarwalla, Swapna; Sarma, Kandarpa Kumar
2016-06-01
Automatic Speaker Recognition (ASR) and related issues are continuously evolving as inseparable elements of Human Computer Interaction (HCI). With assimilation of emerging concepts like big data and Internet of Things (IoT) as extended elements of HCI, ASR techniques are found to be passing through a paradigm shift. Oflate, learning based techniques have started to receive greater attention from research communities related to ASR owing to the fact that former possess natural ability to mimic biological behavior and that way aids ASR modeling and processing. The current learning based ASR techniques are found to be evolving further with incorporation of big data, IoT like concepts. Here, in this paper, we report certain approaches based on machine learning (ML) used for extraction of relevant samples from big data space and apply them for ASR using certain soft computing techniques for Assamese speech with dialectal variations. A class of ML techniques comprising of the basic Artificial Neural Network (ANN) in feedforward (FF) and Deep Neural Network (DNN) forms using raw speech, extracted features and frequency domain forms are considered. The Multi Layer Perceptron (MLP) is configured with inputs in several forms to learn class information obtained using clustering and manual labeling. DNNs are also used to extract specific sentence types. Initially, from a large storage, relevant samples are selected and assimilated. Next, a few conventional methods are used for feature extraction of a few selected types. The features comprise of both spectral and prosodic types. These are applied to Recurrent Neural Network (RNN) and Fully Focused Time Delay Neural Network (FFTDNN) structures to evaluate their performance in recognizing mood, dialect, speaker and gender variations in dialectal Assamese speech. The system is tested under several background noise conditions by considering the recognition rates (obtained using confusion matrices and manually) and computation time
Li, Yang; Li, Guoqing; Wang, Zhenhao
2015-01-01
In order to overcome the problems of poor understandability of the pattern recognition-based transient stability assessment (PRTSA) methods, a new rule extraction method based on extreme learning machine (ELM) and an improved Ant-miner (IAM) algorithm is presented in this paper. First, the basic principles of ELM and Ant-miner algorithm are respectively introduced. Then, based on the selected optimal feature subset, an example sample set is generated by the trained ELM-based PRTSA model. And finally, a set of classification rules are obtained by IAM algorithm to replace the original ELM network. The novelty of this proposal is that transient stability rules are extracted from an example sample set generated by the trained ELM-based transient stability assessment model by using IAM algorithm. The effectiveness of the proposed method is shown by the application results on the New England 39-bus power system and a practical power system--the southern power system of Hebei province.
Mathematical model of simple spalling formation during coal cutting with extracting machine
NASA Astrophysics Data System (ADS)
Gabov, V. V.; Zadkov, D. A.
2018-05-01
A single-mass model of a rotor shearer is analyzed. It is shown that rotor mining machines has large inertia moments and load dynamics. An extraction module model with selective movement of the cutting tool is represented. The peculiar feature of such extracting machines is fluid power drive cutter mechanism. They can steadily operate at large shear thickness, and locking modes are not an emergency for them. Comparing with shearers they have less inertional mass, but slower average cutting speed, and its momentary values depend on load. Basing on the equation of hydraulic fuel consumption balance the work of fluid power drive of extracting module cutter mechanism together with hydro pneumatic accumulator is analyzed. Spalling formation model during coal cutting with fluid power drive cutter mechanism and potential energy stores are suggested. Matching cutter speed with the speed of main crack expansion and amount of potential energy consumption, cutter load is determined only by ultimate stress at crack pole and friction. Tests of an extracting module cutter in real size model proved the stated theory.
Objective determination of image end-members in spectral mixture analysis of AVIRIS data
NASA Technical Reports Server (NTRS)
Tompkins, Stefanie; Mustard, John F.; Pieters, Carle M.; Forsyth, Donald W.
1993-01-01
Spectral mixture analysis has been shown to be a powerful, multifaceted tool for analysis of multi- and hyper-spectral data. Applications of AVIRIS data have ranged from mapping soils and bedrock to ecosystem studies. During the first phase of the approach, a set of end-members are selected from an image cube (image end-members) that best account for its spectral variance within a constrained, linear least squares mixing model. These image end-members are usually selected using a priori knowledge and successive trial and error solutions to refine the total number and physical location of the end-members. However, in many situations a more objective method of determining these essential components is desired. We approach the problem of image end-member determination objectively by using the inherent variance of the data. Unlike purely statistical methods such as factor analysis, this approach derives solutions that conform to a physically realistic model.
The optional selection of micro-motion feature based on Support Vector Machine
NASA Astrophysics Data System (ADS)
Li, Bo; Ren, Hongmei; Xiao, Zhi-he; Sheng, Jing
2017-11-01
Micro-motion form of target is multiple, different micro-motion forms are apt to be modulated, which makes it difficult for feature extraction and recognition. Aiming at feature extraction of cone-shaped objects with different micro-motion forms, this paper proposes the best selection method of micro-motion feature based on support vector machine. After the time-frequency distribution of radar echoes, comparing the time-frequency spectrum of objects with different micro-motion forms, features are extracted based on the differences between the instantaneous frequency variations of different micro-motions. According to the methods based on SVM (Support Vector Machine) features are extracted, then the best features are acquired. Finally, the result shows the method proposed in this paper is feasible under the test condition of certain signal-to-noise ratio(SNR).
NASA Astrophysics Data System (ADS)
Zhou, Peng; Peng, Zhike; Chen, Shiqian; Yang, Yang; Zhang, Wenming
2018-06-01
With the development of large rotary machines for faster and more integrated performance, the condition monitoring and fault diagnosis for them are becoming more challenging. Since the time-frequency (TF) pattern of the vibration signal from the rotary machine often contains condition information and fault feature, the methods based on TF analysis have been widely-used to solve these two problems in the industrial community. This article introduces an effective non-stationary signal analysis method based on the general parameterized time-frequency transform (GPTFT). The GPTFT is achieved by inserting a rotation operator and a shift operator in the short-time Fourier transform. This method can produce a high-concentrated TF pattern with a general kernel. A multi-component instantaneous frequency (IF) extraction method is proposed based on it. The estimation for the IF of every component is accomplished by defining a spectrum concentration index (SCI). Moreover, such an IF estimation process is iteratively operated until all the components are extracted. The tests on three simulation examples and a real vibration signal demonstrate the effectiveness and superiority of our method.
Mori, Kensaku; Ota, Shunsuke; Deguchi, Daisuke; Kitasaka, Takayuki; Suenaga, Yasuhito; Iwano, Shingo; Hasegawa, Yosihnori; Takabatake, Hirotsugu; Mori, Masaki; Natori, Hiroshi
2009-01-01
This paper presents a method for the automated anatomical labeling of bronchial branches extracted from 3D CT images based on machine learning and combination optimization. We also show applications of anatomical labeling on a bronchoscopy guidance system. This paper performs automated labeling by using machine learning and combination optimization. The actual procedure consists of four steps: (a) extraction of tree structures of the bronchus regions extracted from CT images, (b) construction of AdaBoost classifiers, (c) computation of candidate names for all branches by using the classifiers, (d) selection of best combination of anatomical names. We applied the proposed method to 90 cases of 3D CT datasets. The experimental results showed that the proposed method can assign correct anatomical names to 86.9% of the bronchial branches up to the sub-segmental lobe branches. Also, we overlaid the anatomical names of bronchial branches on real bronchoscopic views to guide real bronchoscopy.
Geochemical Characterization of Endmember Mantle Components
2005-06-01
from the oceanic crust and volcanic edifice beneath Gran Canaria (Canary Islands); consequences for crustal contamination of ascending magmas, Chemical...Enriched Mantle II (EM2) Endmember: Evidence from the Samoan Volcanic Chain .................................................... 19 Abstract...DMM). On the other hand, ocean island basalts (OIBs), erupted by hotspot volcanism , are isotopically heterogeneous in terms of most radiogenic
Machine Learning for Knowledge Extraction from PHR Big Data.
Poulymenopoulou, Michaela; Malamateniou, Flora; Vassilacopoulos, George
2014-01-01
Cloud computing, Internet of things (IOT) and NoSQL database technologies can support a new generation of cloud-based PHR services that contain heterogeneous (unstructured, semi-structured and structured) patient data (health, social and lifestyle) from various sources, including automatically transmitted data from Internet connected devices of patient living space (e.g. medical devices connected to patients at home care). The patient data stored in such PHR systems constitute big data whose analysis with the use of appropriate machine learning algorithms is expected to improve diagnosis and treatment accuracy, to cut healthcare costs and, hence, to improve the overall quality and efficiency of healthcare provided. This paper describes a health data analytics engine which uses machine learning algorithms for analyzing cloud based PHR big health data towards knowledge extraction to support better healthcare delivery as regards disease diagnosis and prognosis. This engine comprises of the data preparation, the model generation and the data analysis modules and runs on the cloud taking advantage from the map/reduce paradigm provided by Apache Hadoop.
Hussain, Lal
2018-06-01
Epilepsy is a neurological disorder produced due to abnormal excitability of neurons in the brain. The research reveals that brain activity is monitored through electroencephalogram (EEG) of patients suffered from seizure to detect the epileptic seizure. The performance of EEG detection based epilepsy require feature extracting strategies. In this research, we have extracted varying features extracting strategies based on time and frequency domain characteristics, nonlinear, wavelet based entropy and few statistical features. A deeper study was undertaken using novel machine learning classifiers by considering multiple factors. The support vector machine kernels are evaluated based on multiclass kernel and box constraint level. Likewise, for K-nearest neighbors (KNN), we computed the different distance metrics, Neighbor weights and Neighbors. Similarly, the decision trees we tuned the paramours based on maximum splits and split criteria and ensemble classifiers are evaluated based on different ensemble methods and learning rate. For training/testing tenfold Cross validation was employed and performance was evaluated in form of TPR, NPR, PPV, accuracy and AUC. In this research, a deeper analysis approach was performed using diverse features extracting strategies using robust machine learning classifiers with more advanced optimal options. Support Vector Machine linear kernel and KNN with City block distance metric give the overall highest accuracy of 99.5% which was higher than using the default parameters for these classifiers. Moreover, highest separation (AUC = 0.9991, 0.9990) were obtained at different kernel scales using SVM. Additionally, the K-nearest neighbors with inverse squared distance weight give higher performance at different Neighbors. Moreover, to distinguish the postictal heart rate oscillations from epileptic ictal subjects, and highest performance of 100% was obtained using different machine learning classifiers.
Machine vision and appearance based learning
NASA Astrophysics Data System (ADS)
Bernstein, Alexander
2017-03-01
Smart algorithms are used in Machine vision to organize or extract high-level information from the available data. The resulted high-level understanding the content of images received from certain visual sensing system and belonged to an appearance space can be only a key first step in solving various specific tasks such as mobile robot navigation in uncertain environments, road detection in autonomous driving systems, etc. Appearance-based learning has become very popular in the field of machine vision. In general, the appearance of a scene is a function of the scene content, the lighting conditions, and the camera position. Mobile robots localization problem in machine learning framework via appearance space analysis is considered. This problem is reduced to certain regression on an appearance manifold problem, and newly regression on manifolds methods are used for its solution.
NASA Astrophysics Data System (ADS)
Schneider, P.; Roberts, D. A.
2007-12-01
The Fire Potential Index (FPI) is currently the only operationally used wildfire susceptibility index in the United States that incorporates remote sensing data in addition to meteorological information. Its remote sensing component utilizes relative greenness derived from a NDVI time series as a proxy for computing the ratio of live to dead vegetation. This study investigates the potential of Multiple Endmember Spectral Mixture Analysis (MESMA) as a more direct and physically reasonable way of computing the live ratio and applying it for the computation of the FPI. A time series of 16-day reflectance composites of Moderate Resolution Imaging Spectroradiometer (MODIS) data was used to perform the analysis. Endmember selection for green vegetation (GV), non- photosynthetic vegetation (NPV) and soil was performed in two stages. First, a subset of suitable endmembers was selected from an extensive library of reference and image spectra for each class using Endmember Average Root Mean Square Error (EAR), Minimum Average Spectral Angle (MASA) and a count-based technique. Second, the most appropriate endmembers for the specific data set were selected from the subset by running a series of 2-endmember models on representative images and choosing the ones that modeled the majority of pixels. The final set of endmembers was used for running MESMA on southern California MODIS composites from 2000 to 2006. 3- and 4-endmember models were considered. The best model was chosen on a per-pixel basis according to the minimum root mean square error of the models at each level of complexity. Endmember fractions were normalized by the shade endmember to generate realistic fractions of GV and NPV. In order to validate the MESMA-derived GV fractions they were compared against live ratio estimates from RG. A significant spatial and temporal relationship between both measures was found, indicating that GV fraction has the potential to substitute RG in computing the FPI. To further test
Collection of endmembers and their separability for spectral unmixing in rangeland applications
NASA Astrophysics Data System (ADS)
Rolfson, David
Rangelands are an important resource to Alberta. Due to their size, mapping rangeland features is difficult. However, the use of aerial and satellite data for mapping has increased the area that can be studied at one time. The recent success in applying hyperspectral data to vegetation mapping has shown promise in rangeland classification. However, classification mapping of hyperspectral data requires existing data for input into classification algorithms. The research reported in this thesis focused on acquiring a seasonal inventory of in-situ reflectance spectra of rangeland plant species (endmembers) and comparing them to evaluate their separability as an indicator of their suitability for hyperspectral image classification analysis. The goals of this research also included determining the separability of species endmembers at different times of the growing season. In 2008, reflectance spectra were collected for three shrub species ( Artemisia cana, Symphoricarpos occidentalis, and Rosa acicularis ), five rangeland grass species native to southern Alberta ( Koeleria gracilis, Stipa comata, Bouteloua gracilis, Agropyron smithii, Festuca idahoensis) and one invasive grass species (Agropyron cristatum ). A spectral library, built using the SPECCHIO spectral database software, was populated using these spectroradiometric measurements with a focus on vegetation spectra. Average endmembers of plant spectra acquired during the peak of sample greenness were compared using three separability measures -- normalized Euclidean distance (NED), correlation separability measure (CSM) and Modified Spectral Angle Mapper (MSAM) -- to establish the degree to which the species were separable. Results were normalized to values between 0 and 1 and values above the established thresholds indicate that the species were not separable. The endmembers for Agropyron cristatum, Agropyron smithii, and Rosa acicularis were not separable using CSM (threshold = 0.992) or MSAM (threshold = 0
End-Member Formulation of Solid Solutions and Reactive Transport
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lichtner, Peter C.
2015-09-01
A model for incorporating solid solutions into reactive transport equations is presented based on an end-member representation. Reactive transport equations are solved directly for the composition and bulk concentration of the solid solution. Reactions of a solid solution with an aqueous solution are formulated in terms of an overall stoichiometric reaction corresponding to a time-varying composition and exchange reactions, equivalent to reaction end-members. Reaction rates are treated kinetically using a transition state rate law for the overall reaction and a pseudo-kinetic rate law for exchange reactions. The composition of the solid solution at the onset of precipitation is assumed tomore » correspond to the least soluble composition, equivalent to the composition at equilibrium. The stoichiometric saturation determines if the solid solution is super-saturated with respect to the aqueous solution. The method is implemented for a simple prototype batch reactor using Mathematica for a binary solid solution. Finally, the sensitivity of the results on the kinetic rate constant for a binary solid solution is investigated for reaction of an initially stoichiometric solid phase with an undersaturated aqueous solution.« less
Li, Tao; Sun, Guihua; Ma, Shengzhong; Liang, Kai; Yang, Chupeng; Li, Bo; Luo, Weidong
2016-11-15
Concentration, spatial distribution, composition and sources of polycyclic aromatic hydrocarbons (PAHs) were investigated based on measurements of 16 PAH compounds in surface sediments of the western Taiwan Strait. Total PAH concentrations ranged from 2.41 to 218.54ngg -1 . Cluster analysis identified three site clusters representing the northern, central and southern regions. Sedimentary PAHs mainly originated from a mixture of pyrolytic and petrogenic in the north, from pyrolytic in the central, and from petrogenic in the south. An end-member mixing model was performed using PAH compound data to estimate mixing proportions for unknown end-members (i.e., extreme-value sample points) proposed by principal component analysis (PCA). The results showed that the analyzed samples can be expressed as mixtures of three end-members, and the mixing of different end-members was strongly related to the transport pathway controlled by two currents, which alternately prevail in the Taiwan Strait during different seasons. Copyright © 2016. Published by Elsevier Ltd.
A Method for Extracting Important Segments from Documents Using Support Vector Machines
NASA Astrophysics Data System (ADS)
Suzuki, Daisuke; Utsumi, Akira
In this paper we propose an extraction-based method for automatic summarization. The proposed method consists of two processes: important segment extraction and sentence compaction. The process of important segment extraction classifies each segment in a document as important or not by Support Vector Machines (SVMs). The process of sentence compaction then determines grammatically appropriate portions of a sentence for a summary according to its dependency structure and the classification result by SVMs. To test the performance of our method, we conducted an evaluation experiment using the Text Summarization Challenge (TSC-1) corpus of human-prepared summaries. The result was that our method achieved better performance than a segment-extraction-only method and the Lead method, especially for sentences only a part of which was included in human summaries. Further analysis of the experimental results suggests that a hybrid method that integrates sentence extraction with segment extraction may generate better summaries.
Mena, Luis J.; Orozco, Eber E.; Felix, Vanessa G.; Ostos, Rodolfo; Melgarejo, Jesus; Maestre, Gladys E.
2012-01-01
Machine learning has become a powerful tool for analysing medical domains, assessing the importance of clinical parameters, and extracting medical knowledge for outcomes research. In this paper, we present a machine learning method for extracting diagnostic and prognostic thresholds, based on a symbolic classification algorithm called REMED. We evaluated the performance of our method by determining new prognostic thresholds for well-known and potential cardiovascular risk factors that are used to support medical decisions in the prognosis of fatal cardiovascular diseases. Our approach predicted 36% of cardiovascular deaths with 80% specificity and 75% general accuracy. The new method provides an innovative approach that might be useful to support decisions about medical diagnoses and prognoses. PMID:22924062
Spatial/Spectral Identification of Endmembers from AVIRIS Data using Mathematical Morphology
NASA Technical Reports Server (NTRS)
Plaza, Antonio; Martinez, Pablo; Gualtieri, J. Anthony; Perez, Rosa M.
2001-01-01
During the last several years, a number of airborne and satellite hyperspectral sensors have been developed or improved for remote sensing applications. Imaging spectrometry allows the detection of materials, objects and regions in a particular scene with a high degree of accuracy. Hyperspectral data typically consist of hundreds of thousands of spectra, so the analysis of this information is a key issue. Mathematical morphology theory is a widely used nonlinear technique for image analysis and pattern recognition. Although it is especially well suited to segment binary or grayscale images with irregular and complex shapes, its application in the classification/segmentation of multispectral or hyperspectral images has been quite rare. In this paper, we discuss a new completely automated methodology to find endmembers in the hyperspectral data cube using mathematical morphology. The extension of classic morphology to the hyperspectral domain allows us to integrate spectral and spatial information in the analysis process. In Section 3, some basic concepts about mathematical morphology and the technical details of our algorithm are provided. In Section 4, the accuracy of the proposed method is tested by its application to real hyperspectral data obtained from the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) imaging spectrometer. Some details about these data and reference results, obtained by well-known endmember extraction techniques, are provided in Section 2. Finally, in Section 5 we expose the main conclusions at which we have arrived.
Scorebox extraction from mobile sports videos using Support Vector Machines
NASA Astrophysics Data System (ADS)
Kim, Wonjun; Park, Jimin; Kim, Changick
2008-08-01
Scorebox plays an important role in understanding contents of sports videos. However, the tiny scorebox may give the small-display-viewers uncomfortable experience in grasping the game situation. In this paper, we propose a novel framework to extract the scorebox from sports video frames. We first extract candidates by using accumulated intensity and edge information after short learning period. Since there are various types of scoreboxes inserted in sports videos, multiple attributes need to be used for efficient extraction. Based on those attributes, the optimal information gain is computed and top three ranked attributes in terms of information gain are selected as a three-dimensional feature vector for Support Vector Machines (SVM) to distinguish the scorebox from other candidates, such as logos and advertisement boards. The proposed method is tested on various videos of sports games and experimental results show the efficiency and robustness of our proposed method.
NASA Technical Reports Server (NTRS)
Thompson, David R.; Bornstein, Benjamin; Bue, Brian D.; Tran, Daniel Q.; Chien, Steve A.; Castano, Rebecca
2012-01-01
We present a demonstration of onboard hyperspectral image processing with the potential to reduce mission downlink requirements. The system detects spectral endmembers and then uses them to map units of surface material. This summarizes the content of the scene, reveals spectral anomalies warranting fast response, and reduces data volume by two orders of magnitude. We have integrated this system into the Autonomous Science craft Experiment for operational use onboard the Earth Observing One (EO-1) Spacecraft. The system does not require prior knowledge about spectra of interest. We report on a series of trial overflights in which identical spacecraft commands are effective for autonomous spectral discovery and mapping for varied target features, scenes and imaging conditions.
NASA Astrophysics Data System (ADS)
Stein, N. T.; Arvidson, R. E.; O'Sullivan, J. A.; Catalano, J. G.; Guinness, E. A.; Politte, D. V.; Gellert, R.; VanBommel, S. J.
2018-01-01
The Opportunity rover investigated a gentle swale on the rim of Endeavour crater called Marathon Valley where a series of bright planar outcrops are cut into polygons by fractures. A wheel scuff performed on one of the soil-filled fracture zones revealed the presence of three end-members identified on the basis of Pancam multispectral imaging observations covering 0.4 to 1 μm: red and dark pebbles, and a bright soil clod. Multiple overlapping Alpha Particle X-ray Spectrometer (APXS) measurements were collected on three targets within the scuff zone. The field of view of each APXS measurement contained various proportions of the Pancam-based end-members. Application of a log maximum likelihood method for retrieving the composition of the end-members using the 10 APXS measurements shows that the dark pebble end-member is compositionally similar to average Mars soil, with slightly elevated S and Fe. In contrast, the red pebble end-member exhibits enrichments in Al and Si and is depleted in Fe and Mg relative to average Mars soil. The soil clod end-member is enriched in Mg, S, and Ni. Thermodynamic modeling of the soil clod end-member composition indicates a dominance of sulfate minerals. We hypothesize that acidic fluids in fractures leached and oxidized the basaltic host rock, forming the red pebbles, and then evaporated to leave behind sulfate-cemented soil.
Precise on-machine extraction of the surface normal vector using an eddy current sensor array
NASA Astrophysics Data System (ADS)
Wang, Yongqing; Lian, Meng; Liu, Haibo; Ying, Yangwei; Sheng, Xianjun
2016-11-01
To satisfy the requirements of on-machine measurement of the surface normal during complex surface manufacturing, a highly robust normal vector extraction method using an Eddy current (EC) displacement sensor array is developed, the output of which is almost unaffected by surface brightness, machining coolant and environmental noise. A precise normal vector extraction model based on a triangular-distributed EC sensor array is first established. Calibration of the effects of object surface inclination and coupling interference on measurement results, and the relative position of EC sensors, is involved. A novel apparatus employing three EC sensors and a force transducer was designed, which can be easily integrated into the computer numerical control (CNC) machine tool spindle and/or robot terminal execution. Finally, to test the validity and practicability of the proposed method, typical experiments were conducted with specified testing pieces using the developed approach and system, such as an inclined plane and cylindrical and spherical surfaces.
NASA Astrophysics Data System (ADS)
Cui, Qian; Shi, Jiancheng; Xu, Yuanliu
2011-12-01
Water is the basic needs for human society, and the determining factor of stability of ecosystem as well. There are lots of lakes on Tibet Plateau, which will lead to flood and mudslide when the water expands sharply. At present, water area is extracted from TM or SPOT data for their high spatial resolution; however, their temporal resolution is insufficient. MODIS data have high temporal resolution and broad coverage. So it is valuable resource for detecting the change of water area. Because of its low spatial resolution, mixed-pixels are common. In this paper, four spectral libraries are built using MOD09A1 product, based on that, water body is extracted in sub-pixels utilizing Multiple Endmembers Spectral Mixture Analysis (MESMA) using MODIS daily reflectance data MOD09GA. The unmixed result is comparing with contemporaneous TM data and it is proved that this method has high accuracy.
Albadr, Musatafa Abbas Abbood; Tiun, Sabrina; Al-Dhief, Fahad Taha; Sammour, Mahmoud A M
2018-01-01
Spoken Language Identification (LID) is the process of determining and classifying natural language from a given content and dataset. Typically, data must be processed to extract useful features to perform LID. The extracting features for LID, based on literature, is a mature process where the standard features for LID have already been developed using Mel-Frequency Cepstral Coefficients (MFCC), Shifted Delta Cepstral (SDC), the Gaussian Mixture Model (GMM) and ending with the i-vector based framework. However, the process of learning based on extract features remains to be improved (i.e. optimised) to capture all embedded knowledge on the extracted features. The Extreme Learning Machine (ELM) is an effective learning model used to perform classification and regression analysis and is extremely useful to train a single hidden layer neural network. Nevertheless, the learning process of this model is not entirely effective (i.e. optimised) due to the random selection of weights within the input hidden layer. In this study, the ELM is selected as a learning model for LID based on standard feature extraction. One of the optimisation approaches of ELM, the Self-Adjusting Extreme Learning Machine (SA-ELM) is selected as the benchmark and improved by altering the selection phase of the optimisation process. The selection process is performed incorporating both the Split-Ratio and K-Tournament methods, the improved SA-ELM is named Enhanced Self-Adjusting Extreme Learning Machine (ESA-ELM). The results are generated based on LID with the datasets created from eight different languages. The results of the study showed excellent superiority relating to the performance of the Enhanced Self-Adjusting Extreme Learning Machine LID (ESA-ELM LID) compared with the SA-ELM LID, with ESA-ELM LID achieving an accuracy of 96.25%, as compared to the accuracy of SA-ELM LID of only 95.00%.
Tiun, Sabrina; AL-Dhief, Fahad Taha; Sammour, Mahmoud A. M.
2018-01-01
Spoken Language Identification (LID) is the process of determining and classifying natural language from a given content and dataset. Typically, data must be processed to extract useful features to perform LID. The extracting features for LID, based on literature, is a mature process where the standard features for LID have already been developed using Mel-Frequency Cepstral Coefficients (MFCC), Shifted Delta Cepstral (SDC), the Gaussian Mixture Model (GMM) and ending with the i-vector based framework. However, the process of learning based on extract features remains to be improved (i.e. optimised) to capture all embedded knowledge on the extracted features. The Extreme Learning Machine (ELM) is an effective learning model used to perform classification and regression analysis and is extremely useful to train a single hidden layer neural network. Nevertheless, the learning process of this model is not entirely effective (i.e. optimised) due to the random selection of weights within the input hidden layer. In this study, the ELM is selected as a learning model for LID based on standard feature extraction. One of the optimisation approaches of ELM, the Self-Adjusting Extreme Learning Machine (SA-ELM) is selected as the benchmark and improved by altering the selection phase of the optimisation process. The selection process is performed incorporating both the Split-Ratio and K-Tournament methods, the improved SA-ELM is named Enhanced Self-Adjusting Extreme Learning Machine (ESA-ELM). The results are generated based on LID with the datasets created from eight different languages. The results of the study showed excellent superiority relating to the performance of the Enhanced Self-Adjusting Extreme Learning Machine LID (ESA-ELM LID) compared with the SA-ELM LID, with ESA-ELM LID achieving an accuracy of 96.25%, as compared to the accuracy of SA-ELM LID of only 95.00%. PMID:29672546
Using endmembers in AVIRIS images to estimate changes in vegetative biomass
NASA Technical Reports Server (NTRS)
Smith, Milton O.; Adams, John B.; Ustin, Susan L.; Roberts, Dar A.
1992-01-01
Field techniques for estimating vegetative biomass are labor intensive, and rarely are used to monitor changes in biomass over time. Remote-sensing offers an attractive alternative to field measurements; however, because there is no simple correspondence between encoded radiance in multispectral images and biomass, it is not possible to measure vegetative biomass directly from AVIRIS images. Ways to estimate vegetative biomass by identifying community types and then applying biomass scalars derived from field measurements are investigated. Field measurements of community-scale vegetative biomass can be made, at least for local areas, but it is not always possible to identify vegetation communities unambiguously using remote measurements and conventional image-processing techniques. Furthermore, even when communities are well characterized in a single image, it typically is difficult to assess the extent and nature of changes in a time series of images, owing to uncertainties introduced by variations in illumination geometry, atmospheric attenuation, and instrumental responses. Our objective is to develop an improved method based on spectral mixture analysis to characterize and identify vegetative communities, that can be applied to multi-temporal AVIRIS and other types of images. In previous studies, multi-temporal data sets (AVIRIS and TM) of Owens Valley, CA were analyzed and vegetation communities were defined in terms of fractions of reference (laboratory and field) endmember spectra. An advantage of converting an image to fractions of reference endmembers is that, although fractions in a given pixel may vary from image to image in a time series, the endmembers themselves typically are constant, thus providing a consistent frame of reference.
NASA Astrophysics Data System (ADS)
Schneider, P.; Roberts, D. A.
2008-12-01
Wildfire is a significant natural disturbance mechanism in Southern California. Assessing spatial patterns of wildfire susceptibility requires estimates of the live and dead fractions of vegetation. The Fire Potential Index (FPI), which is currently the only operationally computed fire susceptibility index incorporating remote sensing data, estimates such fractions using a relative greenness measure based on time series of vegetation index images. This contribution assesses the potential of Multiple Endmember Spectral Mixture Analysis (MESMA) for deriving such fractions from single MODIS images without the need for a long remote sensing time series, and investigates the applicability of such MESMA-derived fractions for mapping dynamic fire susceptibility in Southern California. Endmembers for MESMA were selected from a library of reference endmembers using Constrained Reference Endmember Selection (CRES), which uses field estimates of fractions to guide the selection process. Fraction images of green vegetation, non-photosynthetic vegetation, soil, and shade were then computed for all available 16-day MODIS composites between 2000 and 2006 using MESMA. Initial results indicate that MESMA of MODIS imagery is capable of providing reliable estimates of live and dead vegetation fraction. Validation against in situ observations in the Santa Ynez Mountains near Santa Barbara, California, shows that the average fraction error for two tested species was around 10%. Further validation of MODIS-derived fractions was performed against fractions from high-resolution hyperspectral data. It was shown that the fractions derived from data of both sensors correlate with R2 values greater than 0.95. MESMA-derived live and dead vegetation fractions were subsequently tested as a substitute to relative greenness in the FPI algorithm. FPI was computed for every day between 2000 and 2006 using the derived fractions. Model performance was then tested by extracting FPI values for
NASA Astrophysics Data System (ADS)
Liu, Fengjing; Conklin, Martha H.; Shaw, Glenn D.
2017-01-01
Both concentration-discharge relation and end-member mixing analysis were explored to elucidate the connectivity of hydrologic and hydrochemical processes using chemical data collected during 2006-2008 at Happy Isles (468 km2), Pohono Bridge (833 km2), and Briceburg (1873 km2) in the snowmelt-fed mid-Merced River basin, augmented by chemical data collected by the USGS during 1990-2014 at Happy Isles. Concentration-discharge (C-Q) in streamflow was dominated by a well-defined power law relation, with the magnitude of exponent (0.02-0.6) and R2 values (p < 0.001) lower on rising than falling limbs. Concentrations of conservative solutes in streamflow resulted from mixing of two end-members at Happy Isles and Pohono Bridge and three at Briceburg, with relatively constant solute concentrations in end-members. The fractional contribution of groundwater was higher on rising than falling limbs at all basin scales. The relationship between the fractional contributions of subsurface flow and groundwater and streamflow (F-Q) followed the same relation as C-Q as a result of end-member mixing. The F-Q relation was used as a simple model to simulate subsurface flow and groundwater discharges to Happy Isles from 1990 to 2014 and was successfully validated by solute concentrations measured by the USGS. It was also demonstrated that the consistency of F-Q and C-Q relations is applicable to other catchments where end-members and the C-Q relationships are well defined, suggesting hydrologic and hydrochemical processes are strongly coupled and mutually predictable. Combining concentration-discharge and end-member mixing analyses could be used as a diagnostic tool to understand streamflow generation and hydrochemical controls in catchment hydrologic studies.
An Android malware detection system based on machine learning
NASA Astrophysics Data System (ADS)
Wen, Long; Yu, Haiyang
2017-08-01
The Android smartphone, with its open source character and excellent performance, has attracted many users. However, the convenience of the Android platform also has motivated the development of malware. The traditional method which detects the malware based on the signature is unable to detect unknown applications. The article proposes a machine learning-based lightweight system that is capable of identifying malware on Android devices. In this system we extract features based on the static analysis and the dynamitic analysis, then a new feature selection approach based on principle component analysis (PCA) and relief are presented in the article to decrease the dimensions of the features. After that, a model will be constructed with support vector machine (SVM) for classification. Experimental results show that our system provides an effective method in Android malware detection.
NASA Astrophysics Data System (ADS)
Li, Tao; Li, Tuan-Jie
2018-04-01
The analysis of grain-size distribution enables us to decipher sediment transport processes and understand the causal relations between dynamic processes and grain-size distributions. In the present study, grain sizes were measured from surface sediments collected in the Pearl River Estuary and its adjacent coastal areas. End-member modeling analysis attempts to unmix the grain sizes into geologically meaningful populations. Six grain-size end-members were identified. Their dominant modes are 0 Φ, 1.5 Φ, 2.75 Φ, 4.5 Φ, 7 Φ, and 8 Φ, corresponding to coarse sand, medium sand, fine sand, very coarse silt, silt, and clay, respectively. The spatial distributions of the six end-members are influenced by sediment transport and depositional processes. The two coarsest end-members (coarse sand and medium sand) may reflect relict sediments deposited during the last glacial period. The fine sand end-member would be difficult to transport under fair weather conditions, and likely indicates storm deposits. The three remaining fine-grained end-members (very coarse silt, silt, and clay) are recognized as suspended particles transported by saltwater intrusion via the flood tidal current, the Guangdong Coastal Current, and riverine outflow. The grain-size trend analysis shows distinct transport patterns for the three fine-grained end-members. The landward transport of the very coarse silt end-member occurs in the eastern part of the estuary, the seaward transport of the silt end-member occurs in the western part, and the east-west transport of the clay end-member occurs in the coastal areas. The results show that grain-size end-member modeling analysis in combination with sediment trend analysis help to better understand sediment transport patterns and the associated transport mechanisms.
NASA Astrophysics Data System (ADS)
Lee, Donghoon; Kim, Ye-seul; Choi, Sunghoon; Lee, Haenghwa; Jo, Byungdu; Choi, Seungyeon; Shin, Jungwook; Kim, Hee-Joung
2017-03-01
The chest digital tomosynthesis(CDT) is recently developed medical device that has several advantage for diagnosing lung disease. For example, CDT provides depth information with relatively low radiation dose compared to computed tomography (CT). However, a major problem with CDT is the image artifacts associated with data incompleteness resulting from limited angle data acquisition in CDT geometry. For this reason, the sensitivity of lung disease was not clear compared to CT. In this study, to improve sensitivity of lung disease detection in CDT, we developed computer aided diagnosis (CAD) systems based on machine learning. For design CAD systems, we used 100 cases of lung nodules cropped images and 100 cases of normal lesion cropped images acquired by lung man phantoms and proto type CDT. We used machine learning techniques based on support vector machine and Gabor filter. The Gabor filter was used for extracting characteristics of lung nodules and we compared performance of feature extraction of Gabor filter with various scale and orientation parameters. We used 3, 4, 5 scales and 4, 6, 8 orientations. After extracting features, support vector machine (SVM) was used for classifying feature of lesions. The linear, polynomial and Gaussian kernels of SVM were compared to decide the best SVM conditions for CDT reconstruction images. The results of CAD system with machine learning showed the capability of automatically lung lesion detection. Furthermore detection performance was the best when Gabor filter with 5 scale and 8 orientation and SVM with Gaussian kernel were used. In conclusion, our suggested CAD system showed improving sensitivity of lung lesion detection in CDT and decide Gabor filter and SVM conditions to achieve higher detection performance of our developed CAD system for CDT.
Extraction of inland Nypa fruticans (Nipa Palm) using Support Vector Machine
NASA Astrophysics Data System (ADS)
Alberto, R. T.; Serrano, S. C.; Damian, G. B.; Camaso, E. E.; Biagtan, A. R.; Panuyas, N. Z.; Quibuyen, J. S.
2017-09-01
Mangroves are considered as one of the major habitats in coastal ecosystem, providing a lot of economic and ecological services in human society. Nypa fruticans (Nipa palm) is one of the important species of mangroves because of its versatility and uniqueness as halophytic palm. However, nipas are not only adaptable in saline areas, they can also managed to thrive away from the coastline depending on the favorable soil types available in the area. Because of this, mapping of this species are not limited alone in the near shore areas, but in areas where this species are present as well. The extraction process of Nypa fruticans were carried out using the available LiDAR data. Support Vector Machine (SVM) classification process was used to extract nipas in inland areas. The SVM classification process in mapping Nypa fruticans produced high accuracy of 95+%. The Support Vector Machine classification process to extract inland nipas was proven to be effective by utilizing different terrain derivatives from LiDAR data.
Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.
Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi
2013-01-01
The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.
Hamada, Yuki; Stow, Douglas A; Roberts, Dar A; Franklin, Janet; Kyriakidis, Phaedon C
2013-04-01
Arid and semi-arid shrublands have significant biological and economical values and have been experiencing dramatic changes due to human activities. In California, California sage scrub (CSS) is one of the most endangered plant communities in the US and requires close monitoring in order to conserve this important biological resource. We investigate the utility of remote-sensing approaches--object-based image analysis applied to pansharpened QuickBird imagery (QBPS/OBIA) and multiple endmember spectral mixture analysis (MESMA) applied to SPOT imagery (SPOT/MESMA)--for estimating fractional cover of true shrub, subshrub, herb, and bare ground within CSS communities of southern California. We also explore the effectiveness of life-form cover maps for assessing CSS conditions. Overall and combined shrub cover (i.e., true shrub and subshrub) were estimated more accurately using QBPS/OBIA (mean absolute error or MAE, 8.9 %) than SPOT/MESMA (MAE, 11.4 %). Life-form cover from QBPS/OBIA at a 25 × 25 m grid cell size seems most desirable for assessing CSS because of its higher accuracy and spatial detail in cover estimates and amenability to extracting other vegetation information (e.g., size, shape, and density of shrub patches). Maps derived from SPOT/MESMA at a 50 × 50 m scale are effective for retrospective analysis of life-form cover change because their comparable accuracies to QBPS/OBIA and availability of SPOT archives data dating back to the mid-1980s. The framework in this study can be applied to other physiognomically comparable shrubland communities.
Machine vision extracted plant movement for early detection of plant water stress.
Kacira, M; Ling, P P; Short, T H
2002-01-01
A methodology was established for early, non-contact, and quantitative detection of plant water stress with machine vision extracted plant features. Top-projected canopy area (TPCA) of the plants was extracted from plant images using image-processing techniques. Water stress induced plant movement was decoupled from plant diurnal movement and plant growth using coefficient of relative variation of TPCA (CRV[TPCA)] and was found to be an effective marker for water stress detection. Threshold value of CRV(TPCA) as an indicator of water stress was determined by a parametric approach. The effectiveness of the sensing technique was evaluated against the timing of stress detection by an operator. Results of this study suggested that plant water stress detection using projected canopy area based features of the plants was feasible.
Assessing and monitoring of urban vegetation using multiple endmember spectral mixture analysis
NASA Astrophysics Data System (ADS)
Zoran, M. A.; Savastru, R. S.; Savastru, D. M.
2013-08-01
During last years urban vegetation with significant health, biological and economical values had experienced dramatic changes due to urbanization and human activities in the metropolitan area of Bucharest in Romania. We investigated the utility of remote sensing approaches of multiple endmember spectral mixture analysis (MESMA) applied to IKONOS and Landsat TM/ETM satellite data for estimating fractional cover of urban/periurban forest, parks, agricultural vegetation areas. Because of the spectral heterogeneity of same physical features of urban vegetation increases with the increase of image resolution, the traditional spectral information-based statistical method may not be useful to classify land cover dynamics from high resolution imageries like IKONOS. So we used hierarchy tree classification method in classification and MESMA for vegetation land cover dynamics assessment based on available IKONOS high-resolution imagery of Bucharest town. This study employs thirty two endmembers and six hundred and sixty spectral models to identify all Earth's features (vegetation, water, soil, impervious) and shade in the Bucharest area. The mean RMS error for the selected vegetation land cover classes range from 0.0027 to 0.018. The Pearson correlation between the fraction outputs from MESMA and reference data from all IKONOS images 1m panchromatic resolution data for urban/periurban vegetation were ranging in the domain 0.7048 - 0.8287. The framework in this study can be applied to other urban vegetation areas in Romania.
Automatic sub-pixel coastline extraction based on spectral mixture analysis using EO-1 Hyperion data
NASA Astrophysics Data System (ADS)
Hong, Zhonghua; Li, Xuesu; Han, Yanling; Zhang, Yun; Wang, Jing; Zhou, Ruyan; Hu, Kening
2018-06-01
Many megacities (such as Shanghai) are located in coastal areas, therefore, coastline monitoring is critical for urban security and urban development sustainability. A shoreline is defined as the intersection between coastal land and a water surface and features seawater edge movements as tides rise and fall. Remote sensing techniques have increasingly been used for coastline extraction; however, traditional hard classification methods are performed only at the pixel-level and extracting subpixel accuracy using soft classification methods is both challenging and time consuming due to the complex features in coastal regions. This paper presents an automatic sub-pixel coastline extraction method (ASPCE) from high-spectral satellite imaging that performs coastline extraction based on spectral mixture analysis and, thus, achieves higher accuracy. The ASPCE method consists of three main components: 1) A Water- Vegetation-Impervious-Soil (W-V-I-S) model is first presented to detect mixed W-V-I-S pixels and determine the endmember spectra in coastal regions; 2) The linear spectral mixture unmixing technique based on Fully Constrained Least Squares (FCLS) is applied to the mixed W-V-I-S pixels to estimate seawater abundance; and 3) The spatial attraction model is used to extract the coastline. We tested this new method using EO-1 images from three coastal regions in China: the South China Sea, the East China Sea, and the Bohai Sea. The results showed that the method is accurate and robust. Root mean square error (RMSE) was utilized to evaluate the accuracy by calculating the distance differences between the extracted coastline and the digitized coastline. The classifier's performance was compared with that of the Multiple Endmember Spectral Mixture Analysis (MESMA), Mixture Tuned Matched Filtering (MTMF), Sequential Maximum Angle Convex Cone (SMACC), Constrained Energy Minimization (CEM), and one classical Normalized Difference Water Index (NDWI). The results from the
Machine vision based quality inspection of flat glass products
NASA Astrophysics Data System (ADS)
Zauner, G.; Schagerl, M.
2014-03-01
This application paper presents a machine vision solution for the quality inspection of flat glass products. A contact image sensor (CIS) is used to generate digital images of the glass surfaces. The presented machine vision based quality inspection at the end of the production line aims to classify five different glass defect types. The defect images are usually characterized by very little `image structure', i.e. homogeneous regions without distinct image texture. Additionally, these defect images usually consist of only a few pixels. At the same time the appearance of certain defect classes can be very diverse (e.g. water drops). We used simple state-of-the-art image features like histogram-based features (std. deviation, curtosis, skewness), geometric features (form factor/elongation, eccentricity, Hu-moments) and texture features (grey level run length matrix, co-occurrence matrix) to extract defect information. The main contribution of this work now lies in the systematic evaluation of various machine learning algorithms to identify appropriate classification approaches for this specific class of images. In this way, the following machine learning algorithms were compared: decision tree (J48), random forest, JRip rules, naive Bayes, Support Vector Machine (multi class), neural network (multilayer perceptron) and k-Nearest Neighbour. We used a representative image database of 2300 defect images and applied cross validation for evaluation purposes.
Machine Learning Based Malware Detection
2015-05-18
A TRIDENT SCHOLAR PROJECT REPORT NO. 440 Machine Learning Based Malware Detection by Midshipman 1/C Zane A. Markel, USN...COVERED (From - To) 4. TITLE AND SUBTITLE Machine Learning Based Malware Detection 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM...suitably be projected into realistic performance. This work explores several aspects of machine learning based malware detection . First, we
NASA Astrophysics Data System (ADS)
Varatharajan, I.; D'Amore, M.; Maturilli, A.; Helbert, J.; Hiesinger, H.
2018-04-01
Machine learning approach to spectral unmixing of emissivity spectra of Mercury is carried out using endmember spectral library measured at simulated daytime surface conditions of Mercury. Study supports MERTIS payload onboard ESA/JAXA BepiColombo.
NASA Astrophysics Data System (ADS)
Kim, Ji-Hyun; Kim, Kyoung-Ho; Thao, Nguyen Thi; Batsaikhan, Bayartungalag; Yun, Seong-Taek
2017-06-01
In this study, we evaluated the water quality status (especially, salinity problems) and hydrogeochemical processes of an alluvial aquifer in a floodplain of the Red River delta, Vietnam, based on the hydrochemical and isotopic data of groundwater samples (n = 23) from the Kien Xuong district of the Thai Binh province. Following the historical inundation by paleo-seawater during coastal progradation, the aquifer has been undergone progressive freshening and land reclamation to enable settlements and farming. The hydrochemical data of water samples showed a broad hydrochemical change, from Na-Cl through Na-HCO3 to Ca-HCO3 types, suggesting that groundwater was overall evolved through the freshening process accompanying cation exchange. The principal component analysis (PCA) of the hydrochemical data indicates the occurrence of three major hydrogeochemical processes occurring in an aquifer, namely: 1) progressive freshening of remaining paleo-seawater, 2) water-rock interaction (i.e., dissolution of silicates), and 3) redox process including sulfate reduction, as indicated by heavy sulfur and oxygen isotope compositions of sulfate. To quantitatively assess the hydrogeochemical processes, the end-member mixing analysis (EMMA) and the forward mixing modeling using PHREEQC code were conducted. The EMMA results show that the hydrochemical model with the two-dimensional mixing space composed of PC 1 and PC 2 best explains the mixing in the study area; therefore, we consider that the groundwater chemistry mainly evolved by mixing among three end-members (i.e., paleo-seawater, infiltrating rain, and the K-rich groundwater). The distinct depletion of sulfate in groundwater, likely due to bacterial sulfate reduction, can also be explained by EMMA. The evaluation of mass balances using geochemical modeling supports the explanation that the freshening process accompanying direct cation exchange occurs through mixing among three end-members involving the K-rich groundwater. This
Bishop, Christopher M
2013-02-13
Several decades of research in the field of machine learning have resulted in a multitude of different algorithms for solving a broad range of problems. To tackle a new application, a researcher typically tries to map their problem onto one of these existing methods, often influenced by their familiarity with specific algorithms and by the availability of corresponding software implementations. In this study, we describe an alternative methodology for applying machine learning, in which a bespoke solution is formulated for each new application. The solution is expressed through a compact modelling language, and the corresponding custom machine learning code is then generated automatically. This model-based approach offers several major advantages, including the opportunity to create highly tailored models for specific scenarios, as well as rapid prototyping and comparison of a range of alternative models. Furthermore, newcomers to the field of machine learning do not have to learn about the huge range of traditional methods, but instead can focus their attention on understanding a single modelling environment. In this study, we show how probabilistic graphical models, coupled with efficient inference algorithms, provide a very flexible foundation for model-based machine learning, and we outline a large-scale commercial application of this framework involving tens of millions of users. We also describe the concept of probabilistic programming as a powerful software environment for model-based machine learning, and we discuss a specific probabilistic programming language called Infer.NET, which has been widely used in practical applications.
Bishop, Christopher M.
2013-01-01
Several decades of research in the field of machine learning have resulted in a multitude of different algorithms for solving a broad range of problems. To tackle a new application, a researcher typically tries to map their problem onto one of these existing methods, often influenced by their familiarity with specific algorithms and by the availability of corresponding software implementations. In this study, we describe an alternative methodology for applying machine learning, in which a bespoke solution is formulated for each new application. The solution is expressed through a compact modelling language, and the corresponding custom machine learning code is then generated automatically. This model-based approach offers several major advantages, including the opportunity to create highly tailored models for specific scenarios, as well as rapid prototyping and comparison of a range of alternative models. Furthermore, newcomers to the field of machine learning do not have to learn about the huge range of traditional methods, but instead can focus their attention on understanding a single modelling environment. In this study, we show how probabilistic graphical models, coupled with efficient inference algorithms, provide a very flexible foundation for model-based machine learning, and we outline a large-scale commercial application of this framework involving tens of millions of users. We also describe the concept of probabilistic programming as a powerful software environment for model-based machine learning, and we discuss a specific probabilistic programming language called Infer.NET, which has been widely used in practical applications. PMID:23277612
Support patient search on pathology reports with interactive online learning based data extraction.
Zheng, Shuai; Lu, James J; Appin, Christina; Brat, Daniel; Wang, Fusheng
2015-01-01
Structural reporting enables semantic understanding and prompt retrieval of clinical findings about patients. While synoptic pathology reporting provides templates for data entries, information in pathology reports remains primarily in narrative free text form. Extracting data of interest from narrative pathology reports could significantly improve the representation of the information and enable complex structured queries. However, manual extraction is tedious and error-prone, and automated tools are often constructed with a fixed training dataset and not easily adaptable. Our goal is to extract data from pathology reports to support advanced patient search with a highly adaptable semi-automated data extraction system, which can adjust and self-improve by learning from a user's interaction with minimal human effort. We have developed an online machine learning based information extraction system called IDEAL-X. With its graphical user interface, the system's data extraction engine automatically annotates values for users to review upon loading each report text. The system analyzes users' corrections regarding these annotations with online machine learning, and incrementally enhances and refines the learning model as reports are processed. The system also takes advantage of customized controlled vocabularies, which can be adaptively refined during the online learning process to further assist the data extraction. As the accuracy of automatic annotation improves overtime, the effort of human annotation is gradually reduced. After all reports are processed, a built-in query engine can be applied to conveniently define queries based on extracted structured data. We have evaluated the system with a dataset of anatomic pathology reports from 50 patients. Extracted data elements include demographical data, diagnosis, genetic marker, and procedure. The system achieves F-1 scores of around 95% for the majority of tests. Extracting data from pathology reports could enable
Feature extraction algorithm for space targets based on fractal theory
NASA Astrophysics Data System (ADS)
Tian, Balin; Yuan, Jianping; Yue, Xiaokui; Ning, Xin
2007-11-01
In order to offer a potential for extending the life of satellites and reducing the launch and operating costs, satellite servicing including conducting repairs, upgrading and refueling spacecraft on-orbit become much more frequently. Future space operations can be more economically and reliably executed using machine vision systems, which can meet real time and tracking reliability requirements for image tracking of space surveillance system. Machine vision was applied to the research of relative pose for spacecrafts, the feature extraction algorithm was the basis of relative pose. In this paper fractal geometry based edge extraction algorithm which can be used in determining and tracking the relative pose of an observed satellite during proximity operations in machine vision system was presented. The method gets the gray-level image distributed by fractal dimension used the Differential Box-Counting (DBC) approach of the fractal theory to restrain the noise. After this, we detect the consecutive edge using Mathematical Morphology. The validity of the proposed method is examined by processing and analyzing images of space targets. The edge extraction method not only extracts the outline of the target, but also keeps the inner details. Meanwhile, edge extraction is only processed in moving area to reduce computation greatly. Simulation results compared edge detection using the method which presented by us with other detection methods. The results indicate that the presented algorithm is a valid method to solve the problems of relative pose for spacecrafts.
Machine characterization based on an abstract high-level language machine
NASA Technical Reports Server (NTRS)
Saavedra-Barrera, Rafael H.; Smith, Alan Jay; Miya, Eugene
1989-01-01
Measurements are presented for a large number of machines ranging from small workstations to supercomputers. The authors combine these measurements into groups of parameters which relate to specific aspects of the machine implementation, and use these groups to provide overall machine characterizations. The authors also define the concept of pershapes, which represent the level of performance of a machine for different types of computation. A metric based on pershapes is introduced that provides a quantitative way of measuring how similar two machines are in terms of their performance distributions. The metric is related to the extent to which pairs of machines have varying relative performance levels depending on which benchmark is used.
Successful attack on permutation-parity-machine-based neural cryptography.
Seoane, Luís F; Ruttor, Andreas
2012-02-01
An algorithm is presented which implements a probabilistic attack on the key-exchange protocol based on permutation parity machines. Instead of imitating the synchronization of the communicating partners, the strategy consists of a Monte Carlo method to sample the space of possible weights during inner rounds and an analytic approach to convey the extracted information from one outer round to the next one. The results show that the protocol under attack fails to synchronize faster than an eavesdropper using this algorithm.
Ranjith, G; Parvathy, R; Vikas, V; Chandrasekharan, Kesavadas; Nair, Suresh
2015-04-01
With the advent of new imaging modalities, radiologists are faced with handling increasing volumes of data for diagnosis and treatment planning. The use of automated and intelligent systems is becoming essential in such a scenario. Machine learning, a branch of artificial intelligence, is increasingly being used in medical image analysis applications such as image segmentation, registration and computer-aided diagnosis and detection. Histopathological analysis is currently the gold standard for classification of brain tumors. The use of machine learning algorithms along with extraction of relevant features from magnetic resonance imaging (MRI) holds promise of replacing conventional invasive methods of tumor classification. The aim of the study is to classify gliomas into benign and malignant types using MRI data. Retrospective data from 28 patients who were diagnosed with glioma were used for the analysis. WHO Grade II (low-grade astrocytoma) was classified as benign while Grade III (anaplastic astrocytoma) and Grade IV (glioblastoma multiforme) were classified as malignant. Features were extracted from MR spectroscopy. The classification was done using four machine learning algorithms: multilayer perceptrons, support vector machine, random forest and locally weighted learning. Three of the four machine learning algorithms gave an area under ROC curve in excess of 0.80. Random forest gave the best performance in terms of AUC (0.911) while sensitivity was best for locally weighted learning (86.1%). The performance of different machine learning algorithms in the classification of gliomas is promising. An even better performance may be expected by integrating features extracted from other MR sequences. © The Author(s) 2015 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
NASA Astrophysics Data System (ADS)
Antoshechkina, P. M.; Wolf, A. S.; Hamecher, E. A.; Asimow, P. D.; Ghiorso, M. S.
2013-12-01
Community databases such as EarthChem, LEPR, and AMCSD both increase demand for quantitative petrological tools, including thermodynamic models like the MELTS family of algorithms, and are invaluable in development of such tools. The need to extend existing solid solution models to include minor components such as Cr and Na has been evident for years but as the number of components increases it becomes impossible to completely separate derivation of end-member thermodynamic data from calibration of solution properties. In Hamecher et al. (2012; 2013) we developed a calibration scheme that directly interfaces with a MySQL database based on LEPR, with volume data from AMCSD and elsewhere. Here we combine that scheme with a Bayesian approach, where independent constraints on parameter values (e.g. existence of miscibility gaps) are combined with uncertainty propagation to give a more reliable best-fit along with associated model uncertainties. We illustrate the scheme with a new model of molar volume for (Ca,Fe,Mg,Mn,Na)3(Al,Cr,Fe3+,Fe2+,Mg,Mn,Si,Ti)2Si3O12 cubic garnets. For a garnet in this chemical system, the model molar volume is obtained by adding excess volume terms to a linear combination of nine independent end-member volumes. The model calibration is broken into three main stages: (1) estimation of individual end-member thermodynamic properties; (2) calibration of standard state volumes for all available independent and dependent end members; (3) fitting of binary and mixed composition data. For each calibration step, the goodness-of-fit includes weighted residuals as well as χ2-like penalty terms representing the (not necessarily Gaussian) prior constraints on parameter values. Using the Bayesian approach, uncertainties are correctly propagated forward to subsequent steps, allowing determination of final parameter values and correlated uncertainties that account for the entire calibration process. For the aluminosilicate garnets, optimal values of the bulk
NASA's online machine aided indexing system
NASA Technical Reports Server (NTRS)
Silvester, June P.; Genuardi, Michael T.; Klingbiel, Paul H.
1993-01-01
This report describes the NASA Lexical Dictionary, a machine aided indexing system used online at the National Aeronautics and Space Administration's Center for Aerospace Information (CASI). This system is comprised of a text processor that is based on the computational, non-syntactic analysis of input text, and an extensive 'knowledge base' that serves to recognize and translate text-extracted concepts. The structure and function of the various NLD system components are described in detail. Methods used for the development of the knowledge base are discussed. Particular attention is given to a statistically-based text analysis program that provides the knowledge base developer with a list of concept-specific phrases extracted from large textual corpora. Production and quality benefits resulting from the integration of machine aided indexing at CASI are discussed along with a number of secondary applications of NLD-derived systems including on-line spell checking and machine aided lexicography.
NASA Astrophysics Data System (ADS)
Yu, S. S.; Sun, Z. C.; Sun, L.; Wu, M. F.
2017-02-01
The object of this paper is to study the impervious surface extraction method using remote sensing imagery and monitor the spatiotemporal changing patterns of mega cities. Megacity Bombay was selected as the interesting area. Firstly, the pixel-based and object-oriented support vector machine (SVM) classification methods were used to acquire the land use/land cover (LULC) products of Bombay in 2010. Consequently, the overall accuracy (OA) and overall Kappa (OK) of the pixel-based method were 94.97% and 0.96 with a running time of 78 minutes, the OA and OK of the object-oriented method were 93.72% and 0.94 with a running time of only 17s. Additionally, OA and OK of the object-oriented method after a post-classification were improved up to 95.8% and 0.94. Then, the dynamic impervious surfaces of Bombay in the period 1973-2015 were extracted and the urbanization pattern of Bombay was analysed. Results told that both the two SVM classification methods could accomplish the impervious surface extraction, but the object-oriented method should be a better choice. Urbanization of Bombay experienced a fast extending during the past 42 years, implying a dramatically urban sprawl of mega cities in the developing countries along the One Belt and One Road (OBOR).
Kronholm, Scott C.; Capel, Paul D.
2015-01-01
Quantifying the relative contributions of different sources of water to a stream hydrograph is important for understanding the hydrology and water quality dynamics of a given watershed. To compare the performance of two methods of hydrograph separation, a graphical program [baseflow index (BFI)] and an end-member mixing analysis that used high-resolution specific conductance measurements (SC-EMMA) were used to estimate daily and average long-term slowflow additions of water to four small, primarily agricultural streams with different dominant sources of water (natural groundwater, overland flow, subsurface drain outflow, and groundwater from irrigation). Because the result of hydrograph separation by SC-EMMA is strongly related to the choice of slowflow and fastflow end-member values, a sensitivity analysis was conducted based on the various approaches reported in the literature to inform the selection of end-members. There were substantial discrepancies among the BFI and SC-EMMA, and neither method produced reasonable results for all four streams. Streams that had a small difference in the SC of slowflow compared with fastflow or did not have a monotonic relationship between streamflow and stream SC posed a challenge to the SC-EMMA method. The utility of the graphical BFI program was limited in the stream that had only gradual changes in streamflow. The results of this comparison suggest that the two methods may be quantifying different sources of water. Even though both methods are easy to apply, they should be applied with consideration of the streamflow and/or SC characteristics of a stream, especially where anthropogenic water sources (irrigation and subsurface drainage) are present.
The Automatic Measuring Machines and Ground-Based Astrometry
NASA Astrophysics Data System (ADS)
Sergeeva, T. P.
The introduction of the automatic measuring machines into the astronomical investigations a little more then a quarter of the century ago has increased essentially the range and the scale of projects which the astronomers could capable to realize since then. During that time, there have been dozens photographic sky surveys, which have covered all of the sky more then once. Due to high accuracy and speed of automatic measuring machines the photographic astrometry has obtained the opportunity to create the high precision catalogs such as CpC2. Investigations of the structure and kinematics of the stellar components of our Galaxy has been revolutionized in the last decade by the advent of automated plate measuring machines. But in an age of rapidly evolving electronic detectors and space-based catalogs, expected soon, one could think that the twilight hours of astronomical photography have become. On opposite of that point of view such astronomers as D.Monet (U.S.N.O.), L.G.Taff (STScI), M.K.Tsvetkov (IA BAS) and some other have contended the several ways of the photographic astronomy evolution. One of them sounds as: "...special efforts must be taken to extract useful information from the photographic archives before the plates degrade and the technology required to measure them disappears". Another is the minimization of the systematic errors of ground-based star catalogs by employment of certain reduction technology and a dense enough and precise space-based star reference catalogs. In addition to that the using of the higher resolution and quantum efficiency emulsions such as Tech Pan and some of the new methods of processing of the digitized information hold great promise for future deep (B<25) surveys (Bland-Hawthorn et al. 1993, AJ, 106, 2154). Thus not only the hard working of all existing automatic measuring machines is apparently needed but the designing, development and employment of a new generation of portable, mobile scanners is very necessary. The
NASA Astrophysics Data System (ADS)
Borgen, M.; Spencer, R. G.; Mann, P. J.; Vonk, J. E.; Bulygina, E. B.; Holmes, R. M.
2012-12-01
Terrigenous dissolved organic matter (DOM) has historically been thought to be refractory as it is mobilized into and transported through Arctic fluvial networks. However, a growing body of evidence suggests that this DOM, largely leached from vegetation, soils, and litter during the annual freshet, is highly biolabile. This study examined DOM leached from these dominant endmembers of the Kolyma River watershed in the Siberian Arctic. As leachates progressed through time, measurements of dissolved organic carbon (DOC), optical parameters to assess DOM composition, and biodegradation incubations were undertaken. This suite of measurements allowed examination of the rate and composition of leached DOC into the aquatic system and quantification of the biolability of the DOM from the diverse range of endmembers examined. Of all the endmembers, vascular plants leached the greatest amount of DOC and results will be presented relating DOC concentration and DOM composition to initial source material. Furthermore, controls on DOM biolability, enzymatic activity, and the ultimate fate of terriginous DOC in Siberian fluvial systems will be discussed.
Research on intelligent machine self-perception method based on LSTM
NASA Astrophysics Data System (ADS)
Wang, Qiang; Cheng, Tao
2018-05-01
In this paper, we use the advantages of LSTM in feature extraction and processing high-dimensional and complex nonlinear data, and apply it to the autonomous perception of intelligent machines. Compared with the traditional multi-layer neural network, this model has memory, can handle time series information of any length. Since the multi-physical domain signals of processing machines have a certain timing relationship, and there is a contextual relationship between states and states, using this deep learning method to realize the self-perception of intelligent processing machines has strong versatility and adaptability. The experiment results show that the method proposed in this paper can obviously improve the sensing accuracy under various working conditions of the intelligent machine, and also shows that the algorithm can well support the intelligent processing machine to realize self-perception.
Ye, Qing; Pan, Hao; Liu, Changhua
2015-01-01
This research proposes a novel framework of final drive simultaneous failure diagnosis containing feature extraction, training paired diagnostic models, generating decision threshold, and recognizing simultaneous failure modes. In feature extraction module, adopt wavelet package transform and fuzzy entropy to reduce noise interference and extract representative features of failure mode. Use single failure sample to construct probability classifiers based on paired sparse Bayesian extreme learning machine which is trained only by single failure modes and have high generalization and sparsity of sparse Bayesian learning approach. To generate optimal decision threshold which can convert probability output obtained from classifiers into final simultaneous failure modes, this research proposes using samples containing both single and simultaneous failure modes and Grid search method which is superior to traditional techniques in global optimization. Compared with other frequently used diagnostic approaches based on support vector machine and probability neural networks, experiment results based on F 1-measure value verify that the diagnostic accuracy and efficiency of the proposed framework which are crucial for simultaneous failure diagnosis are superior to the existing approach. PMID:25722717
Chen, Zhenyu; Li, Jianping; Wei, Liwei
2007-10-01
Recently, gene expression profiling using microarray techniques has been shown as a promising tool to improve the diagnosis and treatment of cancer. Gene expression data contain high level of noise and the overwhelming number of genes relative to the number of available samples. It brings out a great challenge for machine learning and statistic techniques. Support vector machine (SVM) has been successfully used to classify gene expression data of cancer tissue. In the medical field, it is crucial to deliver the user a transparent decision process. How to explain the computed solutions and present the extracted knowledge becomes a main obstacle for SVM. A multiple kernel support vector machine (MK-SVM) scheme, consisting of feature selection, rule extraction and prediction modeling is proposed to improve the explanation capacity of SVM. In this scheme, we show that the feature selection problem can be translated into an ordinary multiple parameters learning problem. And a shrinkage approach: 1-norm based linear programming is proposed to obtain the sparse parameters and the corresponding selected features. We propose a novel rule extraction approach using the information provided by the separating hyperplane and support vectors to improve the generalization capacity and comprehensibility of rules and reduce the computational complexity. Two public gene expression datasets: leukemia dataset and colon tumor dataset are used to demonstrate the performance of this approach. Using the small number of selected genes, MK-SVM achieves encouraging classification accuracy: more than 90% for both two datasets. Moreover, very simple rules with linguist labels are extracted. The rule sets have high diagnostic power because of their good classification performance.
NASA Astrophysics Data System (ADS)
Deng, Chengbin; Wu, Changshan
2013-12-01
Urban impervious surface information is essential for urban and environmental applications at the regional/national scales. As a popular image processing technique, spectral mixture analysis (SMA) has rarely been applied to coarse-resolution imagery due to the difficulty of deriving endmember spectra using traditional endmember selection methods, particularly within heterogeneous urban environments. To address this problem, we derived endmember signatures through a least squares solution (LSS) technique with known abundances of sample pixels, and integrated these endmember signatures into SMA for mapping large-scale impervious surface fraction. In addition, with the same sample set, we carried out objective comparative analyses among SMA (i.e. fully constrained and unconstrained SMA) and machine learning (i.e. Cubist regression tree and Random Forests) techniques. Analysis of results suggests three major conclusions. First, with the extrapolated endmember spectra from stratified random training samples, the SMA approaches performed relatively well, as indicated by small MAE values. Second, Random Forests yields more reliable results than Cubist regression tree, and its accuracy is improved with increased sample sizes. Finally, comparative analyses suggest a tentative guide for selecting an optimal approach for large-scale fractional imperviousness estimation: unconstrained SMA might be a favorable option with a small number of samples, while Random Forests might be preferred if a large number of samples are available.
NASA Astrophysics Data System (ADS)
Wang, J.; Feng, B.
2016-12-01
Impervious surface area (ISA) has long been studied as an important input into moisture flux models. In general, ISA impedes groundwater recharge, increases stormflow/flood frequency, and alters in-stream and riparian habitats. Urban area is recognized as one of the richest ISA environment. Urban ISA mapping assists flood prevention and urban planning. Hyperspectral imagery (HI), for its ability to detect subtle spectral signature, becomes an ideal candidate in urban ISA mapping. To map ISA from HI involves endmember (EM) selection. The high degree of spatial and spectral heterogeneity of urban environment puts great difficulty in this task: a compromise point is needed between the automatic degree and the good representativeness of the method. The study tested one manual and two semi-automatic EM selection strategies. The manual and the first semi-automatic methods have been widely used in EM selection. The second semi-automatic EM selection method is rather new and has been only proposed for moderate spatial resolution satellite. The manual method visually selected the EM candidates from eight landcover types in the original image. The first semi-automatic method chose the EM candidates using a threshold over the pixel purity index (PPI) map. The second semi-automatic method used the triangle shape of the HI scatter plot in the n-Dimension visualizer to identify the V-I-S (vegetation-impervious surface-soil) EM candidates: the pixels locate at the triangle points. The initial EM candidates from the three methods were further refined by three indexes (EM average RMSE, minimum average spectral angle, and count based EM selection) and generated three spectral libraries, which were used to classify the test image. Spectral angle mapper was applied. The accuracy reports for the classification results were generated. The overall accuracy are 85% for the manual method, 81% for the PPI method, and 87% for the V-I-S method. The V-I-S EM selection method performs best in
NASA Astrophysics Data System (ADS)
Petermann, Eric; Knöller, Kay; Stollberg, Reiner; Scholten, Jan; Rocha, Carlos; Weiß, Holger; Schubert, Michael
2017-04-01
Submarine groundwater discharge (SGD) plays a crucial role for the water quality of coastal waters due to associated fluxes of nutrients, organic compounds and/or heavy-metals. Thus, the quantification of SGD is essential for evaluating the vulnerability of coastal water bodies with regard to groundwater pollution as well as for understanding the matter cycles of the connected water bodies. Here, we present a scientific approach for quantifying discharge of fresh groundwater (GWf) and recirculated seawater (SWrec), including its short-term temporal dynamics, into the tide-affected Knysna estuary, South Africa. For a time-variant end-member mixing analysis we conducted time-series observations of radon (222Rn) and salinity within the estuary over two tidal cycles in combination with estimates of the related end-members for seawater, river water, GWf and SWrec. The mixing analysis was treated as constrained optimization problem for finding an end-member mixing ratio that simultaneously fits the observed data for radon and salinity best for every time-step. Uncertainty of each mixing ratio was quantified by Monte Carlo simulations of the optimization procedure considering uncertainty in end-member characterization. Results reveal the highest GWf and SWrec fraction in the estuary during peak low tide with averages of 0.8 % and 1.4 %, respectively. Further, we calculated a radon mass balance that revealed a daily radon flux of 4.8 * 108 Bq into the estuary equivalent to a GWf discharge of 29.000 m3/d (9.000-59.000 m3/d for 25th-75th percentile range) and a SWrec discharge of 80.000 m3/d (45.000-130.000 m3/d for 25th-75th percentile range). The uncertainty of SGD reflects the end-member uncertainty, i.e. the spatial heterogeneity of groundwater composition. The presented approach allows the calculation of mixing ratios of multiple uncertain end-members for time-series measurements of multiple parameters. Linking these results with a tracer mass balance allows conversion
Vane Pump Casing Machining of Dumpling Machine Based on CAD/CAM
NASA Astrophysics Data System (ADS)
Huang, Yusen; Li, Shilong; Li, Chengcheng; Yang, Zhen
Automatic dumpling forming machine is also called dumpling machine, which makes dumplings through mechanical motions. This paper adopts the stuffing delivery mechanism featuring the improved and specially-designed vane pump casing, which can contribute to the formation of dumplings. Its 3D modeling in Pro/E software, machining process planning, milling path optimization, simulation based on UG and compiling post program were introduced and verified. The results indicated that adoption of CAD/CAM offers firms the potential to pursue new innovative strategies.
Machinability of nickel based alloys using electrical discharge machining process
NASA Astrophysics Data System (ADS)
Khan, M. Adam; Gokul, A. K.; Bharani Dharan, M. P.; Jeevakarthikeyan, R. V. S.; Uthayakumar, M.; Thirumalai Kumaran, S.; Duraiselvam, M.
2018-04-01
The high temperature materials such as nickel based alloys and austenitic steel are frequently used for manufacturing critical aero engine turbine components. Literature on conventional and unconventional machining of steel materials is abundant over the past three decades. However the machining studies on superalloy is still a challenging task due to its inherent property and quality. Thus this material is difficult to be cut in conventional processes. Study on unconventional machining process for nickel alloys is focused in this proposed research. Inconel718 and Monel 400 are the two different candidate materials used for electrical discharge machining (EDM) process. Investigation is to prepare a blind hole using copper electrode of 6mm diameter. Electrical parameters are varied to produce plasma spark for diffusion process and machining time is made constant to calculate the experimental results of both the material. Influence of process parameters on tool wear mechanism and material removal are considered from the proposed experimental design. While machining the tool has prone to discharge more materials due to production of high energy plasma spark and eddy current effect. The surface morphology of the machined surface were observed with high resolution FE SEM. Fused electrode found to be a spherical structure over the machined surface as clumps. Surface roughness were also measured with surface profile using profilometer. It is confirmed that there is no deviation and precise roundness of drilling is maintained.
Learning Machine, Vietnamese Based Human-Computer Interface.
ERIC Educational Resources Information Center
Northwest Regional Educational Lab., Portland, OR.
The sixth session of IT@EDU98 consisted of seven papers on the topic of the learning machine--Vietnamese based human-computer interface, and was chaired by Phan Viet Hoang (Informatics College, Singapore). "Knowledge Based Approach for English Vietnamese Machine Translation" (Hoang Kiem, Dinh Dien) presents the knowledge base approach,…
Forsyth, Alexander W; Barzilay, Regina; Hughes, Kevin S; Lui, Dickson; Lorenz, Karl A; Enzinger, Andrea; Tulsky, James A; Lindvall, Charlotta
2018-06-01
Clinicians document cancer patients' symptoms in free-text format within electronic health record visit notes. Although symptoms are critically important to quality of life and often herald clinical status changes, computational methods to assess the trajectory of symptoms over time are woefully underdeveloped. To create machine learning algorithms capable of extracting patient-reported symptoms from free-text electronic health record notes. The data set included 103,564 sentences obtained from the electronic clinical notes of 2695 breast cancer patients receiving paclitaxel-containing chemotherapy at two academic cancer centers between May 1996 and May 2015. We manually annotated 10,000 sentences and trained a conditional random field model to predict words indicating an active symptom (positive label), absence of a symptom (negative label), or no symptom at all (neutral label). Sentences labeled by human coder were divided into training, validation, and test data sets. Final model performance was determined on 20% test data unused in model development or tuning. The final model achieved precision of 0.82, 0.86, and 0.99 and recall of 0.56, 0.69, and 1.00 for positive, negative, and neutral symptom labels, respectively. The most common positive symptoms were pain, fatigue, and nausea. Machine-based labeling of 103,564 sentences took two minutes. We demonstrate the potential of machine learning to gather, track, and analyze symptoms experienced by cancer patients during chemotherapy. Although our initial model requires further optimization to improve the performance, further model building may yield machine learning methods suitable to be deployed in routine clinical care, quality improvement, and research applications. Copyright © 2018 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
2011-01-01
Background Cardiotocography (CTG) is the most widely used tool for fetal surveillance. The visual analysis of fetal heart rate (FHR) traces largely depends on the expertise and experience of the clinician involved. Several approaches have been proposed for the effective interpretation of FHR. In this paper, a new approach for FHR feature extraction based on empirical mode decomposition (EMD) is proposed, which was used along with support vector machine (SVM) for the classification of FHR recordings as 'normal' or 'at risk'. Methods The FHR were recorded from 15 subjects at a sampling rate of 4 Hz and a dataset consisting of 90 randomly selected records of 20 minutes duration was formed from these. All records were labelled as 'normal' or 'at risk' by two experienced obstetricians. A training set was formed by 60 records, the remaining 30 left as the testing set. The standard deviations of the EMD components are input as features to a support vector machine (SVM) to classify FHR samples. Results For the training set, a five-fold cross validation test resulted in an accuracy of 86% whereas the overall geometric mean of sensitivity and specificity was 94.8%. The Kappa value for the training set was .923. Application of the proposed method to the testing set (30 records) resulted in a geometric mean of 81.5%. The Kappa value for the testing set was .684. Conclusions Based on the overall performance of the system it can be stated that the proposed methodology is a promising new approach for the feature extraction and classification of FHR signals. PMID:21244712
NASA Astrophysics Data System (ADS)
Varatharajan, I.; D'Amore, M.; Maturilli, A.; Helbert, J.; Hiesinger, H.
2017-12-01
The Mercury Radiometer and Thermal Imaging Spectrometer (MERTIS) payload of ESA/JAXA Bepicolombo mission to Mercury will map the thermal emissivity at wavelength range of 7-14 μm and spatial resolution of 500 m/pixel [1]. Mercury was also imaged at the same wavelength range using the Boston University's Mid-Infrared Spectrometer and Imager (MIRSI) mounted on the NASA Infrared Telescope Facility (IRTF) on Mauna Kea, Hawaii with the minimum spatial coverage of 400-600km/spectra which blends all rocks, minerals, and soil types [2]. Therefore, the study [2] used quantitative deconvolution algorithm developed by [3] for spectral unmixing of this composite thermal emissivity spectrum from telescope to their respective areal fractions of endmember spectra; however, the thermal emissivity of endmembers used in [2] is the inverted reflectance measurements (Kirchhoff's law) of various samples measured at room temperature and pressure. Over a decade, the Planetary Spectroscopy Laboratory (PSL) at the Institute of Planetary Research (PF) at the German Aerospace Center (DLR) facilitates the thermal emissivity measurements under controlled and simulated surface conditions of Mercury by taking emissivity measurements at varying temperatures from 100-500°C under vacuum conditions supporting MERTIS payload. The measured thermal emissivity endmember spectral library therefore includes major silicates such as bytownite, anorthoclase, synthetic glass, olivine, enstatite, nepheline basanite, rocks like komatiite, tektite, Johnson Space Center lunar simulant (1A), and synthetic powdered sulfides which includes MgS, FeS, CaS, CrS, TiS, NaS, and MnS. Using such specialized endmember spectral library created under Mercury's conditions significantly increases the accuracy of the deconvolution model results. In this study, we revisited the available telescope spectra and redeveloped the algorithm by [3] by only choosing the endmember spectral library created at PSL for unbiased model
NASA Astrophysics Data System (ADS)
Locock, Andrew J.; Mitchell, Roger H.
2018-04-01
Perovskite mineral oxides commonly exhibit extensive solid-solution, and are therefore classified on the basis of the proportions of their ideal end-members. A uniform sequence of calculation of the end-members is required if comparisons are to be made between different sets of analytical data. A Microsoft Excel spreadsheet has been programmed to assist with the classification and depiction of the minerals of the perovskite- and vapnikite-subgroups following the 2017 nomenclature of the perovskite supergroup recommended by the International Mineralogical Association (IMA). Compositional data for up to 36 elements are input into the spreadsheet as oxides in weight percent. For each analysis, the output includes the formula, the normalized proportions of 15 end-members, and the percentage of cations which cannot be assigned to those end-members. The data are automatically plotted onto the ternary and quaternary diagrams recommended by the IMA for depiction of perovskite compositions. Up to 200 analyses can be entered into the spreadsheet, which is accompanied by data calculated for 140 perovskite compositions compiled from the literature.
NASA Astrophysics Data System (ADS)
Fabian, Karl; Shcherbakov, Valeriy P.; Kosareva, Lina; Nourgaliev, Danis
2016-11-01
Acquisition curves of isothermal remanent magnetization for 1057 samples of core KDP-01 from Lake Hovsgul (Mongolia) are decomposed into three end-members using non-negative matrix factorization. The obtained mixing coefficients also decompose hysteresis loops, back-field, and strong-field thermomagnetic curves into their related end-member components. This proves that the end-members represent different mineralogical fractions of the Lake Hovsgul sedimentary environment. The method used for unmixing offers a new possibility to apply rock magnetism in paleoecological and paleoclimatic studies. For Lake Hovsgul, it indicates that a low-coercivity component with a covarying paramagnetic phase represents a coarse-grained magnetite fraction from terrigenous influx probably via fluvial transport. A second component with coercivities close to 50 mT is identified as a magnetite fraction related to magnetosomes of magnetotactic bacteria. The third component has coercivities near 85 mT and is identified as greigite of biotic or abiotic origin common in suboxic/anoxic sediments. Significant positive correlations between variations of intensity of all three mineralogical components along the core are found. A rapid drop in all end-member concentrations by more than one order of magnitude at about 20 m depth testifies to a major change of the environmental or geological conditions of Lake Hovsgul. It possibly is related to the onset of MIS 10 marking the termination of arid climate conditions. Short intervals of high productivity are characterized by an abundance of magnetite magnetosomes and may highlight glacial-interglacial transition intervals. For the rest of the core, greigite magnetization substantially exceeds that of magnetite, indicating a predominantly anoxic environment.
Overlaid caption extraction in news video based on SVM
NASA Astrophysics Data System (ADS)
Liu, Manman; Su, Yuting; Ji, Zhong
2007-11-01
Overlaid caption in news video often carries condensed semantic information which is key cues for content-based video indexing and retrieval. However, it is still a challenging work to extract caption from video because of its complex background and low resolution. In this paper, we propose an effective overlaid caption extraction approach for news video. We first scan the video key frames using a small window, and then classify the blocks into the text and non-text ones via support vector machine (SVM), with statistical features extracted from the gray level co-occurrence matrices, the LH and HL sub-bands wavelet coefficients and the orientated edge intensity ratios. Finally morphological filtering and projection profile analysis are employed to localize and refine the candidate caption regions. Experiments show its high performance on four 30-minute news video programs.
Zhang, Yifan; Gao, Xunzhang; Peng, Xuan; Ye, Jiaqi; Li, Xiang
2018-05-16
The High Resolution Range Profile (HRRP) recognition has attracted great concern in the field of Radar Automatic Target Recognition (RATR). However, traditional HRRP recognition methods failed to model high dimensional sequential data efficiently and have a poor anti-noise ability. To deal with these problems, a novel stochastic neural network model named Attention-based Recurrent Temporal Restricted Boltzmann Machine (ARTRBM) is proposed in this paper. RTRBM is utilized to extract discriminative features and the attention mechanism is adopted to select major features. RTRBM is efficient to model high dimensional HRRP sequences because it can extract the information of temporal and spatial correlation between adjacent HRRPs. The attention mechanism is used in sequential data recognition tasks including machine translation and relation classification, which makes the model pay more attention to the major features of recognition. Therefore, the combination of RTRBM and the attention mechanism makes our model effective for extracting more internal related features and choose the important parts of the extracted features. Additionally, the model performs well with the noise corrupted HRRP data. Experimental results on the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset show that our proposed model outperforms other traditional methods, which indicates that ARTRBM extracts, selects, and utilizes the correlation information between adjacent HRRPs effectively and is suitable for high dimensional data or noise corrupted data.
Elahian, Bahareh; Yeasin, Mohammed; Mudigoudar, Basanagoud; Wheless, James W; Babajani-Feremi, Abbas
2017-10-01
Using a novel technique based on phase locking value (PLV), we investigated the potential for features extracted from electrocorticographic (ECoG) recordings to serve as biomarkers to identify the seizure onset zone (SOZ). We computed the PLV between the phase of the amplitude of high gamma activity (80-150Hz) and the phase of lower frequency rhythms (4-30Hz) from ECoG recordings obtained from 10 patients with epilepsy (21 seizures). We extracted five features from the PLV and used a machine learning approach based on logistic regression to build a model that classifies electrodes as SOZ or non-SOZ. More than 96% of electrodes identified as the SOZ by our algorithm were within the resected area in six seizure-free patients. In four non-seizure-free patients, more than 31% of the identified SOZ electrodes by our algorithm were outside the resected area. In addition, we observed that the seizure outcome in non-seizure-free patients correlated with the number of non-resected SOZ electrodes identified by our algorithm. This machine learning approach, based on features extracted from the PLV, effectively identified electrodes within the SOZ. The approach has the potential to assist clinicians in surgical decision-making when pre-surgical intracranial recordings are utilized. Copyright © 2017 British Epilepsy Association. Published by Elsevier Ltd. All rights reserved.
Refining Automatically Extracted Knowledge Bases Using Crowdsourcing.
Li, Chunhua; Zhao, Pengpeng; Sheng, Victor S; Xian, Xuefeng; Wu, Jian; Cui, Zhiming
2017-01-01
Machine-constructed knowledge bases often contain noisy and inaccurate facts. There exists significant work in developing automated algorithms for knowledge base refinement. Automated approaches improve the quality of knowledge bases but are far from perfect. In this paper, we leverage crowdsourcing to improve the quality of automatically extracted knowledge bases. As human labelling is costly, an important research challenge is how we can use limited human resources to maximize the quality improvement for a knowledge base. To address this problem, we first introduce a concept of semantic constraints that can be used to detect potential errors and do inference among candidate facts. Then, based on semantic constraints, we propose rank-based and graph-based algorithms for crowdsourced knowledge refining, which judiciously select the most beneficial candidate facts to conduct crowdsourcing and prune unnecessary questions. Our experiments show that our method improves the quality of knowledge bases significantly and outperforms state-of-the-art automatic methods under a reasonable crowdsourcing cost.
Refining Automatically Extracted Knowledge Bases Using Crowdsourcing
Xian, Xuefeng; Cui, Zhiming
2017-01-01
Machine-constructed knowledge bases often contain noisy and inaccurate facts. There exists significant work in developing automated algorithms for knowledge base refinement. Automated approaches improve the quality of knowledge bases but are far from perfect. In this paper, we leverage crowdsourcing to improve the quality of automatically extracted knowledge bases. As human labelling is costly, an important research challenge is how we can use limited human resources to maximize the quality improvement for a knowledge base. To address this problem, we first introduce a concept of semantic constraints that can be used to detect potential errors and do inference among candidate facts. Then, based on semantic constraints, we propose rank-based and graph-based algorithms for crowdsourced knowledge refining, which judiciously select the most beneficial candidate facts to conduct crowdsourcing and prune unnecessary questions. Our experiments show that our method improves the quality of knowledge bases significantly and outperforms state-of-the-art automatic methods under a reasonable crowdsourcing cost. PMID:28588611
Information extraction from dynamic PS-InSAR time series using machine learning
NASA Astrophysics Data System (ADS)
van de Kerkhof, B.; Pankratius, V.; Chang, L.; van Swol, R.; Hanssen, R. F.
2017-12-01
Due to the increasing number of SAR satellites, with shorter repeat intervals and higher resolutions, SAR data volumes are exploding. Time series analyses of SAR data, i.e. Persistent Scatterer (PS) InSAR, enable the deformation monitoring of the built environment at an unprecedented scale, with hundreds of scatterers per km2, updated weekly. Potential hazards, e.g. due to failure of aging infrastructure, can be detected at an early stage. Yet, this requires the operational data processing of billions of measurement points, over hundreds of epochs, updating this data set dynamically as new data come in, and testing whether points (start to) behave in an anomalous way. Moreover, the quality of PS-InSAR measurements is ambiguous and heterogeneous, which will yield false positives and false negatives. Such analyses are numerically challenging. Here we extract relevant information from PS-InSAR time series using machine learning algorithms. We cluster (group together) time series with similar behaviour, even though they may not be spatially close, such that the results can be used for further analysis. First we reduce the dimensionality of the dataset in order to be able to cluster the data, since applying clustering techniques on high dimensional datasets often result in unsatisfying results. Our approach is to apply t-distributed Stochastic Neighbor Embedding (t-SNE), a machine learning algorithm for dimensionality reduction of high-dimensional data to a 2D or 3D map, and cluster this result using Density-Based Spatial Clustering of Applications with Noise (DBSCAN). The results show that we are able to detect and cluster time series with similar behaviour, which is the starting point for more extensive analysis into the underlying driving mechanisms. The results of the methods are compared to conventional hypothesis testing as well as a Self-Organising Map (SOM) approach. Hypothesis testing is robust and takes the stochastic nature of the observations into account
Kim, Dong Wook; Kim, Hwiyoung; Nam, Woong; Kim, Hyung Jun; Cha, In-Ho
2018-04-23
The aim of this study was to build and validate five types of machine learning models that can predict the occurrence of BRONJ associated with dental extraction in patients taking bisphosphonates for the management of osteoporosis. A retrospective review of the medical records was conducted to obtain cases and controls for the study. Total 125 patients consisting of 41 cases and 84 controls were selected for the study. Five machine learning prediction algorithms including multivariable logistic regression model, decision tree, support vector machine, artificial neural network, and random forest were implemented. The outputs of these models were compared with each other and also with conventional methods, such as serum CTX level. Area under the receiver operating characteristic (ROC) curve (AUC) was used to compare the results. The performance of machine learning models was significantly superior to conventional statistical methods and single predictors. The random forest model yielded the best performance (AUC = 0.973), followed by artificial neural network (AUC = 0.915), support vector machine (AUC = 0.882), logistic regression (AUC = 0.844), decision tree (AUC = 0.821), drug holiday alone (AUC = 0.810), and CTX level alone (AUC = 0.630). Machine learning methods showed superior performance in predicting BRONJ associated with dental extraction compared to conventional statistical methods using drug holiday and serum CTX level. Machine learning can thus be applied in a wide range of clinical studies. Copyright © 2017. Published by Elsevier Inc.
NASA Astrophysics Data System (ADS)
Jia, Rui-Sheng; Sun, Hong-Mei; Peng, Yan-Jun; Liang, Yong-Quan; Lu, Xin-Ming
2017-07-01
Microseismic monitoring is an effective means for providing early warning of rock or coal dynamical disasters, and its first step is microseismic event detection, although low SNR microseismic signals often cannot effectively be detected by routine methods. To solve this problem, this paper presents permutation entropy and a support vector machine to detect low SNR microseismic events. First, an extraction method of signal features based on multi-scale permutation entropy is proposed by studying the influence of the scale factor on the signal permutation entropy. Second, the detection model of low SNR microseismic events based on the least squares support vector machine is built by performing a multi-scale permutation entropy calculation for the collected vibration signals, constructing a feature vector set of signals. Finally, a comparative analysis of the microseismic events and noise signals in the experiment proves that the different characteristics of the two can be fully expressed by using multi-scale permutation entropy. The detection model of microseismic events combined with the support vector machine, which has the features of high classification accuracy and fast real-time algorithms, can meet the requirements of online, real-time extractions of microseismic events.
A machine vision system for micro-EDM based on linux
NASA Astrophysics Data System (ADS)
Guo, Rui; Zhao, Wansheng; Li, Gang; Li, Zhiyong; Zhang, Yong
2006-11-01
Due to the high precision and good surface quality that it can give, Electrical Discharge Machining (EDM) is potentially an important process for the fabrication of micro-tools and micro-components. However, a number of issues remain unsolved before micro-EDM becomes a reliable process with repeatable results. To deal with the difficulties in micro electrodes on-line fabrication and tool wear compensation, a micro-EDM machine vision system is developed with a Charge Coupled Device (CCD) camera, with an optical resolution of 1.61μm and an overall magnification of 113~729. Based on the Linux operating system, an image capturing program is developed with the V4L2 API, and an image processing program is exploited by using OpenCV. The contour of micro electrodes can be extracted by means of the Canny edge detector. Through the system calibration, the micro electrodes diameter can be measured on-line. Experiments have been carried out to prove its performance, and the reasons of measurement error are also analyzed.
NASA Astrophysics Data System (ADS)
Y Yang, M.; Wang, J.; Zhang, Q.
2017-07-01
Vegetation coverage is one of the most important indicators for ecological environment change, and is also an effective index for the assessment of land degradation and desertification. The dry-hot valley regions have sparse surface vegetation, and the spectral information about the vegetation in such regions usually has a weak representation in remote sensing, so there are considerable limitations for applying the commonly-used vegetation index method to calculate the vegetation coverage in the dry-hot valley regions. Therefore, in this paper, Alternating Angle Minimum (AAM) algorithm of deterministic model is adopted for selective endmember for pixel unmixing of MODIS image in order to extract the vegetation coverage, and accuracy test is carried out by the use of the Landsat TM image over the same period. As shown by the results, in the dry-hot valley regions with sparse vegetation, AAM model has a high unmixing accuracy, and the extracted vegetation coverage is close to the actual situation, so it is promising to apply the AAM model to the extraction of vegetation coverage in the dry-hot valley regions.
Improving the reliability of inverter-based welding machines
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schiedermayer, M.
1997-02-01
Although inverter-based welding power sources have been available since the late 1980s, many people hesitated to purchase them because of reliability issues. Unfortunately, their hesitancy had a basis, until now. Recent improvements give some inverters a reliability level that approaches that of traditional, transformer-based industrial welding machines, which have a failure rate of about 1%. Acceptance of inverter-based welding machines is important because, for many welding applications, they provide capabilities that solid-state, transformer-based machines cannot deliver. These advantages include enhanced pulsed gas metal arc welding (GMAW-P), lightweight portability, an ultrastable arc, and energy efficiency--all while producing highly aesthetic weld beadsmore » and delivering multiprocess capabilities.« less
Object Extraction in Cluttered Environments via a P300-Based IFCE
He, Huidong; Xian, Bin; Zeng, Ming; Zhou, Huihui; Niu, Linwei; Chen, Genshe
2017-01-01
One of the fundamental issues for robot navigation is to extract an object of interest from an image. The biggest challenges for extracting objects of interest are how to use a machine to model the objects in which a human is interested and extract them quickly and reliably under varying illumination conditions. This article develops a novel method for segmenting an object of interest in a cluttered environment by combining a P300-based brain computer interface (BCI) and an improved fuzzy color extractor (IFCE). The induced P300 potential identifies the corresponding region of interest and obtains the target of interest for the IFCE. The classification results not only represent the human mind but also deliver the associated seed pixel and fuzzy parameters to extract the specific objects in which the human is interested. Then, the IFCE is used to extract the corresponding objects. The results show that the IFCE delivers better performance than the BP network or the traditional FCE. The use of a P300-based IFCE provides a reliable solution for assisting a computer in identifying an object of interest within images taken under varying illumination intensities. PMID:28740505
Wang, Anran; Wang, Jian; Lin, Hongfei; Zhang, Jianhai; Yang, Zhihao; Xu, Kan
2017-12-20
Biomedical event extraction is one of the most frontier domains in biomedical research. The two main subtasks of biomedical event extraction are trigger identification and arguments detection which can both be considered as classification problems. However, traditional state-of-the-art methods are based on support vector machine (SVM) with massive manually designed one-hot represented features, which require enormous work but lack semantic relation among words. In this paper, we propose a multiple distributed representation method for biomedical event extraction. The method combines context consisting of dependency-based word embedding, and task-based features represented in a distributed way as the input of deep learning models to train deep learning models. Finally, we used softmax classifier to label the example candidates. The experimental results on Multi-Level Event Extraction (MLEE) corpus show higher F-scores of 77.97% in trigger identification and 58.31% in overall compared to the state-of-the-art SVM method. Our distributed representation method for biomedical event extraction avoids the problems of semantic gap and dimension disaster from traditional one-hot representation methods. The promising results demonstrate that our proposed method is effective for biomedical event extraction.
NASA Astrophysics Data System (ADS)
Peng, Chong; Wang, Lun; Liao, T. Warren
2015-10-01
Currently, chatter has become the critical factor in hindering machining quality and productivity in machining processes. To avoid cutting chatter, a new method based on dynamic cutting force simulation model and support vector machine (SVM) is presented for the prediction of chatter stability lobes. The cutting force is selected as the monitoring signal, and the wavelet energy entropy theory is used to extract the feature vectors. A support vector machine is constructed using the MATLAB LIBSVM toolbox for pattern classification based on the feature vectors derived from the experimental cutting data. Then combining with the dynamic cutting force simulation model, the stability lobes diagram (SLD) can be estimated. Finally, the predicted results are compared with existing methods such as zero-order analytical (ZOA) and semi-discretization (SD) method as well as actual cutting experimental results to confirm the validity of this new method.
Machine-aided indexing at NASA
NASA Technical Reports Server (NTRS)
Silvester, June P.; Genuardi, Michael T.; Klingbiel, Paul H.
1994-01-01
This report describes the NASA Lexical Dictionary (NLD), a machine-aided indexing system used online at the National Aeronautics and Space Administration's Center for AeroSpace Information (CASI). This system automatically suggests a set of candidate terms from NASA's controlled vocabulary for any designated natural language text input. The system is comprised of a text processor that is based on the computational, nonsyntactic analysis of input text and an extensive knowledge base that serves to recognize and translate text-extracted concepts. The functions of the various NLD system components are described in detail, and production and quality benefits resulting from the implementation of machine-aided indexing at CASI are discussed.
Ibrahim, Wisam; Abadeh, Mohammad Saniee
2017-05-21
Protein fold recognition is an important problem in bioinformatics to predict three-dimensional structure of a protein. One of the most challenging tasks in protein fold recognition problem is the extraction of efficient features from the amino-acid sequences to obtain better classifiers. In this paper, we have proposed six descriptors to extract features from protein sequences. These descriptors are applied in the first stage of a three-stage framework PCA-DELM-LDA to extract feature vectors from the amino-acid sequences. Principal Component Analysis PCA has been implemented to reduce the number of extracted features. The extracted feature vectors have been used with original features to improve the performance of the Deep Extreme Learning Machine DELM in the second stage. Four new features have been extracted from the second stage and used in the third stage by Linear Discriminant Analysis LDA to classify the instances into 27 folds. The proposed framework is implemented on the independent and combined feature sets in SCOP datasets. The experimental results show that extracted feature vectors in the first stage could improve the performance of DELM in extracting new useful features in second stage. Copyright © 2017 Elsevier Ltd. All rights reserved.
Comparison of water extraction methods in Tibet based on GF-1 data
NASA Astrophysics Data System (ADS)
Jia, Lingjun; Shang, Kun; Liu, Jing; Sun, Zhongqing
2018-03-01
In this study, we compared four different water extraction methods with GF-1 data according to different water types in Tibet, including Support Vector Machine (SVM), Principal Component Analysis (PCA), Decision Tree Classifier based on False Normalized Difference Water Index (FNDWI-DTC), and PCA-SVM. The results show that all of the four methods can extract large area water body, but only SVM and PCA-SVM can obtain satisfying extraction results for small size water body. The methods were evaluated by both overall accuracy (OAA) and Kappa coefficient (KC). The OAA of PCA-SVM, SVM, FNDWI-DTC, PCA are 96.68%, 94.23%, 93.99%, 93.01%, and the KCs are 0.9308, 0.8995, 0.8962, 0.8842, respectively, in consistent with visual inspection. In summary, SVM is better for narrow rivers extraction and PCA-SVM is suitable for water extraction of various types. As for dark blue lakes, the methods using PCA can extract more quickly and accurately.
Background Knowledge in Learning-Based Relation Extraction
ERIC Educational Resources Information Center
Do, Quang Xuan
2012-01-01
In this thesis, we study the importance of background knowledge in relation extraction systems. We not only demonstrate the benefits of leveraging background knowledge to improve the systems' performance but also propose a principled framework that allows one to effectively incorporate knowledge into statistical machine learning models for…
Simplex volume analysis for finding endmembers in hyperspectral imagery
NASA Astrophysics Data System (ADS)
Li, Hsiao-Chi; Song, Meiping; Chang, Chein-I.
2015-05-01
Using maximal simplex volume as an optimal criterion for finding endmembers is a common approach and has been widely studied in the literature. Interestingly, very little work has been reported on how simplex volume is calculated. It turns out that the issue of calculating simplex volume is much more complicated and involved than what we may think. This paper investigates this issue from two different aspects, geometric structure and eigen-analysis. The geometric structure is derived from its simplex structure whose volume can be calculated by multiplying its base with its height. On the other hand, eigen-analysis takes advantage of the Cayley-Menger determinant to calculate the simplex volume. The major issue of this approach is that when the matrix is ill-rank where determinant is desired. To deal with this problem two methods are generally considered. One is to perform data dimensionality reduction to make the matrix to be of full rank. The drawback of this method is that the original volume has been shrunk and the found volume of a dimensionality-reduced simplex is not the real original simplex volume. Another is to use singular value decomposition (SVD) to find singular values for calculating simplex volume. The dilemma of this method is its instability in numerical calculations. This paper explores all of these three methods in simplex volume calculation. Experimental results show that geometric structure-based method yields the most reliable simplex volume.
NASA Astrophysics Data System (ADS)
Attallah, Bilal; Serir, Amina; Chahir, Youssef; Boudjelal, Abdelwahhab
2017-11-01
Palmprint recognition systems are dependent on feature extraction. A method of feature extraction using higher discrimination information was developed to characterize palmprint images. In this method, two individual feature extraction techniques are applied to a discrete wavelet transform of a palmprint image, and their outputs are fused. The two techniques used in the fusion are the histogram of gradient and the binarized statistical image features. They are then evaluated using an extreme learning machine classifier before selecting a feature based on principal component analysis. Three palmprint databases, the Hong Kong Polytechnic University (PolyU) Multispectral Palmprint Database, Hong Kong PolyU Palmprint Database II, and the Delhi Touchless (IIDT) Palmprint Database, are used in this study. The study shows that our method effectively identifies and verifies palmprints and outperforms other methods based on feature extraction.
Hoang, Tuan; Tran, Dat; Huang, Xu
2013-01-01
Common Spatial Pattern (CSP) is a state-of-the-art method for feature extraction in Brain-Computer Interface (BCI) systems. However it is designed for 2-class BCI classification problems. Current extensions of this method to multiple classes based on subspace union and covariance matrix similarity do not provide a high performance. This paper presents a new approach to solving multi-class BCI classification problems by forming a subspace resembled from original subspaces and the proposed method for this approach is called Approximation-based Common Principal Component (ACPC). We perform experiments on Dataset 2a used in BCI Competition IV to evaluate the proposed method. This dataset was designed for motor imagery classification with 4 classes. Preliminary experiments show that the proposed ACPC feature extraction method when combining with Support Vector Machines outperforms CSP-based feature extraction methods on the experimental dataset.
A Machine Learning-based Method for Question Type Classification in Biomedical Question Answering.
Sarrouti, Mourad; Ouatik El Alaoui, Said
2017-05-18
Biomedical question type classification is one of the important components of an automatic biomedical question answering system. The performance of the latter depends directly on the performance of its biomedical question type classification system, which consists of assigning a category to each question in order to determine the appropriate answer extraction algorithm. This study aims to automatically classify biomedical questions into one of the four categories: (1) yes/no, (2) factoid, (3) list, and (4) summary. In this paper, we propose a biomedical question type classification method based on machine learning approaches to automatically assign a category to a biomedical question. First, we extract features from biomedical questions using the proposed handcrafted lexico-syntactic patterns. Then, we feed these features for machine-learning algorithms. Finally, the class label is predicted using the trained classifiers. Experimental evaluations performed on large standard annotated datasets of biomedical questions, provided by the BioASQ challenge, demonstrated that our method exhibits significant improved performance when compared to four baseline systems. The proposed method achieves a roughly 10-point increase over the best baseline in terms of accuracy. Moreover, the obtained results show that using handcrafted lexico-syntactic patterns as features' provider of support vector machine (SVM) lead to the highest accuracy of 89.40 %. The proposed method can automatically classify BioASQ questions into one of the four categories: yes/no, factoid, list, and summary. Furthermore, the results demonstrated that our method produced the best classification performance compared to four baseline systems.
Low-power coprocessor for Haar-like feature extraction with pixel-based pipelined architecture
NASA Astrophysics Data System (ADS)
Luo, Aiwen; An, Fengwei; Fujita, Yuki; Zhang, Xiangyu; Chen, Lei; Jürgen Mattausch, Hans
2017-04-01
Intelligent analysis of image and video data requires image-feature extraction as an important processing capability for machine-vision realization. A coprocessor with pixel-based pipeline (CFEPP) architecture is developed for real-time Haar-like cell-based feature extraction. Synchronization with the image sensor’s pixel frequency and immediate usage of each input pixel for the feature-construction process avoids the dependence on memory-intensive conventional strategies like integral-image construction or frame buffers. One 180 nm CMOS prototype can extract the 1680-dimensional Haar-like feature vectors, applied in the speeded up robust features (SURF) scheme, using an on-chip memory of only 96 kb (kilobit). Additionally, a low power dissipation of only 43.45 mW at 1.8 V supply voltage is achieved during VGA video procession at 120 MHz frequency with more than 325 fps. The Haar-like feature-extraction coprocessor is further evaluated by the practical application of vehicle recognition, achieving the expected high accuracy which is comparable to previous work.
Evaluating a Pivot-Based Approach for Bilingual Lexicon Extraction
Kim, Jae-Hoon; Kwon, Hong-Seok; Seo, Hyeong-Won
2015-01-01
A pivot-based approach for bilingual lexicon extraction is based on the similarity of context vectors represented by words in a pivot language like English. In this paper, in order to show validity and usability of the pivot-based approach, we evaluate the approach in company with two different methods for estimating context vectors: one estimates them from two parallel corpora based on word association between source words (resp., target words) and pivot words and the other estimates them from two parallel corpora based on word alignment tools for statistical machine translation. Empirical results on two language pairs (e.g., Korean-Spanish and Korean-French) have shown that the pivot-based approach is very promising for resource-poor languages and this approach observes its validity and usability. Furthermore, for words with low frequency, our method is also well performed. PMID:25983745
Cepstrum based feature extraction method for fungus detection
NASA Astrophysics Data System (ADS)
Yorulmaz, Onur; Pearson, Tom C.; Çetin, A. Enis
2011-06-01
In this paper, a method for detection of popcorn kernels infected by a fungus is developed using image processing. The method is based on two dimensional (2D) mel and Mellin-cepstrum computation from popcorn kernel images. Cepstral features that were extracted from popcorn images are classified using Support Vector Machines (SVM). Experimental results show that high recognition rates of up to 93.93% can be achieved for both damaged and healthy popcorn kernels using 2D mel-cepstrum. The success rate for healthy popcorn kernels was found to be 97.41% and the recognition rate for damaged kernels was found to be 89.43%.
NASA Astrophysics Data System (ADS)
Liu, Shuang; Liu, Fei; Hu, Shaohua; Yin, Zhenbiao
The major power information of the main transmission system in machine tools (MTSMT) during machining process includes effective output power (i.e. cutting power), input power and power loss from the mechanical transmission system, and the main motor power loss. These information are easy to obtain in the lab but difficult to evaluate in a manufacturing process. To solve this problem, a separation method is proposed here to extract the MTSMT power information during machining process. In this method, the energy flow and the mathematical models of major power information of MTSMT during the machining process are set up first. Based on the mathematical models and the basic data tables obtained from experiments, the above mentioned power information during machining process can be separated just by measuring the real time total input power of the spindle motor. The operation program of this method is also given.
Monitoring Urban Greenness Dynamics Using Multiple Endmember Spectral Mixture Analysis
Gan, Muye; Deng, Jinsong; Zheng, Xinyu; Hong, Yang; Wang, Ke
2014-01-01
Urban greenness is increasingly recognized as an essential constituent of the urban environment and can provide a range of services and enhance residents’ quality of life. Understanding the pattern of urban greenness and exploring its spatiotemporal dynamics would contribute valuable information for urban planning. In this paper, we investigated the pattern of urban greenness in Hangzhou, China, over the past two decades using time series Landsat-5 TM data obtained in 1990, 2002, and 2010. Multiple endmember spectral mixture analysis was used to derive vegetation cover fractions at the subpixel level. An RGB-vegetation fraction model, change intensity analysis and the concentric technique were integrated to reveal the detailed, spatial characteristics and the overall pattern of change in the vegetation cover fraction. Our results demonstrated the ability of multiple endmember spectral mixture analysis to accurately model the vegetation cover fraction in pixels despite the complex spectral confusion of different land cover types. The integration of multiple techniques revealed various changing patterns in urban greenness in this region. The overall vegetation cover has exhibited a drastic decrease over the past two decades, while no significant change occurred in the scenic spots that were studied. Meanwhile, a remarkable recovery of greenness was observed in the existing urban area. The increasing coverage of small green patches has played a vital role in the recovery of urban greenness. These changing patterns were more obvious during the period from 2002 to 2010 than from 1990 to 2002, and they revealed the combined effects of rapid urbanization and greening policies. This work demonstrates the usefulness of time series of vegetation cover fractions for conducting accurate and in-depth studies of the long-term trajectories of urban greenness to obtain meaningful information for sustainable urban development. PMID:25375176
Romaniello, Roberto; Leone, Alessandro; Tamborrino, Antonia
2017-01-01
An industrial prototype of a partial de-stoner machine was specified, built and implemented in an industrial olive oil extraction plant. The partial de-stoner machine was compared to the traditional mechanical crusher to assess its quantitative and qualitative performance. The extraction efficiency of the olive oil extraction plant, olive oil quality, sensory evaluation and rheological aspects were investigated. The results indicate that by using the partial de-stoner machine the extraction plant did not show statistical differences with respect to the traditional mechanical crushing. Moreover, the partial de-stoner machine allowed recovery of 60% of olive pits and the oils obtained were characterised by more marked green fruitiness, flavour and aroma than the oils produced using the traditional processing systems. The partial de-stoner machine removes the limitations of the traditional total de-stoner machine, opening new frontiers for the recovery of pits to be used as biomass. Moreover, the partial de-stoner machine permitted a significant reduction in the viscosity of the olive paste. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.
Tsipouras, Markos G; Giannakeas, Nikolaos; Tzallas, Alexandros T; Tsianou, Zoe E; Manousou, Pinelopi; Hall, Andrew; Tsoulos, Ioannis; Tsianos, Epameinondas
2017-03-01
Collagen proportional area (CPA) extraction in liver biopsy images provides the degree of fibrosis expansion in liver tissue, which is the most characteristic histological alteration in hepatitis C virus (HCV). Assessment of the fibrotic tissue is currently based on semiquantitative staging scores such as Ishak and Metavir. Since its introduction as a fibrotic tissue assessment technique, CPA calculation based on image analysis techniques has proven to be more accurate than semiquantitative scores. However, CPA has yet to reach everyday clinical practice, since the lack of standardized and robust methods for computerized image analysis for CPA assessment have proven to be a major limitation. The current work introduces a three-stage fully automated methodology for CPA extraction based on machine learning techniques. Specifically, clustering algorithms have been employed for background-tissue separation, as well as for fibrosis detection in liver tissue regions, in the first and the third stage of the methodology, respectively. Due to the existence of several types of tissue regions in the image (such as blood clots, muscle tissue, structural collagen, etc.), classification algorithms have been employed to identify liver tissue regions and exclude all other non-liver tissue regions from CPA computation. For the evaluation of the methodology, 79 liver biopsy images have been employed, obtaining 1.31% mean absolute CPA error, with 0.923 concordance correlation coefficient. The proposed methodology is designed to (i) avoid manual threshold-based and region selection processes, widely used in similar approaches presented in the literature, and (ii) minimize CPA calculation time. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Oil Spill Detection along the Gulf of Mexico Coastline based on Airborne Imaging Spectrometer Data
NASA Astrophysics Data System (ADS)
Arslan, M. D.; Filippi, A. M.; Guneralp, I.
2013-12-01
The Deepwater Horizon oil spill in the Gulf of Mexico between April and July 2010 demonstrated the importance of synoptic oil-spill monitoring in coastal environments via remote-sensing methods. This study focuses on terrestrial oil-spill detection and thickness estimation based on hyperspectral images acquired along the coastline of the Gulf of Mexico. We use AVIRIS (Airborne Visible/Infrared Imaging Spectrometer) imaging spectrometer data collected over Bay Jimmy and Wilkinson Bay within Barataria Bay, Louisiana, USA during September 2010. We also employ field-based observations of the degree of oil accumulation along the coastline, as well as in situ measurements from the literature. As part of our proposed spectroscopic approach, we operate on atmospherically- and geometrically-corrected hyperspectral AVIRIS data to extract image-derived endmembers via Minimum Noise Fraction transform, Pixel Purity Index-generation, and n-dimensional visualization. Extracted endmembers are then used as input to endmember-mapping algorithms to yield fractional-abundance images and crisp classification images. We also employ Multiple Endmember Spectral Mixture Analysis (MESMA) for oil detection and mapping in order to enable the number and types of endmembers to vary on a per-pixel basis, in contast to simple Spectral Mixture Analysis (SMA). MESMA thus better allows accounting for spectral variabiltiy of oil (e.g., due to varying oil thicknesses, states of degradation, and the presence of different oil types, etc.) and other materials, including soils and salt marsh vegetation of varying types, which may or may not be affected by the oil spill. A decision-tree approach is also utilized for comparison. Classification results do indicate that MESMA provides advantageous capabilities for mapping several oil-thickness classes for affected vegetation and soils along the Gulf of Mexico coastline, relative to the conventional approaches tested. Oil thickness-mapping results from MESMA
Extracting laboratory test information from biomedical text
Kang, Yanna Shen; Kayaalp, Mehmet
2013-01-01
Background: No previous study reported the efficacy of current natural language processing (NLP) methods for extracting laboratory test information from narrative documents. This study investigates the pathology informatics question of how accurately such information can be extracted from text with the current tools and techniques, especially machine learning and symbolic NLP methods. The study data came from a text corpus maintained by the U.S. Food and Drug Administration, containing a rich set of information on laboratory tests and test devices. Methods: The authors developed a symbolic information extraction (SIE) system to extract device and test specific information about four types of laboratory test entities: Specimens, analytes, units of measures and detection limits. They compared the performance of SIE and three prominent machine learning based NLP systems, LingPipe, GATE and BANNER, each implementing a distinct supervised machine learning method, hidden Markov models, support vector machines and conditional random fields, respectively. Results: Machine learning systems recognized laboratory test entities with moderately high recall, but low precision rates. Their recall rates were relatively higher when the number of distinct entity values (e.g., the spectrum of specimens) was very limited or when lexical morphology of the entity was distinctive (as in units of measures), yet SIE outperformed them with statistically significant margins on extracting specimen, analyte and detection limit information in both precision and F-measure. Its high recall performance was statistically significant on analyte information extraction. Conclusions: Despite its shortcomings against machine learning methods, a well-tailored symbolic system may better discern relevancy among a pile of information of the same type and may outperform a machine learning system by tapping into lexically non-local contextual information such as the document structure. PMID:24083058
NASA Astrophysics Data System (ADS)
Serrano, Rafael; González, Luis Carlos; Martín, Francisco Jesús
2009-11-01
Under the project SENSOR-IA which has had financial funding from the Order of Incentives to the Regional Technology Centers of the Counsil of Innovation, Science and Enterprise of Andalusia, an architecture for the optimization of a machining process in real time through rule-based expert system has been developed. The architecture consists of an acquisition system and sensor data processing engine (SATD) from an expert system (SE) rule-based which communicates with the SATD. The SE has been designed as an inference engine with an algorithm for effective action, using a modus ponens rule model of goal-oriented rules.The pilot test demonstrated that it is possible to govern in real time the machining process based on rules contained in a SE. The tests have been done with approximated rules. Future work includes an exhaustive collection of data with different tool materials and geometries in a database to extract more precise rules.
Dictionary Based Machine Translation from Kannada to Telugu
NASA Astrophysics Data System (ADS)
Sindhu, D. V.; Sagar, B. M.
2017-08-01
Machine Translation is a task of translating from one language to another language. For the languages with less linguistic resources like Kannada and Telugu Dictionary based approach is the best approach. This paper mainly focuses on Dictionary based machine translation for Kannada to Telugu. The proposed methodology uses dictionary for translating word by word without much correlation of semantics between them. The dictionary based machine translation process has the following sub process: Morph analyzer, dictionary, transliteration, transfer grammar and the morph generator. As a part of this work bilingual dictionary with 8000 entries is developed and the suffix mapping table at the tag level is built. This system is tested for the children stories. In near future this system can be further improved by defining transfer grammar rules.
Assessment of multi-wildfire occurrence data for machine learning based risk modelling
NASA Astrophysics Data System (ADS)
Lim, C. H.; Kim, M.; Kim, S. J.; Yoo, S.; Lee, W. K.
2017-12-01
The occurrence of East Asian wildfires is mainly caused by human-activities, but the extreme drought increased due to the climate change caused wildfires and they spread to large-scale fires. Accurate occurrence location data is required for modelling wildfire probability and risk. In South Korea, occurrence data surveyed through KFS (Korea Forest Service) and MODIS (MODerate-resolution Imaging Spectroradiometer) satellite-based active fire data can be utilized. In this study, two sorts of wildfire occurrence data were applied to select suitable occurrence data for machine learning based wildfire risk modelling. MaxEnt (Maximum Entropy) model based on machine learning is used for wildfire risk modelling, and two types of occurrence data and socio-economic and climate-environment data are applied to modelling. In the results with KFS survey based data, the low relationship was shown with climate-environmental factors, and the uncertainty of coordinate information appeared. The MODIS-based active fire data were found outside the forests, and there were a lot of spots that did not match the actual wildfires. In order to utilize MODIS-based active fire data, it was necessary to extract forest area and utilize only high-confidence level data. In KFS data, it was necessary to separate the analysis according to the damage scale to improve the modelling accuracy. Ultimately, it is considered to be the best way to simulate the wildfire risk by constructing more accurate information by combining two sorts of wildfire occurrence data.
The research on construction and application of machining process knowledge base
NASA Astrophysics Data System (ADS)
Zhao, Tan; Qiao, Lihong; Qie, Yifan; Guo, Kai
2018-03-01
In order to realize the application of knowledge in machining process design, from the perspective of knowledge in the application of computer aided process planning(CAPP), a hierarchical structure of knowledge classification is established according to the characteristics of mechanical engineering field. The expression of machining process knowledge is structured by means of production rules and the object-oriented methods. Three kinds of knowledge base models are constructed according to the representation of machining process knowledge. In this paper, the definition and classification of machining process knowledge, knowledge model, and the application flow of the process design based on the knowledge base are given, and the main steps of the design decision of the machine tool are carried out as an application by using the knowledge base.
A survey of machine readable data bases
NASA Technical Reports Server (NTRS)
Matlock, P.
1981-01-01
Forty-two of the machine readable data bases available to the technologist and researcher in the natural sciences and engineering are described and compared with the data bases and date base services offered by NASA.
NASA Astrophysics Data System (ADS)
OMEGA Science Team; Combe, J.-Ph.; Le Mouélic, S.; Sotin, C.; Gendrin, A.; Mustard, J. F.; Le Deit, L.; Launeau, P.; Bibring, J.-P.; Gondet, B.; Langevin, Y.; Pinet, P.; OMEGA Science Team
2008-05-01
The mineralogical composition of the Martian surface is investigated by a Multiple-Endmember Linear Spectral Unmixing Model (MELSUM) of the Observatoire pour la Minéralogie, l'Eau, les Glaces et l'Activité (OMEGA) imaging spectrometer onboard Mars Express. OMEGA has fully covered the surface of the red planet at medium to low resolution (2-4 km per pixel). Several areas have been imaged at a resolution up to 300 m per pixel. One difficulty in the data processing is to extract the mineralogical composition, since rocks are mixtures of several components. MELSUM is an algorithm that selects the best linear combination of spectra among the families of minerals available in a reference library. The best fit of the observed spectrum on each pixel is calculated by the same unmixing equation used in the classical Spectral Mixture Analysis (SMA). This study shows the importance of the choice of the input library, which contains in our case 24 laboratory spectra (endmembers) of minerals that cover the diversity of the mineral families that may be found on the Martian surface. The analysis is restricted to the 1.0-2.5 μm wavelength range. Grain size variations and atmospheric scattering by aerosols induce changes in overall albedo level and continuum slopes. Synthetic flat and pure slope spectra have therefore been included in the input mineral spectral endmembers library in order to take these effects into account. The selection process for the endmembers is a systematic exploration of whole set of combinations of four components plus the straight line spectra. When negative coefficients occur, the results are discarded. This strategy is successfully tested on the terrestrial Cuprite site (Nevada, USA), for which extensive ground observations exist. It is then applied to different areas on Mars including Syrtis Major, Aram Chaos and Olympia Undae near the North Polar Cap. MELSUM on Syrtis Major reveals a region dominated by mafic minerals, with the oldest crustal regions
Xu, Huo; Jiang, Yifan; Liu, Dengyou; Liu, Kai; Zhang, Yafeng; Yu, Suhong; Shen, Zhifa; Wu, Zai-Sheng
2018-06-29
The sensitive detection of cancer-related genes is of great significance for early diagnosis and treatment of human cancers, and previous isothermal amplification sensing systems were often based on the reuse of target DNA, the amplification of enzymatic products and the accumulation of reporting probes. However, no reporting probes are able to be transformed into target species and in turn initiate the signal of other probes. Herein we reported a simple, isothermal and highly sensitive homogeneous assay system for tumor suppressor p53 gene detection based on a new autonomous DNA machine, where the signaling probe, molecular beacon (MB), was able to execute the function similar to target DNA besides providing the common signal. In the presence of target p53 gene, the operation of DNA machine can be initiated, and cyclical nucleic acid strand-displacement polymerization (CNDP) and nicking/polymerization cyclical amplification (NPCA) occur, during which the MB was opened by target species and cleaved by restriction endonuclease. In turn, the cleaved fragments could activate the next signaling process as target DNA did. According to the functional similarity, the cleaved fragment was called twin target, and the corresponding fashion to amplify the signal was named twin target self-amplification. Utilizing this newly-proposed DNA machine, the target DNA could be detected down to 0.1 pM with a wide dynamic range (6 orders of magnitude) and single-base mismatched targets were discriminated, indicating a very high assay sensitivity and good specificity. In addition, the DNA machine was not only used to screen the p53 gene in complex biological matrix but also was capable of practically detecting genomic DNA p53 extracted from A549 cell line. This indicates that the proposed DNA machine holds the potential application in biomedical research and early clinical diagnosis. Copyright © 2018 Elsevier B.V. All rights reserved.
Scientific bases of human-machine communication by voice.
Schafer, R W
1995-01-01
The scientific bases for human-machine communication by voice are in the fields of psychology, linguistics, acoustics, signal processing, computer science, and integrated circuit technology. The purpose of this paper is to highlight the basic scientific and technological issues in human-machine communication by voice and to point out areas of future research opportunity. The discussion is organized around the following major issues in implementing human-machine voice communication systems: (i) hardware/software implementation of the system, (ii) speech synthesis for voice output, (iii) speech recognition and understanding for voice input, and (iv) usability factors related to how humans interact with machines. PMID:7479802
Knowledge-based vision and simple visual machines.
Cliff, D; Noble, J
1997-01-01
The vast majority of work in machine vision emphasizes the representation of perceived objects and events: it is these internal representations that incorporate the 'knowledge' in knowledge-based vision or form the 'models' in model-based vision. In this paper, we discuss simple machine vision systems developed by artificial evolution rather than traditional engineering design techniques, and note that the task of identifying internal representations within such systems is made difficult by the lack of an operational definition of representation at the causal mechanistic level. Consequently, we question the nature and indeed the existence of representations posited to be used within natural vision systems (i.e. animals). We conclude that representations argued for on a priori grounds by external observers of a particular vision system may well be illusory, and are at best place-holders for yet-to-be-identified causal mechanistic interactions. That is, applying the knowledge-based vision approach in the understanding of evolved systems (machines or animals) may well lead to theories and models that are internally consistent, computationally plausible, and entirely wrong. PMID:9304684
NASA Astrophysics Data System (ADS)
Leighs, J. A.; Halling-Brown, M. D.; Patel, M. N.
2018-03-01
The UK currently has a national breast cancer-screening program and images are routinely collected from a number of screening sites, representing a wealth of invaluable data that is currently under-used. Radiologists evaluate screening images manually and recall suspicious cases for further analysis such as biopsy. Histological testing of biopsy samples confirms the malignancy of the tumour, along with other diagnostic and prognostic characteristics such as disease grade. Machine learning is becoming increasingly popular for clinical image classification problems, as it is capable of discovering patterns in data otherwise invisible. This is particularly true when applied to medical imaging features; however clinical datasets are often relatively small. A texture feature extraction toolkit has been developed to mine a wide range of features from medical images such as mammograms. This study analysed a dataset of 1,366 radiologist-marked, biopsy-proven malignant lesions obtained from the OPTIMAM Medical Image Database (OMI-DB). Exploratory data analysis methods were employed to better understand extracted features. Machine learning techniques including Classification and Regression Trees (CART), ensemble methods (e.g. random forests), and logistic regression were applied to the data to predict the disease grade of the analysed lesions. Prediction scores of up to 83% were achieved; sensitivity and specificity of the models trained have been discussed to put the results into a clinical context. The results show promise in the ability to predict prognostic indicators from the texture features extracted and thus enable prioritisation of care for patients at greatest risk.
Entanglement-Based Machine Learning on a Quantum Computer
NASA Astrophysics Data System (ADS)
Cai, X.-D.; Wu, D.; Su, Z.-E.; Chen, M.-C.; Wang, X.-L.; Li, Li; Liu, N.-L.; Lu, C.-Y.; Pan, J.-W.
2015-03-01
Machine learning, a branch of artificial intelligence, learns from previous experience to optimize performance, which is ubiquitous in various fields such as computer sciences, financial analysis, robotics, and bioinformatics. A challenge is that machine learning with the rapidly growing "big data" could become intractable for classical computers. Recently, quantum machine learning algorithms [Lloyd, Mohseni, and Rebentrost, arXiv.1307.0411] were proposed which could offer an exponential speedup over classical algorithms. Here, we report the first experimental entanglement-based classification of two-, four-, and eight-dimensional vectors to different clusters using a small-scale photonic quantum computer, which are then used to implement supervised and unsupervised machine learning. The results demonstrate the working principle of using quantum computers to manipulate and classify high-dimensional vectors, the core mathematical routine in machine learning. The method can, in principle, be scaled to larger numbers of qubits, and may provide a new route to accelerate machine learning.
Arrhythmia Classification Based on Multi-Domain Feature Extraction for an ECG Recognition System.
Li, Hongqiang; Yuan, Danyang; Wang, Youxi; Cui, Dianyin; Cao, Lu
2016-10-20
Automatic recognition of arrhythmias is particularly important in the diagnosis of heart diseases. This study presents an electrocardiogram (ECG) recognition system based on multi-domain feature extraction to classify ECG beats. An improved wavelet threshold method for ECG signal pre-processing is applied to remove noise interference. A novel multi-domain feature extraction method is proposed; this method employs kernel-independent component analysis in nonlinear feature extraction and uses discrete wavelet transform to extract frequency domain features. The proposed system utilises a support vector machine classifier optimized with a genetic algorithm to recognize different types of heartbeats. An ECG acquisition experimental platform, in which ECG beats are collected as ECG data for classification, is constructed to demonstrate the effectiveness of the system in ECG beat classification. The presented system, when applied to the MIT-BIH arrhythmia database, achieves a high classification accuracy of 98.8%. Experimental results based on the ECG acquisition experimental platform show that the system obtains a satisfactory classification accuracy of 97.3% and is able to classify ECG beats efficiently for the automatic identification of cardiac arrhythmias.
Arrhythmia Classification Based on Multi-Domain Feature Extraction for an ECG Recognition System
Li, Hongqiang; Yuan, Danyang; Wang, Youxi; Cui, Dianyin; Cao, Lu
2016-01-01
Automatic recognition of arrhythmias is particularly important in the diagnosis of heart diseases. This study presents an electrocardiogram (ECG) recognition system based on multi-domain feature extraction to classify ECG beats. An improved wavelet threshold method for ECG signal pre-processing is applied to remove noise interference. A novel multi-domain feature extraction method is proposed; this method employs kernel-independent component analysis in nonlinear feature extraction and uses discrete wavelet transform to extract frequency domain features. The proposed system utilises a support vector machine classifier optimized with a genetic algorithm to recognize different types of heartbeats. An ECG acquisition experimental platform, in which ECG beats are collected as ECG data for classification, is constructed to demonstrate the effectiveness of the system in ECG beat classification. The presented system, when applied to the MIT-BIH arrhythmia database, achieves a high classification accuracy of 98.8%. Experimental results based on the ECG acquisition experimental platform show that the system obtains a satisfactory classification accuracy of 97.3% and is able to classify ECG beats efficiently for the automatic identification of cardiac arrhythmias. PMID:27775596
Takada, Kenji
2016-09-01
New approach for the diagnosis of extractions with neural network machine learning. Seok-Ki Jung and Tae-Woo Kim. Am J Orthod Dentofacial Orthop 2016;149:127-33. Not reported. Mathematical modeling. Copyright © 2016 Elsevier Inc. All rights reserved.
Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines
del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J.; Raboso, Mariano
2015-01-01
Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation—based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking—to reduce the dimensions of images—and binarization—to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements. PMID:26091392
Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines.
del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J; Raboso, Mariano
2015-06-17
Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation-based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking-to reduce the dimensions of images-and binarization-to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements.
The release of dissolved actinium to the ocean: A global comparison of different end-members
Geibert, W.; Charette, M.; Kim, G.; Moore, W.S.; Street, J.; Young, M.; Paytan, A.
2008-01-01
The measurement of short-lived 223Ra often involves a second measurement for supported activities, which represents 227Ac in the sample. Here we exploit this fact, presenting a set of 284 values on the oceanic distribution of 227Ac, which was collected when analyzing water samples for short-lived radium isotopes by the radium delayed coincidence counting system. The present work compiles 227Ac data from coastal regions all over the northern hemisphere, including values from ground water, from estuaries and lagoons, and from marine end-members. Deep-sea samples from a continental slope off Puerto Rico and from an active vent site near Hawaii complete the overview of 227Ac near its potential sources. The average 227Ac activities of nearshore marine end-members range from 0.4??dpm m- 3 at the Gulf of Mexico to 3.0??dpm m- 3 in the coastal waters of the Korean Strait. In analogy to 228Ra, we find the extension of adjacent shelf regions to play a substantial role for 227Ac activities, although less pronounced than for radium, due to its weaker shelf source. Based on previously published values, we calculate an open ocean 227Ac inventory of 1.35 * 1018??dpm 227Acex in the ocean, which corresponds to 37??moles, or 8.4??kg. This implies a flux of 127??dpm m-2 y- 1 from the deep-sea floor. For the shelf regions, we obtain a global inventory of 227Ac of 4.5 * 1015??dpm, which cannot be converted directly into a flux value, as the regional loss term of 227Ac to the open ocean would have to be included. Ac has so far been considered to behave similarly to Ra in the marine environment, with the exception of a strong Ac source in the deep-sea due to 231Paex. Here, we present evidence of geochemical differences between Ac, which is retained in a warm vent system, and Ra, which is readily released [Moore, W.S., Ussler, W. and Paull, C.K., 2008-this issue. Short-lived radium isotopes in the Hawaiian margin: Evidence for large fluid fluxes through the Puna Ridge. Marine Chemistry
A Shellcode Detection Method Based on Full Native API Sequence and Support Vector Machine
NASA Astrophysics Data System (ADS)
Cheng, Yixuan; Fan, Wenqing; Huang, Wei; An, Jing
2017-09-01
Dynamic monitoring the behavior of a program is widely used to discriminate between benign program and malware. It is usually based on the dynamic characteristics of a program, such as API call sequence or API call frequency to judge. The key innovation of this paper is to consider the full Native API sequence and use the support vector machine to detect the shellcode. We also use the Markov chain to extract and digitize Native API sequence features. Our experimental results show that the method proposed in this paper has high accuracy and low detection rate.
NASA Astrophysics Data System (ADS)
Jia, S.
2015-12-01
As an effective method of extracting land cover fractions based on spectral endmembers, spectral mixture analysis (SMA) has been applied using remotely sensed imagery in different spatial, temporal, and spectral resolutions. A number of studies focused on arid/semiarid ecosystem have used SMA to obtain the land cover fractions of GV, NPV/litter, and bare soil (BS) using MODIS reflectance products to understand ecosystem phenology, track vegetation dynamics, and evaluate the impact of major disturbances. However, several challenges remain in the application of SMA in studying ecosystem phenology, including obtaining high quality endmembers and increasing computational efficiency when considering to long time series that cover a broad spatial extent. Okin (2007) proposes a variation of SMA, named as relative spectra mixture analysis (RSMA) to address the latter challenge by calculating the relative change of fraction of GV, NPV/litter, and BS compared with a baseline date. This approach assumes that the baseline image contains the spectral information of the bare soil that can be used as an endmember for spectral mixture analysis though it is mixed with the spectral reflectance of other non-soil land cover types. Using the baseline image, one can obtain the change of fractions of GV, NPV/litter, BS, and snow compared with the baseline image. However, RSMA results depend on the selection of baseline date and the fractional components during this date. In this study, we modified the strategy of implementing RSMA by introducing a step of obtaining a soil map as the baseline image using multiple-endmember SMA (MESMA) before applying RSMA. The fractions of land cover components from this modified RSMA are also validated using the field observations from two study area in semiarid savanna and grassland of Queensland, Australia.
Stroke-model-based character extraction from gray-level document images.
Ye, X; Cheriet, M; Suen, C Y
2001-01-01
Global gray-level thresholding techniques such as Otsu's method, and local gray-level thresholding techniques such as edge-based segmentation or the adaptive thresholding method are powerful in extracting character objects from simple or slowly varying backgrounds. However, they are found to be insufficient when the backgrounds include sharply varying contours or fonts in different sizes. A stroke-model is proposed to depict the local features of character objects as double-edges in a predefined size. This model enables us to detect thin connected components selectively, while ignoring relatively large backgrounds that appear complex. Meanwhile, since the stroke width restriction is fully factored in, the proposed technique can be used to extract characters in predefined font sizes. To process large volumes of documents efficiently, a hybrid method is proposed for character extraction from various backgrounds. Using the measurement of class separability to differentiate images with simple backgrounds from those with complex backgrounds, the hybrid method can process documents with different backgrounds by applying the appropriate methods. Experiments on extracting handwriting from a check image, as well as machine-printed characters from scene images demonstrate the effectiveness of the proposed model.
Differentiation of Glioblastoma and Lymphoma Using Feature Extraction and Support Vector Machine.
Yang, Zhangjing; Feng, Piaopiao; Wen, Tian; Wan, Minghua; Hong, Xunning
2017-01-01
Differentiation of glioblastoma multiformes (GBMs) and lymphomas using multi-sequence magnetic resonance imaging (MRI) is an important task that is valuable for treatment planning. However, this task is a challenge because GBMs and lymphomas may have a similar appearance in MRI images. This similarity may lead to misclassification and could affect the treatment results. In this paper, we propose a semi-automatic method based on multi-sequence MRI to differentiate these two types of brain tumors. Our method consists of three steps: 1) the key slice is selected from 3D MRIs and region of interests (ROIs) are drawn around the tumor region; 2) different features are extracted based on prior clinical knowledge and validated using a t-test; and 3) features that are helpful for classification are used to build an original feature vector and a support vector machine is applied to perform classification. In total, 58 GBM cases and 37 lymphoma cases are used to validate our method. A leave-one-out crossvalidation strategy is adopted in our experiments. The global accuracy of our method was determined as 96.84%, which indicates that our method is effective for the differentiation of GBM and lymphoma and can be applied in clinical diagnosis. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Zhang, Jie; Xiao, Wendong; Zhang, Sen; Huang, Shoudong
2017-04-17
Device-free localization (DFL) is becoming one of the new technologies in wireless localization field, due to its advantage that the target to be localized does not need to be attached to any electronic device. In the radio-frequency (RF) DFL system, radio transmitters (RTs) and radio receivers (RXs) are used to sense the target collaboratively, and the location of the target can be estimated by fusing the changes of the received signal strength (RSS) measurements associated with the wireless links. In this paper, we will propose an extreme learning machine (ELM) approach for DFL, to improve the efficiency and the accuracy of the localization algorithm. Different from the conventional machine learning approaches for wireless localization, in which the above differential RSS measurements are trivially used as the only input features, we introduce the parameterized geometrical representation for an affected link, which consists of its geometrical intercepts and differential RSS measurement. Parameterized geometrical feature extraction (PGFE) is performed for the affected links and the features are used as the inputs of ELM. The proposed PGFE-ELM for DFL is trained in the offline phase and performed for real-time localization in the online phase, where the estimated location of the target is obtained through the created ELM. PGFE-ELM has the advantages that the affected links used by ELM in the online phase can be different from those used for training in the offline phase, and can be more robust to deal with the uncertain combination of the detectable wireless links. Experimental results show that the proposed PGFE-ELM can improve the localization accuracy and learning speed significantly compared with a number of the existing machine learning and DFL approaches, including the weighted K-nearest neighbor (WKNN), support vector machine (SVM), back propagation neural network (BPNN), as well as the well-known radio tomographic imaging (RTI) DFL approach.
Zhang, Jie; Xiao, Wendong; Zhang, Sen; Huang, Shoudong
2017-01-01
Device-free localization (DFL) is becoming one of the new technologies in wireless localization field, due to its advantage that the target to be localized does not need to be attached to any electronic device. In the radio-frequency (RF) DFL system, radio transmitters (RTs) and radio receivers (RXs) are used to sense the target collaboratively, and the location of the target can be estimated by fusing the changes of the received signal strength (RSS) measurements associated with the wireless links. In this paper, we will propose an extreme learning machine (ELM) approach for DFL, to improve the efficiency and the accuracy of the localization algorithm. Different from the conventional machine learning approaches for wireless localization, in which the above differential RSS measurements are trivially used as the only input features, we introduce the parameterized geometrical representation for an affected link, which consists of its geometrical intercepts and differential RSS measurement. Parameterized geometrical feature extraction (PGFE) is performed for the affected links and the features are used as the inputs of ELM. The proposed PGFE-ELM for DFL is trained in the offline phase and performed for real-time localization in the online phase, where the estimated location of the target is obtained through the created ELM. PGFE-ELM has the advantages that the affected links used by ELM in the online phase can be different from those used for training in the offline phase, and can be more robust to deal with the uncertain combination of the detectable wireless links. Experimental results show that the proposed PGFE-ELM can improve the localization accuracy and learning speed significantly compared with a number of the existing machine learning and DFL approaches, including the weighted K-nearest neighbor (WKNN), support vector machine (SVM), back propagation neural network (BPNN), as well as the well-known radio tomographic imaging (RTI) DFL approach. PMID:28420187
NASA Astrophysics Data System (ADS)
Cook, Peter G.; Rodellas, Valentí; Stieglitz, Thomas C.
2018-03-01
Tracer approaches to estimate both porewater exchange (the cycling of water between surface water and sediments, with zero net water flux) and groundwater inflow (the net flow of terrestrially derived groundwater into surface water) are commonly based on solute mass balances. However, this requires appropriate characterization of tracer end-member concentrations in exchanging or discharging water. Where either porewater exchange or groundwater inflow to surface water occur in isolation, then the water flux is easily estimated from the net tracer flux if the end-member is appropriately chosen. However, in most natural systems porewater exchange and groundwater inflow will occur concurrently. Our analysis shows that if groundwater inflow (Qg) and porewater exchange (Qp) mix completely before discharging to surface water, then the combined water flux (Qg + Qp) can be approximated by dividing the combined tracer flux by the difference between the porewater and surface water concentrations, (cp - c). If Qg and Qp do not mix prior to discharge, then (Qg + Qp) can only be constrained by minimum and maximum values. The minimum value is obtained by dividing the net tracer flux by the groundwater concentration, and the maximum is obtained by dividing by (cp - c). Dividing by the groundwater concentration gives a maximum value for Qg. If porewater exchange and groundwater outflow occur concurrently, then dividing the net tracer flux by (cp - c) will provide a minimum value for Qp. Use of multiple tracers, and spatial and temporal replication should provide a more complete picture of exchange processes and the extent of subsurface mixing.
Feng, Zhichao; Rong, Pengfei; Cao, Peng; Zhou, Qingyu; Zhu, Wenwei; Yan, Zhimin; Liu, Qianyun; Wang, Wei
2018-04-01
To evaluate the diagnostic performance of machine-learning based quantitative texture analysis of CT images to differentiate small (≤ 4 cm) angiomyolipoma without visible fat (AMLwvf) from renal cell carcinoma (RCC). This single-institutional retrospective study included 58 patients with pathologically proven small renal mass (17 in AMLwvf and 41 in RCC groups). Texture features were extracted from the largest possible tumorous regions of interest (ROIs) by manual segmentation in preoperative three-phase CT images. Interobserver reliability and the Mann-Whitney U test were applied to select features preliminarily. Then support vector machine with recursive feature elimination (SVM-RFE) and synthetic minority oversampling technique (SMOTE) were adopted to establish discriminative classifiers, and the performance of classifiers was assessed. Of the 42 extracted features, 16 candidate features showed significant intergroup differences (P < 0.05) and had good interobserver agreement. An optimal feature subset including 11 features was further selected by the SVM-RFE method. The SVM-RFE+SMOTE classifier achieved the best performance in discriminating between small AMLwvf and RCC, with the highest accuracy, sensitivity, specificity and AUC of 93.9 %, 87.8 %, 100 % and 0.955, respectively. Machine learning analysis of CT texture features can facilitate the accurate differentiation of small AMLwvf from RCC. • Although conventional CT is useful for diagnosis of SRMs, it has limitations. • Machine-learning based CT texture analysis facilitate differentiation of small AMLwvf from RCC. • The highest accuracy of SVM-RFE+SMOTE classifier reached 93.9 %. • Texture analysis combined with machine-learning methods might spare unnecessary surgery for AMLwvf.
Wen, Tingxi; Zhang, Zhongnan; Qiu, Ming; Zeng, Ming; Luo, Weizhen
2017-01-01
The computer mouse is an important human-computer interaction device. But patients with physical finger disability are unable to operate this device. Surface EMG (sEMG) can be monitored by electrodes on the skin surface and is a reflection of the neuromuscular activities. Therefore, we can control limbs auxiliary equipment by utilizing sEMG classification in order to help the physically disabled patients to operate the mouse. To develop a new a method to extract sEMG generated by finger motion and apply novel features to classify sEMG. A window-based data acquisition method was presented to extract signal samples from sEMG electordes. Afterwards, a two-dimensional matrix image based feature extraction method, which differs from the classical methods based on time domain or frequency domain, was employed to transform signal samples to feature maps used for classification. In the experiments, sEMG data samples produced by the index and middle fingers at the click of a mouse button were separately acquired. Then, characteristics of the samples were analyzed to generate a feature map for each sample. Finally, the machine learning classification algorithms (SVM, KNN, RBF-NN) were employed to classify these feature maps on a GPU. The study demonstrated that all classifiers can identify and classify sEMG samples effectively. In particular, the accuracy of the SVM classifier reached up to 100%. The signal separation method is a convenient, efficient and quick method, which can effectively extract the sEMG samples produced by fingers. In addition, unlike the classical methods, the new method enables to extract features by enlarging sample signals' energy appropriately. The classical machine learning classifiers all performed well by using these features.
Yang, Zhihao; Lin, Yuan; Wu, Jiajin; Tang, Nan; Lin, Hongfei; Li, Yanpeng
2011-10-01
Knowledge about protein-protein interactions (PPIs) unveils the molecular mechanisms of biological processes. However, the volume and content of published biomedical literature on protein interactions is expanding rapidly, making it increasingly difficult for interaction database curators to detect and curate protein interaction information manually. We present a multiple kernel learning-based approach for automatic PPI extraction from biomedical literature. The approach combines the following kernels: feature-based, tree, and graph and combines their output with Ranking support vector machine (SVM). Experimental evaluations show that the features in individual kernels are complementary and the kernel combined with Ranking SVM achieves better performance than those of the individual kernels, equal weight combination and optimal weight combination. Our approach can achieve state-of-the-art performance with respect to the comparable evaluations, with 64.88% F-score and 88.02% AUC on the AImed corpus. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Support vector machine learning-based fMRI data group analysis.
Wang, Ze; Childress, Anna R; Wang, Jiongjiong; Detre, John A
2007-07-15
To explore the multivariate nature of fMRI data and to consider the inter-subject brain response discrepancies, a multivariate and brain response model-free method is fundamentally required. Two such methods are presented in this paper by integrating a machine learning algorithm, the support vector machine (SVM), and the random effect model. Without any brain response modeling, SVM was used to extract a whole brain spatial discriminance map (SDM), representing the brain response difference between the contrasted experimental conditions. Population inference was then obtained through the random effect analysis (RFX) or permutation testing (PMU) on the individual subjects' SDMs. Applied to arterial spin labeling (ASL) perfusion fMRI data, SDM RFX yielded lower false-positive rates in the null hypothesis test and higher detection sensitivity for synthetic activations with varying cluster size and activation strengths, compared to the univariate general linear model (GLM)-based RFX. For a sensory-motor ASL fMRI study, both SDM RFX and SDM PMU yielded similar activation patterns to GLM RFX and GLM PMU, respectively, but with higher t values and cluster extensions at the same significance level. Capitalizing on the absence of temporal noise correlation in ASL data, this study also incorporated PMU in the individual-level GLM and SVM analyses accompanied by group-level analysis through RFX or group-level PMU. Providing inferences on the probability of being activated or deactivated at each voxel, these individual-level PMU-based group analysis methods can be used to threshold the analysis results of GLM RFX, SDM RFX or SDM PMU.
NASA Astrophysics Data System (ADS)
Fabian, Karl; Knies, Jochen; Kosareva, Lina; Nurgaliev, Danis
2017-04-01
Room temperature magnetic initial curves, upper hysteresis curves, acquisition curves of induced remanent magnetization (IRM), and backfield (BF) curves have been measured between -1.5 T and 1.5 T for more than 430 samples from Ocean Drilling Program (ODP) Hole 910C. The core was drilled in 556.4 m water depth on the southern Yermak Plateau (80°15.896'N, 6°35.430'E), NW Svalbard. In total, 507.4 m of sediments were cored, and average recovery was 57%, with 80% between 170 and 504.7 meter below seafloor (mbsf). For this study, the borehole was re-sampled between 150 mbsf and 504.7 mbsf for environmental magnetic, inorganic geochemical, and sedimentological analyses (443 samples). The lithology is mainly silty-clay with some enrichments of fine sands in the lower section (below 400 mbsf). For all samples, a Curie express balance was used to obtain the temperature dependence of induced magnetization in air at a heating rate of 100 °C/min up to a maximum temperature of 800 °C. The hysteresis curves were used to infer classical hysteresis parameters like saturation remanence (Mrs), saturation magnetization (Ms), remanence coercivity (Hcr) or coercivity (Hc). In addition several other parameters, like hysteresis energy, high-field slope or saturation field have been determined and help to characterize the down-core variation of the magnetic fractions. Acquisition curves of isothermal remanent magnetization are decomposed into endmembers using non-negative matrix factorization. The obtained mixing coefficients decompose hysteresis loops, back-field, thermomagnetic curves, geochemistry, and sedimentological parameters into their related endmember components. Down-core variation of the endmembers enables reconstruction of sediment transport processes and in-situ formation of magnetic mineral phases.
Experimental Machine Learning of Quantum States
NASA Astrophysics Data System (ADS)
Gao, Jun; Qiao, Lu-Feng; Jiao, Zhi-Qiang; Ma, Yue-Chi; Hu, Cheng-Qiu; Ren, Ruo-Jing; Yang, Ai-Lin; Tang, Hao; Yung, Man-Hong; Jin, Xian-Min
2018-06-01
Quantum information technologies provide promising applications in communication and computation, while machine learning has become a powerful technique for extracting meaningful structures in "big data." A crossover between quantum information and machine learning represents a new interdisciplinary area stimulating progress in both fields. Traditionally, a quantum state is characterized by quantum-state tomography, which is a resource-consuming process when scaled up. Here we experimentally demonstrate a machine-learning approach to construct a quantum-state classifier for identifying the separability of quantum states. We show that it is possible to experimentally train an artificial neural network to efficiently learn and classify quantum states, without the need of obtaining the full information of the states. We also show how adding a hidden layer of neurons to the neural network can significantly boost the performance of the state classifier. These results shed new light on how classification of quantum states can be achieved with limited resources, and represent a step towards machine-learning-based applications in quantum information processing.
Khellal, Atmane; Ma, Hongbin; Fei, Qing
2018-05-09
The success of Deep Learning models, notably convolutional neural networks (CNNs), makes them the favorable solution for object recognition systems in both visible and infrared domains. However, the lack of training data in the case of maritime ships research leads to poor performance due to the problem of overfitting. In addition, the back-propagation algorithm used to train CNN is very slow and requires tuning many hyperparameters. To overcome these weaknesses, we introduce a new approach fully based on Extreme Learning Machine (ELM) to learn useful CNN features and perform a fast and accurate classification, which is suitable for infrared-based recognition systems. The proposed approach combines an ELM based learning algorithm to train CNN for discriminative features extraction and an ELM based ensemble for classification. The experimental results on VAIS dataset, which is the largest dataset of maritime ships, confirm that the proposed approach outperforms the state-of-the-art models in term of generalization performance and training speed. For instance, the proposed model is up to 950 times faster than the traditional back-propagation based training of convolutional neural networks, primarily for low-level features extraction.
Zorman, Milan; Sánchez de la Rosa, José Luis; Dinevski, Dejan
2011-12-01
It is not very often to see a symbol-based machine learning approach to be used for the purpose of image classification and recognition. In this paper we will present such an approach, which we first used on the follicular lymphoma images. Lymphoma is a broad term encompassing a variety of cancers of the lymphatic system. Lymphoma is differentiated by the type of cell that multiplies and how the cancer presents itself. It is very important to get an exact diagnosis regarding lymphoma and to determine the treatments that will be most effective for the patient's condition. Our work was focused on the identification of lymphomas by finding follicles in microscopy images provided by the Laboratory of Pathology in the University Hospital of Tenerife, Spain. We divided our work in two stages: in the first stage we did image pre-processing and feature extraction, and in the second stage we used different symbolic machine learning approaches for pixel classification. Symbolic machine learning approaches are often neglected when looking for image analysis tools. They are not only known for a very appropriate knowledge representation, but also claimed to lack computational power. The results we got are very promising and show that symbolic approaches can be successful in image analysis applications.
NASA Astrophysics Data System (ADS)
Peng, Tsung-Ren; Zhan, Wen-Jun; Tong, Lun-Tao; Chen, Chi-Tsun; Liu, Tsang-Sen; Lu, Wan-Chung
2018-03-01
A study in eastern Taiwan evaluated the importance of montane water contribution (MC) to adjacent valley-plain groundwater (VPG) in a tectonic suture zone. The evaluation used a ternary natural-tracer-based end-member mixing analysis (EMMA). With this purpose, VPG and three end-member water samples of plain precipitation (PP), mountain-front recharge (MFR), and mountain-block recharge (MBR) were collected and analyzed for stable isotopic compositions (δ 2H and δ 18O) and chemical concentrations (electrical conductivity (EC) and Cl-). After evaluation, Cl- is deemed unsuitable for EMMA in this study, and the contribution fractions of respective end members derived by the δ 18O-EC pair are similar to those derived by the δ 2H-EC pair. EMMA results indicate that the MC, including MFR and MBR, contributes at least 70% (679 × 106 m3 water volume) of the VPG, significantly greater than the approximately 30% of PP contribution, and greater than the 20-50% in equivalent humid regions worldwide. The large MC is attributable to highly fractured strata and the steep topography of studied catchments caused by active tectonism. Furthermore, the contribution fractions derived by EMMA reflect the unique hydrogeological conditions in the respective study sub-regions. A region with a large MBR fraction is indicative of active lateral groundwater flow as a result of highly fractured strata in montane catchments. On the other hand, a region characterized by a large MFR fraction may possess high-permeability stream beds or high stream gradients. Those hydrogeological implications are helpful for water resource management and protection authorities of the studied regions.
Initial planetary base construction techniques and machine implementation
NASA Technical Reports Server (NTRS)
Crockford, William W.
1987-01-01
Conceptual designs of (1) initial planetary base structures, and (2) an unmanned machine to perform the construction of these structures using materials local to the planet are presented. Rock melting is suggested as a possible technique to be used by the machine in fabricating roads, platforms, and interlocking bricks. Identification of problem areas in machine design and materials processing is accomplished. The feasibility of the designs is contingent upon favorable results of an analysis of the engineering behavior of the product materials. The analysis requires knowledge of several parameters for solution of the constitutive equations of the theory of elasticity. An initial collection of these parameters is presented which helps to define research needed to perform a realistic feasibility study. A qualitative approach to estimating power and mass lift requirements for the proposed machine is used which employs specifications of currently available equipment. An initial, unmanned mission scenario is discussed with emphasis on identifying uncompleted tasks and suggesting design considerations for vehicles and primitive structures which use the products of the machine processing.
Face recognition using total margin-based adaptive fuzzy support vector machines.
Liu, Yi-Hung; Chen, Yen-Ting
2007-01-01
This paper presents a new classifier called total margin-based adaptive fuzzy support vector machines (TAF-SVM) that deals with several problems that may occur in support vector machines (SVMs) when applied to the face recognition. The proposed TAF-SVM not only solves the overfitting problem resulted from the outlier with the approach of fuzzification of the penalty, but also corrects the skew of the optimal separating hyperplane due to the very imbalanced data sets by using different cost algorithm. In addition, by introducing the total margin algorithm to replace the conventional soft margin algorithm, a lower generalization error bound can be obtained. Those three functions are embodied into the traditional SVM so that the TAF-SVM is proposed and reformulated in both linear and nonlinear cases. By using two databases, the Chung Yuan Christian University (CYCU) multiview and the facial recognition technology (FERET) face databases, and using the kernel Fisher's discriminant analysis (KFDA) algorithm to extract discriminating face features, experimental results show that the proposed TAF-SVM is superior to SVM in terms of the face-recognition accuracy. The results also indicate that the proposed TAF-SVM can achieve smaller error variances than SVM over a number of tests such that better recognition stability can be obtained.
Machine learning-based diagnosis of melanoma using macro images.
Gautam, Diwakar; Ahmed, Mushtaq; Meena, Yogesh Kumar; Ul Haq, Ahtesham
2018-05-01
Cancer bears a poisoning threat to human society. Melanoma, the skin cancer, originates from skin layers and penetrates deep into subcutaneous layers. There exists an extensive research in melanoma diagnosis using dermatoscopic images captured through a dermatoscope. While designing a diagnostic model for general handheld imaging systems is an emerging trend, this article proposes a computer-aided decision support system for macro images captured by a general-purpose camera. General imaging conditions are adversely affected by nonuniform illumination, which further affects the extraction of relevant information. To mitigate it, we process an image to define a smooth illumination surface using the multistage illumination compensation approach, and the infected region is extracted using the proposed multimode segmentation method. The lesion information is numerated as a feature set comprising geometry, photometry, border series, and texture measures. The redundancy in feature set is reduced using information theory methods, and a classification boundary is modeled to distinguish benign and malignant samples using support vector machine, random forest, neural network, and fast discriminative mixed-membership-based naive Bayesian classifiers. Moreover, the experimental outcome is supported by hypothesis testing and boxplot representation for classification losses. The simulation results prove the significance of the proposed model that shows an improved performance as compared with competing arts. Copyright © 2017 John Wiley & Sons, Ltd.
NASA Technical Reports Server (NTRS)
Roberts, Dar A.; Green, Robert O.; Sabol, Donald E.; Adams, John B.
1993-01-01
Imaging spectrometry offers a new way of deriving ecological information about vegetation communities from remote sensing. Applications include derivation of canopy chemistry, measurement of column atmospheric water vapor and liquid water, improved detectability of materials, more accurate estimation of green vegetation cover and discrimination of spectrally distinct green leaf, non-photosynthetic vegetation (NPV: litter, wood, bark, etc.) and shade spectra associated with different vegetation communities. Much of our emphasis has been on interpreting Airborne Visible/Infrared Imaging Spectrometry (AVIRIS) data spectral mixtures. Two approaches have been used, simple models, where the data are treated as a mixture of 3 to 4 laboratory/field measured spectra, known as reference endmembers (EM's), applied uniformly to the whole image, to more complex models where both the number of EM's and the types of EM's vary on a per-pixel basis. Where simple models are applied, materials, such as NPV, which are spectrally similar to soils, can be discriminated on the basis of residual spectra. One key aspect is that the data are calibrated to reflectance and modeled as mixtures of reference EM's, permitting temporal comparison of EM fractions, independent of scene location or data type. In previous studies the calibration was performed using a modified-empirical line calibration, assuming a uniform atmosphere across the scene. In this study, a Modtran-based calibration approach was used to map liquid water and atmospheric water vapor and retrieve surface reflectance from three AVIRIS scenes acquired in 1992 over the Jasper Ridge Biological Preserve. The data were acquired on June 2nd, September 4th and October 6th. Reflectance images were analyzed as spectral mixtures of reference EM's using a simple 4 EM model. Atmospheric water vapor derived from Modtran was compared to elevation, and community type. Liquid water was compare to the abundance of NPV, Shade and Green Vegetation
Competency-Based Education Curriculum for Machine Shop. Teacher's Guide.
ERIC Educational Resources Information Center
Associated Educational Consultants, Inc., Pittsburgh, PA.
This teacher's guide is designed to accompany the machine shop competency-based education curriculum for secondary students in West Virginia. It has been developed to facilitate use of the curriculum by instructors of machine shop programs. The teacher's guide contains the following material: an explanation of the curriculum and suggested usage; a…
RVC-CAL library for endmember and abundance estimation in hyperspectral image analysis
NASA Astrophysics Data System (ADS)
Lazcano López, R.; Madroñal Quintín, D.; Juárez Martínez, E.; Sanz Álvaro, C.
2015-10-01
Hyperspectral imaging (HI) collects information from across the electromagnetic spectrum, covering a wide range of wavelengths. Although this technology was initially developed for remote sensing and earth observation, its multiple advantages - such as high spectral resolution - led to its application in other fields, as cancer detection. However, this new field has shown specific requirements; for instance, it needs to accomplish strong time specifications, since all the potential applications - like surgical guidance or in vivo tumor detection - imply real-time requisites. Achieving this time requirements is a great challenge, as hyperspectral images generate extremely high volumes of data to process. Thus, some new research lines are studying new processing techniques, and the most relevant ones are related to system parallelization. In that line, this paper describes the construction of a new hyperspectral processing library for RVC-CAL language, which is specifically designed for multimedia applications and allows multithreading compilation and system parallelization. This paper presents the development of the required library functions to implement two of the four stages of the hyperspectral imaging processing chain--endmember and abundances estimation. The results obtained show that the library achieves speedups of 30%, approximately, comparing to an existing software of hyperspectral images analysis; concretely, the endmember estimation step reaches an average speedup of 27.6%, which saves almost 8 seconds in the execution time. It also shows the existence of some bottlenecks, as the communication interfaces among the different actors due to the volume of data to transfer. Finally, it is shown that the library considerably simplifies the implementation process. Thus, experimental results show the potential of a RVC-CAL library for analyzing hyperspectral images in real-time, as it provides enough resources to study the system performance.
Machine-Learning Approach for Design of Nanomagnetic-Based Antennas
NASA Astrophysics Data System (ADS)
Gianfagna, Carmine; Yu, Huan; Swaminathan, Madhavan; Pulugurtha, Raj; Tummala, Rao; Antonini, Giulio
2017-08-01
We propose a machine-learning approach for design of planar inverted-F antennas with a magneto-dielectric nanocomposite substrate. It is shown that machine-learning techniques can be efficiently used to characterize nanomagnetic-based antennas by accurately mapping the particle radius and volume fraction of the nanomagnetic material to antenna parameters such as gain, bandwidth, radiation efficiency, and resonant frequency. A modified mixing rule model is also presented. In addition, the inverse problem is addressed through machine learning as well, where given the antenna parameters, the corresponding design space of possible material parameters is identified.
Quantum Neural Network Based Machine Translator for Hindi to English
Singh, V. P.; Chakraverty, S.
2014-01-01
This paper presents the machine learning based machine translation system for Hindi to English, which learns the semantically correct corpus. The quantum neural based pattern recognizer is used to recognize and learn the pattern of corpus, using the information of part of speech of individual word in the corpus, like a human. The system performs the machine translation using its knowledge gained during the learning by inputting the pair of sentences of Devnagri-Hindi and English. To analyze the effectiveness of the proposed approach, 2600 sentences have been evaluated during simulation and evaluation. The accuracy achieved on BLEU score is 0.7502, on NIST score is 6.5773, on ROUGE-L score is 0.9233, and on METEOR score is 0.5456, which is significantly higher in comparison with Google Translation and Bing Translation for Hindi to English Machine Translation. PMID:24977198
Quantum neural network based machine translator for Hindi to English.
Narayan, Ravi; Singh, V P; Chakraverty, S
2014-01-01
This paper presents the machine learning based machine translation system for Hindi to English, which learns the semantically correct corpus. The quantum neural based pattern recognizer is used to recognize and learn the pattern of corpus, using the information of part of speech of individual word in the corpus, like a human. The system performs the machine translation using its knowledge gained during the learning by inputting the pair of sentences of Devnagri-Hindi and English. To analyze the effectiveness of the proposed approach, 2600 sentences have been evaluated during simulation and evaluation. The accuracy achieved on BLEU score is 0.7502, on NIST score is 6.5773, on ROUGE-L score is 0.9233, and on METEOR score is 0.5456, which is significantly higher in comparison with Google Translation and Bing Translation for Hindi to English Machine Translation.
NASA Astrophysics Data System (ADS)
Fan, Fenglei; Deng, Yingbin
2015-04-01
Since publication, the authors have been advised of one prior methodological article not cited in our original paper. The missed reference from the paper is "Andreou, C., Karathanassi, V., 2012. A novel multiple endmember spectral mixture analysis using spectral angle distance. Geoscience and Remote Sensing Symposium (IGARSS) 2012 IEEE International, 4110-4113." This reference "Andreou, C., Karathanassi, V., 2012 in proceeding of IGARSS" should be cited in the Part 3 and Table 1, whereby in the fist sentence of Part 3 and the last sentence of caption of Table 1. The following should be added (in Part 3: According to the traditional MESMA and referring to MESMA-SAD method which is reported by Andreou and Karathanassi; in caption of Table 1: Based on the MESMA-SAD method of Andreou and Karathanassi). We apologize to the authors and workers concerned for this oversight and are pleased to acknowledge their contributions.
Pereira, Sérgio; Meier, Raphael; McKinley, Richard; Wiest, Roland; Alves, Victor; Silva, Carlos A; Reyes, Mauricio
2018-02-01
Machine learning systems are achieving better performances at the cost of becoming increasingly complex. However, because of that, they become less interpretable, which may cause some distrust by the end-user of the system. This is especially important as these systems are pervasively being introduced to critical domains, such as the medical field. Representation Learning techniques are general methods for automatic feature computation. Nevertheless, these techniques are regarded as uninterpretable "black boxes". In this paper, we propose a methodology to enhance the interpretability of automatically extracted machine learning features. The proposed system is composed of a Restricted Boltzmann Machine for unsupervised feature learning, and a Random Forest classifier, which are combined to jointly consider existing correlations between imaging data, features, and target variables. We define two levels of interpretation: global and local. The former is devoted to understanding if the system learned the relevant relations in the data correctly, while the later is focused on predictions performed on a voxel- and patient-level. In addition, we propose a novel feature importance strategy that considers both imaging data and target variables, and we demonstrate the ability of the approach to leverage the interpretability of the obtained representation for the task at hand. We evaluated the proposed methodology in brain tumor segmentation and penumbra estimation in ischemic stroke lesions. We show the ability of the proposed methodology to unveil information regarding relationships between imaging modalities and extracted features and their usefulness for the task at hand. In both clinical scenarios, we demonstrate that the proposed methodology enhances the interpretability of automatically learned features, highlighting specific learning patterns that resemble how an expert extracts relevant data from medical images. Copyright © 2017 Elsevier B.V. All rights reserved.
Prediction on sunspot activity based on fuzzy information granulation and support vector machine
NASA Astrophysics Data System (ADS)
Peng, Lingling; Yan, Haisheng; Yang, Zhigang
2018-04-01
In order to analyze the range of sunspots, a combined prediction method of forecasting the fluctuation range of sunspots based on fuzzy information granulation (FIG) and support vector machine (SVM) was put forward. Firstly, employing the FIG to granulate sample data and extract va)alid information of each window, namely the minimum value, the general average value and the maximum value of each window. Secondly, forecasting model is built respectively with SVM and then cross method is used to optimize these parameters. Finally, the fluctuation range of sunspots is forecasted with the optimized SVM model. Case study demonstrates that the model have high accuracy and can effectively predict the fluctuation of sunspots.
Virtual Machine Language Controls Remote Devices
NASA Technical Reports Server (NTRS)
2014-01-01
Kennedy Space Center worked with Blue Sun Enterprises, based in Boulder, Colorado, to enhance the company's virtual machine language (VML) to control the instruments on the Regolith and Environment Science and Oxygen and Lunar Volatiles Extraction mission. Now the NASA-improved VML is available for crewed and uncrewed spacecraft, and has potential applications on remote systems such as weather balloons, unmanned aerial vehicles, and submarines.
Shim, Miseon; Hwang, Han-Jeong; Kim, Do-Won; Lee, Seung-Hwan; Im, Chang-Hwan
2016-10-01
Recently, an increasing number of researchers have endeavored to develop practical tools for diagnosing patients with schizophrenia using machine learning techniques applied to EEG biomarkers. Although a number of studies showed that source-level EEG features can potentially be applied to the differential diagnosis of schizophrenia, most studies have used only sensor-level EEG features such as ERP peak amplitude and power spectrum for machine learning-based diagnosis of schizophrenia. In this study, we used both sensor-level and source-level features extracted from EEG signals recorded during an auditory oddball task for the classification of patients with schizophrenia and healthy controls. EEG signals were recorded from 34 patients with schizophrenia and 34 healthy controls while each subject was asked to attend to oddball tones. Our results demonstrated higher classification accuracy when source-level features were used together with sensor-level features, compared to when only sensor-level features were used. In addition, the selected sensor-level features were mostly found in the frontal area, and the selected source-level features were mostly extracted from the temporal area, which coincide well with the well-known pathological region of cognitive processing in patients with schizophrenia. Our results suggest that our approach would be a promising tool for the computer-aided diagnosis of schizophrenia. Copyright © 2016 Elsevier B.V. All rights reserved.
A Fault Alarm and Diagnosis Method Based on Sensitive Parameters and Support Vector Machine
NASA Astrophysics Data System (ADS)
Zhang, Jinjie; Yao, Ziyun; Lv, Zhiquan; Zhu, Qunxiong; Xu, Fengtian; Jiang, Zhinong
2015-08-01
Study on the extraction of fault feature and the diagnostic technique of reciprocating compressor is one of the hot research topics in the field of reciprocating machinery fault diagnosis at present. A large number of feature extraction and classification methods have been widely applied in the related research, but the practical fault alarm and the accuracy of diagnosis have not been effectively improved. Developing feature extraction and classification methods to meet the requirements of typical fault alarm and automatic diagnosis in practical engineering is urgent task. The typical mechanical faults of reciprocating compressor are presented in the paper, and the existing data of online monitoring system is used to extract fault feature parameters within 15 types in total; the inner sensitive connection between faults and the feature parameters has been made clear by using the distance evaluation technique, also sensitive characteristic parameters of different faults have been obtained. On this basis, a method based on fault feature parameters and support vector machine (SVM) is developed, which will be applied to practical fault diagnosis. A better ability of early fault warning has been proved by the experiment and the practical fault cases. Automatic classification by using the SVM to the data of fault alarm has obtained better diagnostic accuracy.
Mining protein function from text using term-based support vector machines
Rice, Simon B; Nenadic, Goran; Stapley, Benjamin J
2005-01-01
Background Text mining has spurred huge interest in the domain of biology. The goal of the BioCreAtIvE exercise was to evaluate the performance of current text mining systems. We participated in Task 2, which addressed assigning Gene Ontology terms to human proteins and selecting relevant evidence from full-text documents. We approached it as a modified form of the document classification task. We used a supervised machine-learning approach (based on support vector machines) to assign protein function and select passages that support the assignments. As classification features, we used a protein's co-occurring terms that were automatically extracted from documents. Results The results evaluated by curators were modest, and quite variable for different problems: in many cases we have relatively good assignment of GO terms to proteins, but the selected supporting text was typically non-relevant (precision spanning from 3% to 50%). The method appears to work best when a substantial set of relevant documents is obtained, while it works poorly on single documents and/or short passages. The initial results suggest that our approach can also mine annotations from text even when an explicit statement relating a protein to a GO term is absent. Conclusion A machine learning approach to mining protein function predictions from text can yield good performance only if sufficient training data is available, and significant amount of supporting data is used for prediction. The most promising results are for combined document retrieval and GO term assignment, which calls for the integration of methods developed in BioCreAtIvE Task 1 and Task 2. PMID:15960835
Product Quality Modelling Based on Incremental Support Vector Machine
NASA Astrophysics Data System (ADS)
Wang, J.; Zhang, W.; Qin, B.; Shi, W.
2012-05-01
Incremental Support vector machine (ISVM) is a new learning method developed in recent years based on the foundations of statistical learning theory. It is suitable for the problem of sequentially arriving field data and has been widely used for product quality prediction and production process optimization. However, the traditional ISVM learning does not consider the quality of the incremental data which may contain noise and redundant data; it will affect the learning speed and accuracy to a great extent. In order to improve SVM training speed and accuracy, a modified incremental support vector machine (MISVM) is proposed in this paper. Firstly, the margin vectors are extracted according to the Karush-Kuhn-Tucker (KKT) condition; then the distance from the margin vectors to the final decision hyperplane is calculated to evaluate the importance of margin vectors, where the margin vectors are removed while their distance exceed the specified value; finally, the original SVs and remaining margin vectors are used to update the SVM. The proposed MISVM can not only eliminate the unimportant samples such as noise samples, but also can preserve the important samples. The MISVM has been experimented on two public data and one field data of zinc coating weight in strip hot-dip galvanizing, and the results shows that the proposed method can improve the prediction accuracy and the training speed effectively. Furthermore, it can provide the necessary decision supports and analysis tools for auto control of product quality, and also can extend to other process industries, such as chemical process and manufacturing process.
Weng, Wei-Hung; Wagholikar, Kavishwar B; McCray, Alexa T; Szolovits, Peter; Chueh, Henry C
2017-12-01
The medical subdomain of a clinical note, such as cardiology or neurology, is useful content-derived metadata for developing machine learning downstream applications. To classify the medical subdomain of a note accurately, we have constructed a machine learning-based natural language processing (NLP) pipeline and developed medical subdomain classifiers based on the content of the note. We constructed the pipeline using the clinical NLP system, clinical Text Analysis and Knowledge Extraction System (cTAKES), the Unified Medical Language System (UMLS) Metathesaurus, Semantic Network, and learning algorithms to extract features from two datasets - clinical notes from Integrating Data for Analysis, Anonymization, and Sharing (iDASH) data repository (n = 431) and Massachusetts General Hospital (MGH) (n = 91,237), and built medical subdomain classifiers with different combinations of data representation methods and supervised learning algorithms. We evaluated the performance of classifiers and their portability across the two datasets. The convolutional recurrent neural network with neural word embeddings trained-medical subdomain classifier yielded the best performance measurement on iDASH and MGH datasets with area under receiver operating characteristic curve (AUC) of 0.975 and 0.991, and F1 scores of 0.845 and 0.870, respectively. Considering better clinical interpretability, linear support vector machine-trained medical subdomain classifier using hybrid bag-of-words and clinically relevant UMLS concepts as the feature representation, with term frequency-inverse document frequency (tf-idf)-weighting, outperformed other shallow learning classifiers on iDASH and MGH datasets with AUC of 0.957 and 0.964, and F1 scores of 0.932 and 0.934 respectively. We trained classifiers on one dataset, applied to the other dataset and yielded the threshold of F1 score of 0.7 in classifiers for half of the medical subdomains we studied. Our study shows that a supervised
Temperature based Restricted Boltzmann Machines
NASA Astrophysics Data System (ADS)
Li, Guoqi; Deng, Lei; Xu, Yi; Wen, Changyun; Wang, Wei; Pei, Jing; Shi, Luping
2016-01-01
Restricted Boltzmann machines (RBMs), which apply graphical models to learning probability distribution over a set of inputs, have attracted much attention recently since being proposed as building blocks of multi-layer learning systems called deep belief networks (DBNs). Note that temperature is a key factor of the Boltzmann distribution that RBMs originate from. However, none of existing schemes have considered the impact of temperature in the graphical model of DBNs. In this work, we propose temperature based restricted Boltzmann machines (TRBMs) which reveals that temperature is an essential parameter controlling the selectivity of the firing neurons in the hidden layers. We theoretically prove that the effect of temperature can be adjusted by setting the parameter of the sharpness of the logistic function in the proposed TRBMs. The performance of RBMs can be improved by adjusting the temperature parameter of TRBMs. This work provides a comprehensive insights into the deep belief networks and deep learning architectures from a physical point of view.
Chen, Lili; Hao, Yaru
2017-01-01
Preterm birth (PTB) is the leading cause of perinatal mortality and long-term morbidity, which results in significant health and economic problems. The early detection of PTB has great significance for its prevention. The electrohysterogram (EHG) related to uterine contraction is a noninvasive, real-time, and automatic novel technology which can be used to detect, diagnose, or predict PTB. This paper presents a method for feature extraction and classification of EHG between pregnancy and labour group, based on Hilbert-Huang transform (HHT) and extreme learning machine (ELM). For each sample, each channel was decomposed into a set of intrinsic mode functions (IMFs) using empirical mode decomposition (EMD). Then, the Hilbert transform was applied to IMF to obtain analytic function. The maximum amplitude of analytic function was extracted as feature. The identification model was constructed based on ELM. Experimental results reveal that the best classification performance of the proposed method can reach an accuracy of 88.00%, a sensitivity of 91.30%, and a specificity of 85.19%. The area under receiver operating characteristic (ROC) curve is 0.88. Finally, experimental results indicate that the method developed in this work could be effective in the classification of EHG between pregnancy and labour group.
Knowledge-based load leveling and task allocation in human-machine systems
NASA Technical Reports Server (NTRS)
Chignell, M. H.; Hancock, P. A.
1986-01-01
Conventional human-machine systems use task allocation policies which are based on the premise of a flexible human operator. This individual is most often required to compensate for and augment the capabilities of the machine. The development of artificial intelligence and improved technologies have allowed for a wider range of task allocation strategies. In response to these issues a Knowledge Based Adaptive Mechanism (KBAM) is proposed for assigning tasks to human and machine in real time, using a load leveling policy. This mechanism employs an online workload assessment and compensation system which is responsive to variations in load through an intelligent interface. This interface consists of a loading strategy reasoner which has access to information about the current status of the human-machine system as well as a database of admissible human/machine loading strategies. Difficulties standing in the way of successful implementation of the load leveling strategy are examined.
Xie, Hong-Bo; Huang, Hu; Wu, Jianhua; Liu, Lei
2015-02-01
We present a multiclass fuzzy relevance vector machine (FRVM) learning mechanism and evaluate its performance to classify multiple hand motions using surface electromyographic (sEMG) signals. The relevance vector machine (RVM) is a sparse Bayesian kernel method which avoids some limitations of the support vector machine (SVM). However, RVM still suffers the difficulty of possible unclassifiable regions in multiclass problems. We propose two fuzzy membership function-based FRVM algorithms to solve such problems, based on experiments conducted on seven healthy subjects and two amputees with six hand motions. Two feature sets, namely, AR model coefficients and room mean square value (AR-RMS), and wavelet transform (WT) features, are extracted from the recorded sEMG signals. Fuzzy support vector machine (FSVM) analysis was also conducted for wide comparison in terms of accuracy, sparsity, training and testing time, as well as the effect of training sample sizes. FRVM yielded comparable classification accuracy with dramatically fewer support vectors in comparison with FSVM. Furthermore, the processing delay of FRVM was much less than that of FSVM, whilst training time of FSVM much faster than FRVM. The results indicate that FRVM classifier trained using sufficient samples can achieve comparable generalization capability as FSVM with significant sparsity in multi-channel sEMG classification, which is more suitable for sEMG-based real-time control applications.
Forecasting Solar Flares Using Magnetogram-based Predictors and Machine Learning
NASA Astrophysics Data System (ADS)
Florios, Kostas; Kontogiannis, Ioannis; Park, Sung-Hong; Guerra, Jordan A.; Benvenuto, Federico; Bloomfield, D. Shaun; Georgoulis, Manolis K.
2018-02-01
We propose a forecasting approach for solar flares based on data from Solar Cycle 24, taken by the Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO) mission. In particular, we use the Space-weather HMI Active Region Patches (SHARP) product that facilitates cut-out magnetograms of solar active regions (AR) in the Sun in near-realtime (NRT), taken over a five-year interval (2012 - 2016). Our approach utilizes a set of thirteen predictors, which are not included in the SHARP metadata, extracted from line-of-sight and vector photospheric magnetograms. We exploit several machine learning (ML) and conventional statistics techniques to predict flares of peak magnitude {>} M1 and {>} C1 within a 24 h forecast window. The ML methods used are multi-layer perceptrons (MLP), support vector machines (SVM), and random forests (RF). We conclude that random forests could be the prediction technique of choice for our sample, with the second-best method being multi-layer perceptrons, subject to an entropy objective function. A Monte Carlo simulation showed that the best-performing method gives accuracy ACC=0.93(0.00), true skill statistic TSS=0.74(0.02), and Heidke skill score HSS=0.49(0.01) for {>} M1 flare prediction with probability threshold 15% and ACC=0.84(0.00), TSS=0.60(0.01), and HSS=0.59(0.01) for {>} C1 flare prediction with probability threshold 35%.
Machine learning-based in-line holographic sensing of unstained malaria-infected red blood cells.
Go, Taesik; Kim, Jun H; Byeon, Hyeokjun; Lee, Sang J
2018-04-19
Accurate and immediate diagnosis of malaria is important for medication of the infectious disease. Conventional methods for diagnosing malaria are time consuming and rely on the skill of experts. Therefore, an automatic and simple diagnostic modality is essential for healthcare in developing countries that lack the expertise of trained microscopists. In the present study, a new automatic sensing method using digital in-line holographic microscopy (DIHM) combined with machine learning algorithms was proposed to sensitively detect unstained malaria-infected red blood cells (iRBCs). To identify the RBC characteristics, 13 descriptors were extracted from segmented holograms of individual RBCs. Among the 13 descriptors, 10 features were highly statistically different between healthy RBCs (hRBCs) and iRBCs. Six machine learning algorithms were applied to effectively combine the dominant features and to greatly improve the diagnostic capacity of the present method. Among the classification models trained by the 6 tested algorithms, the model trained by the support vector machine (SVM) showed the best accuracy in separating hRBCs and iRBCs for training (n = 280, 96.78%) and testing sets (n = 120, 97.50%). This DIHM-based artificial intelligence methodology is simple and does not require blood staining. Thus, it will be beneficial and valuable in the diagnosis of malaria. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
MLACP: machine-learning-based prediction of anticancer peptides
Manavalan, Balachandran; Basith, Shaherin; Shin, Tae Hwan; Choi, Sun; Kim, Myeong Ok; Lee, Gwang
2017-01-01
Cancer is the second leading cause of death globally, and use of therapeutic peptides to target and kill cancer cells has received considerable attention in recent years. Identification of anticancer peptides (ACPs) through wet-lab experimentation is expensive and often time consuming; therefore, development of an efficient computational method is essential to identify potential ACP candidates prior to in vitro experimentation. In this study, we developed support vector machine- and random forest-based machine-learning methods for the prediction of ACPs using the features calculated from the amino acid sequence, including amino acid composition, dipeptide composition, atomic composition, and physicochemical properties. We trained our methods using the Tyagi-B dataset and determined the machine parameters by 10-fold cross-validation. Furthermore, we evaluated the performance of our methods on two benchmarking datasets, with our results showing that the random forest-based method outperformed the existing methods with an average accuracy and Matthews correlation coefficient value of 88.7% and 0.78, respectively. To assist the scientific community, we also developed a publicly accessible web server at www.thegleelab.org/MLACP.html. PMID:29100375
Casey, M
1996-08-15
Recurrent neural networks (RNNs) can learn to perform finite state computations. It is shown that an RNN performing a finite state computation must organize its state space to mimic the states in the minimal deterministic finite state machine that can perform that computation, and a precise description of the attractor structure of such systems is given. This knowledge effectively predicts activation space dynamics, which allows one to understand RNN computation dynamics in spite of complexity in activation dynamics. This theory provides a theoretical framework for understanding finite state machine (FSM) extraction techniques and can be used to improve training methods for RNNs performing FSM computations. This provides an example of a successful approach to understanding a general class of complex systems that has not been explicitly designed, e.g., systems that have evolved or learned their internal structure.
Meng, Qier; Kitasaka, Takayuki; Nimura, Yukitaka; Oda, Masahiro; Ueno, Junji; Mori, Kensaku
2017-02-01
Airway segmentation plays an important role in analyzing chest computed tomography (CT) volumes for computerized lung cancer detection, emphysema diagnosis and pre- and intra-operative bronchoscope navigation. However, obtaining a complete 3D airway tree structure from a CT volume is quite a challenging task. Several researchers have proposed automated airway segmentation algorithms basically based on region growing and machine learning techniques. However, these methods fail to detect the peripheral bronchial branches, which results in a large amount of leakage. This paper presents a novel approach for more accurate extraction of the complex airway tree. This proposed segmentation method is composed of three steps. First, Hessian analysis is utilized to enhance the tube-like structure in CT volumes; then, an adaptive multiscale cavity enhancement filter is employed to detect the cavity-like structure with different radii. In the second step, support vector machine learning will be utilized to remove the false positive (FP) regions from the result obtained in the previous step. Finally, the graph-cut algorithm is used to refine the candidate voxels to form an integrated airway tree. A test dataset including 50 standard-dose chest CT volumes was used for evaluating our proposed method. The average extraction rate was about 79.1 % with the significantly decreased FP rate. A new method of airway segmentation based on local intensity structure and machine learning technique was developed. The method was shown to be feasible for airway segmentation in a computer-aided diagnosis system for a lung and bronchoscope guidance system.
Hu, Yu-Chuan; Li, Gang; Yang, Yang; Han, Yu; Sun, Ying-Zhi; Liu, Zhi-Cheng; Tian, Qiang; Han, Zi-Yang; Liu, Le-De; Hu, Bin-Quan; Qiu, Zi-Yu; Wang, Wen; Cui, Guang-Bin
2017-01-01
Current machine learning techniques provide the opportunity to develop noninvasive and automated glioma grading tools, by utilizing quantitative parameters derived from multi-modal magnetic resonance imaging (MRI) data. However, the efficacies of different machine learning methods in glioma grading have not been investigated.A comprehensive comparison of varied machine learning methods in differentiating low-grade gliomas (LGGs) and high-grade gliomas (HGGs) as well as WHO grade II, III and IV gliomas based on multi-parametric MRI images was proposed in the current study. The parametric histogram and image texture attributes of 120 glioma patients were extracted from the perfusion, diffusion and permeability parametric maps of preoperative MRI. Then, 25 commonly used machine learning classifiers combined with 8 independent attribute selection methods were applied and evaluated using leave-one-out cross validation (LOOCV) strategy. Besides, the influences of parameter selection on the classifying performances were investigated. We found that support vector machine (SVM) exhibited superior performance to other classifiers. By combining all tumor attributes with synthetic minority over-sampling technique (SMOTE), the highest classifying accuracy of 0.945 or 0.961 for LGG and HGG or grade II, III and IV gliomas was achieved. Application of Recursive Feature Elimination (RFE) attribute selection strategy further improved the classifying accuracies. Besides, the performances of LibSVM, SMO, IBk classifiers were influenced by some key parameters such as kernel type, c, gama, K, etc. SVM is a promising tool in developing automated preoperative glioma grading system, especially when being combined with RFE strategy. Model parameters should be considered in glioma grading model optimization. PMID:28599282
Zhang, Xin; Yan, Lin-Feng; Hu, Yu-Chuan; Li, Gang; Yang, Yang; Han, Yu; Sun, Ying-Zhi; Liu, Zhi-Cheng; Tian, Qiang; Han, Zi-Yang; Liu, Le-De; Hu, Bin-Quan; Qiu, Zi-Yu; Wang, Wen; Cui, Guang-Bin
2017-07-18
Current machine learning techniques provide the opportunity to develop noninvasive and automated glioma grading tools, by utilizing quantitative parameters derived from multi-modal magnetic resonance imaging (MRI) data. However, the efficacies of different machine learning methods in glioma grading have not been investigated.A comprehensive comparison of varied machine learning methods in differentiating low-grade gliomas (LGGs) and high-grade gliomas (HGGs) as well as WHO grade II, III and IV gliomas based on multi-parametric MRI images was proposed in the current study. The parametric histogram and image texture attributes of 120 glioma patients were extracted from the perfusion, diffusion and permeability parametric maps of preoperative MRI. Then, 25 commonly used machine learning classifiers combined with 8 independent attribute selection methods were applied and evaluated using leave-one-out cross validation (LOOCV) strategy. Besides, the influences of parameter selection on the classifying performances were investigated. We found that support vector machine (SVM) exhibited superior performance to other classifiers. By combining all tumor attributes with synthetic minority over-sampling technique (SMOTE), the highest classifying accuracy of 0.945 or 0.961 for LGG and HGG or grade II, III and IV gliomas was achieved. Application of Recursive Feature Elimination (RFE) attribute selection strategy further improved the classifying accuracies. Besides, the performances of LibSVM, SMO, IBk classifiers were influenced by some key parameters such as kernel type, c, gama, K, etc. SVM is a promising tool in developing automated preoperative glioma grading system, especially when being combined with RFE strategy. Model parameters should be considered in glioma grading model optimization.
Experimental investigation of the tip based micro/nano machining
NASA Astrophysics Data System (ADS)
Guo, Z.; Tian, Y.; Liu, X.; Wang, F.; Zhou, C.; Zhang, D.
2017-12-01
Based on the self-developed three dimensional micro/nano machining system, the effects of machining parameters and sample material on micro/nano machining are investigated. The micro/nano machining system is mainly composed of the probe system and micro/nano positioning stage. The former is applied to control the normal load and the latter is utilized to realize high precision motion in the xy plane. A sample examination method is firstly introduced to estimate whether the sample is placed horizontally. The machining parameters include scratching direction, speed, cycles, normal load and feed. According to the experimental results, the scratching depth is significantly affected by the normal load in all four defined scratching directions but is rarely influenced by the scratching speed. The increase of scratching cycle number can increase the scratching depth as well as smooth the groove wall. In addition, the scratching tests of silicon and copper attest that the harder material is easier to be removed. In the scratching with different feed amount, the machining results indicate that the machined depth increases as the feed reduces. Further, a cubic polynomial is used to fit the experimental results to predict the scratching depth. With the selected machining parameters of scratching direction d3/d4, scratching speed 5 μm/s and feed 0.06 μm, some more micro structures including stair, sinusoidal groove, Chinese character '田', 'TJU' and Chinese panda have been fabricated on the silicon substrate.
Analytical Model-Based Design Optimization of a Transverse Flux Machine
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hasan, Iftekhar; Husain, Tausif; Sozer, Yilmaz
This paper proposes an analytical machine design tool using magnetic equivalent circuit (MEC)-based particle swarm optimization (PSO) for a double-sided, flux-concentrating transverse flux machine (TFM). The magnetic equivalent circuit method is applied to analytically establish the relationship between the design objective and the input variables of prospective TFM designs. This is computationally less intensive and more time efficient than finite element solvers. A PSO algorithm is then used to design a machine with the highest torque density within the specified power range along with some geometric design constraints. The stator pole length, magnet length, and rotor thickness are the variablesmore » that define the optimization search space. Finite element analysis (FEA) was carried out to verify the performance of the MEC-PSO optimized machine. The proposed analytical design tool helps save computation time by at least 50% when compared to commercial FEA-based optimization programs, with results found to be in agreement with less than 5% error.« less
HMM for hyperspectral spectrum representation and classification with endmember entropy vectors
NASA Astrophysics Data System (ADS)
Arabi, Samir Y. W.; Fernandes, David; Pizarro, Marco A.
2015-10-01
The Hyperspectral images due to its good spectral resolution are extensively used for classification, but its high number of bands requires a higher bandwidth in the transmission data, a higher data storage capability and a higher computational capability in processing systems. This work presents a new methodology for hyperspectral data classification that can work with a reduced number of spectral bands and achieve good results, comparable with processing methods that require all hyperspectral bands. The proposed method for hyperspectral spectra classification is based on the Hidden Markov Model (HMM) associated to each Endmember (EM) of a scene and the conditional probabilities of each EM belongs to each other EM. The EM conditional probability is transformed in EM vector entropy and those vectors are used as reference vectors for the classes in the scene. The conditional probability of a spectrum that will be classified is also transformed in a spectrum entropy vector, which is classified in a given class by the minimum ED (Euclidian Distance) among it and the EM entropy vectors. The methodology was tested with good results using AVIRIS spectra of a scene with 13 EM considering the full 209 bands and the reduced spectral bands of 128, 64 and 32. For the test area its show that can be used only 32 spectral bands instead of the original 209 bands, without significant loss in the classification process.
Machine learning and computer vision approaches for phenotypic profiling
Morris, Quaid
2017-01-01
With recent advances in high-throughput, automated microscopy, there has been an increased demand for effective computational strategies to analyze large-scale, image-based data. To this end, computer vision approaches have been applied to cell segmentation and feature extraction, whereas machine-learning approaches have been developed to aid in phenotypic classification and clustering of data acquired from biological images. Here, we provide an overview of the commonly used computer vision and machine-learning methods for generating and categorizing phenotypic profiles, highlighting the general biological utility of each approach. PMID:27940887
Machine learning and computer vision approaches for phenotypic profiling.
Grys, Ben T; Lo, Dara S; Sahin, Nil; Kraus, Oren Z; Morris, Quaid; Boone, Charles; Andrews, Brenda J
2017-01-02
With recent advances in high-throughput, automated microscopy, there has been an increased demand for effective computational strategies to analyze large-scale, image-based data. To this end, computer vision approaches have been applied to cell segmentation and feature extraction, whereas machine-learning approaches have been developed to aid in phenotypic classification and clustering of data acquired from biological images. Here, we provide an overview of the commonly used computer vision and machine-learning methods for generating and categorizing phenotypic profiles, highlighting the general biological utility of each approach. © 2017 Grys et al.
Machine learning-based methods for prediction of linear B-cell epitopes.
Wang, Hsin-Wei; Pai, Tun-Wen
2014-01-01
B-cell epitope prediction facilitates immunologists in designing peptide-based vaccine, diagnostic test, disease prevention, treatment, and antibody production. In comparison with T-cell epitope prediction, the performance of variable length B-cell epitope prediction is still yet to be satisfied. Fortunately, due to increasingly available verified epitope databases, bioinformaticians could adopt machine learning-based algorithms on all curated data to design an improved prediction tool for biomedical researchers. Here, we have reviewed related epitope prediction papers, especially those for linear B-cell epitope prediction. It should be noticed that a combination of selected propensity scales and statistics of epitope residues with machine learning-based tools formulated a general way for constructing linear B-cell epitope prediction systems. It is also observed from most of the comparison results that the kernel method of support vector machine (SVM) classifier outperformed other machine learning-based approaches. Hence, in this chapter, except reviewing recently published papers, we have introduced the fundamentals of B-cell epitope and SVM techniques. In addition, an example of linear B-cell prediction system based on physicochemical features and amino acid combinations is illustrated in details.
Towards a generalized energy prediction model for machine tools
Bhinge, Raunak; Park, Jinkyoo; Law, Kincho H.; Dornfeld, David A.; Helu, Moneer; Rachuri, Sudarsan
2017-01-01
Energy prediction of machine tools can deliver many advantages to a manufacturing enterprise, ranging from energy-efficient process planning to machine tool monitoring. Physics-based, energy prediction models have been proposed in the past to understand the energy usage pattern of a machine tool. However, uncertainties in both the machine and the operating environment make it difficult to predict the energy consumption of the target machine reliably. Taking advantage of the opportunity to collect extensive, contextual, energy-consumption data, we discuss a data-driven approach to develop an energy prediction model of a machine tool in this paper. First, we present a methodology that can efficiently and effectively collect and process data extracted from a machine tool and its sensors. We then present a data-driven model that can be used to predict the energy consumption of the machine tool for machining a generic part. Specifically, we use Gaussian Process (GP) Regression, a non-parametric machine-learning technique, to develop the prediction model. The energy prediction model is then generalized over multiple process parameters and operations. Finally, we apply this generalized model with a method to assess uncertainty intervals to predict the energy consumed to machine any part using a Mori Seiki NVD1500 machine tool. Furthermore, the same model can be used during process planning to optimize the energy-efficiency of a machining process. PMID:28652687
Towards a generalized energy prediction model for machine tools.
Bhinge, Raunak; Park, Jinkyoo; Law, Kincho H; Dornfeld, David A; Helu, Moneer; Rachuri, Sudarsan
2017-04-01
Energy prediction of machine tools can deliver many advantages to a manufacturing enterprise, ranging from energy-efficient process planning to machine tool monitoring. Physics-based, energy prediction models have been proposed in the past to understand the energy usage pattern of a machine tool. However, uncertainties in both the machine and the operating environment make it difficult to predict the energy consumption of the target machine reliably. Taking advantage of the opportunity to collect extensive, contextual, energy-consumption data, we discuss a data-driven approach to develop an energy prediction model of a machine tool in this paper. First, we present a methodology that can efficiently and effectively collect and process data extracted from a machine tool and its sensors. We then present a data-driven model that can be used to predict the energy consumption of the machine tool for machining a generic part. Specifically, we use Gaussian Process (GP) Regression, a non-parametric machine-learning technique, to develop the prediction model. The energy prediction model is then generalized over multiple process parameters and operations. Finally, we apply this generalized model with a method to assess uncertainty intervals to predict the energy consumed to machine any part using a Mori Seiki NVD1500 machine tool. Furthermore, the same model can be used during process planning to optimize the energy-efficiency of a machining process.
Information extraction from multi-institutional radiology reports.
Hassanpour, Saeed; Langlotz, Curtis P
2016-01-01
also evaluated the generalizability of our approach across different organizations by training and testing our system on data from different organizations. Our results show the efficacy of our machine learning approach in extracting the information model's elements (10-fold cross-validation average performance: precision: 87%, recall: 84%, F1 score: 85%) and its superiority and generalizability compared to the common non-machine learning approach (p-value<0.05). Our machine learning information extraction approach provides an effective automatic method to annotate and extract clinically significant information from a large collection of free text radiology reports. This information extraction system can help clinicians better understand the radiology reports and prioritize their review process. In addition, the extracted information can be used by researchers to link radiology reports to information from other data sources such as electronic health records and the patient's genome. Extracted information also can facilitate disease surveillance, real-time clinical decision support for the radiologist, and content-based image retrieval. Copyright © 2015 Elsevier B.V. All rights reserved.
Singular value decomposition based feature extraction technique for physiological signal analysis.
Chang, Cheng-Ding; Wang, Chien-Chih; Jiang, Bernard C
2012-06-01
Multiscale entropy (MSE) is one of the popular techniques to calculate and describe the complexity of the physiological signal. Many studies use this approach to detect changes in the physiological conditions in the human body. However, MSE results are easily affected by noise and trends, leading to incorrect estimation of MSE values. In this paper, singular value decomposition (SVD) is adopted to replace MSE to extract the features of physiological signals, and adopt the support vector machine (SVM) to classify the different physiological states. A test data set based on the PhysioNet website was used, and the classification results showed that using SVD to extract features of the physiological signal could attain a classification accuracy rate of 89.157%, which is higher than that using the MSE value (71.084%). The results show the proposed analysis procedure is effective and appropriate for distinguishing different physiological states. This promising result could be used as a reference for doctors in diagnosis of congestive heart failure (CHF) disease.
Prediction of drug synergy in cancer using ensemble-based machine learning techniques
NASA Astrophysics Data System (ADS)
Singh, Harpreet; Rana, Prashant Singh; Singh, Urvinder
2018-04-01
Drug synergy prediction plays a significant role in the medical field for inhibiting specific cancer agents. It can be developed as a pre-processing tool for therapeutic successes. Examination of different drug-drug interaction can be done by drug synergy score. It needs efficient regression-based machine learning approaches to minimize the prediction errors. Numerous machine learning techniques such as neural networks, support vector machines, random forests, LASSO, Elastic Nets, etc., have been used in the past to realize requirement as mentioned above. However, these techniques individually do not provide significant accuracy in drug synergy score. Therefore, the primary objective of this paper is to design a neuro-fuzzy-based ensembling approach. To achieve this, nine well-known machine learning techniques have been implemented by considering the drug synergy data. Based on the accuracy of each model, four techniques with high accuracy are selected to develop ensemble-based machine learning model. These models are Random forest, Fuzzy Rules Using Genetic Cooperative-Competitive Learning method (GFS.GCCL), Adaptive-Network-Based Fuzzy Inference System (ANFIS) and Dynamic Evolving Neural-Fuzzy Inference System method (DENFIS). Ensembling is achieved by evaluating the biased weighted aggregation (i.e. adding more weights to the model with a higher prediction score) of predicted data by selected models. The proposed and existing machine learning techniques have been evaluated on drug synergy score data. The comparative analysis reveals that the proposed method outperforms others in terms of accuracy, root mean square error and coefficient of correlation.
Epileptic seizure detection in EEG signal using machine learning techniques.
Jaiswal, Abeg Kumar; Banka, Haider
2018-03-01
Epilepsy is a well-known nervous system disorder characterized by seizures. Electroencephalograms (EEGs), which capture brain neural activity, can detect epilepsy. Traditional methods for analyzing an EEG signal for epileptic seizure detection are time-consuming. Recently, several automated seizure detection frameworks using machine learning technique have been proposed to replace these traditional methods. The two basic steps involved in machine learning are feature extraction and classification. Feature extraction reduces the input pattern space by keeping informative features and the classifier assigns the appropriate class label. In this paper, we propose two effective approaches involving subpattern based PCA (SpPCA) and cross-subpattern correlation-based PCA (SubXPCA) with Support Vector Machine (SVM) for automated seizure detection in EEG signals. Feature extraction was performed using SpPCA and SubXPCA. Both techniques explore the subpattern correlation of EEG signals, which helps in decision-making process. SVM is used for classification of seizure and non-seizure EEG signals. The SVM was trained with radial basis kernel. All the experiments have been carried out on the benchmark epilepsy EEG dataset. The entire dataset consists of 500 EEG signals recorded under different scenarios. Seven different experimental cases for classification have been conducted. The classification accuracy was evaluated using tenfold cross validation. The classification results of the proposed approaches have been compared with the results of some of existing techniques proposed in the literature to establish the claim.
Vibration Sensor Monitoring of Nickel-Titanium Alloy Turning for Machinability Evaluation.
Segreto, Tiziana; Caggiano, Alessandra; Karam, Sara; Teti, Roberto
2017-12-12
Nickel-Titanium (Ni-Ti) alloys are very difficult-to-machine materials causing notable manufacturing problems due to their unique mechanical properties, including superelasticity, high ductility, and severe strain-hardening. In this framework, the aim of this paper is to assess the machinability of Ni-Ti alloys with reference to turning processes in order to realize a reliable and robust in-process identification of machinability conditions. An on-line sensor monitoring procedure based on the acquisition of vibration signals was implemented during the experimental turning tests. The detected vibration sensorial data were processed through an advanced signal processing method in time-frequency domain based on wavelet packet transform (WPT). The extracted sensorial features were used to construct WPT pattern feature vectors to send as input to suitably configured neural networks (NNs) for cognitive pattern recognition in order to evaluate the correlation between input sensorial information and output machinability conditions.
Vibration Sensor Monitoring of Nickel-Titanium Alloy Turning for Machinability Evaluation
Segreto, Tiziana; Karam, Sara; Teti, Roberto
2017-01-01
Nickel-Titanium (Ni-Ti) alloys are very difficult-to-machine materials causing notable manufacturing problems due to their unique mechanical properties, including superelasticity, high ductility, and severe strain-hardening. In this framework, the aim of this paper is to assess the machinability of Ni-Ti alloys with reference to turning processes in order to realize a reliable and robust in-process identification of machinability conditions. An on-line sensor monitoring procedure based on the acquisition of vibration signals was implemented during the experimental turning tests. The detected vibration sensorial data were processed through an advanced signal processing method in time-frequency domain based on wavelet packet transform (WPT). The extracted sensorial features were used to construct WPT pattern feature vectors to send as input to suitably configured neural networks (NNs) for cognitive pattern recognition in order to evaluate the correlation between input sensorial information and output machinability conditions. PMID:29231864
System and method for cooling a superconducting rotary machine
Ackermann, Robert Adolf [Schenectady, NY; Laskaris, Evangelos Trifon [Schenectady, NY; Huang, Xianrui [Clifton Park, NY; Bray, James William [Niskayuna, NY
2011-08-09
A system for cooling a superconducting rotary machine includes a plurality of sealed siphon tubes disposed in balanced locations around a rotor adjacent to a superconducting coil. Each of the sealed siphon tubes includes a tubular body and a heat transfer medium disposed in the tubular body that undergoes a phase change during operation of the machine to extract heat from the superconducting coil. A siphon heat exchanger is thermally coupled to the siphon tubes for extracting heat from the siphon tubes during operation of the machine.
A Sensor-Based Method for Diagnostics of Machine Tool Linear Axes.
Vogl, Gregory W; Weiss, Brian A; Donmez, M Alkan
2015-01-01
A linear axis is a vital subsystem of machine tools, which are vital systems within many manufacturing operations. When installed and operating within a manufacturing facility, a machine tool needs to stay in good condition for parts production. All machine tools degrade during operations, yet knowledge of that degradation is illusive; specifically, accurately detecting degradation of linear axes is a manual and time-consuming process. Thus, manufacturers need automated and efficient methods to diagnose the condition of their machine tool linear axes without disruptions to production. The Prognostics and Health Management for Smart Manufacturing Systems (PHM4SMS) project at the National Institute of Standards and Technology (NIST) developed a sensor-based method to quickly estimate the performance degradation of linear axes. The multi-sensor-based method uses data collected from a 'sensor box' to identify changes in linear and angular errors due to axis degradation; the sensor box contains inclinometers, accelerometers, and rate gyroscopes to capture this data. The sensors are expected to be cost effective with respect to savings in production losses and scrapped parts for a machine tool. Numerical simulations, based on sensor bandwidth and noise specifications, show that changes in straightness and angular errors could be known with acceptable test uncertainty ratios. If a sensor box resides on a machine tool and data is collected periodically, then the degradation of the linear axes can be determined and used for diagnostics and prognostics to help optimize maintenance, production schedules, and ultimately part quality.
Karthick, P A; Ghosh, Diptasree Maitra; Ramakrishnan, S
2018-02-01
Surface electromyography (sEMG) based muscle fatigue research is widely preferred in sports science and occupational/rehabilitation studies due to its noninvasiveness. However, these signals are complex, multicomponent and highly nonstationary with large inter-subject variations, particularly during dynamic contractions. Hence, time-frequency based machine learning methodologies can improve the design of automated system for these signals. In this work, the analysis based on high-resolution time-frequency methods, namely, Stockwell transform (S-transform), B-distribution (BD) and extended modified B-distribution (EMBD) are proposed to differentiate the dynamic muscle nonfatigue and fatigue conditions. The nonfatigue and fatigue segments of sEMG signals recorded from the biceps brachii of 52 healthy volunteers are preprocessed and subjected to S-transform, BD and EMBD. Twelve features are extracted from each method and prominent features are selected using genetic algorithm (GA) and binary particle swarm optimization (BPSO). Five machine learning algorithms, namely, naïve Bayes, support vector machine (SVM) of polynomial and radial basis kernel, random forest and rotation forests are used for the classification. The results show that all the proposed time-frequency distributions (TFDs) are able to show the nonstationary variations of sEMG signals. Most of the features exhibit statistically significant difference in the muscle fatigue and nonfatigue conditions. The maximum number of features (66%) is reduced by GA and BPSO for EMBD and BD-TFD respectively. The combination of EMBD- polynomial kernel based SVM is found to be most accurate (91% accuracy) in classifying the conditions with the features selected using GA. The proposed methods are found to be capable of handling the nonstationary and multicomponent variations of sEMG signals recorded in dynamic fatiguing contractions. Particularly, the combination of EMBD- polynomial kernel based SVM could be used to
Li, Wutao; Huang, Zhigang; Lang, Rongling; Qin, Honglei; Zhou, Kai; Cao, Yongbin
2016-03-04
Interferences can severely degrade the performance of Global Navigation Satellite System (GNSS) receivers. As the first step of GNSS any anti-interference measures, interference monitoring for GNSS is extremely essential and necessary. Since interference monitoring can be considered as a classification problem, a real-time interference monitoring technique based on Twin Support Vector Machine (TWSVM) is proposed in this paper. A TWSVM model is established, and TWSVM is solved by the Least Squares Twin Support Vector Machine (LSTWSVM) algorithm. The interference monitoring indicators are analyzed to extract features from the interfered GNSS signals. The experimental results show that the chosen observations can be used as the interference monitoring indicators. The interference monitoring performance of the proposed method is verified by using GPS L1 C/A code signal and being compared with that of standard SVM. The experimental results indicate that the TWSVM-based interference monitoring is much faster than the conventional SVM. Furthermore, the training time of TWSVM is on millisecond (ms) level and the monitoring time is on microsecond (μs) level, which make the proposed approach usable in practical interference monitoring applications.
A Real-Time Interference Monitoring Technique for GNSS Based on a Twin Support Vector Machine Method
Li, Wutao; Huang, Zhigang; Lang, Rongling; Qin, Honglei; Zhou, Kai; Cao, Yongbin
2016-01-01
Interferences can severely degrade the performance of Global Navigation Satellite System (GNSS) receivers. As the first step of GNSS any anti-interference measures, interference monitoring for GNSS is extremely essential and necessary. Since interference monitoring can be considered as a classification problem, a real-time interference monitoring technique based on Twin Support Vector Machine (TWSVM) is proposed in this paper. A TWSVM model is established, and TWSVM is solved by the Least Squares Twin Support Vector Machine (LSTWSVM) algorithm. The interference monitoring indicators are analyzed to extract features from the interfered GNSS signals. The experimental results show that the chosen observations can be used as the interference monitoring indicators. The interference monitoring performance of the proposed method is verified by using GPS L1 C/A code signal and being compared with that of standard SVM. The experimental results indicate that the TWSVM-based interference monitoring is much faster than the conventional SVM. Furthermore, the training time of TWSVM is on millisecond (ms) level and the monitoring time is on microsecond (μs) level, which make the proposed approach usable in practical interference monitoring applications. PMID:26959020
Voice based gender classification using machine learning
NASA Astrophysics Data System (ADS)
Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.
2017-11-01
Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.
Simulation and Community-Based Instruction of Vending Machines with Time Delay.
ERIC Educational Resources Information Center
Browder, Diane M.; And Others
1988-01-01
The study evaluated the use of simulated instruction on vending machine use as an adjunct to community-based instruction with two moderately retarded children. Results showed concurrent acquisition of the vending machine skills across trained and untrained sites. (Author/DB)
Interface Metaphors for Interactive Machine Learning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jasper, Robert J.; Blaha, Leslie M.
To promote more interactive and dynamic machine learn- ing, we revisit the notion of user-interface metaphors. User-interface metaphors provide intuitive constructs for supporting user needs through interface design elements. A user-interface metaphor provides a visual or action pattern that leverages a user’s knowledge of another domain. Metaphors suggest both the visual representations that should be used in a display as well as the interactions that should be afforded to the user. We argue that user-interface metaphors can also offer a method of extracting interaction-based user feedback for use in machine learning. Metaphors offer indirect, context-based information that can be usedmore » in addition to explicit user inputs, such as user-provided labels. Implicit information from user interactions with metaphors can augment explicit user input for active learning paradigms. Or it might be leveraged in systems where explicit user inputs are more challenging to obtain. Each interaction with the metaphor provides an opportunity to gather data and learn. We argue this approach is especially important in streaming applications, where we desire machine learning systems that can adapt to dynamic, changing data.« less
End-member modelling as a tool for climate reconstruction-An Eastern Mediterranean case study.
Beuscher, Sarah; Krüger, Stefan; Ehrmann, Werner; Schmiedl, Gerhard; Milker, Yvonne; Arz, Helge; Schulz, Hartmut
2017-01-01
The Eastern Mediterranean Sea is a sink for terrigenous sediments from North Africa, Europe and Asia Minor. Its sediments therefore provide valuable information on the climate dynamics in the source areas and the associated transport processes. We present a high-resolution dataset of sediment core M40/4_SL71, which was collected SW of Crete and spans the last ca. 180 kyr. We analysed the clay mineral composition, the grain size distribution within the silt fraction, and the abundance of major and trace elements. We tested the potential of end-member modelling on these sedimentological datasets as a tool for reconstructing the climate variability in the source regions and the associated detrital input. For each dataset, we modelled three end members. All end members were assigned to a specific provenance and sedimentary process. In total, three end members were related to the Saharan dust input, and five were related to the fluvial sediment input. One end member was strongly associated with the sapropel layers. The Saharan dust end members of the grain size and clay mineral datasets generally suggest enhanced dust export into the Eastern Mediterranean Sea during the dry phases with short-term increases during Heinrich events. During the African Humid Periods, dust export was reduced but may not have completely ceased. The loading patterns of two fluvial end members show a strong relationship with the Northern Hemisphere insolation, and all fluvial end members document enhanced input during the African Humid Periods. The sapropel end member most likely reflects the fixation of redox-sensitive elements within the anoxic sapropel layers. Our results exemplify that end-member modelling is a valuable tool for interpreting extensive and multidisciplinary datasets.
A Sensor-Based Method for Diagnostics of Machine Tool Linear Axes
Vogl, Gregory W.; Weiss, Brian A.; Donmez, M. Alkan
2017-01-01
A linear axis is a vital subsystem of machine tools, which are vital systems within many manufacturing operations. When installed and operating within a manufacturing facility, a machine tool needs to stay in good condition for parts production. All machine tools degrade during operations, yet knowledge of that degradation is illusive; specifically, accurately detecting degradation of linear axes is a manual and time-consuming process. Thus, manufacturers need automated and efficient methods to diagnose the condition of their machine tool linear axes without disruptions to production. The Prognostics and Health Management for Smart Manufacturing Systems (PHM4SMS) project at the National Institute of Standards and Technology (NIST) developed a sensor-based method to quickly estimate the performance degradation of linear axes. The multi-sensor-based method uses data collected from a ‘sensor box’ to identify changes in linear and angular errors due to axis degradation; the sensor box contains inclinometers, accelerometers, and rate gyroscopes to capture this data. The sensors are expected to be cost effective with respect to savings in production losses and scrapped parts for a machine tool. Numerical simulations, based on sensor bandwidth and noise specifications, show that changes in straightness and angular errors could be known with acceptable test uncertainty ratios. If a sensor box resides on a machine tool and data is collected periodically, then the degradation of the linear axes can be determined and used for diagnostics and prognostics to help optimize maintenance, production schedules, and ultimately part quality. PMID:28691039
Mazumder, Oishee; Kundu, Ananda Sankar; Lenka, Prasanna Kumar; Bhaumik, Subhasis
2016-10-01
Ambulatory activity classification is an active area of research for controlling and monitoring state initiation, termination, and transition in mobility assistive devices such as lower-limb exoskeletons. State transition of lower-limb exoskeletons reported thus far are achieved mostly through the use of manual switches or state machine-based logic. In this paper, we propose a postural activity classifier using a 'dendogram-based support vector machine' (DSVM) which can be used to control a lower-limb exoskeleton. A pressure sensor-based wearable insole and two six-axis inertial measurement units (IMU) have been used for recognising two static and seven dynamic postural activities: sit, stand, and sit-to-stand, stand-to-sit, level walk, fast walk, slope walk, stair ascent and stair descent. Most of the ambulatory activities are periodic in nature and have unique patterns of response. The proposed classification algorithm involves the recognition of activity patterns on the basis of the periodic shape of trajectories. Polynomial coefficients extracted from the hip angle trajectory and the centre-of-pressure (CoP) trajectory during an activity cycle are used as features to classify dynamic activities. The novelty of this paper lies in finding suitable instrumentation, developing post-processing techniques, and selecting shape-based features for ambulatory activity classification. The proposed activity classifier is used to identify the activity states of a lower-limb exoskeleton. The DSVM classifier algorithm achieved an overall classification accuracy of 95.2%. Copyright © 2016 Elsevier B.V. All rights reserved.
A Machine Learning Ensemble Classifier for Early Prediction of Diabetic Retinopathy.
S K, Somasundaram; P, Alli
2017-11-09
The main complication of diabetes is Diabetic retinopathy (DR), retinal vascular disease and it leads to the blindness. Regular screening for early DR disease detection is considered as an intensive labor and resource oriented task. Therefore, automatic detection of DR diseases is performed only by using the computational technique is the great solution. An automatic method is more reliable to determine the presence of an abnormality in Fundus images (FI) but, the classification process is poorly performed. Recently, few research works have been designed for analyzing texture discrimination capacity in FI to distinguish the healthy images. However, the feature extraction (FE) process was not performed well, due to the high dimensionality. Therefore, to identify retinal features for DR disease diagnosis and early detection using Machine Learning and Ensemble Classification method, called, Machine Learning Bagging Ensemble Classifier (ML-BEC) is designed. The ML-BEC method comprises of two stages. The first stage in ML-BEC method comprises extraction of the candidate objects from Retinal Images (RI). The candidate objects or the features for DR disease diagnosis include blood vessels, optic nerve, neural tissue, neuroretinal rim, optic disc size, thickness and variance. These features are initially extracted by applying Machine Learning technique called, t-distributed Stochastic Neighbor Embedding (t-SNE). Besides, t-SNE generates a probability distribution across high-dimensional images where the images are separated into similar and dissimilar pairs. Then, t-SNE describes a similar probability distribution across the points in the low-dimensional map. This lessens the Kullback-Leibler divergence among two distributions regarding the locations of the points on the map. The second stage comprises of application of ensemble classifiers to the extracted features for providing accurate analysis of digital FI using machine learning. In this stage, an automatic detection
A system framework of inter-enterprise machining quality control based on fractal theory
NASA Astrophysics Data System (ADS)
Zhao, Liping; Qin, Yongtao; Yao, Yiyong; Yan, Peng
2014-03-01
In order to meet the quality control requirement of dynamic and complicated product machining processes among enterprises, a system framework of inter-enterprise machining quality control based on fractal was proposed. In this system framework, the fractal-specific characteristic of inter-enterprise machining quality control function was analysed, and the model of inter-enterprise machining quality control was constructed by the nature of fractal structures. Furthermore, the goal-driven strategy of inter-enterprise quality control and the dynamic organisation strategy of inter-enterprise quality improvement were constructed by the characteristic analysis on this model. In addition, the architecture of inter-enterprise machining quality control based on fractal was established by means of Web service. Finally, a case study for application was presented. The result showed that the proposed method was available, and could provide guidance for quality control and support for product reliability in inter-enterprise machining processes.
2010-01-01
Background Protein-protein interaction (PPI) plays essential roles in cellular functions. The cost, time and other limitations associated with the current experimental methods have motivated the development of computational methods for predicting PPIs. As protein interactions generally occur via domains instead of the whole molecules, predicting domain-domain interaction (DDI) is an important step toward PPI prediction. Computational methods developed so far have utilized information from various sources at different levels, from primary sequences, to molecular structures, to evolutionary profiles. Results In this paper, we propose a computational method to predict DDI using support vector machines (SVMs), based on domains represented as interaction profile hidden Markov models (ipHMM) where interacting residues in domains are explicitly modeled according to the three dimensional structural information available at the Protein Data Bank (PDB). Features about the domains are extracted first as the Fisher scores derived from the ipHMM and then selected using singular value decomposition (SVD). Domain pairs are represented by concatenating their selected feature vectors, and classified by a support vector machine trained on these feature vectors. The method is tested by leave-one-out cross validation experiments with a set of interacting protein pairs adopted from the 3DID database. The prediction accuracy has shown significant improvement as compared to InterPreTS (Interaction Prediction through Tertiary Structure), an existing method for PPI prediction that also uses the sequences and complexes of known 3D structure. Conclusions We show that domain-domain interaction prediction can be significantly enhanced by exploiting information inherent in the domain profiles via feature selection based on Fisher scores, singular value decomposition and supervised learning based on support vector machines. Datasets and source code are freely available on the web at http
NASA Astrophysics Data System (ADS)
Boghosian, A.; Child, S. F.; Kingslake, J.; Tedesco, M.; Bell, R. E.; Alexandrov, O.; McMichael, S.
2017-12-01
observing a pond and river reemerge after apparently freezing during the 2016-17 melt season. Using the ponds/rivers endmember scheme helps us to constrain the role storage and transport play on stabilizing ice shelves. By extending this analysis to other ice tongues and shelves we can better understand their vulnerability to a warming world.
AC Loss Analysis of MgB2-Based Fully Superconducting Machines
NASA Astrophysics Data System (ADS)
Feddersen, M.; Haran, K. S.; Berg, F.
2017-12-01
Superconducting electric machines have shown potential for significant increase in power density, making them attractive for size and weight sensitive applications such as offshore wind generation, marine propulsion, and hybrid-electric aircraft propulsion. Superconductors exhibit no loss under dc conditions, though ac current and field produce considerable losses due to hysteresis, eddy currents, and coupling mechanisms. For this reason, many present machines are designed to be partially superconducting, meaning that the dc field components are superconducting while the ac armature coils are conventional conductors. Fully superconducting designs can provide increases in power density with significantly higher armature current; however, a good estimate of ac losses is required to determine the feasibility under the machines intended operating conditions. This paper aims to characterize the expected losses in a fully superconducting machine targeted towards aircraft, based on an actively-shielded, partially superconducting machine from prior work. Various factors are examined such as magnet strength, operating frequency, and machine load to produce a model for the loss in the superconducting components of the machine. This model is then used to optimize the design of the machine for minimal ac loss while maximizing power density. Important observations from the study are discussed.
Chunk Alignment for Corpus-Based Machine Translation
ERIC Educational Resources Information Center
Kim, Jae Dong
2011-01-01
Since sub-sentential alignment is critically important to the translation quality of an Example-Based Machine Translation (EBMT) system, which operates by finding and combining phrase-level matches against the training examples, we developed a new alignment algorithm for the purpose of improving the EBMT system's performance. This new…
Machine Trades. A Competency Based Articulated Curriculum.
ERIC Educational Resources Information Center
Mein, Jake; And Others
This document is a competency-based curriculum guide designed to promote articulation in machine trades vocational education programs between and among secondary and postsecondary institutions in the Indian Hills Community College and Merged Area XV high schools in Iowa. The guide is organized in 11 sections. The first six sections provide…
Rapid tomographic reconstruction based on machine learning for time-resolved combustion diagnostics
NASA Astrophysics Data System (ADS)
Yu, Tao; Cai, Weiwei; Liu, Yingzheng
2018-04-01
Optical tomography has attracted surged research efforts recently due to the progress in both the imaging concepts and the sensor and laser technologies. The high spatial and temporal resolutions achievable by these methods provide unprecedented opportunity for diagnosis of complicated turbulent combustion. However, due to the high data throughput and the inefficiency of the prevailing iterative methods, the tomographic reconstructions which are typically conducted off-line are computationally formidable. In this work, we propose an efficient inversion method based on a machine learning algorithm, which can extract useful information from the previous reconstructions and build efficient neural networks to serve as a surrogate model to rapidly predict the reconstructions. Extreme learning machine is cited here as an example for demonstrative purpose simply due to its ease of implementation, fast learning speed, and good generalization performance. Extensive numerical studies were performed, and the results show that the new method can dramatically reduce the computational time compared with the classical iterative methods. This technique is expected to be an alternative to existing methods when sufficient training data are available. Although this work is discussed under the context of tomographic absorption spectroscopy, we expect it to be useful also to other high speed tomographic modalities such as volumetric laser-induced fluorescence and tomographic laser-induced incandescence which have been demonstrated for combustion diagnostics.
Rapid tomographic reconstruction based on machine learning for time-resolved combustion diagnostics.
Yu, Tao; Cai, Weiwei; Liu, Yingzheng
2018-04-01
Optical tomography has attracted surged research efforts recently due to the progress in both the imaging concepts and the sensor and laser technologies. The high spatial and temporal resolutions achievable by these methods provide unprecedented opportunity for diagnosis of complicated turbulent combustion. However, due to the high data throughput and the inefficiency of the prevailing iterative methods, the tomographic reconstructions which are typically conducted off-line are computationally formidable. In this work, we propose an efficient inversion method based on a machine learning algorithm, which can extract useful information from the previous reconstructions and build efficient neural networks to serve as a surrogate model to rapidly predict the reconstructions. Extreme learning machine is cited here as an example for demonstrative purpose simply due to its ease of implementation, fast learning speed, and good generalization performance. Extensive numerical studies were performed, and the results show that the new method can dramatically reduce the computational time compared with the classical iterative methods. This technique is expected to be an alternative to existing methods when sufficient training data are available. Although this work is discussed under the context of tomographic absorption spectroscopy, we expect it to be useful also to other high speed tomographic modalities such as volumetric laser-induced fluorescence and tomographic laser-induced incandescence which have been demonstrated for combustion diagnostics.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-12-28
... Determination Concerning Laser-Based Multi-Function Office Machines AGENCY: U.S. Customs and Border Protection... country of origin of laser-based multi-function office machines. Based upon the facts presented, CBP has... essential character of the laser-based multi-function office machine, and it is at their assembly and...
Application of Machine Learning in Urban Greenery Land Cover Extraction
NASA Astrophysics Data System (ADS)
Qiao, X.; Li, L. L.; Li, D.; Gan, Y. L.; Hou, A. Y.
2018-04-01
Urban greenery is a critical part of the modern city and the greenery coverage information is essential for land resource management, environmental monitoring and urban planning. It is a challenging work to extract the urban greenery information from remote sensing image as the trees and grassland are mixed with city built-ups. In this paper, we propose a new automatic pixel-based greenery extraction method using multispectral remote sensing images. The method includes three main steps. First, a small part of the images is manually interpreted to provide prior knowledge. Secondly, a five-layer neural network is trained and optimised with the manual extraction results, which are divided to serve as training samples, verification samples and testing samples. Lastly, the well-trained neural network will be applied to the unlabelled data to perform the greenery extraction. The GF-2 and GJ-1 high resolution multispectral remote sensing images were used to extract greenery coverage information in the built-up areas of city X. It shows a favourable performance in the 619 square kilometers areas. Also, when comparing with the traditional NDVI method, the proposed method gives a more accurate delineation of the greenery region. Due to the advantage of low computational load and high accuracy, it has a great potential for large area greenery auto extraction, which saves a lot of manpower and resources.
Machine Learning methods for Quantitative Radiomic Biomarkers.
Parmar, Chintan; Grossmann, Patrick; Bussink, Johan; Lambin, Philippe; Aerts, Hugo J W L
2015-08-17
Radiomics extracts and mines large number of medical imaging features quantifying tumor phenotypic characteristics. Highly accurate and reliable machine-learning approaches can drive the success of radiomic applications in clinical care. In this radiomic study, fourteen feature selection methods and twelve classification methods were examined in terms of their performance and stability for predicting overall survival. A total of 440 radiomic features were extracted from pre-treatment computed tomography (CT) images of 464 lung cancer patients. To ensure the unbiased evaluation of different machine-learning methods, publicly available implementations along with reported parameter configurations were used. Furthermore, we used two independent radiomic cohorts for training (n = 310 patients) and validation (n = 154 patients). We identified that Wilcoxon test based feature selection method WLCX (stability = 0.84 ± 0.05, AUC = 0.65 ± 0.02) and a classification method random forest RF (RSD = 3.52%, AUC = 0.66 ± 0.03) had highest prognostic performance with high stability against data perturbation. Our variability analysis indicated that the choice of classification method is the most dominant source of performance variation (34.21% of total variance). Identification of optimal machine-learning methods for radiomic applications is a crucial step towards stable and clinically relevant radiomic biomarkers, providing a non-invasive way of quantifying and monitoring tumor-phenotypic characteristics in clinical practice.
[Card-based age control mechanisms at tobacco vending machines. Effect and consequences].
Schneider, S; Meyer, C; Löber, S; Röhrig, S; Solle, D
2010-02-01
Until recently, 700,000 tobacco vending machines provided uncontrolled access to cigarettes for children and adolescents in Germany. On January 1, 2007, a card-based electronic locking device was attached to all tobacco vending machines to prevent the purchase of cigarettes by children and adolescents under 16. Starting in 2009, only persons older than 18 are able to buy cigarettes from tobacco vending machines. The aim of the present investigation (SToP Study: "Sources of Tobacco for Pupils" Study) was to assess changes in the number of tobacco vending machines after the introduction of these new technical devices (supplier's reaction). In addition, the ways smoking adolescents make purchases were assessed (consumer's reaction). We registered and mapped the total number of tobacco points of sale (tobacco POS) before and after the introduction of the card-based electronic locking device in two selected districts of the city of Cologne. Furthermore, pupils from local schools (response rate: 83%) were asked about their tobacco consumption and ways of purchase using a questionnaire. Results indicated that in the area investigated the total number of tobacco POSs decreased from 315 in 2005 to 277 in 2007. The rates of decrease were 48% for outdoor vending machines and 8% for indoor vending machines. Adolescents reported circumventing the card-based electronic locking devices (e.g., by using cards from older friends) and using other tobacco POSs (especially newspaper kiosks) or relying on their social network (mainly friends). The decreasing number of tobacco vending machines has not had a significant impact on cigarette acquisition by adolescent smokers as they tend to circumvent the newly introduced security measures.
End-member modelling as a tool for climate reconstruction—An Eastern Mediterranean case study
Krüger, Stefan; Ehrmann, Werner; Schmiedl, Gerhard; Milker, Yvonne; Arz, Helge; Schulz, Hartmut
2017-01-01
The Eastern Mediterranean Sea is a sink for terrigenous sediments from North Africa, Europe and Asia Minor. Its sediments therefore provide valuable information on the climate dynamics in the source areas and the associated transport processes. We present a high-resolution dataset of sediment core M40/4_SL71, which was collected SW of Crete and spans the last ca. 180 kyr. We analysed the clay mineral composition, the grain size distribution within the silt fraction, and the abundance of major and trace elements. We tested the potential of end-member modelling on these sedimentological datasets as a tool for reconstructing the climate variability in the source regions and the associated detrital input. For each dataset, we modelled three end members. All end members were assigned to a specific provenance and sedimentary process. In total, three end members were related to the Saharan dust input, and five were related to the fluvial sediment input. One end member was strongly associated with the sapropel layers. The Saharan dust end members of the grain size and clay mineral datasets generally suggest enhanced dust export into the Eastern Mediterranean Sea during the dry phases with short-term increases during Heinrich events. During the African Humid Periods, dust export was reduced but may not have completely ceased. The loading patterns of two fluvial end members show a strong relationship with the Northern Hemisphere insolation, and all fluvial end members document enhanced input during the African Humid Periods. The sapropel end member most likely reflects the fixation of redox-sensitive elements within the anoxic sapropel layers. Our results exemplify that end-member modelling is a valuable tool for interpreting extensive and multidisciplinary datasets. PMID:28934332
NASA Astrophysics Data System (ADS)
Jia, Xiaodong; Jin, Chao; Buzza, Matt; Di, Yuan; Siegel, David; Lee, Jay
2018-01-01
Successful applications of Diffusion Map (DM) in machine failure detection and diagnosis have been reported in several recent studies. DM provides an efficient way to visualize the high-dimensional, complex and nonlinear machine data, and thus suggests more knowledge about the machine under monitoring. In this paper, a DM based methodology named as DM-EVD is proposed for machine degradation assessment, abnormality detection and diagnosis in an online fashion. Several limitations and challenges of using DM for machine health monitoring have been analyzed and addressed. Based on the proposed DM-EVD, a deviation based methodology is then proposed to include more dimension reduction methods. In this work, the incorporation of Laplacian Eigen-map and Principal Component Analysis (PCA) are explored, and the latter algorithm is named as PCA-Dev and is validated in the case study. To show the successful application of the proposed methodology, case studies from diverse fields are presented and investigated in this work. Improved results are reported by benchmarking with other machine learning algorithms.
Porting Gravitational Wave Signal Extraction to Parallel Virtual Machine (PVM)
NASA Technical Reports Server (NTRS)
Thirumalainambi, Rajkumar; Thompson, David E.; Redmon, Jeffery
2009-01-01
Laser Interferometer Space Antenna (LISA) is a planned NASA-ESA mission to be launched around 2012. The Gravitational Wave detection is fundamentally the determination of frequency, source parameters, and waveform amplitude derived in a specific order from the interferometric time-series of the rotating LISA spacecrafts. The LISA Science Team has developed a Mock LISA Data Challenge intended to promote the testing of complicated nested search algorithms to detect the 100-1 millihertz frequency signals at amplitudes of 10E-21. However, it has become clear that, sequential search of the parameters is very time consuming and ultra-sensitive; hence, a new strategy has been developed. Parallelization of existing sequential search algorithms of Gravitational Wave signal identification consists of decomposing sequential search loops, beginning with outermost loops and working inward. In this process, the main challenge is to detect interdependencies among loops and partitioning the loops so as to preserve concurrency. Existing parallel programs are based upon either shared memory or distributed memory paradigms. In PVM, master and node programs are used to execute parallelization and process spawning. The PVM can handle process management and process addressing schemes using a virtual machine configuration. The task scheduling and the messaging and signaling can be implemented efficiently for the LISA Gravitational Wave search process using a master and 6 nodes. This approach is accomplished using a server that is available at NASA Ames Research Center, and has been dedicated to the LISA Data Challenge Competition. Historically, gravitational wave and source identification parameters have taken around 7 days in this dedicated single thread Linux based server. Using PVM approach, the parameter extraction problem can be reduced to within a day. The low frequency computation and a proxy signal-to-noise ratio are calculated in separate nodes that are controlled by the master
Li, Lishuang; Zhang, Panpan; Zheng, Tianfu; Zhang, Hongying; Jiang, Zhenchao; Huang, Degen
2014-01-01
Protein-Protein Interaction (PPI) extraction is an important task in the biomedical information extraction. Presently, many machine learning methods for PPI extraction have achieved promising results. However, the performance is still not satisfactory. One reason is that the semantic resources were basically ignored. In this paper, we propose a multiple-kernel learning-based approach to extract PPIs, combining the feature-based kernel, tree kernel and semantic kernel. Particularly, we extend the shortest path-enclosed tree kernel (SPT) by a dynamic extended strategy to retrieve the richer syntactic information. Our semantic kernel calculates the protein-protein pair similarity and the context similarity based on two semantic resources: WordNet and Medical Subject Heading (MeSH). We evaluate our method with Support Vector Machine (SVM) and achieve an F-score of 69.40% and an AUC of 92.00%, which show that our method outperforms most of the state-of-the-art systems by integrating semantic information.
Detection of distorted frames in retinal video-sequences via machine learning
NASA Astrophysics Data System (ADS)
Kolar, Radim; Liberdova, Ivana; Odstrcilik, Jan; Hracho, Michal; Tornow, Ralf P.
2017-07-01
This paper describes detection of distorted frames in retinal sequences based on set of global features extracted from each frame. The feature vector is consequently used in classification step, in which three types of classifiers are tested. The best classification accuracy 96% has been achieved with support vector machine approach.
A new machine classification method applied to human peripheral blood leukocytes
NASA Technical Reports Server (NTRS)
Rorvig, Mark E.; Fitzpatrick, Steven J.; Vitthal, Sanjay; Ladoulis, Charles T.
1994-01-01
Human beings judge images by complex mental processes, whereas computing machines extract features. By reducing scaled human judgments and machine extracted features to a common metric space and fitting them by regression, the judgments of human experts rendered on a sample of images may be imposed on an image population to provide automatic classification.
Yan, Jianjun; Shen, Xiaojing; Wang, Yiqin; Li, Fufeng; Xia, Chunming; Guo, Rui; Chen, Chunfeng; Shen, Qingwei
2010-01-01
This study aims at utilising Wavelet Packet Transform (WPT) and Support Vector Machine (SVM) algorithm to make objective analysis and quantitative research for the auscultation in Traditional Chinese Medicine (TCM) diagnosis. First, Wavelet Packet Decomposition (WPD) at level 6 was employed to split more elaborate frequency bands of the auscultation signals. Then statistic analysis was made based on the extracted Wavelet Packet Energy (WPE) features from WPD coefficients. Furthermore, the pattern recognition was used to distinguish mixed subjects' statistical feature values of sample groups through SVM. Finally, the experimental results showed that the classification accuracies were at a high level.
Feature extraction in MFL signals of machined defects in steel tubes
NASA Astrophysics Data System (ADS)
Perazzo, R.; Pignotti, A.; Reich, S.; Stickar, P.
2001-04-01
Thirty defects of various shapes were machined on the external and internal wall surfaces of a 177 mm diameter ferromagnetic steel pipe. MFL signals were digitized and recorded at a frequency of 4 Khz. Various magnetizing currents and relative tube-probe velocities of the order of 2m/s were used. The identification of the location of the defect by a principal component/neural network analysis of the signal is shown to be more effective than the standard procedure of classification based on the average signal frequency.
An illustration of new methods in machine condition monitoring, Part I: stochastic resonance
NASA Astrophysics Data System (ADS)
Worden, K.; Antoniadou, I.; Marchesiello, S.; Mba, C.; Garibaldi, L.
2017-05-01
There have been many recent developments in the application of data-based methods to machine condition monitoring. A powerful methodology based on machine learning has emerged, where diagnostics are based on a two-step procedure: extraction of damage-sensitive features, followed by unsupervised learning (novelty detection) or supervised learning (classification). The objective of the current pair of papers is simply to illustrate one state-of-the-art procedure for each step, using synthetic data representative of reality in terms of size and complexity. The first paper in the pair will deal with feature extraction. Although some papers have appeared in the recent past considering stochastic resonance as a means of amplifying damage information in signals, they have largely relied on ad hoc specifications of the resonator used. In contrast, the current paper will adopt a principled optimisation-based approach to the resonator design. The paper will also show that a discrete dynamical system can provide all the benefits of a continuous system, but also provide a considerable speed-up in terms of simulation time in order to facilitate the optimisation approach.
Xu, Rong; Li, Li; Wang, QuanQiu
2013-01-01
Motivation: Systems approaches to studying phenotypic relationships among diseases are emerging as an active area of research for both novel disease gene discovery and drug repurposing. Currently, systematic study of disease phenotypic relationships on a phenome-wide scale is limited because large-scale machine-understandable disease–phenotype relationship knowledge bases are often unavailable. Here, we present an automatic approach to extract disease–manifestation (D-M) pairs (one specific type of disease–phenotype relationship) from the wide body of published biomedical literature. Data and Methods: Our method leverages external knowledge and limits the amount of human effort required. For the text corpus, we used 119 085 682 MEDLINE sentences (21 354 075 citations). First, we used D-M pairs from existing biomedical ontologies as prior knowledge to automatically discover D-M–specific syntactic patterns. We then extracted additional pairs from MEDLINE using the learned patterns. Finally, we analysed correlations between disease manifestations and disease-associated genes and drugs to demonstrate the potential of this newly created knowledge base in disease gene discovery and drug repurposing. Results: In total, we extracted 121 359 unique D-M pairs with a high precision of 0.924. Among the extracted pairs, 120 419 (99.2%) have not been captured in existing structured knowledge sources. We have shown that disease manifestations correlate positively with both disease-associated genes and drug treatments. Conclusions: The main contribution of our study is the creation of a large-scale and accurate D-M phenotype relationship knowledge base. This unique knowledge base, when combined with existing phenotypic, genetic and proteomic datasets, can have profound implications in our deeper understanding of disease etiology and in rapid drug repurposing. Availability: http://nlp.case.edu/public/data/DMPatternUMLS/ Contact: rxx@case.edu PMID:23828786
Objective Auscultation of TCM Based on Wavelet Packet Fractal Dimension and Support Vector Machine.
Yan, Jian-Jun; Guo, Rui; Wang, Yi-Qin; Liu, Guo-Ping; Yan, Hai-Xia; Xia, Chun-Ming; Shen, Xiaojing
2014-01-01
This study was conducted to illustrate that auscultation features based on the fractal dimension combined with wavelet packet transform (WPT) were conducive to the identification the pattern of syndromes of Traditional Chinese Medicine (TCM). The WPT and the fractal dimension were employed to extract features of auscultation signals of 137 patients with lung Qi-deficient pattern, 49 patients with lung Yin-deficient pattern, and 43 healthy subjects. With these features, the classification model was constructed based on multiclass support vector machine (SVM). When all auscultation signals were trained by SVM to decide the patterns of TCM syndromes, the overall recognition rate of model was 79.49%; when male and female auscultation signals were trained, respectively, to decide the patterns, the overall recognition rate of model reached 86.05%. The results showed that the methods proposed in this paper were effective to analyze auscultation signals, and the performance of model can be greatly improved when the distinction of gender was considered.
Objective Auscultation of TCM Based on Wavelet Packet Fractal Dimension and Support Vector Machine
Yan, Jian-Jun; Wang, Yi-Qin; Liu, Guo-Ping; Yan, Hai-Xia; Xia, Chun-Ming; Shen, Xiaojing
2014-01-01
This study was conducted to illustrate that auscultation features based on the fractal dimension combined with wavelet packet transform (WPT) were conducive to the identification the pattern of syndromes of Traditional Chinese Medicine (TCM). The WPT and the fractal dimension were employed to extract features of auscultation signals of 137 patients with lung Qi-deficient pattern, 49 patients with lung Yin-deficient pattern, and 43 healthy subjects. With these features, the classification model was constructed based on multiclass support vector machine (SVM). When all auscultation signals were trained by SVM to decide the patterns of TCM syndromes, the overall recognition rate of model was 79.49%; when male and female auscultation signals were trained, respectively, to decide the patterns, the overall recognition rate of model reached 86.05%. The results showed that the methods proposed in this paper were effective to analyze auscultation signals, and the performance of model can be greatly improved when the distinction of gender was considered. PMID:24883068
Support vector machine in machine condition monitoring and fault diagnosis
NASA Astrophysics Data System (ADS)
Widodo, Achmad; Yang, Bo-Suk
2007-08-01
Recently, the issue of machine condition monitoring and fault diagnosis as a part of maintenance system became global due to the potential advantages to be gained from reduced maintenance costs, improved productivity and increased machine availability. This paper presents a survey of machine condition monitoring and fault diagnosis using support vector machine (SVM). It attempts to summarize and review the recent research and developments of SVM in machine condition monitoring and diagnosis. Numerous methods have been developed based on intelligent systems such as artificial neural network, fuzzy expert system, condition-based reasoning, random forest, etc. However, the use of SVM for machine condition monitoring and fault diagnosis is still rare. SVM has excellent performance in generalization so it can produce high accuracy in classification for machine condition monitoring and diagnosis. Until 2006, the use of SVM in machine condition monitoring and fault diagnosis is tending to develop towards expertise orientation and problem-oriented domain. Finally, the ability to continually change and obtain a novel idea for machine condition monitoring and fault diagnosis using SVM will be future works.
Zhang, Xin; Cui, Jintian; Wang, Weisheng; Lin, Chao
2017-01-01
To address the problem of image texture feature extraction, a direction measure statistic that is based on the directionality of image texture is constructed, and a new method of texture feature extraction, which is based on the direction measure and a gray level co-occurrence matrix (GLCM) fusion algorithm, is proposed in this paper. This method applies the GLCM to extract the texture feature value of an image and integrates the weight factor that is introduced by the direction measure to obtain the final texture feature of an image. A set of classification experiments for the high-resolution remote sensing images were performed by using support vector machine (SVM) classifier with the direction measure and gray level co-occurrence matrix fusion algorithm. Both qualitative and quantitative approaches were applied to assess the classification results. The experimental results demonstrated that texture feature extraction based on the fusion algorithm achieved a better image recognition, and the accuracy of classification based on this method has been significantly improved. PMID:28640181
NASA Astrophysics Data System (ADS)
Yu, Jianbo
2015-12-01
Prognostics is much efficient to achieve zero-downtime performance, maximum productivity and proactive maintenance of machines. Prognostics intends to assess and predict the time evolution of machine health degradation so that machine failures can be predicted and prevented. A novel prognostics system is developed based on the data-model-fusion scheme using the Bayesian inference-based self-organizing map (SOM) and an integration of logistic regression (LR) and high-order particle filtering (HOPF). In this prognostics system, a baseline SOM is constructed to model the data distribution space of healthy machine under an assumption that predictable fault patterns are not available. Bayesian inference-based probability (BIP) derived from the baseline SOM is developed as a quantification indication of machine health degradation. BIP is capable of offering failure probability for the monitored machine, which has intuitionist explanation related to health degradation state. Based on those historic BIPs, the constructed LR and its modeling noise constitute a high-order Markov process (HOMP) to describe machine health propagation. HOPF is used to solve the HOMP estimation to predict the evolution of the machine health in the form of a probability density function (PDF). An on-line model update scheme is developed to adapt the Markov process changes to machine health dynamics quickly. The experimental results on a bearing test-bed illustrate the potential applications of the proposed system as an effective and simple tool for machine health prognostics.
English to Sanskrit Machine Translation Using Transfer Based approach
NASA Astrophysics Data System (ADS)
Pathak, Ganesh R.; Godse, Sachin P.
2010-11-01
Translation is one of the needs of global society for communicating thoughts and ideas of one country with other country. Translation is the process of interpretation of text meaning and subsequent production of equivalent text, also called as communicating same meaning (message) in another language. In this paper we gave detail information on how to convert source language text in to target language text using Transfer Based Approach for machine translation. Here we implemented English to Sanskrit machine translator using transfer based approach. English is global language used for business and communication but large amount of population in India is not using and understand the English. Sanskrit is ancient language of India most of the languages in India are derived from Sanskrit. Sanskrit can be act as an intermediate language for multilingual translation.
Pre-use anesthesia machine check; certified anesthesia technician based quality improvement audit.
Al Suhaibani, Mazen; Al Malki, Assaf; Al Dosary, Saad; Al Barmawi, Hanan; Pogoku, Mahdhav
2014-01-01
Quality assurance of providing a work ready machine in multiple theatre operating rooms in a tertiary modern medical center in Riyadh. The aim of the following study is to keep high quality environment for workers and patients in surgical operating rooms. Technicians based audit by using key performance indicators to assure inspection, passing test of machine worthiness for use daily and in between cases and in case of unexpected failure to provide quick replacement by ready to use another anesthetic machine. The anesthetic machines in all operating rooms are daily and continuously inspected and passed as ready by technicians and verified by anesthesiologist consultant or assistant consultant. The daily records of each machines were collected then inspected for data analysis by quality improvement committee department for descriptive analysis and report the degree of staff compliance to daily inspection as "met" items. Replaced machine during use and overall compliance. Distractive statistic using Microsoft Excel 2003 tables and graphs of sums and percentages of item studied in this audit. Audit obtained highest compliance percentage and low rate of replacement of machine which indicate unexpected machine state of use and quick machine switch. The authors are able to conclude that following regular inspection and running self-check recommended by the manufacturers can contribute to abort any possibility of hazard of anesthesia machine failure during operation. Furthermore in case of unexpected reason to replace the anesthesia machine in quick maneuver contributes to high assured operative utilization of man machine inter-phase in modern surgical operating rooms.
[Study on high strength mica-based machinable glass-ceramic].
Li, Hong; Ran, Junguo; Gou, Li; Wang, Fanghu
2004-02-01
The phase constitution, microstructure and properties of a new type of machinable glass-ceramics containing fluorophlogopite-type (FPT) Ca-mica for used in restorative dentistry were investigated. According to the results of X-ray diffraction (XRD) and energy-dispersive spectrometry(EDS), its main crystalline phases were FPT Ca-mica and t-ZrO2, together with few KxCa(1-x)/2Mg2Si4O10F2, m-ZrO2. The flexible strength was 235 MPa, which was nearly two times larger than that of the present mica-based dental materials, and the highest fracture toughness was 2.17 MPa.m1/2. The microstructure had a great effect on properties, the glass-ceramics contained a large volume, and the fine crystals showed higher strength. The material possessed typical microstructure of machinable glass-ceramics and displayed excellent machinability during drilling test and CAD/CAM.
TEACHING PHYSICS: A computer-based revitalization of Atwood's machine
NASA Astrophysics Data System (ADS)
Trumper, Ricardo; Gelbman, Moshe
2000-09-01
Atwood's machine is used in a microcomputer-based experiment to demonstrate Newton's second law with considerable precision. The friction force on the masses and the moment of inertia of the pulley can also be estimated.
Association Rule-based Predictive Model for Machine Failure in Industrial Internet of Things
NASA Astrophysics Data System (ADS)
Kwon, Jung-Hyok; Lee, Sol-Bee; Park, Jaehoon; Kim, Eui-Jik
2017-09-01
This paper proposes an association rule-based predictive model for machine failure in industrial Internet of things (IIoT), which can accurately predict the machine failure in real manufacturing environment by investigating the relationship between the cause and type of machine failure. To develop the predictive model, we consider three major steps: 1) binarization, 2) rule creation, 3) visualization. The binarization step translates item values in a dataset into one or zero, then the rule creation step creates association rules as IF-THEN structures using the Lattice model and Apriori algorithm. Finally, the created rules are visualized in various ways for users’ understanding. An experimental implementation was conducted using R Studio version 3.3.2. The results show that the proposed predictive model realistically predicts machine failure based on association rules.
Building a protein name dictionary from full text: a machine learning term extraction approach.
Shi, Lei; Campagne, Fabien
2005-04-07
The majority of information in the biological literature resides in full text articles, instead of abstracts. Yet, abstracts remain the focus of many publicly available literature data mining tools. Most literature mining tools rely on pre-existing lexicons of biological names, often extracted from curated gene or protein databases. This is a limitation, because such databases have low coverage of the many name variants which are used to refer to biological entities in the literature. We present an approach to recognize named entities in full text. The approach collects high frequency terms in an article, and uses support vector machines (SVM) to identify biological entity names. It is also computationally efficient and robust to noise commonly found in full text material. We use the method to create a protein name dictionary from a set of 80,528 full text articles. Only 8.3% of the names in this dictionary match SwissProt description lines. We assess the quality of the dictionary by studying its protein name recognition performance in full text. This dictionary term lookup method compares favourably to other published methods, supporting the significance of our direct extraction approach. The method is strong in recognizing name variants not found in SwissProt.
Building a protein name dictionary from full text: a machine learning term extraction approach
Shi, Lei; Campagne, Fabien
2005-01-01
Background The majority of information in the biological literature resides in full text articles, instead of abstracts. Yet, abstracts remain the focus of many publicly available literature data mining tools. Most literature mining tools rely on pre-existing lexicons of biological names, often extracted from curated gene or protein databases. This is a limitation, because such databases have low coverage of the many name variants which are used to refer to biological entities in the literature. Results We present an approach to recognize named entities in full text. The approach collects high frequency terms in an article, and uses support vector machines (SVM) to identify biological entity names. It is also computationally efficient and robust to noise commonly found in full text material. We use the method to create a protein name dictionary from a set of 80,528 full text articles. Only 8.3% of the names in this dictionary match SwissProt description lines. We assess the quality of the dictionary by studying its protein name recognition performance in full text. Conclusion This dictionary term lookup method compares favourably to other published methods, supporting the significance of our direct extraction approach. The method is strong in recognizing name variants not found in SwissProt. PMID:15817129
SPECTRa-T: machine-based data extraction and semantic searching of chemistry e-theses.
Downing, Jim; Harvey, Matt J; Morgan, Peter B; Murray-Rust, Peter; Rzepa, Henry S; Stewart, Diana C; Tonge, Alan P; Townsend, Joe A
2010-02-22
The SPECTRa-T project has developed text-mining tools to extract named chemical entities (NCEs), such as chemical names and terms, and chemical objects (COs), e.g., experimental spectral assignments and physical chemistry properties, from electronic theses (e-theses). Although NCEs were readily identified within the two major document formats studied, only the use of structured documents enabled identification of chemical objects and their association with the relevant chemical entity (e.g., systematic chemical name). A corpus of theses was analyzed and it is shown that a high degree of semantic information can be extracted from structured documents. This integrated information has been deposited in a persistent Resource Description Framework (RDF) triple-store that allows users to conduct semantic searches. The strength and weaknesses of several document formats are reviewed.
Zeng, Xueqiang; Luo, Gang
2017-12-01
Machine learning is broadly used for clinical data analysis. Before training a model, a machine learning algorithm must be selected. Also, the values of one or more model parameters termed hyper-parameters must be set. Selecting algorithms and hyper-parameter values requires advanced machine learning knowledge and many labor-intensive manual iterations. To lower the bar to machine learning, miscellaneous automatic selection methods for algorithms and/or hyper-parameter values have been proposed. Existing automatic selection methods are inefficient on large data sets. This poses a challenge for using machine learning in the clinical big data era. To address the challenge, this paper presents progressive sampling-based Bayesian optimization, an efficient and automatic selection method for both algorithms and hyper-parameter values. We report an implementation of the method. We show that compared to a state of the art automatic selection method, our method can significantly reduce search time, classification error rate, and standard deviation of error rate due to randomization. This is major progress towards enabling fast turnaround in identifying high-quality solutions required by many machine learning-based clinical data analysis tasks.
Zhao, Yong; Hong, Wen-Xue
2011-11-01
Fast, nondestructive and accurate identification of special quality eggs is an urgent problem. The present paper proposed a new feature extraction method based on symbol entropy to identify near infrared spectroscopy of special quality eggs. The authors selected normal eggs, free range eggs, selenium-enriched eggs and zinc-enriched eggs as research objects and measured the near-infrared diffuse reflectance spectra in the range of 12 000-4 000 cm(-1). Raw spectra were symbolically represented with aggregation approximation algorithm and symbolic entropy was extracted as feature vector. An error-correcting output codes multiclass support vector machine classifier was designed to identify the spectrum. Symbolic entropy feature is robust when parameter changed and the highest recognition rate reaches up to 100%. The results show that the identification method of special quality eggs using near-infrared is feasible and the symbol entropy can be used as a new feature extraction method of near-infrared spectra.
ERIC Educational Resources Information Center
Hepburn, Larry; Shin, Masako
This document, one of eight in a multi-cultural competency-based vocational/technical curricula series, is on machine trades. This program is designed to run 36 weeks and cover 6 instructional areas: use of measuring tools; benchwork/tool bit grinding; lathe work; milling work; precision grinding; and combination machine work. A duty-task index…
Huynh-Thu, Vân Anh; Saeys, Yvan; Wehenkel, Louis; Geurts, Pierre
2012-07-01
Univariate statistical tests are widely used for biomarker discovery in bioinformatics. These procedures are simple, fast and their output is easily interpretable by biologists but they can only identify variables that provide a significant amount of information in isolation from the other variables. As biological processes are expected to involve complex interactions between variables, univariate methods thus potentially miss some informative biomarkers. Variable relevance scores provided by machine learning techniques, however, are potentially able to highlight multivariate interacting effects, but unlike the p-values returned by univariate tests, these relevance scores are usually not statistically interpretable. This lack of interpretability hampers the determination of a relevance threshold for extracting a feature subset from the rankings and also prevents the wide adoption of these methods by practicians. We evaluated several, existing and novel, procedures that extract relevant features from rankings derived from machine learning approaches. These procedures replace the relevance scores with measures that can be interpreted in a statistical way, such as p-values, false discovery rates, or family wise error rates, for which it is easier to determine a significance level. Experiments were performed on several artificial problems as well as on real microarray datasets. Although the methods differ in terms of computing times and the tradeoff, they achieve in terms of false positives and false negatives, some of them greatly help in the extraction of truly relevant biomarkers and should thus be of great practical interest for biologists and physicians. As a side conclusion, our experiments also clearly highlight that using model performance as a criterion for feature selection is often counter-productive. Python source codes of all tested methods, as well as the MATLAB scripts used for data simulation, can be found in the Supplementary Material.
Alumina additions may improve the damage tolerance of soft machined zirconia-based ceramics.
Oilo, Marit; Tvinnereim, Helene M; Gjerdet, Nils Roar
2011-01-01
The aim of this study was to evaluate the damage tolerance of different zirconia-based materials. Bars of one hard machined and one soft machined dental zirconia and an experimental 95% zirconia 5% alumina ceramic were subjected to 100,000 stress cycles (n = 10), indented to provoke cracks on the tensile stress side (n = 10), and left untreated as controls (n = 10). The experimental material demonstrated a higher relative damage tolerance, with a 40% reduction compared to 68% for the hard machined zirconia and 84% for the soft machined zirconia.
Feng, Jingwen; Feng, Tong; Yang, Chengwen; Wang, Wei; Sa, Yu; Feng, Yuanming
2018-06-01
This study was to explore the feasibility of prediction and classification of cells in different stages of apoptosis with a stain-free method based on diffraction images and supervised machine learning. Apoptosis was induced in human chronic myelogenous leukemia K562 cells by cis-platinum (DDP). A newly developed technique of polarization diffraction imaging flow cytometry (p-DIFC) was performed to acquire diffraction images of the cells in three different statuses (viable, early apoptotic and late apoptotic/necrotic) after cell separation through fluorescence activated cell sorting with Annexin V-PE and SYTOX® Green double staining. The texture features of the diffraction images were extracted with in-house software based on the Gray-level co-occurrence matrix algorithm to generate datasets for cell classification with supervised machine learning method. Therefore, this new method has been verified in hydrogen peroxide induced apoptosis model of HL-60. Results show that accuracy of higher than 90% was achieved respectively in independent test datasets from each cell type based on logistic regression with ridge estimators, which indicated that p-DIFC system has a great potential in predicting and classifying cells in different stages of apoptosis.
Broiler weight estimation based on machine vision and artificial neural network.
Amraei, S; Abdanan Mehdizadeh, S; Salari, S
2017-04-01
1. Machine vision and artificial neural network (ANN) procedures were used to estimate live body weight of broiler chickens in 30 1-d-old broiler chickens reared for 42 d. 2. Imaging was performed two times daily. To localise chickens within the pen, an ellipse fitting algorithm was used and the chickens' head and tail removed using the Chan-Vese method. 3. The correlations between the body weight and 6 physical extracted features indicated that there were strong correlations between body weight and the 5 features including area, perimeter, convex area, major and minor axis length. 5. According to statistical analysis there was no significant difference between morning and afternoon data over 42 d. 6. In an attempt to improve the accuracy of live weight approximation different ANN techniques, including Bayesian regulation, Levenberg-Marquardt, Scaled conjugate gradient and gradient descent were used. Bayesian regulation with R 2 value of 0.98 was the best network for prediction of broiler weight. 7. The accuracy of the machine vision technique was examined and most errors were less than 50 g.
Applying machine learning to identify autistic adults using imitation: An exploratory study.
Li, Baihua; Sharma, Arjun; Meng, James; Purushwalkam, Senthil; Gowen, Emma
2017-01-01
Autism spectrum condition (ASC) is primarily diagnosed by behavioural symptoms including social, sensory and motor aspects. Although stereotyped, repetitive motor movements are considered during diagnosis, quantitative measures that identify kinematic characteristics in the movement patterns of autistic individuals are poorly studied, preventing advances in understanding the aetiology of motor impairment, or whether a wider range of motor characteristics could be used for diagnosis. The aim of this study was to investigate whether data-driven machine learning based methods could be used to address some fundamental problems with regard to identifying discriminative test conditions and kinematic parameters to classify between ASC and neurotypical controls. Data was based on a previous task where 16 ASC participants and 14 age, IQ matched controls observed then imitated a series of hand movements. 40 kinematic parameters extracted from eight imitation conditions were analysed using machine learning based methods. Two optimal imitation conditions and nine most significant kinematic parameters were identified and compared with some standard attribute evaluators. To our knowledge, this is the first attempt to apply machine learning to kinematic movement parameters measured during imitation of hand movements to investigate the identification of ASC. Although based on a small sample, the work demonstrates the feasibility of applying machine learning methods to analyse high-dimensional data and suggest the potential of machine learning for identifying kinematic biomarkers that could contribute to the diagnostic classification of autism.
NASA Astrophysics Data System (ADS)
Pan, Yifan; Zhang, Xianfeng; Tian, Jie; Jin, Xu; Luo, Lun; Yang, Ke
2017-01-01
Asphalt road reflectance spectra change as pavement ages. This provides the possibility for remote sensing to be used to monitor a change in asphalt pavement conditions. However, the relatively narrow geometry of roads and the relatively coarse spatial resolution of remotely sensed imagery result in mixtures between pavement and adjacent landcovers (e.g., vegetation, buildings, and soil), increasing uncertainties in spectral analysis. To overcome this problem, multiple endmember spectral mixture analysis (MESMA) was used to map the asphalt pavement condition using Worldview-2 satellite imagery in this study. Based on extensive field investigation and in situ measurements, aged asphalt pavements were categorized into four stages-preliminarily aged, moderately aged, heavily aged, and distressed. The spectral characteristics in the first three stages were further analyzed, and a MESMA unmixing analysis was conducted to map these three kinds of pavement conditions from the Worldview-2 image. The results showed that the road pavement conditions could be detected well and mapped with an overall accuracy of 81.71% and Kappa coefficient of 0.77. Finally, a quantitative assessment of the pavement conditions for each road segment in this study area was conducted to inform road maintenance management.
NASA Astrophysics Data System (ADS)
Taylor, Wayne R.; Tompkins, Linda A.; Haggerty, Stephen E.
1994-10-01
A suite of largely unaltered, aphanitic, mica-bearing hypabyssal kimberlites from the Koidu kimberlite complex of the West African Craton have been investigated to determine their geochemical affinity relative to Group I (nonmicaceous) and Group II (micaceous) kimberlites of southern Africa. Comparison is made with altered kimberlites from Liberia, other West African and global kimberlites. Based on major element oxides, the Koidu kimberlites, though mica-bearing, show closest compositional similarity with the Group IA kimberlites of southern Africa. Based on major and trace elements, the Koidu kimberlites show an unusual geochemical signature. This signature is similar to that of the distinctive, micaceous Aries kimberlite of northwest Australia, and includes high Nb/U (most samples > 46), Ce/Sr(>0.4), Ta/Hf(>2), and Nb/Zr(>1) ratios and low P 2O 5/Ce ∗10 4(<27), Ba/Rb(<32), and U/Th(<0.2) ratios compared with Group I kimberlites. Koidu kimberlites can be readily discriminated from Group II kimberlites by their higher Ti/K(>0.4) and Mb/La(>1) ratios and lower Ba/Nb(<10) and Pb/Ce(<0.06) ratios. The compositions of Liberian kimberlites are leached of mobile incompatible elements, but least affected samples show affinity to Group I. Guinea kimberlites appear to be of two types: one having affinity with Group IA and the other, micaceous variety, having affinity with the Aries kimberlite. Kimberlites with an Aries geochemical signature appear to exist on some other cratons, e.g., the Kundelungu kimberlites (Zaire) and two mica-bearing Group I kimberlites (S. Africa). The Koidu kimberlites exhibit compositionally-dependent isotopic heterogeneity though initial ɛNd and ɛSr values are broadly asthenospheric (i.e., near bulk earth) similar to Group I and Aries. A compositional spectrum appears to exist between nonmicaceous Group I kimberlites through mica-bearing Koidu kimberlites to extreme endmembers of the Aries type. This spectrum can be modelled as partial melts
Machine learning-based dual-energy CT parametric mapping
NASA Astrophysics Data System (ADS)
Su, Kuan-Hao; Kuo, Jung-Wen; Jordan, David W.; Van Hedent, Steven; Klahr, Paul; Wei, Zhouping; Helo, Rose Al; Liang, Fan; Qian, Pengjiang; Pereira, Gisele C.; Rassouli, Negin; Gilkeson, Robert C.; Traughber, Bryan J.; Cheng, Chee-Wai; Muzic, Raymond F., Jr.
2018-06-01
The aim is to develop and evaluate machine learning methods for generating quantitative parametric maps of effective atomic number (Zeff), relative electron density (ρ e), mean excitation energy (I x ), and relative stopping power (RSP) from clinical dual-energy CT data. The maps could be used for material identification and radiation dose calculation. Machine learning methods of historical centroid (HC), random forest (RF), and artificial neural networks (ANN) were used to learn the relationship between dual-energy CT input data and ideal output parametric maps calculated for phantoms from the known compositions of 13 tissue substitutes. After training and model selection steps, the machine learning predictors were used to generate parametric maps from independent phantom and patient input data. Precision and accuracy were evaluated using the ideal maps. This process was repeated for a range of exposure doses, and performance was compared to that of the clinically-used dual-energy, physics-based method which served as the reference. The machine learning methods generated more accurate and precise parametric maps than those obtained using the reference method. Their performance advantage was particularly evident when using data from the lowest exposure, one-fifth of a typical clinical abdomen CT acquisition. The RF method achieved the greatest accuracy. In comparison, the ANN method was only 1% less accurate but had much better computational efficiency than RF, being able to produce parametric maps in 15 s. Machine learning methods outperformed the reference method in terms of accuracy and noise tolerance when generating parametric maps, encouraging further exploration of the techniques. Among the methods we evaluated, ANN is the most suitable for clinical use due to its combination of accuracy, excellent low-noise performance, and computational efficiency.
Machine learning-based dual-energy CT parametric mapping.
Su, Kuan-Hao; Kuo, Jung-Wen; Jordan, David W; Van Hedent, Steven; Klahr, Paul; Wei, Zhouping; Al Helo, Rose; Liang, Fan; Qian, Pengjiang; Pereira, Gisele C; Rassouli, Negin; Gilkeson, Robert C; Traughber, Bryan J; Cheng, Chee-Wai; Muzic, Raymond F
2018-06-08
The aim is to develop and evaluate machine learning methods for generating quantitative parametric maps of effective atomic number (Z eff ), relative electron density (ρ e ), mean excitation energy (I x ), and relative stopping power (RSP) from clinical dual-energy CT data. The maps could be used for material identification and radiation dose calculation. Machine learning methods of historical centroid (HC), random forest (RF), and artificial neural networks (ANN) were used to learn the relationship between dual-energy CT input data and ideal output parametric maps calculated for phantoms from the known compositions of 13 tissue substitutes. After training and model selection steps, the machine learning predictors were used to generate parametric maps from independent phantom and patient input data. Precision and accuracy were evaluated using the ideal maps. This process was repeated for a range of exposure doses, and performance was compared to that of the clinically-used dual-energy, physics-based method which served as the reference. The machine learning methods generated more accurate and precise parametric maps than those obtained using the reference method. Their performance advantage was particularly evident when using data from the lowest exposure, one-fifth of a typical clinical abdomen CT acquisition. The RF method achieved the greatest accuracy. In comparison, the ANN method was only 1% less accurate but had much better computational efficiency than RF, being able to produce parametric maps in 15 s. Machine learning methods outperformed the reference method in terms of accuracy and noise tolerance when generating parametric maps, encouraging further exploration of the techniques. Among the methods we evaluated, ANN is the most suitable for clinical use due to its combination of accuracy, excellent low-noise performance, and computational efficiency.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahrens, L.
2010-11-01
The objective of this note is to (once again) explore the AGS 'ORM' (orbit response matrix) data taken (by Operations) early during the 2007 run with an AGS bare machine and gold beam. Indeed the present motivation is to extract as much information about the AGS inherent transverse coupling as possible - from general arguments and the copious ORM data. And taking this one step further, (though not accomplished yet) the goal really should be to tell the model how to describe this coupling. 'Bare' as used here means the AGS with no quadrupole, sextupole or octupole magnets powered. Onlymore » the main (combined-function) magnet string and dipole bumps necessary to optimize beam survival are powered. 'ORM data' means the systematic recording of the equilibrium orbit beam position monitor response to powering individual dipole corrector magnets. The 'matrix' results from looking at the effect of each of the (12 superperiods X 4 dipoles per superperiod) 'kicks' on each of the (12 X 6) pick up electrodes (pues) in each transverse plane. So then we have two (48 X 72) matrices of numbers from the ORM data. (Though 'pue' usually refers to the hardware in the vacuum chamber and 'bpm' to the beam position monitoring system, the two labels will be used casually here.) The exercise is carried out at two magnet rigidities, injection (AGS field {approx}434 Gauss) and extraction to RHIC ({approx}9730 Gauss), - a ratio of rigidities of about 22.4. Since we stick with a bare machine, we are also stuck with the bare tunes which means the tunes are rather close together and near 8.75. Injection: (h,v) {approx} (8.73, 8.76).« less
Ma, Xiao H; Jia, Jia; Zhu, Feng; Xue, Ying; Li, Ze R; Chen, Yu Z
2009-05-01
Machine learning methods have been explored as ligand-based virtual screening tools for facilitating drug lead discovery. These methods predict compounds of specific pharmacodynamic, pharmacokinetic or toxicological properties based on their structure-derived structural and physicochemical properties. Increasing attention has been directed at these methods because of their capability in predicting compounds of diverse structures and complex structure-activity relationships without requiring the knowledge of target 3D structure. This article reviews current progresses in using machine learning methods for virtual screening of pharmacodynamically active compounds from large compound libraries, and analyzes and compares the reported performances of machine learning tools with those of structure-based and other ligand-based (such as pharmacophore and clustering) virtual screening methods. The feasibility to improve the performance of machine learning methods in screening large libraries is discussed.
A Novel Approach for Lie Detection Based on F-Score and Extreme Learning Machine
Gao, Junfeng; Wang, Zhao; Yang, Yong; Zhang, Wenjia; Tao, Chunyi; Guan, Jinan; Rao, Nini
2013-01-01
A new machine learning method referred to as F-score_ELM was proposed to classify the lying and truth-telling using the electroencephalogram (EEG) signals from 28 guilty and innocent subjects. Thirty-one features were extracted from the probe responses from these subjects. Then, a recently-developed classifier called extreme learning machine (ELM) was combined with F-score, a simple but effective feature selection method, to jointly optimize the number of the hidden nodes of ELM and the feature subset by a grid-searching training procedure. The method was compared to two classification models combining principal component analysis with back-propagation network and support vector machine classifiers. We thoroughly assessed the performance of these classification models including the training and testing time, sensitivity and specificity from the training and testing sets, as well as network size. The experimental results showed that the number of the hidden nodes can be effectively optimized by the proposed method. Also, F-score_ELM obtained the best classification accuracy and required the shortest training and testing time. PMID:23755136
Deep Learning Methods for Underwater Target Feature Extraction and Recognition
Peng, Yuan; Qiu, Mengran; Shi, Jianfei; Liu, Liangliang
2018-01-01
The classification and recognition technology of underwater acoustic signal were always an important research content in the field of underwater acoustic signal processing. Currently, wavelet transform, Hilbert-Huang transform, and Mel frequency cepstral coefficients are used as a method of underwater acoustic signal feature extraction. In this paper, a method for feature extraction and identification of underwater noise data based on CNN and ELM is proposed. An automatic feature extraction method of underwater acoustic signals is proposed using depth convolution network. An underwater target recognition classifier is based on extreme learning machine. Although convolution neural networks can execute both feature extraction and classification, their function mainly relies on a full connection layer, which is trained by gradient descent-based; the generalization ability is limited and suboptimal, so an extreme learning machine (ELM) was used in classification stage. Firstly, CNN learns deep and robust features, followed by the removing of the fully connected layers. Then ELM fed with the CNN features is used as the classifier to conduct an excellent classification. Experiments on the actual data set of civil ships obtained 93.04% recognition rate; compared to the traditional Mel frequency cepstral coefficients and Hilbert-Huang feature, recognition rate greatly improved. PMID:29780407
Process service quality evaluation based on Dempster-Shafer theory and support vector machine.
Pei, Feng-Que; Li, Dong-Bo; Tong, Yi-Fei; He, Fei
2017-01-01
Human involvement influences traditional service quality evaluations, which triggers an evaluation's low accuracy, poor reliability and less impressive predictability. This paper proposes a method by employing a support vector machine (SVM) and Dempster-Shafer evidence theory to evaluate the service quality of a production process by handling a high number of input features with a low sampling data set, which is called SVMs-DS. Features that can affect production quality are extracted by a large number of sensors. Preprocessing steps such as feature simplification and normalization are reduced. Based on three individual SVM models, the basic probability assignments (BPAs) are constructed, which can help the evaluation in a qualitative and quantitative way. The process service quality evaluation results are validated by the Dempster rules; the decision threshold to resolve conflicting results is generated from three SVM models. A case study is presented to demonstrate the effectiveness of the SVMs-DS method.
A Relevance Vector Machine-Based Approach with Application to Oil Sand Pump Prognostics
Hu, Jinfei; Tse, Peter W.
2013-01-01
Oil sand pumps are widely used in the mining industry for the delivery of mixtures of abrasive solids and liquids. Because they operate under highly adverse conditions, these pumps usually experience significant wear. Consequently, equipment owners are quite often forced to invest substantially in system maintenance to avoid unscheduled downtime. In this study, an approach combining relevance vector machines (RVMs) with a sum of two exponential functions was developed to predict the remaining useful life (RUL) of field pump impellers. To handle field vibration data, a novel feature extracting process was proposed to arrive at a feature varying with the development of damage in the pump impellers. A case study involving two field datasets demonstrated the effectiveness of the developed method. Compared with standalone exponential fitting, the proposed RVM-based model was much better able to predict the remaining useful life of pump impellers. PMID:24051527
A relevance vector machine-based approach with application to oil sand pump prognostics.
Hu, Jinfei; Tse, Peter W
2013-09-18
Oil sand pumps are widely used in the mining industry for the delivery of mixtures of abrasive solids and liquids. Because they operate under highly adverse conditions, these pumps usually experience significant wear. Consequently, equipment owners are quite often forced to invest substantially in system maintenance to avoid unscheduled downtime. In this study, an approach combining relevance vector machines (RVMs) with a sum of two exponential functions was developed to predict the remaining useful life (RUL) of field pump impellers. To handle field vibration data, a novel feature extracting process was proposed to arrive at a feature varying with the development of damage in the pump impellers. A case study involving two field datasets demonstrated the effectiveness of the developed method. Compared with standalone exponential fitting, the proposed RVM-based model was much better able to predict the remaining useful life of pump impellers.
Aggregation of Electric Current Consumption Features to Extract Maintenance KPIs
NASA Astrophysics Data System (ADS)
Simon, Victor; Johansson, Carl-Anders; Galar, Diego
2017-09-01
All electric powered machines offer the possibility of extracting information and calculating Key Performance Indicators (KPIs) from the electric current signal. Depending on the time window, sampling frequency and type of analysis, different indicators from the micro to macro level can be calculated for such aspects as maintenance, production, energy consumption etc. On the micro-level, the indicators are generally used for condition monitoring and diagnostics and are normally based on a short time window and a high sampling frequency. The macro indicators are normally based on a longer time window with a slower sampling frequency and are used as indicators for overall performance, cost or consumption. The indicators can be calculated directly from the current signal but can also be based on a combination of information from the current signal and operational data like rpm, position etc. One or several of those indicators can be used for prediction and prognostics of a machine's future behavior. This paper uses this technique to calculate indicators for maintenance and energy optimization in electric powered machines and fleets of machines, especially machine tools.
NASA Astrophysics Data System (ADS)
Heidari, Morteza; Zargari Khuzani, Abolfazl; Danala, Gopichandh; Mirniaharikandehei, Seyedehnafiseh; Qian, Wei; Zheng, Bin
2018-03-01
Both conventional and deep machine learning has been used to develop decision-support tools applied in medical imaging informatics. In order to take advantages of both conventional and deep learning approach, this study aims to investigate feasibility of applying a locally preserving projection (LPP) based feature regeneration algorithm to build a new machine learning classifier model to predict short-term breast cancer risk. First, a computer-aided image processing scheme was used to segment and quantify breast fibro-glandular tissue volume. Next, initially computed 44 image features related to the bilateral mammographic tissue density asymmetry were extracted. Then, an LLP-based feature combination method was applied to regenerate a new operational feature vector using a maximal variance approach. Last, a k-nearest neighborhood (KNN) algorithm based machine learning classifier using the LPP-generated new feature vectors was developed to predict breast cancer risk. A testing dataset involving negative mammograms acquired from 500 women was used. Among them, 250 were positive and 250 remained negative in the next subsequent mammography screening. Applying to this dataset, LLP-generated feature vector reduced the number of features from 44 to 4. Using a leave-onecase-out validation method, area under ROC curve produced by the KNN classifier significantly increased from 0.62 to 0.68 (p < 0.05) and odds ratio was 4.60 with a 95% confidence interval of [3.16, 6.70]. Study demonstrated that this new LPP-based feature regeneration approach enabled to produce an optimal feature vector and yield improved performance in assisting to predict risk of women having breast cancer detected in the next subsequent mammography screening.
Appraisal of geodynamic inversion results: a data mining approach
NASA Astrophysics Data System (ADS)
Baumann, T. S.
2016-11-01
Bayesian sampling based inversions require many thousands or even millions of forward models, depending on how nonlinear or non-unique the inverse problem is, and how many unknowns are involved. The result of such a probabilistic inversion is not a single `best-fit' model, but rather a probability distribution that is represented by the entire model ensemble. Often, a geophysical inverse problem is non-unique, and the corresponding posterior distribution is multimodal, meaning that the distribution consists of clusters with similar models that represent the observations equally well. In these cases, we would like to visualize the characteristic model properties within each of these clusters of models. However, even for a moderate number of inversion parameters, a manual appraisal for a large number of models is not feasible. This poses the question whether it is possible to extract end-member models that represent each of the best-fit regions including their uncertainties. Here, I show how a machine learning tool can be used to characterize end-member models, including their uncertainties, from a complete model ensemble that represents a posterior probability distribution. The model ensemble used here results from a nonlinear geodynamic inverse problem, where rheological properties of the lithosphere are constrained from multiple geophysical observations. It is demonstrated that by taking vertical cross-sections through the effective viscosity structure of each of the models, the entire model ensemble can be classified into four end-member model categories that have a similar effective viscosity structure. These classification results are helpful to explore the non-uniqueness of the inverse problem and can be used to compute representative data fits for each of the end-member models. Conversely, these insights also reveal how new observational constraints could reduce the non-uniqueness. The method is not limited to geodynamic applications and a generalized MATLAB
Imaging nanoscale lattice variations by machine learning of x-ray diffraction microscopy data
Laanait, Nouamane; Zhang, Zhan; Schlepütz, Christian M.
2016-08-09
In this paper, we present a novel methodology based on machine learning to extract lattice variations in crystalline materials, at the nanoscale, from an x-ray Bragg diffraction-based imaging technique. By employing a full-field microscopy setup, we capture real space images of materials, with imaging contrast determined solely by the x-ray diffracted signal. The data sets that emanate from this imaging technique are a hybrid of real space information (image spatial support) and reciprocal lattice space information (image contrast), and are intrinsically multidimensional (5D). By a judicious application of established unsupervised machine learning techniques and multivariate analysis to this multidimensional datamore » cube, we show how to extract features that can be ascribed physical interpretations in terms of common structural distortions, such as lattice tilts and dislocation arrays. Finally, we demonstrate this 'big data' approach to x-ray diffraction microscopy by identifying structural defects present in an epitaxial ferroelectric thin-film of lead zirconate titanate.« less
Imaging nanoscale lattice variations by machine learning of x-ray diffraction microscopy data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Laanait, Nouamane; Zhang, Zhan; Schlepütz, Christian M.
In this paper, we present a novel methodology based on machine learning to extract lattice variations in crystalline materials, at the nanoscale, from an x-ray Bragg diffraction-based imaging technique. By employing a full-field microscopy setup, we capture real space images of materials, with imaging contrast determined solely by the x-ray diffracted signal. The data sets that emanate from this imaging technique are a hybrid of real space information (image spatial support) and reciprocal lattice space information (image contrast), and are intrinsically multidimensional (5D). By a judicious application of established unsupervised machine learning techniques and multivariate analysis to this multidimensional datamore » cube, we show how to extract features that can be ascribed physical interpretations in terms of common structural distortions, such as lattice tilts and dislocation arrays. Finally, we demonstrate this 'big data' approach to x-ray diffraction microscopy by identifying structural defects present in an epitaxial ferroelectric thin-film of lead zirconate titanate.« less
Evidence of end-effector based gait machines in gait rehabilitation after CNS lesion.
Hesse, S; Schattat, N; Mehrholz, J; Werner, C
2013-01-01
A task-specific repetitive approach in gait rehabilitation after CNS lesion is well accepted nowadays. To ease the therapists' and patients' physical effort, the past two decades have seen the introduction of gait machines to intensify the amount of gait practice. Two principles have emerged, an exoskeleton- and an endeffector-based approach. Both systems share the harness and the body weight support. With the end-effector-based devices, the patients' feet are positioned on two foot plates, whose movements simulate stance and swing phase. This article provides an overview on the end-effector based machine's effectiveness regarding the restoration of gait. For the electromechanical gait trainer GT I, a meta analysis identified nine controlled trials (RCT) in stroke subjects (n = 568) and were analyzed to detect differences between end-effector-based locomotion + physiotherapy and physiotherapy alone. Patients practising with the machine effected in a superior gait ability (210 out of 319 patients, 65.8% vs. 96 out of 249 patients, 38.6%, respectively, Z = 2.29, p = 0.020), due to a larger training intensity. Only single RCTs have been reported for other devices and etiologies. The introduction of end-effector based gait machines has opened a new succesful chapter in gait rehabilitation after CNS lesion.
Repurposing mainstream CNC machine tools for laser-based additive manufacturing
NASA Astrophysics Data System (ADS)
Jones, Jason B.
2016-04-01
The advent of laser technology has been a key enabler for industrial 3D printing, known as Additive Manufacturing (AM). Despite its commercial success and unique technical capabilities, laser-based AM systems are not yet able to produce parts with the same accuracy and surface finish as CNC machining. To enable the geometry and material freedoms afforded by AM, yet achieve the precision and productivity of CNC machining, hybrid combinations of these two processes have started to gain traction. To achieve the benefits of combined processing, laser technology has been integrated into mainstream CNC machines - effectively repurposing them as hybrid manufacturing platforms. This paper reviews how this engineering challenge has prompted beam delivery innovations to allow automated changeover between laser processing and machining, using standard CNC tool changers. Handling laser-processing heads using the tool changer also enables automated change over between different types of laser processing heads, further expanding the breadth of laser processing flexibility in a hybrid CNC. This paper highlights the development, challenges and future impact of hybrid CNCs on laser processing.
Comparison of Machine Learning Methods for the Arterial Hypertension Diagnostics
Belo, David; Gamboa, Hugo
2017-01-01
The paper presents results of machine learning approach accuracy applied analysis of cardiac activity. The study evaluates the diagnostics possibilities of the arterial hypertension by means of the short-term heart rate variability signals. Two groups were studied: 30 relatively healthy volunteers and 40 patients suffering from the arterial hypertension of II-III degree. The following machine learning approaches were studied: linear and quadratic discriminant analysis, k-nearest neighbors, support vector machine with radial basis, decision trees, and naive Bayes classifier. Moreover, in the study, different methods of feature extraction are analyzed: statistical, spectral, wavelet, and multifractal. All in all, 53 features were investigated. Investigation results show that discriminant analysis achieves the highest classification accuracy. The suggested approach of noncorrelated feature set search achieved higher results than data set based on the principal components. PMID:28831239
Barua, Shaibal; Begum, Shahina; Ahmed, Mobyen Uddin
2015-01-01
Machine learning algorithms play an important role in computer science research. Recent advancement in sensor data collection in clinical sciences lead to a complex, heterogeneous data processing, and analysis for patient diagnosis and prognosis. Diagnosis and treatment of patients based on manual analysis of these sensor data are difficult and time consuming. Therefore, development of Knowledge-based systems to support clinicians in decision-making is important. However, it is necessary to perform experimental work to compare performances of different machine learning methods to help to select appropriate method for a specific characteristic of data sets. This paper compares classification performance of three popular machine learning methods i.e., case-based reasoning, neutral networks and support vector machine to diagnose stress of vehicle drivers using finger temperature and heart rate variability. The experimental results show that case-based reasoning outperforms other two methods in terms of classification accuracy. Case-based reasoning has achieved 80% and 86% accuracy to classify stress using finger temperature and heart rate variability. On contrary, both neural network and support vector machine have achieved less than 80% accuracy by using both physiological signals.
Confabulation Based Sentence Completion for Machine Reading
2010-11-01
making sentence completion an indispensible component of machine reading. Cogent confabulation is a bio-inspired computational model that mimics the...thus making sentence completion an indispensible component of machine reading. Cogent confabulation is a bio-inspired computational model that mimics...University Press, 1992. [2] H. Motoda and K. Yoshida, “Machine learning techniques to make computers easier to use,” Proceedings of the Fifteenth
Fingerprint recognition of alien invasive weeds based on the texture character and machine learning
NASA Astrophysics Data System (ADS)
Yu, Jia-Jia; Li, Xiao-Li; He, Yong; Xu, Zheng-Hao
2008-11-01
Multi-spectral imaging technique based on texture analysis and machine learning was proposed to discriminate alien invasive weeds with similar outline but different categories. The objectives of this study were to investigate the feasibility of using Multi-spectral imaging, especially the near-infrared (NIR) channel (800 nm+/-10 nm) to find the weeds' fingerprints, and validate the performance with specific eigenvalues by co-occurrence matrix. Veronica polita Pries, Veronica persica Poir, longtube ground ivy, Laminum amplexicaule Linn. were selected in this study, which perform different effect in field, and are alien invasive species in China. 307 weed leaves' images were randomly selected for the calibration set, while the remaining 207 samples for the prediction set. All images were pretreated by Wallis filter to adjust the noise by uneven lighting. Gray level co-occurrence matrix was applied to extract the texture character, which shows density, randomness correlation, contrast and homogeneity of texture with different algorithms. Three channels (green channel by 550 nm+/-10 nm, red channel by 650 nm+/-10 nm and NIR channel by 800 nm+/-10 nm) were respectively calculated to get the eigenvalues.Least-squares support vector machines (LS-SVM) was applied to discriminate the categories of weeds by the eigenvalues from co-occurrence matrix. Finally, recognition ratio of 83.35% by NIR channel was obtained, better than the results by green channel (76.67%) and red channel (69.46%). The prediction results of 81.35% indicated that the selected eigenvalues reflected the main characteristics of weeds' fingerprint based on multi-spectral (especially by NIR channel) and LS-SVM model.
Gehrmann, Sebastian; Dernoncourt, Franck; Li, Yeran; Carlson, Eric T; Wu, Joy T; Welt, Jonathan; Foote, John; Moseley, Edward T; Grant, David W; Tyler, Patrick D; Celi, Leo A
2018-01-01
In secondary analysis of electronic health records, a crucial task consists in correctly identifying the patient cohort under investigation. In many cases, the most valuable and relevant information for an accurate classification of medical conditions exist only in clinical narratives. Therefore, it is necessary to use natural language processing (NLP) techniques to extract and evaluate these narratives. The most commonly used approach to this problem relies on extracting a number of clinician-defined medical concepts from text and using machine learning techniques to identify whether a particular patient has a certain condition. However, recent advances in deep learning and NLP enable models to learn a rich representation of (medical) language. Convolutional neural networks (CNN) for text classification can augment the existing techniques by leveraging the representation of language to learn which phrases in a text are relevant for a given medical condition. In this work, we compare concept extraction based methods with CNNs and other commonly used models in NLP in ten phenotyping tasks using 1,610 discharge summaries from the MIMIC-III database. We show that CNNs outperform concept extraction based methods in almost all of the tasks, with an improvement in F1-score of up to 26 and up to 7 percentage points in area under the ROC curve (AUC). We additionally assess the interpretability of both approaches by presenting and evaluating methods that calculate and extract the most salient phrases for a prediction. The results indicate that CNNs are a valid alternative to existing approaches in patient phenotyping and cohort identification, and should be further investigated. Moreover, the deep learning approach presented in this paper can be used to assist clinicians during chart review or support the extraction of billing codes from text by identifying and highlighting relevant phrases for various medical conditions.
Insect-machine interface based neurocybernetics.
Bozkurt, Alper; Gilmour, Robert F; Sinha, Ayesa; Stern, David; Lal, Amit
2009-06-01
We present details of a novel bioelectric interface formed by placing microfabricated probes into insect during metamorphic growth cycles. The inserted microprobes emerge with the insect where the development of tissue around the electronics during the pupal development allows mechanically stable and electrically reliable structures coupled to the insect. Remarkably, the insects do not react adversely or otherwise to the inserted electronics in the pupae stage, as is true when the electrodes are inserted in adult stages. We report on the electrical and mechanical characteristics of this novel bioelectronic interface, which we believe would be adopted by many investigators trying to investigate biological behavior in insects with negligible or minimal traumatic effect encountered when probes are inserted in adult stages. This novel insect-machine interface also allows for hybrid insect-machine platforms for further studies. As an application, we demonstrate our first results toward navigation of flight in moths. When instrumented with equipment to gather information for environmental sensing, such insects potentially can assist man to monitor the ecosystems that we share with them for sustainability. The simplicity of the optimized surgical procedure we invented allows for batch insertions to the insect for automatic and mass production of such hybrid insect-machine platforms. Therefore, our bioelectronic interface and hybrid insect-machine platform enables multidisciplinary scientific and engineering studies not only to investigate the details of insect behavioral physiology but also to control it.
Machine learning in soil classification.
Bhattacharya, B; Solomatine, D P
2006-03-01
In a number of engineering problems, e.g. in geotechnics, petroleum engineering, etc. intervals of measured series data (signals) are to be attributed a class maintaining the constraint of contiguity and standard classification methods could be inadequate. Classification in this case needs involvement of an expert who observes the magnitude and trends of the signals in addition to any a priori information that might be available. In this paper, an approach for automating this classification procedure is presented. Firstly, a segmentation algorithm is developed and applied to segment the measured signals. Secondly, the salient features of these segments are extracted using boundary energy method. Based on the measured data and extracted features to assign classes to the segments classifiers are built; they employ Decision Trees, ANN and Support Vector Machines. The methodology was tested in classifying sub-surface soil using measured data from Cone Penetration Testing and satisfactory results were obtained.
A Machine Reading System for Assembling Synthetic Paleontological Databases
Peters, Shanan E.; Zhang, Ce; Livny, Miron; Ré, Christopher
2014-01-01
Many aspects of macroevolutionary theory and our understanding of biotic responses to global environmental change derive from literature-based compilations of paleontological data. Existing manually assembled databases are, however, incomplete and difficult to assess and enhance with new data types. Here, we develop and validate the quality of a machine reading system, PaleoDeepDive, that automatically locates and extracts data from heterogeneous text, tables, and figures in publications. PaleoDeepDive performs comparably to humans in several complex data extraction and inference tasks and generates congruent synthetic results that describe the geological history of taxonomic diversity and genus-level rates of origination and extinction. Unlike traditional databases, PaleoDeepDive produces a probabilistic database that systematically improves as information is added. We show that the system can readily accommodate sophisticated data types, such as morphological data in biological illustrations and associated textual descriptions. Our machine reading approach to scientific data integration and synthesis brings within reach many questions that are currently underdetermined and does so in ways that may stimulate entirely new modes of inquiry. PMID:25436610
Modal identification of spindle-tool unit in high-speed machining
NASA Astrophysics Data System (ADS)
Gagnol, Vincent; Le, Thien-Phu; Ray, Pascal
2011-10-01
The accurate knowledge of high-speed motorised spindle dynamic behaviour during machining is important in order to ensure the reliability of machine tools in service and the quality of machined parts. More specifically, the prediction of stable cutting regions, which is a critical requirement for high-speed milling operations, requires the accurate estimation of tool/holder/spindle set dynamic modal parameters. These estimations are generally obtained through Frequency Response Function (FRF) measurements of the non-rotating spindle. However, significant changes in modal parameters are expected to occur during operation, due to high-speed spindle rotation. The spindle's modal variations are highlighted through an integrated finite element model of the dynamic high-speed spindle-bearing system, taking into account rotor dynamics effects. The dependency of dynamic behaviour on speed range is then investigated and determined with accuracy. The objective of the proposed paper is to validate these numerical results through an experiment-based approach. Hence, an experimental setup is elaborated to measure rotating tool vibration during the machining operation in order to determine the spindle's modal frequency variation with respect to spindle speed in an industrial environment. The identification of natural frequencies of the spindle under rotating conditions is challenging, due to the low number of sensors and the presence of many harmonics in the measured signals. In order to overcome these issues and to extract the characteristics of the system, the spindle modes are determined through a 3-step procedure. First, spindle modes are highlighted using the Frequency Domain Decomposition (FDD) technique, with a new formulation at the considered rotating speed. These extracted modes are then analysed through the value of their respective damping ratios in order to separate the harmonics component from structural spindle natural frequencies. Finally, the stochastic
2018-01-01
As an intrinsic part of the Internet of Things (IoT) ecosystem, machine-to-machine (M2M) communications are expected to provide ubiquitous connectivity between machines. Millimeter-wave (mmWave) communication is another promising technology for the future communication systems to alleviate the pressure of scarce spectrum resources. For this reason, in this paper, we consider multi-hop M2M communications, where a machine-type communication (MTC) device with the limited transmit power relays to help other devices using mmWave. To be specific, we focus on hop distance statistics and their impacts on system performances in multi-hop wireless networks (MWNs) with directional antenna arrays in mmWave for M2M communications. Different from microwave systems, in mmWave communications, wireless channel suffers from blockage by obstacles that heavily attenuate line-of-sight signals, which may result in limited per-hop progress in MWNs. We consider two routing strategies aiming at different types of applications and derive the probability distributions of their hop distances. Moreover, we provide their baseline statistics assuming the blockage-free scenario to quantify the impact of blockages. Based on the hop distance analysis, we propose a method to estimate the end-to-end performances (e.g., outage probability, hop count, and transmit energy) of the mmWave MWNs, which provides important insights into mmWave MWN design without time-consuming and repetitive end-to-end simulation. PMID:29329248
Jung, Haejoon; Lee, In-Ho
2018-01-12
As an intrinsic part of the Internet of Things (IoT) ecosystem, machine-to-machine (M2M) communications are expected to provide ubiquitous connectivity between machines. Millimeter-wave (mmWave) communication is another promising technology for the future communication systems to alleviate the pressure of scarce spectrum resources. For this reason, in this paper, we consider multi-hop M2M communications, where a machine-type communication (MTC) device with the limited transmit power relays to help other devices using mmWave. To be specific, we focus on hop distance statistics and their impacts on system performances in multi-hop wireless networks (MWNs) with directional antenna arrays in mmWave for M2M communications. Different from microwave systems, in mmWave communications, wireless channel suffers from blockage by obstacles that heavily attenuate line-of-sight signals, which may result in limited per-hop progress in MWNs. We consider two routing strategies aiming at different types of applications and derive the probability distributions of their hop distances. Moreover, we provide their baseline statistics assuming the blockage-free scenario to quantify the impact of blockages. Based on the hop distance analysis, we propose a method to estimate the end-to-end performances (e.g., outage probability, hop count, and transmit energy) of the mmWave MWNs, which provides important insights into mmWave MWN design without time-consuming and repetitive end-to-end simulation.
Support vector machines-based fault diagnosis for turbo-pump rotor
NASA Astrophysics Data System (ADS)
Yuan, Sheng-Fa; Chu, Fu-Lei
2006-05-01
Most artificial intelligence methods used in fault diagnosis are based on empirical risk minimisation principle and have poor generalisation when fault samples are few. Support vector machines (SVM) is a new general machine-learning tool based on structural risk minimisation principle that exhibits good generalisation even when fault samples are few. Fault diagnosis based on SVM is discussed. Since basic SVM is originally designed for two-class classification, while most of fault diagnosis problems are multi-class cases, a new multi-class classification of SVM named 'one to others' algorithm is presented to solve the multi-class recognition problems. It is a binary tree classifier composed of several two-class classifiers organised by fault priority, which is simple, and has little repeated training amount, and the rate of training and recognition is expedited. The effectiveness of the method is verified by the application to the fault diagnosis for turbo pump rotor.
Maraschin, Marcelo; Somensi-Zeggio, Amélia; Oliveira, Simone K; Kuhnen, Shirley; Tomazzoli, Maíra M; Raguzzoni, Josiane C; Zeri, Ana C M; Carreira, Rafael; Correia, Sara; Costa, Christopher; Rocha, Miguel
2016-01-22
The chemical composition of propolis is affected by environmental factors and harvest season, making it difficult to standardize its extracts for medicinal usage. By detecting a typical chemical profile associated with propolis from a specific production region or season, certain types of propolis may be used to obtain a specific pharmacological activity. In this study, propolis from three agroecological regions (plain, plateau, and highlands) from southern Brazil, collected over the four seasons of 2010, were investigated through a novel NMR-based metabolomics data analysis workflow. Chemometrics and machine learning algorithms (PLS-DA and RF), including methods to estimate variable importance in classification, were used in this study. The machine learning and feature selection methods permitted construction of models for propolis sample classification with high accuracy (>75%, reaching ∼90% in the best case), better discriminating samples regarding their collection seasons comparatively to the harvest regions. PLS-DA and RF allowed the identification of biomarkers for sample discrimination, expanding the set of discriminating features and adding relevant information for the identification of the class-determining metabolites. The NMR-based metabolomics analytical platform, coupled to bioinformatic tools, allowed characterization and classification of Brazilian propolis samples regarding the metabolite signature of important compounds, i.e., chemical fingerprint, harvest seasons, and production regions.
A New Type of Tea Baking Machine Based on Pro/E Design
NASA Astrophysics Data System (ADS)
Lin, Xin-Ying; Wang, Wei
2017-11-01
In this paper, the production process of wulong tea was discussed, mainly the effect of baking on the quality of tea. The suitable baking temperature of different tea was introduced. Based on Pro/E, a new type of baking machine suitable for wulong tea baking was designed. The working principle, mechanical structure and constant temperature timing intelligent control system of baking machine were expounded. Finally, the characteristics and innovation of new baking machine were discussed.The mechanical structure of this baking machine is more simple and reasonable, and can use the heat of the inlet and outlet, more energy saving and environmental protection. The temperature control part adopts fuzzy PID control, which can improve the accuracy and response speed of temperature control and reduce the dependence of baking operation on skilled experience.
Spectral Unmixing Based Construction of Lunar Mineral Abundance Maps
NASA Astrophysics Data System (ADS)
Bernhardt, V.; Grumpe, A.; Wöhler, C.
2017-07-01
In this study we apply a nonlinear spectral unmixing algorithm to a nearly global lunar spectral reflectance mosaic derived from hyper-spectral image data acquired by the Moon Mineralogy Mapper (M3) instrument. Corrections for topographic effects and for thermal emission were performed. A set of 19 laboratory-based reflectance spectra of lunar samples published by the Lunar Soil Characterization Consortium (LSCC) were used as a catalog of potential endmember spectra. For a given spectrum, the multi-population population-based incremental learning (MPBIL) algorithm was used to determine the subset of endmembers actually contained in it. However, as the MPBIL algorithm is computationally expensive, it cannot be applied to all pixels of the reflectance mosaic. Hence, the reflectance mosaic was clustered into a set of 64 prototype spectra, and the MPBIL algorithm was applied to each prototype spectrum. Each pixel of the mosaic was assigned to the most similar prototype, and the set of endmembers previously determined for that prototype was used for pixel-wise nonlinear spectral unmixing using the Hapke model, implemented as linear unmixing of the single-scattering albedo spectrum. This procedure yields maps of the fractional abundances of the 19 endmembers. Based on the known modal abundances of a variety of mineral species in the LSCC samples, a conversion from endmember abundances to mineral abundances was performed. We present maps of the fractional abundances of plagioclase, pyroxene and olivine and compare our results with previously published lunar mineral abundance maps.
ERIC Educational Resources Information Center
Chen, Hsinchun
2003-01-01
Discusses information retrieval techniques used on the World Wide Web. Topics include machine learning in information extraction; relevance feedback; information filtering and recommendation; text classification and text clustering; Web mining, based on data mining techniques; hyperlink structure; and Web size. (LRW)
Methodology for creating dedicated machine and algorithm on sunflower counting
NASA Astrophysics Data System (ADS)
Muracciole, Vincent; Plainchault, Patrick; Mannino, Maria-Rosaria; Bertrand, Dominique; Vigouroux, Bertrand
2007-09-01
In order to sell grain lots in European countries, seed industries need a government certification. This certification requests purity testing, seed counting in order to quantify specified seed species and other impurities in lots, and germination testing. These analyses are carried out within the framework of international trade according to the methods of the International Seed Testing Association. Presently these different analyses are still achieved manually by skilled operators. Previous works have already shown that seeds can be characterized by around 110 visual features (morphology, colour, texture), and thus have presented several identification algorithms. Until now, most of the works in this domain are computer based. The approach presented in this article is based on the design of dedicated electronic vision machine aimed to identify and sort seeds. This machine is composed of a FPGA (Field Programmable Gate Array), a DSP (Digital Signal Processor) and a PC bearing the GUI (Human Machine Interface) of the system. Its operation relies on the stroboscopic image acquisition of a seed falling in front of a camera. A first machine was designed according to this approach, in order to simulate all the vision chain (image acquisition, feature extraction, identification) under the Matlab environment. In order to perform this task into dedicated hardware, all these algorithms were developed without the use of the Matlab toolbox. The objective of this article is to present a design methodology for a special purpose identification algorithm based on distance between groups into dedicated hardware machine for seed counting.
On Intelligent Design and Planning Method of Process Route Based on Gun Breech Machining Process
NASA Astrophysics Data System (ADS)
Hongzhi, Zhao; Jian, Zhang
2018-03-01
The paper states an approach of intelligent design and planning of process route based on gun breech machining process, against several problems, such as complex machining process of gun breech, tedious route design and long period of its traditional unmanageable process route. Based on gun breech machining process, intelligent design and planning system of process route are developed by virtue of DEST and VC++. The system includes two functional modules--process route intelligent design and its planning. The process route intelligent design module, through the analysis of gun breech machining process, summarizes breech process knowledge so as to complete the design of knowledge base and inference engine. And then gun breech process route intelligently output. On the basis of intelligent route design module, the final process route is made, edited and managed in the process route planning module.
An active role for machine learning in drug development
Murphy, Robert F.
2014-01-01
Due to the complexity of biological systems, cutting-edge machine-learning methods will be critical for future drug development. In particular, machine-vision methods to extract detailed information from imaging assays and active-learning methods to guide experimentation will be required to overcome the dimensionality problem in drug development. PMID:21587249
Scalable Machine Learning for Massive Astronomical Datasets
NASA Astrophysics Data System (ADS)
Ball, Nicholas M.; Gray, A.
2014-04-01
We present the ability to perform data mining and machine learning operations on a catalog of half a billion astronomical objects. This is the result of the combination of robust, highly accurate machine learning algorithms with linear scalability that renders the applications of these algorithms to massive astronomical data tractable. We demonstrate the core algorithms kernel density estimation, K-means clustering, linear regression, nearest neighbors, random forest and gradient-boosted decision tree, singular value decomposition, support vector machine, and two-point correlation function. Each of these is relevant for astronomical applications such as finding novel astrophysical objects, characterizing artifacts in data, object classification (including for rare objects), object distances, finding the important features describing objects, density estimation of distributions, probabilistic quantities, and exploring the unknown structure of new data. The software, Skytree Server, runs on any UNIX-based machine, a virtual machine, or cloud-based and distributed systems including Hadoop. We have integrated it on the cloud computing system of the Canadian Astronomical Data Centre, the Canadian Advanced Network for Astronomical Research (CANFAR), creating the world's first cloud computing data mining system for astronomy. We demonstrate results showing the scaling of each of our major algorithms on large astronomical datasets, including the full 470,992,970 objects of the 2 Micron All-Sky Survey (2MASS) Point Source Catalog. We demonstrate the ability to find outliers in the full 2MASS dataset utilizing multiple methods, e.g., nearest neighbors. This is likely of particular interest to the radio astronomy community given, for example, that survey projects contain groups dedicated to this topic. 2MASS is used as a proof-of-concept dataset due to its convenience and availability. These results are of interest to any astronomical project with large and/or complex
Scalable Machine Learning for Massive Astronomical Datasets
NASA Astrophysics Data System (ADS)
Ball, Nicholas M.; Astronomy Data Centre, Canadian
2014-01-01
We present the ability to perform data mining and machine learning operations on a catalog of half a billion astronomical objects. This is the result of the combination of robust, highly accurate machine learning algorithms with linear scalability that renders the applications of these algorithms to massive astronomical data tractable. We demonstrate the core algorithms kernel density estimation, K-means clustering, linear regression, nearest neighbors, random forest and gradient-boosted decision tree, singular value decomposition, support vector machine, and two-point correlation function. Each of these is relevant for astronomical applications such as finding novel astrophysical objects, characterizing artifacts in data, object classification (including for rare objects), object distances, finding the important features describing objects, density estimation of distributions, probabilistic quantities, and exploring the unknown structure of new data. The software, Skytree Server, runs on any UNIX-based machine, a virtual machine, or cloud-based and distributed systems including Hadoop. We have integrated it on the cloud computing system of the Canadian Astronomical Data Centre, the Canadian Advanced Network for Astronomical Research (CANFAR), creating the world's first cloud computing data mining system for astronomy. We demonstrate results showing the scaling of each of our major algorithms on large astronomical datasets, including the full 470,992,970 objects of the 2 Micron All-Sky Survey (2MASS) Point Source Catalog. We demonstrate the ability to find outliers in the full 2MASS dataset utilizing multiple methods, e.g., nearest neighbors, and the local outlier factor. 2MASS is used as a proof-of-concept dataset due to its convenience and availability. These results are of interest to any astronomical project with large and/or complex datasets that wishes to extract the full scientific value from its data.
Device for Extracting Flavors and Fragrances
NASA Technical Reports Server (NTRS)
Chang, F. R.
1986-01-01
Machine for making coffee and tea in weightless environment may prove even more valuable on Earth as general extraction apparatus. Zero-gravity beverage maker uses piston instead of gravity to move hot water and beverage from one chamber to other and dispense beverage. Machine functions like conventional coffeemaker during part of operating cycle and includes additional features that enable operation not only in zero gravity but also extraction under pressure in presence or absence of gravity.
Integrated feature extraction and selection for neuroimage classification
NASA Astrophysics Data System (ADS)
Fan, Yong; Shen, Dinggang
2009-02-01
Feature extraction and selection are of great importance in neuroimage classification for identifying informative features and reducing feature dimensionality, which are generally implemented as two separate steps. This paper presents an integrated feature extraction and selection algorithm with two iterative steps: constrained subspace learning based feature extraction and support vector machine (SVM) based feature selection. The subspace learning based feature extraction focuses on the brain regions with higher possibility of being affected by the disease under study, while the possibility of brain regions being affected by disease is estimated by the SVM based feature selection, in conjunction with SVM classification. This algorithm can not only take into account the inter-correlation among different brain regions, but also overcome the limitation of traditional subspace learning based feature extraction methods. To achieve robust performance and optimal selection of parameters involved in feature extraction, selection, and classification, a bootstrapping strategy is used to generate multiple versions of training and testing sets for parameter optimization, according to the classification performance measured by the area under the ROC (receiver operating characteristic) curve. The integrated feature extraction and selection method is applied to a structural MR image based Alzheimer's disease (AD) study with 98 non-demented and 100 demented subjects. Cross-validation results indicate that the proposed algorithm can improve performance of the traditional subspace learning based classification.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chikkagoudar, Satish; Chatterjee, Samrat; Thomas, Dennis G.
The absence of a robust and unified theory of cyber dynamics presents challenges and opportunities for using machine learning based data-driven approaches to further the understanding of the behavior of such complex systems. Analysts can also use machine learning approaches to gain operational insights. In order to be operationally beneficial, cybersecurity machine learning based models need to have the ability to: (1) represent a real-world system, (2) infer system properties, and (3) learn and adapt based on expert knowledge and observations. Probabilistic models and Probabilistic graphical models provide these necessary properties and are further explored in this chapter. Bayesian Networksmore » and Hidden Markov Models are introduced as an example of a widely used data driven classification/modeling strategy.« less
GWAS-based machine learning approach to predict duloxetine response in major depressive disorder.
Maciukiewicz, Malgorzata; Marshe, Victoria S; Hauschild, Anne-Christin; Foster, Jane A; Rotzinger, Susan; Kennedy, James L; Kennedy, Sidney H; Müller, Daniel J; Geraci, Joseph
2018-04-01
Major depressive disorder (MDD) is one of the most prevalent psychiatric disorders and is commonly treated with antidepressant drugs. However, large variability is observed in terms of response to antidepressants. Machine learning (ML) models may be useful to predict treatment outcomes. A sample of 186 MDD patients received treatment with duloxetine for up to 8 weeks were categorized as "responders" based on a MADRS change >50% from baseline; or "remitters" based on a MADRS score ≤10 at end point. The initial dataset (N = 186) was randomly divided into training and test sets in a nested 5-fold cross-validation, where 80% was used as a training set and 20% made up five independent test sets. We performed genome-wide logistic regression to identify potentially significant variants related to duloxetine response/remission and extracted the most promising predictors using LASSO regression. Subsequently, classification-regression trees (CRT) and support vector machines (SVM) were applied to construct models, using ten-fold cross-validation. With regards to response, none of the pairs performed significantly better than chance (accuracy p > .1). For remission, SVM achieved moderate performance with an accuracy = 0.52, a sensitivity = 0.58, and a specificity = 0.46, and 0.51 for all coefficients for CRT. The best performing SVM fold was characterized by an accuracy = 0.66 (p = .071), sensitivity = 0.70 and a sensitivity = 0.61. In this study, the potential of using GWAS data to predict duloxetine outcomes was examined using ML models. The models were characterized by a promising sensitivity, but specificity remained moderate at best. The inclusion of additional non-genetic variables to create integrated models may improve prediction. Copyright © 2017. Published by Elsevier Ltd.
Predicting the Performance of Chain Saw Machines Based on Shore Scleroscope Hardness
NASA Astrophysics Data System (ADS)
Tumac, Deniz
2014-03-01
Shore hardness has been used to estimate several physical and mechanical properties of rocks over the last few decades. However, the number of researches correlating Shore hardness with rock cutting performance is quite limited. Also, rather limited researches have been carried out on predicting the performance of chain saw machines. This study differs from the previous investigations in the way that Shore hardness values (SH1, SH2, and deformation coefficient) are used to determine the field performance of chain saw machines. The measured Shore hardness values are correlated with the physical and mechanical properties of natural stone samples, cutting parameters (normal force, cutting force, and specific energy) obtained from linear cutting tests in unrelieved cutting mode, and areal net cutting rate of chain saw machines. Two empirical models developed previously are improved for the prediction of the areal net cutting rate of chain saw machines. The first model is based on a revised chain saw penetration index, which uses SH1, machine weight, and useful arm cutting depth as predictors. The second model is based on the power consumed for only cutting the stone, arm thickness, and specific energy as a function of the deformation coefficient. While cutting force has a strong relationship with Shore hardness values, the normal force has a weak or moderate correlation. Uniaxial compressive strength, Cerchar abrasivity index, and density can also be predicted by Shore hardness values.
NASA Astrophysics Data System (ADS)
Pathak, Jaideep; Wikner, Alexander; Fussell, Rebeckah; Chandra, Sarthak; Hunt, Brian R.; Girvan, Michelle; Ott, Edward
2018-04-01
A model-based approach to forecasting chaotic dynamical systems utilizes knowledge of the mechanistic processes governing the dynamics to build an approximate mathematical model of the system. In contrast, machine learning techniques have demonstrated promising results for forecasting chaotic systems purely from past time series measurements of system state variables (training data), without prior knowledge of the system dynamics. The motivation for this paper is the potential of machine learning for filling in the gaps in our underlying mechanistic knowledge that cause widely-used knowledge-based models to be inaccurate. Thus, we here propose a general method that leverages the advantages of these two approaches by combining a knowledge-based model and a machine learning technique to build a hybrid forecasting scheme. Potential applications for such an approach are numerous (e.g., improving weather forecasting). We demonstrate and test the utility of this approach using a particular illustrative version of a machine learning known as reservoir computing, and we apply the resulting hybrid forecaster to a low-dimensional chaotic system, as well as to a high-dimensional spatiotemporal chaotic system. These tests yield extremely promising results in that our hybrid technique is able to accurately predict for a much longer period of time than either its machine-learning component or its model-based component alone.
Jaya, T; Dheeba, J; Singh, N Albert
2015-12-01
Diabetic retinopathy is a major cause of vision loss in diabetic patients. Currently, there is a need for making decisions using intelligent computer algorithms when screening a large volume of data. This paper presents an expert decision-making system designed using a fuzzy support vector machine (FSVM) classifier to detect hard exudates in fundus images. The optic discs in the colour fundus images are segmented to avoid false alarms using morphological operations and based on circular Hough transform. To discriminate between the exudates and the non-exudates pixels, colour and texture features are extracted from the images. These features are given as input to the FSVM classifier. The classifier analysed 200 retinal images collected from diabetic retinopathy screening programmes. The tests made on the retinal images show that the proposed detection system has better discriminating power than the conventional support vector machine. With the best combination of FSVM and features sets, the area under the receiver operating characteristic curve reached 0.9606, which corresponds to a sensitivity of 94.1% with a specificity of 90.0%. The results suggest that detecting hard exudates using FSVM contribute to computer-assisted detection of diabetic retinopathy and as a decision support system for ophthalmologists.
Vision-Based People Detection System for Heavy Machine Applications
Fremont, Vincent; Bui, Manh Tuan; Boukerroui, Djamal; Letort, Pierrick
2016-01-01
This paper presents a vision-based people detection system for improving safety in heavy machines. We propose a perception system composed of a monocular fisheye camera and a LiDAR. Fisheye cameras have the advantage of a wide field-of-view, but the strong distortions that they create must be handled at the detection stage. Since people detection in fisheye images has not been well studied, we focus on investigating and quantifying the impact that strong radial distortions have on the appearance of people, and we propose approaches for handling this specificity, adapted from state-of-the-art people detection approaches. These adaptive approaches nevertheless have the drawback of high computational cost and complexity. Consequently, we also present a framework for harnessing the LiDAR modality in order to enhance the detection algorithm for different camera positions. A sequential LiDAR-based fusion architecture is used, which addresses directly the problem of reducing false detections and computational cost in an exclusively vision-based system. A heavy machine dataset was built, and different experiments were carried out to evaluate the performance of the system. The results are promising, in terms of both processing speed and performance. PMID:26805838
Vision-Based People Detection System for Heavy Machine Applications.
Fremont, Vincent; Bui, Manh Tuan; Boukerroui, Djamal; Letort, Pierrick
2016-01-20
This paper presents a vision-based people detection system for improving safety in heavy machines. We propose a perception system composed of a monocular fisheye camera and a LiDAR. Fisheye cameras have the advantage of a wide field-of-view, but the strong distortions that they create must be handled at the detection stage. Since people detection in fisheye images has not been well studied, we focus on investigating and quantifying the impact that strong radial distortions have on the appearance of people, and we propose approaches for handling this specificity, adapted from state-of-the-art people detection approaches. These adaptive approaches nevertheless have the drawback of high computational cost and complexity. Consequently, we also present a framework for harnessing the LiDAR modality in order to enhance the detection algorithm for different camera positions. A sequential LiDAR-based fusion architecture is used, which addresses directly the problem of reducing false detections and computational cost in an exclusively vision-based system. A heavy machine dataset was built, and different experiments were carried out to evaluate the performance of the system. The results are promising, in terms of both processing speed and performance.
Prior-knowledge-based spectral mixture analysis for impervious surface mapping
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Jinshui; He, Chunyang; Zhou, Yuyu
2014-01-03
In this study, we developed a prior-knowledge-based spectral mixture analysis (PKSMA) to map impervious surfaces by using endmembers derived separately for high- and low-density urban regions. First, an urban area was categorized into high- and low-density urban areas, using a multi-step classification method. Next, in high-density urban areas that were assumed to have only vegetation and impervious surfaces (ISs), the Vegetation-Impervious model (V-I) was used in a spectral mixture analysis (SMA) with three endmembers: vegetation, high albedo, and low albedo. In low-density urban areas, the Vegetation-Impervious-Soil model (V-I-S) was used in an SMA analysis with four endmembers: high albedo, lowmore » albedo, soil, and vegetation. The fraction of IS with high and low albedo in each pixel was combined to produce the final IS map. The root mean-square error (RMSE) of the IS map produced using PKSMA was about 11.0%, compared to 14.52% using four-endmember SMA. Particularly in high-density urban areas, PKSMA (RMSE = 6.47%) showed better performance than four-endmember (15.91%). The results indicate that PKSMA can improve IS mapping compared to traditional SMA by using appropriately selected endmembers and is particularly strong in high-density urban areas.« less
21 CFR 172.585 - Sugar beet extract flavor base.
Code of Federal Regulations, 2014 CFR
2014-04-01
... 21 Food and Drugs 3 2014-04-01 2014-04-01 false Sugar beet extract flavor base. 172.585 Section... Related Substances § 172.585 Sugar beet extract flavor base. Sugar beet extract flavor base may be safely used in food in accordance with the provisions of this section. (a) Sugar beet extract flavor base is...
ERIC Educational Resources Information Center
Mississippi Research and Curriculum Unit for Vocational and Technical Education, State College.
This document, which reflects Mississippi's statutory requirement that instructional programs be based on core curricula and performance-based assessment, contains outlines of the instructional units required in local instructional management plans and daily lesson plans for machine tool operation/machine shop I and II. Presented first are a…
Automatic vetting of planet candidates from ground based surveys: Machine learning with NGTS
NASA Astrophysics Data System (ADS)
Armstrong, David J.; Günther, Maximilian N.; McCormac, James; Smith, Alexis M. S.; Bayliss, Daniel; Bouchy, François; Burleigh, Matthew R.; Casewell, Sarah; Eigmüller, Philipp; Gillen, Edward; Goad, Michael R.; Hodgkin, Simon T.; Jenkins, James S.; Louden, Tom; Metrailler, Lionel; Pollacco, Don; Poppenhaeger, Katja; Queloz, Didier; Raynard, Liam; Rauer, Heike; Udry, Stéphane; Walker, Simon R.; Watson, Christopher A.; West, Richard G.; Wheatley, Peter J.
2018-05-01
State of the art exoplanet transit surveys are producing ever increasing quantities of data. To make the best use of this resource, in detecting interesting planetary systems or in determining accurate planetary population statistics, requires new automated methods. Here we describe a machine learning algorithm that forms an integral part of the pipeline for the NGTS transit survey, demonstrating the efficacy of machine learning in selecting planetary candidates from multi-night ground based survey data. Our method uses a combination of random forests and self-organising-maps to rank planetary candidates, achieving an AUC score of 97.6% in ranking 12368 injected planets against 27496 false positives in the NGTS data. We build on past examples by using injected transit signals to form a training set, a necessary development for applying similar methods to upcoming surveys. We also make the autovet code used to implement the algorithm publicly accessible. autovet is designed to perform machine learned vetting of planetary candidates, and can utilise a variety of methods. The apparent robustness of machine learning techniques, whether on space-based or the qualitatively different ground-based data, highlights their importance to future surveys such as TESS and PLATO and the need to better understand their advantages and pitfalls in an exoplanetary context.
Real-time machine vision system using FPGA and soft-core processor
NASA Astrophysics Data System (ADS)
Malik, Abdul Waheed; Thörnberg, Benny; Meng, Xiaozhou; Imran, Muhammad
2012-06-01
This paper presents a machine vision system for real-time computation of distance and angle of a camera from reference points in the environment. Image pre-processing, component labeling and feature extraction modules were modeled at Register Transfer (RT) level and synthesized for implementation on field programmable gate arrays (FPGA). The extracted image component features were sent from the hardware modules to a soft-core processor, MicroBlaze, for computation of distance and angle. A CMOS imaging sensor operating at a clock frequency of 27MHz was used in our experiments to produce a video stream at the rate of 75 frames per second. Image component labeling and feature extraction modules were running in parallel having a total latency of 13ms. The MicroBlaze was interfaced with the component labeling and feature extraction modules through Fast Simplex Link (FSL). The latency for computing distance and angle of camera from the reference points was measured to be 2ms on the MicroBlaze, running at 100 MHz clock frequency. In this paper, we present the performance analysis, device utilization and power consumption for the designed system. The FPGA based machine vision system that we propose has high frame speed, low latency and a power consumption that is much lower compared to commercially available smart camera solutions.
Nakai, Yasushi; Takiguchi, Tetsuya; Matsui, Gakuyo; Yamaoka, Noriko; Takada, Satoshi
2017-10-01
Abnormal prosody is often evident in the voice intonations of individuals with autism spectrum disorders. We compared a machine-learning-based voice analysis with human hearing judgments made by 10 speech therapists for classifying children with autism spectrum disorders ( n = 30) and typical development ( n = 51). Using stimuli limited to single-word utterances, machine-learning-based voice analysis was superior to speech therapist judgments. There was a significantly higher true-positive than false-negative rate for machine-learning-based voice analysis but not for speech therapists. Results are discussed in terms of some artificiality of clinician judgments based on single-word utterances, and the objectivity machine-learning-based voice analysis adds to judging abnormal prosody.
Accessible engineering drawings for visually impaired machine operators.
Ramteke, Deepak; Kansal, Gayatri; Madhab, Benu
2014-01-01
An engineering drawing provides manufacturing information to a machine operator. An operator plans and executes machining operations based on this information. A visually impaired (VI) operator does not have direct access to the drawings. Drawing information is provided to them verbally or by using sample parts. Both methods have limitations that affect the quality of output. Use of engineering drawings is a standard practice for every industry; this hampers employment of a VI operator. Accessible engineering drawings are required to increase both independence, as well as, employability of VI operators. Today, Computer Aided Design (CAD) software is used for making engineering drawings, which are saved in CAD files. Required information is extracted from the CAD files and converted into Braille or voice. The authors of this article propose a method to make engineering drawings information directly accessible to a VI operator.
Evaluation of machinability and flexural strength of a novel dental machinable glass-ceramic.
Qin, Feng; Zheng, Shucan; Luo, Zufeng; Li, Yong; Guo, Ling; Zhao, Yunfeng; Fu, Qiang
2009-10-01
To evaluate the machinability and flexural strength of a novel dental machinable glass-ceramic (named PMC), and to compare the machinability property with that of Vita Mark II and human enamel. The raw batch materials were selected and mixed. Four groups of novel glass-ceramics were formed at different nucleation temperatures, and were assigned to Group 1, Group 2, Group 3 and Group 4. The machinability of the four groups of novel glass-ceramics, Vita Mark II ceramic and freshly extracted human premolars were compared by means of drilling depth measurement. A three-point bending test was used to measure the flexural strength of the novel glass-ceramics. The crystalline phases of the group with the best machinability were identified by X-ray diffraction. In terms of the drilling depth, Group 2 of the novel glass-ceramics proves to have the largest drilling depth. There was no statistical difference among Group 1, Group 4 and the natural teeth. The drilling depth of Vita MK II was statistically less than that of Group 1, Group 4 and the natural teeth. Group 3 had the least drilling depth. In respect of the flexural strength, Group 2 exhibited the maximum flexural strength; Group 1 was statistically weaker than Group 2; there was no statistical difference between Group 3 and Group 4, and they were the weakest materials. XRD of Group 2 ceramic showed that a new type of dental machinable glass-ceramic containing calcium-mica had been developed by the present study and was named PMC. PMC is promising for application as a dental machinable ceramic due to its good machinability and relatively high strength.
NASA Astrophysics Data System (ADS)
Tu, Shu-Ju; Wang, Chih-Wei; Pan, Kuang-Tse; Wu, Yi-Cheng; Wu, Chen-Te
2018-03-01
Lung cancer screening aims to detect small pulmonary nodules and decrease the mortality rate of those affected. However, studies from large-scale clinical trials of lung cancer screening have shown that the false-positive rate is high and positive predictive value is low. To address these problems, a technical approach is greatly needed for accurate malignancy differentiation among these early-detected nodules. We studied the clinical feasibility of an additional protocol of localized thin-section CT for further assessment on recalled patients from lung cancer screening tests. Our approach of localized thin-section CT was integrated with radiomics features extraction and machine learning classification which was supervised by pathological diagnosis. Localized thin-section CT images of 122 nodules were retrospectively reviewed and 374 radiomics features were extracted. In this study, 48 nodules were benign and 74 malignant. There were nine patients with multiple nodules and four with synchronous multiple malignant nodules. Different machine learning classifiers with a stratified ten-fold cross-validation were used and repeated 100 times to evaluate classification accuracy. Of the image features extracted from the thin-section CT images, 238 (64%) were useful in differentiating between benign and malignant nodules. These useful features include CT density (p = 0.002 518), sigma (p = 0.002 781), uniformity (p = 0.032 41), and entropy (p = 0.006 685). The highest classification accuracy was 79% by the logistic classifier. The performance metrics of this logistic classification model was 0.80 for the positive predictive value, 0.36 for the false-positive rate, and 0.80 for the area under the receiver operating characteristic curve. Our approach of direct risk classification supervised by the pathological diagnosis with localized thin-section CT and radiomics feature extraction may support clinical physicians in determining
Tu, Shu-Ju; Wang, Chih-Wei; Pan, Kuang-Tse; Wu, Yi-Cheng; Wu, Chen-Te
2018-03-14
Lung cancer screening aims to detect small pulmonary nodules and decrease the mortality rate of those affected. However, studies from large-scale clinical trials of lung cancer screening have shown that the false-positive rate is high and positive predictive value is low. To address these problems, a technical approach is greatly needed for accurate malignancy differentiation among these early-detected nodules. We studied the clinical feasibility of an additional protocol of localized thin-section CT for further assessment on recalled patients from lung cancer screening tests. Our approach of localized thin-section CT was integrated with radiomics features extraction and machine learning classification which was supervised by pathological diagnosis. Localized thin-section CT images of 122 nodules were retrospectively reviewed and 374 radiomics features were extracted. In this study, 48 nodules were benign and 74 malignant. There were nine patients with multiple nodules and four with synchronous multiple malignant nodules. Different machine learning classifiers with a stratified ten-fold cross-validation were used and repeated 100 times to evaluate classification accuracy. Of the image features extracted from the thin-section CT images, 238 (64%) were useful in differentiating between benign and malignant nodules. These useful features include CT density (p = 0.002 518), sigma (p = 0.002 781), uniformity (p = 0.032 41), and entropy (p = 0.006 685). The highest classification accuracy was 79% by the logistic classifier. The performance metrics of this logistic classification model was 0.80 for the positive predictive value, 0.36 for the false-positive rate, and 0.80 for the area under the receiver operating characteristic curve. Our approach of direct risk classification supervised by the pathological diagnosis with localized thin-section CT and radiomics feature extraction may support clinical physicians in determining
Machine-learning in grading of gliomas based on multi-parametric magnetic resonance imaging at 3T.
Citak-Er, Fusun; Firat, Zeynep; Kovanlikaya, Ilhami; Ture, Ugur; Ozturk-Isik, Esin
2018-06-15
The objective of this study was to assess the contribution of multi-parametric (mp) magnetic resonance imaging (MRI) quantitative features in the machine learning-based grading of gliomas with a multi-region-of-interests approach. Forty-three patients who were newly diagnosed as having a glioma were included in this study. The patients were scanned prior to any therapy using a standard brain tumor magnetic resonance (MR) imaging protocol that included T1 and T2-weighted, diffusion-weighted, diffusion tensor, MR perfusion and MR spectroscopic imaging. Three different regions-of-interest were drawn for each subject to encompass tumor, immediate tumor periphery, and distant peritumoral edema/normal. The normalized mp-MRI features were used to build machine-learning models for differentiating low-grade gliomas (WHO grades I and II) from high grades (WHO grades III and IV). In order to assess the contribution of regional mp-MRI quantitative features to the classification models, a support vector machine-based recursive feature elimination method was applied prior to classification. A machine-learning model based on support vector machine algorithm with linear kernel achieved an accuracy of 93.0%, a specificity of 86.7%, and a sensitivity of 96.4% for the grading of gliomas using ten-fold cross validation based on the proposed subset of the mp-MRI features. In this study, machine-learning based on multiregional and multi-parametric MRI data has proven to be an important tool in grading glial tumors accurately even in this limited patient population. Future studies are needed to investigate the use of machine learning algorithms for brain tumor classification in a larger patient cohort. Copyright © 2018. Published by Elsevier Ltd.
Araki, Tadashi; Ikeda, Nobutaka; Shukla, Devarshi; Jain, Pankaj K; Londhe, Narendra D; Shrivastava, Vimal K; Banchhor, Sumit K; Saba, Luca; Nicolaides, Andrew; Shafique, Shoaib; Laird, John R; Suri, Jasjit S
2016-05-01
Percutaneous coronary interventional procedures need advance planning prior to stenting or an endarterectomy. Cardiologists use intravascular ultrasound (IVUS) for screening, risk assessment and stratification of coronary artery disease (CAD). We hypothesize that plaque components are vulnerable to rupture due to plaque progression. Currently, there are no standard grayscale IVUS tools for risk assessment of plaque rupture. This paper presents a novel strategy for risk stratification based on plaque morphology embedded with principal component analysis (PCA) for plaque feature dimensionality reduction and dominant feature selection technique. The risk assessment utilizes 56 grayscale coronary features in a machine learning framework while linking information from carotid and coronary plaque burdens due to their common genetic makeup. This system consists of a machine learning paradigm which uses a support vector machine (SVM) combined with PCA for optimal and dominant coronary artery morphological feature extraction. Carotid artery proven intima-media thickness (cIMT) biomarker is adapted as a gold standard during the training phase of the machine learning system. For the performance evaluation, K-fold cross validation protocol is adapted with 20 trials per fold. For choosing the dominant features out of the 56 grayscale features, a polling strategy of PCA is adapted where the original value of the features is unaltered. Different protocols are designed for establishing the stability and reliability criteria of the coronary risk assessment system (cRAS). Using the PCA-based machine learning paradigm and cross-validation protocol, a classification accuracy of 98.43% (AUC 0.98) with K=10 folds using an SVM radial basis function (RBF) kernel was achieved. A reliability index of 97.32% and machine learning stability criteria of 5% were met for the cRAS. This is the first Computer aided design (CADx) system of its kind that is able to demonstrate the ability of coronary
Jian, Yulin; Huang, Daoyu; Yan, Jia; Lu, Kun; Huang, Ying; Wen, Tailai; Zeng, Tanyue; Zhong, Shijie; Xie, Qilong
2017-01-01
A novel classification model, named the quantum-behaved particle swarm optimization (QPSO)-based weighted multiple kernel extreme learning machine (QWMK-ELM), is proposed in this paper. Experimental validation is carried out with two different electronic nose (e-nose) datasets. Being different from the existing multiple kernel extreme learning machine (MK-ELM) algorithms, the combination coefficients of base kernels are regarded as external parameters of single-hidden layer feedforward neural networks (SLFNs). The combination coefficients of base kernels, the model parameters of each base kernel, and the regularization parameter are optimized by QPSO simultaneously before implementing the kernel extreme learning machine (KELM) with the composite kernel function. Four types of common single kernel functions (Gaussian kernel, polynomial kernel, sigmoid kernel, and wavelet kernel) are utilized to constitute different composite kernel functions. Moreover, the method is also compared with other existing classification methods: extreme learning machine (ELM), kernel extreme learning machine (KELM), k-nearest neighbors (KNN), support vector machine (SVM), multi-layer perceptron (MLP), radical basis function neural network (RBFNN), and probabilistic neural network (PNN). The results have demonstrated that the proposed QWMK-ELM outperforms the aforementioned methods, not only in precision, but also in efficiency for gas classification. PMID:28629202
Jian, Yulin; Huang, Daoyu; Yan, Jia; Lu, Kun; Huang, Ying; Wen, Tailai; Zeng, Tanyue; Zhong, Shijie; Xie, Qilong
2017-06-19
A novel classification model, named the quantum-behaved particle swarm optimization (QPSO)-based weighted multiple kernel extreme learning machine (QWMK-ELM), is proposed in this paper. Experimental validation is carried out with two different electronic nose (e-nose) datasets. Being different from the existing multiple kernel extreme learning machine (MK-ELM) algorithms, the combination coefficients of base kernels are regarded as external parameters of single-hidden layer feedforward neural networks (SLFNs). The combination coefficients of base kernels, the model parameters of each base kernel, and the regularization parameter are optimized by QPSO simultaneously before implementing the kernel extreme learning machine (KELM) with the composite kernel function. Four types of common single kernel functions (Gaussian kernel, polynomial kernel, sigmoid kernel, and wavelet kernel) are utilized to constitute different composite kernel functions. Moreover, the method is also compared with other existing classification methods: extreme learning machine (ELM), kernel extreme learning machine (KELM), k-nearest neighbors (KNN), support vector machine (SVM), multi-layer perceptron (MLP), radical basis function neural network (RBFNN), and probabilistic neural network (PNN). The results have demonstrated that the proposed QWMK-ELM outperforms the aforementioned methods, not only in precision, but also in efficiency for gas classification.
Ikushima, Koujiro; Arimura, Hidetaka; Jin, Ze; Yabu-Uchi, Hidetake; Kuwazuru, Jumpei; Shioyama, Yoshiyuki; Sasaki, Tomonari; Honda, Hiroshi; Sasaki, Masayuki
2017-01-01
We have proposed a computer-assisted framework for machine-learning-based delineation of gross tumor volumes (GTVs) following an optimum contour selection (OCS) method. The key idea of the proposed framework was to feed image features around GTV contours (determined based on the knowledge of radiation oncologists) into a machine-learning classifier during the training step, after which the classifier produces the 'degree of GTV' for each voxel in the testing step. Initial GTV regions were extracted using a support vector machine (SVM) that learned the image features inside and outside each tumor region (determined by radiation oncologists). The leave-one-out-by-patient test was employed for training and testing the steps of the proposed framework. The final GTV regions were determined using the OCS method that can be used to select a global optimum object contour based on multiple active delineations with a LSM around the GTV. The efficacy of the proposed framework was evaluated in 14 lung cancer cases [solid: 6, ground-glass opacity (GGO): 4, mixed GGO: 4] using the 3D Dice similarity coefficient (DSC), which denotes the degree of region similarity between the GTVs contoured by radiation oncologists and those determined using the proposed framework. The proposed framework achieved an average DSC of 0.777 for 14 cases, whereas the OCS-based framework produced an average DSC of 0.507. The average DSCs for GGO and mixed GGO were 0.763 and 0.701, respectively, obtained by the proposed framework. The proposed framework can be employed as a tool to assist radiation oncologists in delineating various GTV regions. © The Author 2016. Published by Oxford University Press on behalf of The Japan Radiation Research Society and Japanese Society for Radiation Oncology.
Grouin, Cyril; Zweigenbaum, Pierre
2013-01-01
In this paper, we present a comparison of two approaches to automatically de-identify medical records written in French: a rule-based system and a machine-learning based system using a conditional random fields (CRF) formalism. Both systems have been designed to process nine identifiers in a corpus of medical records in cardiology. We performed two evaluations: first, on 62 documents in cardiology, and on 10 documents in foetopathology - produced by optical character recognition (OCR) - to evaluate the robustness of our systems. We achieved a 0.843 (rule-based) and 0.883 (machine-learning) exact match overall F-measure in cardiology. While the rule-based system allowed us to achieve good results on nominative (first and last names) and numerical data (dates, phone numbers, and zip codes), the machine-learning approach performed best on more complex categories (postal addresses, hospital names, medical devices, and towns). On the foetopathology corpus, although our systems have not been designed for this corpus and despite OCR character recognition errors, we obtained promising results: a 0.681 (rule-based) and 0.638 (machine-learning) exact-match overall F-measure. This demonstrates that existing tools can be applied to process new documents of lower quality.
Extracting microRNA-gene relations from biomedical literature using distant supervision
Clarke, Luka A.; Couto, Francisco M.
2017-01-01
Many biomedical relation extraction approaches are based on supervised machine learning, requiring an annotated corpus. Distant supervision aims at training a classifier by combining a knowledge base with a corpus, reducing the amount of manual effort necessary. This is particularly useful for biomedicine because many databases and ontologies have been made available for many biological processes, while the availability of annotated corpora is still limited. We studied the extraction of microRNA-gene relations from text. MicroRNA regulation is an important biological process due to its close association with human diseases. The proposed method, IBRel, is based on distantly supervised multi-instance learning. We evaluated IBRel on three datasets, and the results were compared with a co-occurrence approach as well as a supervised machine learning algorithm. While supervised learning outperformed on two of those datasets, IBRel obtained an F-score 28.3 percentage points higher on the dataset for which there was no training set developed specifically. To demonstrate the applicability of IBRel, we used it to extract 27 miRNA-gene relations from recently published papers about cystic fibrosis. Our results demonstrate that our method can be successfully used to extract relations from literature about a biological process without an annotated corpus. The source code and data used in this study are available at https://github.com/AndreLamurias/IBRel. PMID:28263989
Extracting microRNA-gene relations from biomedical literature using distant supervision.
Lamurias, Andre; Clarke, Luka A; Couto, Francisco M
2017-01-01
Many biomedical relation extraction approaches are based on supervised machine learning, requiring an annotated corpus. Distant supervision aims at training a classifier by combining a knowledge base with a corpus, reducing the amount of manual effort necessary. This is particularly useful for biomedicine because many databases and ontologies have been made available for many biological processes, while the availability of annotated corpora is still limited. We studied the extraction of microRNA-gene relations from text. MicroRNA regulation is an important biological process due to its close association with human diseases. The proposed method, IBRel, is based on distantly supervised multi-instance learning. We evaluated IBRel on three datasets, and the results were compared with a co-occurrence approach as well as a supervised machine learning algorithm. While supervised learning outperformed on two of those datasets, IBRel obtained an F-score 28.3 percentage points higher on the dataset for which there was no training set developed specifically. To demonstrate the applicability of IBRel, we used it to extract 27 miRNA-gene relations from recently published papers about cystic fibrosis. Our results demonstrate that our method can be successfully used to extract relations from literature about a biological process without an annotated corpus. The source code and data used in this study are available at https://github.com/AndreLamurias/IBRel.
NASA Technical Reports Server (NTRS)
Brackett, Robert A.; Arvidson, Raymond E.
1993-01-01
A technique is presented that allows extraction of compositional and textural information from visible, near and thermal infrared remotely sensed data. Using a library of both emissivity and reflectance spectra, endmember abundances and endmember thermal inertias are extracted from AVIRIS (Airborne Visible and Infrared Imaging Spectrometer) and TIMS (Thermal Infrared Mapping Spectrometer) data over Lunar Crater Volcanic Field, Nevada, using a dual inversion. The inversion technique is motivated by upcoming Mars Observer data and the need for separation of composition and texture parameters from sub pixel mixtures of bedrock and dust. The model employed offers the opportunity to extract compositional and textural information for a variety of endmembers within a given pixel. Geologic inferences concerning grain size, abundance, and source of endmembers can be made directly from the inverted data. These parameters are of direct relevance to Mars exploration, both for Mars Observer and for follow-on missions.
21 CFR 172.585 - Sugar beet extract flavor base.
Code of Federal Regulations, 2010 CFR
2010-04-01
... 21 Food and Drugs 3 2010-04-01 2009-04-01 true Sugar beet extract flavor base. 172.585 Section 172... CONSUMPTION Flavoring Agents and Related Substances § 172.585 Sugar beet extract flavor base. Sugar beet...) Sugar beet extract flavor base is the concentrated residue of soluble sugar beet extractives from which...
Zhang, Hong-Guang; Yang, Qin-Min; Lu, Jian-Gang
2014-04-01
In this paper, a novel discriminant methodology based on near infrared spectroscopic analysis technique and least square support vector machine was proposed for rapid and nondestructive discrimination of different types of Polyacrylamide. The diffuse reflectance spectra of samples of Non-ionic Polyacrylamide, Anionic Polyacrylamide and Cationic Polyacrylamide were measured. Then principal component analysis method was applied to reduce the dimension of the spectral data and extract of the principal compnents. The first three principal components were used for cluster analysis of the three different types of Polyacrylamide. Then those principal components were also used as inputs of least square support vector machine model. The optimization of the parameters and the number of principal components used as inputs of least square support vector machine model was performed through cross validation based on grid search. 60 samples of each type of Polyacrylamide were collected. Thus a total of 180 samples were obtained. 135 samples, 45 samples for each type of Polyacrylamide, were randomly split into a training set to build calibration model and the rest 45 samples were used as test set to evaluate the performance of the developed model. In addition, 5 Cationic Polyacrylamide samples and 5 Anionic Polyacrylamide samples adulterated with different proportion of Non-ionic Polyacrylamide were also prepared to show the feasibilty of the proposed method to discriminate the adulterated Polyacrylamide samples. The prediction error threshold for each type of Polyacrylamide was determined by F statistical significance test method based on the prediction error of the training set of corresponding type of Polyacrylamide in cross validation. The discrimination accuracy of the built model was 100% for prediction of the test set. The prediction of the model for the 10 mixing samples was also presented, and all mixing samples were accurately discriminated as adulterated samples. The
Machine Learning Based Evaluation of Reading and Writing Difficulties.
Iwabuchi, Mamoru; Hirabayashi, Rumi; Nakamura, Kenryu; Dim, Nem Khan
2017-01-01
The possibility of auto evaluation of reading and writing difficulties was investigated using non-parametric machine learning (ML) regression technique for URAWSS (Understanding Reading and Writing Skills of Schoolchildren) [1] test data of 168 children of grade 1 - 9. The result showed that the ML had better prediction than the ordinary rule-based decision.
Competency-Based Education Curriculum for Machine Shop. Student Material.
ERIC Educational Resources Information Center
Associated Educational Consultants, Inc., Pittsburgh, PA.
This publication contains the student material for the machine shop competency-based education curriculum for secondary students in West Virginia. It has been developed to facilitate the learning of skills necessary for a career as a machinist. The tasks in the curriculum are those actually performed on the job. The materials are intended for use…
Light-operated machines based on threaded molecular structures.
Credi, Alberto; Silvi, Serena; Venturi, Margherita
2014-01-01
Rotaxanes and related species represent the most common implementation of the concept of artificial molecular machines, because the supramolecular nature of the interactions between the components and their interlocked architecture allow a precise control on the position and movement of the molecular units. The use of light to power artificial molecular machines is particularly valuable because it can play the dual role of "writing" and "reading" the system. Moreover, light-driven machines can operate without accumulation of waste products, and photons are the ideal inputs to enable autonomous operation mechanisms. In appropriately designed molecular machines, light can be used to control not only the stability of the system, which affects the relative position of the molecular components but also the kinetics of the mechanical processes, thereby enabling control on the direction of the movements. This step forward is necessary in order to make a leap from molecular machines to molecular motors.
ALE: automated label extraction from GEO metadata.
Giles, Cory B; Brown, Chase A; Ripperger, Michael; Dennis, Zane; Roopnarinesingh, Xiavan; Porter, Hunter; Perz, Aleksandra; Wren, Jonathan D
2017-12-28
NCBI's Gene Expression Omnibus (GEO) is a rich community resource containing millions of gene expression experiments from human, mouse, rat, and other model organisms. However, information about each experiment (metadata) is in the format of an open-ended, non-standardized textual description provided by the depositor. Thus, classification of experiments for meta-analysis by factors such as gender, age of the sample donor, and tissue of origin is not feasible without assigning labels to the experiments. Automated approaches are preferable for this, primarily because of the size and volume of the data to be processed, but also because it ensures standardization and consistency. While some of these labels can be extracted directly from the textual metadata, many of the data available do not contain explicit text informing the researcher about the age and gender of the subjects with the study. To bridge this gap, machine-learning methods can be trained to use the gene expression patterns associated with the text-derived labels to refine label-prediction confidence. Our analysis shows only 26% of metadata text contains information about gender and 21% about age. In order to ameliorate the lack of available labels for these data sets, we first extract labels from the textual metadata for each GEO RNA dataset and evaluate the performance against a gold standard of manually curated labels. We then use machine-learning methods to predict labels, based upon gene expression of the samples and compare this to the text-based method. Here we present an automated method to extract labels for age, gender, and tissue from textual metadata and GEO data using both a heuristic approach as well as machine learning. We show the two methods together improve accuracy of label assignment to GEO samples.
Machine Vision-Based Measurement Systems for Fruit and Vegetable Quality Control in Postharvest.
Blasco, José; Munera, Sandra; Aleixos, Nuria; Cubero, Sergio; Molto, Enrique
Individual items of any agricultural commodity are different from each other in terms of colour, shape or size. Furthermore, as they are living thing, they change their quality attributes over time, thereby making the development of accurate automatic inspection machines a challenging task. Machine vision-based systems and new optical technologies make it feasible to create non-destructive control and monitoring tools for quality assessment to ensure adequate accomplishment of food standards. Such systems are much faster than any manual non-destructive examination of fruit and vegetable quality, thus allowing the whole production to be inspected with objective and repeatable criteria. Moreover, current technology makes it possible to inspect the fruit in spectral ranges beyond the sensibility of the human eye, for instance in the ultraviolet and near-infrared regions. Machine vision-based applications require the use of multiple technologies and knowledge, ranging from those related to image acquisition (illumination, cameras, etc.) to the development of algorithms for spectral image analysis. Machine vision-based systems for inspecting fruit and vegetables are targeted towards different purposes, from in-line sorting into commercial categories to the detection of contaminants or the distribution of specific chemical compounds on the product's surface. This chapter summarises the current state of the art in these techniques, starting with systems based on colour images for the inspection of conventional colour, shape or external defects and then goes on to consider recent developments in spectral image analysis for internal quality assessment or contaminant detection.
Programming and machining of complex parts based on CATIA solid modeling
NASA Astrophysics Data System (ADS)
Zhu, Xiurong
2017-09-01
The complex parts of the use of CATIA solid modeling programming and simulation processing design, elaborated in the field of CNC machining, programming and the importance of processing technology. In parts of the design process, first make a deep analysis on the principle, and then the size of the design, the size of each chain, connected to each other. After the use of backstepping and a variety of methods to calculate the final size of the parts. In the selection of parts materials, careful study, repeated testing, the final choice of 6061 aluminum alloy. According to the actual situation of the processing site, it is necessary to make a comprehensive consideration of various factors in the machining process. The simulation process should be based on the actual processing, not only pay attention to shape. It can be used as reference for machining.
Virtual screening by a new Clustering-based Weighted Similarity Extreme Learning Machine approach
Kudisthalert, Wasu
2018-01-01
Machine learning techniques are becoming popular in virtual screening tasks. One of the powerful machine learning algorithms is Extreme Learning Machine (ELM) which has been applied to many applications and has recently been applied to virtual screening. We propose the Weighted Similarity ELM (WS-ELM) which is based on a single layer feed-forward neural network in a conjunction of 16 different similarity coefficients as activation function in the hidden layer. It is known that the performance of conventional ELM is not robust due to random weight selection in the hidden layer. Thus, we propose a Clustering-based WS-ELM (CWS-ELM) that deterministically assigns weights by utilising clustering algorithms i.e. k-means clustering and support vector clustering. The experiments were conducted on one of the most challenging datasets–Maximum Unbiased Validation Dataset–which contains 17 activity classes carefully selected from PubChem. The proposed algorithms were then compared with other machine learning techniques such as support vector machine, random forest, and similarity searching. The results show that CWS-ELM in conjunction with support vector clustering yields the best performance when utilised together with Sokal/Sneath(1) coefficient. Furthermore, ECFP_6 fingerprint presents the best results in our framework compared to the other types of fingerprints, namely ECFP_4, FCFP_4, and FCFP_6. PMID:29652912
NASA Astrophysics Data System (ADS)
Petermann, E.; Knöller, K.; Rocha, C.; Scholten, J.; Stollberg, R.; Weiß, H.; Schubert, M.
2018-02-01
Quantification of submarine groundwater discharge (SGD) is essential for evaluating the vulnerability of coastal water bodies to groundwater pollution and for understanding water body material cycles response due to potential discharge of nutrients, organic compounds, or heavy metals. Here we present an environmental tracer-based methodology for quantifying SGD into Knysna Estuary, South Africa. Both components of SGD, (1) fresh, terrestrial (FSGD) and (2) saline, recirculated (RSGD), were differentiated. We conducted an end-member mixing analysis for radon (222Rn) and salinity time series of estuary water over two tidal cycles to determine fractions of seawater, riverwater, FSGD, and RSGD. The mixing analysis was treated as a constrained optimization problem for finding the end-member mixing ratio that is producing the best fit to observations at every time step. Results revealed highest FSGD and RSGD fractions in the estuary during peak low tide. Over a 24 h time series, the portions of FSGD and RSGD in the estuary water were 0.2% and 0.8% near the estuary mouth and the FSGD/RSGD ratio was 1:3.3. We determined a median FSGD of 41,000 m³ d-1 (1.4 m³ d-1 per m shoreline) and a median RSGD of 135,000 m³ d-1 (4.5 m³ d-1 per m shoreline) which suggests that SGD exceeds river discharge by a factor of 1.0-2.1. By comparison to other sources, this implies that SGD is responsible for 28-73% of total DIN fluxes into Knysna Estuary.
Reliability Evaluation of Machine Center Components Based on Cascading Failure Analysis
NASA Astrophysics Data System (ADS)
Zhang, Ying-Zhi; Liu, Jin-Tong; Shen, Gui-Xiang; Long, Zhe; Sun, Shu-Guang
2017-07-01
In order to rectify the problems that the component reliability model exhibits deviation, and the evaluation result is low due to the overlook of failure propagation in traditional reliability evaluation of machine center components, a new reliability evaluation method based on cascading failure analysis and the failure influenced degree assessment is proposed. A direct graph model of cascading failure among components is established according to cascading failure mechanism analysis and graph theory. The failure influenced degrees of the system components are assessed by the adjacency matrix and its transposition, combined with the Pagerank algorithm. Based on the comprehensive failure probability function and total probability formula, the inherent failure probability function is determined to realize the reliability evaluation of the system components. Finally, the method is applied to a machine center, it shows the following: 1) The reliability evaluation values of the proposed method are at least 2.5% higher than those of the traditional method; 2) The difference between the comprehensive and inherent reliability of the system component presents a positive correlation with the failure influenced degree of the system component, which provides a theoretical basis for reliability allocation of machine center system.
NASA Technical Reports Server (NTRS)
Genuardi, Michael T.
1993-01-01
One strategy for machine-aided indexing (MAI) is to provide a concept-level analysis of the textual elements of documents or document abstracts. In such systems, natural-language phrases are analyzed in order to identify and classify concepts related to a particular subject domain. The overall performance of these MAI systems is largely dependent on the quality and comprehensiveness of their knowledge bases. These knowledge bases function to (1) define the relations between a controlled indexing vocabulary and natural language expressions; (2) provide a simple mechanism for disambiguation and the determination of relevancy; and (3) allow the extension of concept-hierarchical structure to all elements of the knowledge file. After a brief description of the NASA Machine-Aided Indexing system, concerns related to the development and maintenance of MAI knowledge bases are discussed. Particular emphasis is given to statistically-based text analysis tools designed to aid the knowledge base developer. One such tool, the Knowledge Base Building (KBB) program, presents the domain expert with a well-filtered list of synonyms and conceptually-related phrases for each thesaurus concept. Another tool, the Knowledge Base Maintenance (KBM) program, functions to identify areas of the knowledge base affected by changes in the conceptual domain (for example, the addition of a new thesaurus term). An alternate use of the KBM as an aid in thesaurus construction is also discussed.
Functional networks inference from rule-based machine learning models.
Lazzarini, Nicola; Widera, Paweł; Williamson, Stuart; Heer, Rakesh; Krasnogor, Natalio; Bacardit, Jaume
2016-01-01
Functional networks play an important role in the analysis of biological processes and systems. The inference of these networks from high-throughput (-omics) data is an area of intense research. So far, the similarity-based inference paradigm (e.g. gene co-expression) has been the most popular approach. It assumes a functional relationship between genes which are expressed at similar levels across different samples. An alternative to this paradigm is the inference of relationships from the structure of machine learning models. These models are able to capture complex relationships between variables, that often are different/complementary to the similarity-based methods. We propose a protocol to infer functional networks from machine learning models, called FuNeL. It assumes, that genes used together within a rule-based machine learning model to classify the samples, might also be functionally related at a biological level. The protocol is first tested on synthetic datasets and then evaluated on a test suite of 8 real-world datasets related to human cancer. The networks inferred from the real-world data are compared against gene co-expression networks of equal size, generated with 3 different methods. The comparison is performed from two different points of view. We analyse the enriched biological terms in the set of network nodes and the relationships between known disease-associated genes in a context of the network topology. The comparison confirms both the biological relevance and the complementary character of the knowledge captured by the FuNeL networks in relation to similarity-based methods and demonstrates its potential to identify known disease associations as core elements of the network. Finally, using a prostate cancer dataset as a case study, we confirm that the biological knowledge captured by our method is relevant to the disease and consistent with the specialised literature and with an independent dataset not used in the inference process. The
Classification of older adults with/without a fall history using machine learning methods.
Lin Zhang; Ou Ma; Fabre, Jennifer M; Wood, Robert H; Garcia, Stephanie U; Ivey, Kayla M; McCann, Evan D
2015-01-01
Falling is a serious problem in an aged society such that assessment of the risk of falls for individuals is imperative for the research and practice of falls prevention. This paper introduces an application of several machine learning methods for training a classifier which is capable of classifying individual older adults into a high risk group and a low risk group (distinguished by whether or not the members of the group have a recent history of falls). Using a 3D motion capture system, significant gait features related to falls risk are extracted. By training these features, classification hypotheses are obtained based on machine learning techniques (K Nearest-neighbour, Naive Bayes, Logistic Regression, Neural Network, and Support Vector Machine). Training and test accuracies with sensitivity and specificity of each of these techniques are assessed. The feature adjustment and tuning of the machine learning algorithms are discussed. The outcome of the study will benefit the prediction and prevention of falls.
GeneRIF indexing: sentence selection based on machine learning.
Jimeno-Yepes, Antonio J; Sticco, J Caitlin; Mork, James G; Aronson, Alan R
2013-05-31
A Gene Reference Into Function (GeneRIF) describes novel functionality of genes. GeneRIFs are available from the National Center for Biotechnology Information (NCBI) Gene database. GeneRIF indexing is performed manually, and the intention of our work is to provide methods to support creating the GeneRIF entries. The creation of GeneRIF entries involves the identification of the genes mentioned in MEDLINE®; citations and the sentences describing a novel function. We have compared several learning algorithms and several features extracted or derived from MEDLINE sentences to determine if a sentence should be selected for GeneRIF indexing. Features are derived from the sentences or using mechanisms to augment the information provided by them: assigning a discourse label using a previously trained model, for example. We show that machine learning approaches with specific feature combinations achieve results close to one of the annotators. We have evaluated different feature sets and learning algorithms. In particular, Naïve Bayes achieves better performance with a selection of features similar to one used in related work, which considers the location of the sentence, the discourse of the sentence and the functional terminology in it. The current performance is at a level similar to human annotation and it shows that machine learning can be used to automate the task of sentence selection for GeneRIF annotation. The current experiments are limited to the human species. We would like to see how the methodology can be extended to other species, specifically the normalization of gene mentions in other species.
Machine Reading for Extraction of Bacteria and Habitat Taxonomies
Kordjamshidi, Parisa; Massa, Wouter; Provoost, Thomas; Moens, Marie-Francine
2015-01-01
There is a vast amount of scientific literature available from various resources such as the internet. Automating the extraction of knowledge from these resources is very helpful for biologists to easily access this information. This paper presents a system to extract the bacteria and their habitats, as well as the relations between them. We investigate to what extent current techniques are suited for this task and test a variety of models in this regard. We detect entities in a biological text and map the habitats into a given taxonomy. Our model uses a linear chain Conditional Random Field (CRF). For the prediction of relations between the entities, a model based on logistic regression is built. Designing a system upon these techniques, we explore several improvements for both the generation and selection of good candidates. One contribution to this lies in the extended exibility of our ontology mapper that uses an advanced boundary detection and assigns the taxonomy elements to the detected habitats. Furthermore, we discover value in the combination of several distinct candidate generation rules. Using these techniques, we show results that are significantly improving upon the state of art for the BioNLP Bacteria Biotopes task. PMID:27077141
SAD-Based Stereo Vision Machine on a System-on-Programmable-Chip (SoPC)
Zhang, Xiang; Chen, Zhangwei
2013-01-01
This paper, proposes a novel solution for a stereo vision machine based on the System-on-Programmable-Chip (SoPC) architecture. The SOPC technology provides great convenience for accessing many hardware devices such as DDRII, SSRAM, Flash, etc., by IP reuse. The system hardware is implemented in a single FPGA chip involving a 32-bit Nios II microprocessor, which is a configurable soft IP core in charge of managing the image buffer and users' configuration data. The Sum of Absolute Differences (SAD) algorithm is used for dense disparity map computation. The circuits of the algorithmic module are modeled by the Matlab-based DSP Builder. With a set of configuration interfaces, the machine can process many different sizes of stereo pair images. The maximum image size is up to 512 K pixels. This machine is designed to focus on real time stereo vision applications. The stereo vision machine offers good performance and high efficiency in real time. Considering a hardware FPGA clock of 90 MHz, 23 frames of 640 × 480 disparity maps can be obtained in one second with 5 × 5 matching window and maximum 64 disparity pixels. PMID:23459385
Manifold learning in machine vision and robotics
NASA Astrophysics Data System (ADS)
Bernstein, Alexander
2017-02-01
Smart algorithms are used in Machine vision and Robotics to organize or extract high-level information from the available data. Nowadays, Machine learning is an essential and ubiquitous tool to automate extraction patterns or regularities from data (images in Machine vision; camera, laser, and sonar sensors data in Robotics) in order to solve various subject-oriented tasks such as understanding and classification of images content, navigation of mobile autonomous robot in uncertain environments, robot manipulation in medical robotics and computer-assisted surgery, and other. Usually such data have high dimensionality, however, due to various dependencies between their components and constraints caused by physical reasons, all "feasible and usable data" occupy only a very small part in high dimensional "observation space" with smaller intrinsic dimensionality. Generally accepted model of such data is manifold model in accordance with which the data lie on or near an unknown manifold (surface) of lower dimensionality embedded in an ambient high dimensional observation space; real-world high-dimensional data obtained from "natural" sources meet, as a rule, this model. The use of Manifold learning technique in Machine vision and Robotics, which discovers a low-dimensional structure of high dimensional data and results in effective algorithms for solving of a large number of various subject-oriented tasks, is the content of the conference plenary speech some topics of which are in the paper.
Kernel machines for epilepsy diagnosis via EEG signal classification: a comparative study.
Lima, Clodoaldo A M; Coelho, André L V
2011-10-01
We carry out a systematic assessment on a suite of kernel-based learning machines while coping with the task of epilepsy diagnosis through automatic electroencephalogram (EEG) signal classification. The kernel machines investigated include the standard support vector machine (SVM), the least squares SVM, the Lagrangian SVM, the smooth SVM, the proximal SVM, and the relevance vector machine. An extensive series of experiments was conducted on publicly available data, whose clinical EEG recordings were obtained from five normal subjects and five epileptic patients. The performance levels delivered by the different kernel machines are contrasted in terms of the criteria of predictive accuracy, sensitivity to the kernel function/parameter value, and sensitivity to the type of features extracted from the signal. For this purpose, 26 values for the kernel parameter (radius) of two well-known kernel functions (namely, Gaussian and exponential radial basis functions) were considered as well as 21 types of features extracted from the EEG signal, including statistical values derived from the discrete wavelet transform, Lyapunov exponents, and combinations thereof. We first quantitatively assess the impact of the choice of the wavelet basis on the quality of the features extracted. Four wavelet basis functions were considered in this study. Then, we provide the average accuracy (i.e., cross-validation error) values delivered by 252 kernel machine configurations; in particular, 40%/35% of the best-calibrated models of the standard and least squares SVMs reached 100% accuracy rate for the two kernel functions considered. Moreover, we show the sensitivity profiles exhibited by a large sample of the configurations whereby one can visually inspect their levels of sensitiveness to the type of feature and to the kernel function/parameter value. Overall, the results evidence that all kernel machines are competitive in terms of accuracy, with the standard and least squares SVMs
Zhang, Yanjun; Zhang, Xiangmin; Liu, Wenhui; Luo, Yuxi; Yu, Enjia; Zou, Keju; Liu, Xiaoliang
2014-01-01
This paper employed the clinical Polysomnographic (PSG) data, mainly including all-night Electroencephalogram (EEG), Electrooculogram (EOG) and Electromyogram (EMG) signals of subjects, and adopted the American Academy of Sleep Medicine (AASM) clinical staging manual as standards to realize automatic sleep staging. Authors extracted eighteen different features of EEG, EOG and EMG in time domains and frequency domains to construct the vectors according to the existing literatures as well as clinical experience. By adopting sleep samples self-learning, the linear combination of weights and parameters of multiple kernels of the fuzzy support vector machine (FSVM) were learned and the multi-kernel FSVM (MK-FSVM) was constructed. The overall agreement between the experts' scores and the results presented was 82.53%. Compared with previous results, the accuracy of N1 was improved to some extent while the accuracies of other stages were approximate, which well reflected the sleep structure. The staging algorithm proposed in this paper is transparent, and worth further investigation.
Deng, Li; Wang, Guohua; Yu, Suihuai
2016-01-01
In order to consider the psychological cognitive characteristics affecting operating comfort and realize the automatic layout design, cognitive ergonomics and GA-ACA (genetic algorithm and ant colony algorithm) were introduced into the layout design of human-machine interaction interface. First, from the perspective of cognitive psychology, according to the information processing process, the cognitive model of human-machine interaction interface was established. Then, the human cognitive characteristics were analyzed, and the layout principles of human-machine interaction interface were summarized as the constraints in layout design. Again, the expression form of fitness function, pheromone, and heuristic information for the layout optimization of cabin was studied. The layout design model of human-machine interaction interface was established based on GA-ACA. At last, a layout design system was developed based on this model. For validation, the human-machine interaction interface layout design of drilling rig control room was taken as an example, and the optimization result showed the feasibility and effectiveness of the proposed method.
Deng, Li; Wang, Guohua; Yu, Suihuai
2016-01-01
In order to consider the psychological cognitive characteristics affecting operating comfort and realize the automatic layout design, cognitive ergonomics and GA-ACA (genetic algorithm and ant colony algorithm) were introduced into the layout design of human-machine interaction interface. First, from the perspective of cognitive psychology, according to the information processing process, the cognitive model of human-machine interaction interface was established. Then, the human cognitive characteristics were analyzed, and the layout principles of human-machine interaction interface were summarized as the constraints in layout design. Again, the expression form of fitness function, pheromone, and heuristic information for the layout optimization of cabin was studied. The layout design model of human-machine interaction interface was established based on GA-ACA. At last, a layout design system was developed based on this model. For validation, the human-machine interaction interface layout design of drilling rig control room was taken as an example, and the optimization result showed the feasibility and effectiveness of the proposed method. PMID:26884745
Murugesan, Gurusamy; Abdulkadhar, Sabenabanu; Natarajan, Jeyakumar
2017-01-01
Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promising results in PPI task. However, most of these kernel methods fail to capture the semantic relation information between two entities. In this paper, we present a special type of tree kernel for PPI extraction which exploits both syntactic (structural) and semantic vectors information known as Distributed Smoothed Tree kernel (DSTK). DSTK comprises of distributed trees with syntactic information along with distributional semantic vectors representing semantic information of the sentences or phrases. To generate robust machine learning model composition of feature based kernel and DSTK were combined using ensemble support vector machine (SVM). Five different corpora (AIMed, BioInfer, HPRD50, IEPA, and LLL) were used for evaluating the performance of our system. Experimental results show that our system achieves better f-score with five different corpora compared to other state-of-the-art systems. PMID:29099838
Murugesan, Gurusamy; Abdulkadhar, Sabenabanu; Natarajan, Jeyakumar
2017-01-01
Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promising results in PPI task. However, most of these kernel methods fail to capture the semantic relation information between two entities. In this paper, we present a special type of tree kernel for PPI extraction which exploits both syntactic (structural) and semantic vectors information known as Distributed Smoothed Tree kernel (DSTK). DSTK comprises of distributed trees with syntactic information along with distributional semantic vectors representing semantic information of the sentences or phrases. To generate robust machine learning model composition of feature based kernel and DSTK were combined using ensemble support vector machine (SVM). Five different corpora (AIMed, BioInfer, HPRD50, IEPA, and LLL) were used for evaluating the performance of our system. Experimental results show that our system achieves better f-score with five different corpora compared to other state-of-the-art systems.
Alcaide-Leon, P; Dufort, P; Geraldo, A F; Alshafai, L; Maralani, P J; Spears, J; Bharatha, A
2017-06-01
Accurate preoperative differentiation of primary central nervous system lymphoma and enhancing glioma is essential to avoid unnecessary neurosurgical resection in patients with primary central nervous system lymphoma. The purpose of the study was to evaluate the diagnostic performance of a machine-learning algorithm by using texture analysis of contrast-enhanced T1-weighted images for differentiation of primary central nervous system lymphoma and enhancing glioma. Seventy-one adult patients with enhancing gliomas and 35 adult patients with primary central nervous system lymphomas were included. The tumors were manually contoured on contrast-enhanced T1WI, and the resulting volumes of interest were mined for textural features and subjected to a support vector machine-based machine-learning protocol. Three readers classified the tumors independently on contrast-enhanced T1WI. Areas under the receiver operating characteristic curves were estimated for each reader and for the support vector machine classifier. A noninferiority test for diagnostic accuracy based on paired areas under the receiver operating characteristic curve was performed with a noninferiority margin of 0.15. The mean areas under the receiver operating characteristic curve were 0.877 (95% CI, 0.798-0.955) for the support vector machine classifier; 0.878 (95% CI, 0.807-0.949) for reader 1; 0.899 (95% CI, 0.833-0.966) for reader 2; and 0.845 (95% CI, 0.757-0.933) for reader 3. The mean area under the receiver operating characteristic curve of the support vector machine classifier was significantly noninferior to the mean area under the curve of reader 1 ( P = .021), reader 2 ( P = .035), and reader 3 ( P = .007). Support vector machine classification based on textural features of contrast-enhanced T1WI is noninferior to expert human evaluation in the differentiation of primary central nervous system lymphoma and enhancing glioma. © 2017 by American Journal of Neuroradiology.
Zhang, Sa; Li, Zhou; Xin, Xue-Gang
2017-12-20
To achieve differential diagnosis of normal and malignant gastric tissues based on discrepancies in their dielectric properties using support vector machine. The dielectric properties of normal and malignant gastric tissues at the frequency ranging from 42.58 to 500 MHz were measured by coaxial probe method, and the Cole?Cole model was used to fit the measured data. Receiver?operating characteristic (ROC) curve analysis was used to evaluate the discrimination capability with respect to permittivity, conductivity, and Cole?Cole fitting parameters. Support vector machine was used for discriminating normal and malignant gastric tissues, and the discrimination accuracy was calculated using k?fold cross? The area under the ROC curve was above 0.8 for permittivity at the 5 frequencies at the lower end of the measured frequency range. The combination of the support vector machine with the permittivity at all these 5 frequencies combined achieved the highest discrimination accuracy of 84.38% with a MATLAB runtime of 3.40 s. The support vector machine?assisted diagnosis is feasible for human malignant gastric tissues based on the dielectric properties.
A general-purpose machine learning framework for predicting properties of inorganic materials
Ward, Logan; Agrawal, Ankit; Choudhary, Alok; ...
2016-08-26
A very active area of materials research is to devise methods that use machine learning to automatically extract predictive models from existing materials data. While prior examples have demonstrated successful models for some applications, many more applications exist where machine learning can make a strong impact. To enable faster development of machine-learning-based models for such applications, we have created a framework capable of being applied to a broad range of materials data. Our method works by using a chemically diverse list of attributes, which we demonstrate are suitable for describing a wide variety of properties, and a novel method formore » partitioning the data set into groups of similar materials to boost the predictive accuracy. In this manuscript, we demonstrate how this new method can be used to predict diverse properties of crystalline and amorphous materials, such as band gap energy and glass-forming ability.« less
A general-purpose machine learning framework for predicting properties of inorganic materials
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ward, Logan; Agrawal, Ankit; Choudhary, Alok
A very active area of materials research is to devise methods that use machine learning to automatically extract predictive models from existing materials data. While prior examples have demonstrated successful models for some applications, many more applications exist where machine learning can make a strong impact. To enable faster development of machine-learning-based models for such applications, we have created a framework capable of being applied to a broad range of materials data. Our method works by using a chemically diverse list of attributes, which we demonstrate are suitable for describing a wide variety of properties, and a novel method formore » partitioning the data set into groups of similar materials to boost the predictive accuracy. In this manuscript, we demonstrate how this new method can be used to predict diverse properties of crystalline and amorphous materials, such as band gap energy and glass-forming ability.« less
CD-REST: a system for extracting chemical-induced disease relation in literature.
Xu, Jun; Wu, Yonghui; Zhang, Yaoyun; Wang, Jingqi; Lee, Hee-Jin; Xu, Hua
2016-01-01
Mining chemical-induced disease relations embedded in the vast biomedical literature could facilitate a wide range of computational biomedical applications, such as pharmacovigilance. The BioCreative V organized a Chemical Disease Relation (CDR) Track regarding chemical-induced disease relation extraction from biomedical literature in 2015. We participated in all subtasks of this challenge. In this article, we present our participation system Chemical Disease Relation Extraction SysTem (CD-REST), an end-to-end system for extracting chemical-induced disease relations in biomedical literature. CD-REST consists of two main components: (1) a chemical and disease named entity recognition and normalization module, which employs the Conditional Random Fields algorithm for entity recognition and a Vector Space Model-based approach for normalization; and (2) a relation extraction module that classifies both sentence-level and document-level candidate drug-disease pairs by support vector machines. Our system achieved the best performance on the chemical-induced disease relation extraction subtask in the BioCreative V CDR Track, demonstrating the effectiveness of our proposed machine learning-based approaches for automatic extraction of chemical-induced disease relations in biomedical literature. The CD-REST system provides web services using HTTP POST request. The web services can be accessed fromhttp://clinicalnlptool.com/cdr The online CD-REST demonstration system is available athttp://clinicalnlptool.com/cdr/cdr.html. Database URL:http://clinicalnlptool.com/cdr;http://clinicalnlptool.com/cdr/cdr.html. © The Author(s) 2016. Published by Oxford University Press.
NASA Astrophysics Data System (ADS)
Schwab, M. P.; Klaus, J.; Pfister, L.; Weiler, M.
2016-12-01
Over the past decades, stream sampling protocols for hydro-geochemical parameters were often limited by logistical and technological constraints. While long-term monitoring protocols were typically based on weekly sampling intervals, high frequency sampling was commonly limited to a few single events. In this contribution, we combined high frequency and long-term measurements to understand DOC and nitrate dynamics in a forest headwater for different runoff events and seasons. Our study area is the forested Weierbach catchment (0.47 km2) in Luxembourg, where the fractured schist bedrock is covered by cambisol soils. The runoff response is characterized by a double peak behaviour. The first peak occurs during or right after a rainfall event triggered by fast near surface runoff generation processes, while a second delayed peak lasts several days and is generated by subsurface flow. This second peak occurs only if a distinct storage threshold of the catchment is exceeded. Our observations were carried out with a field deployable UV-Vis spectrometer measuring DOC and nitrate concentrations in-situ at 15 min intervals for more than two years. In addition, a long-term validation was carried out with data obtained from the analysis of water collected with grab samples. The long-term, high-frequency measurements allowed us to calculate a complete and detailed balance of DOC and nitrate export over two years. Transport behaviour of the DOC and nitrate showed different dynamics between the first and second hydrograph peaks. DOC is mainly exported during the first peaks, while nitrate is mostly exported during the delayed second peaks. Biweekly end-member measurement of soil and groundwater over several years enables us to link the behaviour of DOC and nitrate export to various end-members in the catchment. Altogether, the long-term and high-frequency time series provides the opportunity to study DOC and nitrate export processes without having to just rely only on either a
A Cooperative Approach to Virtual Machine Based Fault Injection
DOE Office of Scientific and Technical Information (OSTI.GOV)
Naughton III, Thomas J; Engelmann, Christian; Vallee, Geoffroy R
Resilience investigations often employ fault injection (FI) tools to study the effects of simulated errors on a target system. It is important to keep the target system under test (SUT) isolated from the controlling environment in order to maintain control of the experiement. Virtual machines (VMs) have been used to aid these investigations due to the strong isolation properties of system-level virtualization. A key challenge in fault injection tools is to gain proper insight and context about the SUT. In VM-based FI tools, this challenge of target con- text is increased due to the separation between host and guest (VM).more » We discuss an approach to VM-based FI that leverages virtual machine introspection (VMI) methods to gain insight into the target s context running within the VM. The key to this environment is the ability to provide basic information to the FI system that can be used to create a map of the target environment. We describe a proof- of-concept implementation and a demonstration of its use to introduce simulated soft errors into an iterative solver benchmark running in user-space of a guest VM.« less
Research on intrusion detection based on Kohonen network and support vector machine
NASA Astrophysics Data System (ADS)
Shuai, Chunyan; Yang, Hengcheng; Gong, Zeweiyi
2018-05-01
In view of the problem of low detection accuracy and the long detection time of support vector machine, which directly applied to the network intrusion detection system. Optimization of SVM parameters can greatly improve the detection accuracy, but it can not be applied to high-speed network because of the long detection time. a method based on Kohonen neural network feature selection is proposed to reduce the optimization time of support vector machine parameters. Firstly, this paper is to calculate the weights of the KDD99 network intrusion data by Kohonen network and select feature by weight. Then, after the feature selection is completed, genetic algorithm (GA) and grid search method are used for parameter optimization to find the appropriate parameters and classify them by support vector machines. By comparing experiments, it is concluded that feature selection can reduce the time of parameter optimization, which has little influence on the accuracy of classification. The experiments suggest that the support vector machine can be used in the network intrusion detection system and reduce the missing rate.
A human-machine cooperation route planning method based on improved A* algorithm
NASA Astrophysics Data System (ADS)
Zhang, Zhengsheng; Cai, Chao
2011-12-01
To avoid the limitation of common route planning method to blindly pursue higher Machine Intelligence and autoimmunization, this paper presents a human-machine cooperation route planning method. The proposed method includes a new A* path searing strategy based on dynamic heuristic searching and a human cooperated decision strategy to prune searching area. It can overcome the shortage of A* algorithm to fall into a local long term searching. Experiments showed that this method can quickly plan a feasible route to meet the macro-policy thinking.
Effects of Toy Crane Design-Based Learning on Simple Machines
ERIC Educational Resources Information Center
Korur, Fikret; Efe, Gülfem; Erdogan, Fisun; Tunç, Berna
2017-01-01
The aim of this 2-group study was to investigate the following question: Are there significant differences between scaffolded design-based learning controlled using 7 forms and teacher-directed instruction methods for the toy crane project on grade 7 students' posttest scores on the simple machines achievement test, attitude toward simple…
Concrete Condition Assessment Using Impact-Echo Method and Extreme Learning Machines
Zhang, Jing-Kui; Yan, Weizhong; Cui, De-Mi
2016-01-01
The impact-echo (IE) method is a popular non-destructive testing (NDT) technique widely used for measuring the thickness of plate-like structures and for detecting certain defects inside concrete elements or structures. However, the IE method is not effective for full condition assessment (i.e., defect detection, defect diagnosis, defect sizing and location), because the simple frequency spectrum analysis involved in the existing IE method is not sufficient to capture the IE signal patterns associated with different conditions. In this paper, we attempt to enhance the IE technique and enable it for full condition assessment of concrete elements by introducing advanced machine learning techniques for performing comprehensive analysis and pattern recognition of IE signals. Specifically, we use wavelet decomposition for extracting signatures or features out of the raw IE signals and apply extreme learning machine, one of the recently developed machine learning techniques, as classification models for full condition assessment. To validate the capabilities of the proposed method, we build a number of specimens with various types, sizes, and locations of defects and perform IE testing on these specimens in a lab environment. Based on analysis of the collected IE signals using the proposed machine learning based IE method, we demonstrate that the proposed method is effective in performing full condition assessment of concrete elements or structures. PMID:27023563
21 CFR 172.585 - Sugar beet extract flavor base.
Code of Federal Regulations, 2013 CFR
2013-04-01
... 21 Food and Drugs 3 2013-04-01 2013-04-01 false Sugar beet extract flavor base. 172.585 Section... HUMAN CONSUMPTION Flavoring Agents and Related Substances § 172.585 Sugar beet extract flavor base. Sugar beet extract flavor base may be safely used in food in accordance with the provisions of this...
21 CFR 172.585 - Sugar beet extract flavor base.
Code of Federal Regulations, 2012 CFR
2012-04-01
... 21 Food and Drugs 3 2012-04-01 2012-04-01 false Sugar beet extract flavor base. 172.585 Section... HUMAN CONSUMPTION Flavoring Agents and Related Substances § 172.585 Sugar beet extract flavor base. Sugar beet extract flavor base may be safely used in food in accordance with the provisions of this...
21 CFR 172.585 - Sugar beet extract flavor base.
Code of Federal Regulations, 2011 CFR
2011-04-01
... 21 Food and Drugs 3 2011-04-01 2011-04-01 false Sugar beet extract flavor base. 172.585 Section... HUMAN CONSUMPTION Flavoring Agents and Related Substances § 172.585 Sugar beet extract flavor base. Sugar beet extract flavor base may be safely used in food in accordance with the provisions of this...
Caggiano, Alessandra
2018-03-09
Machining of titanium alloys is characterised by extremely rapid tool wear due to the high cutting temperature and the strong adhesion at the tool-chip and tool-workpiece interface, caused by the low thermal conductivity and high chemical reactivity of Ti alloys. With the aim to monitor the tool conditions during dry turning of Ti-6Al-4V alloy, a machine learning procedure based on the acquisition and processing of cutting force, acoustic emission and vibration sensor signals during turning is implemented. A number of sensorial features are extracted from the acquired sensor signals in order to feed machine learning paradigms based on artificial neural networks. To reduce the large dimensionality of the sensorial features, an advanced feature extraction methodology based on Principal Component Analysis (PCA) is proposed. PCA allowed to identify a smaller number of features ( k = 2 features), the principal component scores, obtained through linear projection of the original d features into a new space with reduced dimensionality k = 2, sufficient to describe the variance of the data. By feeding artificial neural networks with the PCA features, an accurate diagnosis of tool flank wear ( VB max ) was achieved, with predicted values very close to the measured tool wear values.
2018-01-01
Machining of titanium alloys is characterised by extremely rapid tool wear due to the high cutting temperature and the strong adhesion at the tool-chip and tool-workpiece interface, caused by the low thermal conductivity and high chemical reactivity of Ti alloys. With the aim to monitor the tool conditions during dry turning of Ti-6Al-4V alloy, a machine learning procedure based on the acquisition and processing of cutting force, acoustic emission and vibration sensor signals during turning is implemented. A number of sensorial features are extracted from the acquired sensor signals in order to feed machine learning paradigms based on artificial neural networks. To reduce the large dimensionality of the sensorial features, an advanced feature extraction methodology based on Principal Component Analysis (PCA) is proposed. PCA allowed to identify a smaller number of features (k = 2 features), the principal component scores, obtained through linear projection of the original d features into a new space with reduced dimensionality k = 2, sufficient to describe the variance of the data. By feeding artificial neural networks with the PCA features, an accurate diagnosis of tool flank wear (VBmax) was achieved, with predicted values very close to the measured tool wear values. PMID:29522443
Machine learning based Intelligent cognitive network using fog computing
NASA Astrophysics Data System (ADS)
Lu, Jingyang; Li, Lun; Chen, Genshe; Shen, Dan; Pham, Khanh; Blasch, Erik
2017-05-01
In this paper, a Cognitive Radio Network (CRN) based on artificial intelligence is proposed to distribute the limited radio spectrum resources more efficiently. The CRN framework can analyze the time-sensitive signal data close to the signal source using fog computing with different types of machine learning techniques. Depending on the computational capabilities of the fog nodes, different features and machine learning techniques are chosen to optimize spectrum allocation. Also, the computing nodes send the periodic signal summary which is much smaller than the original signal to the cloud so that the overall system spectrum source allocation strategies are dynamically updated. Applying fog computing, the system is more adaptive to the local environment and robust to spectrum changes. As most of the signal data is processed at the fog level, it further strengthens the system security by reducing the communication burden of the communications network.
Conrad, Leanne F; Oliver, Michele L; Jack, Robert J; Dickey, James P; Eger, Tammy R
2014-01-01
The purpose of this work was to help a steel industry partner select the most appropriate of three high end heavy equipment seats to retrofit a number of their heavy mobile machines used in the steel making process. The participants included 8 males (22.3 ± 2.0 yrs.) and 8 females (23.5 ± 1.8 yrs.) with no experience operating heavy mobile equipment. Previously recorded 6-DOF chassis acceleration data from a Pot Hauler (a machine which picks up and transports pots of slag) were used to extract six, 20 second representative profiles for implementation on a lab-based heavy machine simulator (6-DOF Parallel Robotics System Corporation robot). Subjects sat on three heavy equipment seats (BeGe7150, Grammar MSG 95G1721, and a 6801 Isringhausen with the seat pan cushion retrofitted with a Skydex cushion) mounted on the simulator. Each subject completed three trials for each combination of seat (n=3) and vibration profile (n=6). Chassis and operator/seat interface vibration were measured by 2, 6-DOF vibration transducers. Variables included Seat Effective Amplitude Transmissibility (SEAT) (X,Y,Z,Roll,Pitch,Yaw,6DOF Vector Sum) to determine if the seat was attenuating or amplifying the vibration, 6-degree of freedom (DOF) vibration total value weighted predicted comfort (Avc) (according to ISO 2631-1) and operator reported comfort (ORC). Factorial ANOVAs revealed significant differences (p < or = 0.05) between seats for all SEAT variables but different seats performed better than others depending on the axis. Significant differences between males and females were observed for SEAT in X,Y, and Pitch as well as for Avs. As expected there were significant differences between vibration profiles for all assessed variables. A number of interaction effects were observed, the most frequently occurring of which was between seat and vibration profile. Based upon the number of seat and vibration profile interactions, results suggest that a single seat is not suited for all tested
Chan, Chung-Hung; Yusoff, Rozita; Ngoh, Gek-Cheng
2013-09-01
A modeling technique based on absorbed microwave energy was proposed to model microwave-assisted extraction (MAE) of antioxidant compounds from cocoa (Theobroma cacao L.) leaves. By adapting suitable extraction model at the basis of microwave energy absorbed during extraction, the model can be developed to predict extraction profile of MAE at various microwave irradiation power (100-600 W) and solvent loading (100-300 ml). Verification with experimental data confirmed that the prediction was accurate in capturing the extraction profile of MAE (R-square value greater than 0.87). Besides, the predicted yields from the model showed good agreement with the experimental results with less than 10% deviation observed. Furthermore, suitable extraction times to ensure high extraction yield at various MAE conditions can be estimated based on absorbed microwave energy. The estimation is feasible as more than 85% of active compounds can be extracted when compared with the conventional extraction technique. Copyright © 2013 Elsevier Ltd. All rights reserved.
A Machine Learning-based Rainfall System for GPM Dual-frequency Radar
NASA Astrophysics Data System (ADS)
Tan, H.; Chandrasekar, V.; Chen, H.
2017-12-01
Precipitation measurement produced by the Global Precipitation Measurement (GPM) Dual-frequency Precipitation Radar (DPR) plays an important role in researching the water circle and forecasting extreme weather event. Compare with its predecessor - Tropical Rainfall Measuring Mission (TRMM) Precipitation Radar (PR), GRM DPR measures precipitation in two different frequencies (i.e., Ku and Ka band), which can provide detailed information on the microphysical properties of precipitation particles, quantify particle size distribution and quantitatively measure light rain and falling snow. This paper presents a novel Machine Learning system for ground-based and space borne radar rainfall estimation. The system first trains ground radar data for rainfall estimation using rainfall measurements from gauges and subsequently uses the ground radar based rainfall estimates to train GPM DPR data in order to get space based rainfall product. Therein, data alignment between space DPR and ground radar is conducted using the methodology proposed by Bolen and Chandrasekar (2013), which can minimize the effects of potential geometric distortion of GPM DPR observations. For demonstration purposes, rainfall measurements from three rain gauge networks near Melbourne, Florida, are used for training and validation purposes. These three gauge networks, which are located in Kennedy Space Center (KSC), South Florida Water Management District (SFL), and St. Johns Water Management District (STJ), include 33, 46, and 99 rain gauge stations, respectively. Collocated ground radar observations from the National Weather Service (NWS) Weather Surveillance Radar - 1988 Doppler (WSR-88D) in Melbourne (i.e., KMLB radar) are trained with the gauge measurements. The trained model is then used to derive KMLB radar based rainfall product, which is used to train GPM DPR data collected from coincident overpasses events. The machine learning based rainfall product is compared against the GPM standard products
A rule-based approach to model checking of UML state machines
NASA Astrophysics Data System (ADS)
Grobelna, Iwona; Grobelny, Michał; Stefanowicz, Łukasz
2016-12-01
In the paper a new approach to formal verification of control process specification expressed by means of UML state machines in version 2.x is proposed. In contrast to other approaches from the literature, we use the abstract and universal rule-based logical model suitable both for model checking (using the nuXmv model checker), but also for logical synthesis in form of rapid prototyping. Hence, a prototype implementation in hardware description language VHDL can be obtained that fully reflects the primary, already formally verified specification in form of UML state machines. Presented approach allows to increase the assurance that implemented system meets the user-defined requirements.
Automatic Extraction of Metadata from Scientific Publications for CRIS Systems
ERIC Educational Resources Information Center
Kovacevic, Aleksandar; Ivanovic, Dragan; Milosavljevic, Branko; Konjovic, Zora; Surla, Dusan
2011-01-01
Purpose: The aim of this paper is to develop a system for automatic extraction of metadata from scientific papers in PDF format for the information system for monitoring the scientific research activity of the University of Novi Sad (CRIS UNS). Design/methodology/approach: The system is based on machine learning and performs automatic extraction…
Zhang, Heng; Pan, Zhongming; Zhang, Wenna
2018-06-07
An acoustic⁻seismic mixed feature extraction method based on the wavelet coefficient energy ratio (WCER) of the target signal is proposed in this study for classifying vehicle targets in wireless sensor networks. The signal was decomposed into a set of wavelet coefficients using the à trous algorithm, which is a concise method used to implement the wavelet transform of a discrete signal sequence. After the wavelet coefficients of the target acoustic and seismic signals were obtained, the energy ratio of each layer coefficient was calculated as the feature vector of the target signals. Subsequently, the acoustic and seismic features were merged into an acoustic⁻seismic mixed feature to improve the target classification accuracy after the acoustic and seismic WCER features of the target signal were simplified using the hierarchical clustering method. We selected the support vector machine method for classification and utilized the data acquired from a real-world experiment to validate the proposed method. The calculated results show that the WCER feature extraction method can effectively extract the target features from target signals. Feature simplification can reduce the time consumption of feature extraction and classification, with no effect on the target classification accuracy. The use of acoustic⁻seismic mixed features effectively improved target classification accuracy by approximately 12% compared with either acoustic signal or seismic signal alone.
Machine Learning and Radiology
Wang, Shijun; Summers, Ronald M.
2012-01-01
In this paper, we give a short introduction to machine learning and survey its applications in radiology. We focused on six categories of applications in radiology: medical image segmentation, registration, computer aided detection and diagnosis, brain function or activity analysis and neurological disease diagnosis from fMR images, content-based image retrieval systems for CT or MRI images, and text analysis of radiology reports using natural language processing (NLP) and natural language understanding (NLU). This survey shows that machine learning plays a key role in many radiology applications. Machine learning identifies complex patterns automatically and helps radiologists make intelligent decisions on radiology data such as conventional radiographs, CT, MRI, and PET images and radiology reports. In many applications, the performance of machine learning-based automatic detection and diagnosis systems has shown to be comparable to that of a well-trained and experienced radiologist. Technology development in machine learning and radiology will benefit from each other in the long run. Key contributions and common characteristics of machine learning techniques in radiology are discussed. We also discuss the problem of translating machine learning applications to the radiology clinical setting, including advantages and potential barriers. PMID:22465077
A Comparison Study of Machine Learning Based Algorithms for Fatigue Crack Growth Calculation.
Wang, Hongxun; Zhang, Weifang; Sun, Fuqiang; Zhang, Wei
2017-05-18
The relationships between the fatigue crack growth rate ( d a / d N ) and stress intensity factor range ( Δ K ) are not always linear even in the Paris region. The stress ratio effects on fatigue crack growth rate are diverse in different materials. However, most existing fatigue crack growth models cannot handle these nonlinearities appropriately. The machine learning method provides a flexible approach to the modeling of fatigue crack growth because of its excellent nonlinear approximation and multivariable learning ability. In this paper, a fatigue crack growth calculation method is proposed based on three different machine learning algorithms (MLAs): extreme learning machine (ELM), radial basis function network (RBFN) and genetic algorithms optimized back propagation network (GABP). The MLA based method is validated using testing data of different materials. The three MLAs are compared with each other as well as the classical two-parameter model ( K * approach). The results show that the predictions of MLAs are superior to those of K * approach in accuracy and effectiveness, and the ELM based algorithms show overall the best agreement with the experimental data out of the three MLAs, for its global optimization and extrapolation ability.
Machine vision for digital microfluidics
NASA Astrophysics Data System (ADS)
Shin, Yong-Jun; Lee, Jeong-Bong
2010-01-01
Machine vision is widely used in an industrial environment today. It can perform various tasks, such as inspecting and controlling production processes, that may require humanlike intelligence. The importance of imaging technology for biological research or medical diagnosis is greater than ever. For example, fluorescent reporter imaging enables scientists to study the dynamics of gene networks with high spatial and temporal resolution. Such high-throughput imaging is increasingly demanding the use of machine vision for real-time analysis and control. Digital microfluidics is a relatively new technology with expectations of becoming a true lab-on-a-chip platform. Utilizing digital microfluidics, only small amounts of biological samples are required and the experimental procedures can be automatically controlled. There is a strong need for the development of a digital microfluidics system integrated with machine vision for innovative biological research today. In this paper, we show how machine vision can be applied to digital microfluidics by demonstrating two applications: machine vision-based measurement of the kinetics of biomolecular interactions and machine vision-based droplet motion control. It is expected that digital microfluidics-based machine vision system will add intelligence and automation to high-throughput biological imaging in the future.
Romeo, Valeria; Maurea, Simone; Cuocolo, Renato; Petretta, Mario; Mainenti, Pier Paolo; Verde, Francesco; Coppola, Milena; Dell'Aversana, Serena; Brunetti, Arturo
2018-01-17
Adrenal adenomas (AA) are the most common benign adrenal lesions, often characterized based on intralesional fat content as either lipid-rich (LRA) or lipid-poor (LPA). The differentiation of AA, particularly LPA, from nonadenoma adrenal lesions (NAL) may be challenging. Texture analysis (TA) can extract quantitative parameters from MR images. Machine learning is a technique for recognizing patterns that can be applied to medical images by identifying the best combination of TA features to create a predictive model for the diagnosis of interest. To assess the diagnostic efficacy of TA-derived parameters extracted from MR images in characterizing LRA, LPA, and NAL using a machine-learning approach. Retrospective, observational study. Sixty MR examinations, including 20 LRA, 20 LPA, and 20 NAL. Unenhanced T 1 -weighted in-phase (IP) and out-of-phase (OP) as well as T 2 -weighted (T 2 -w) MR images acquired at 3T. Adrenal lesions were manually segmented, placing a spherical volume of interest on IP, OP, and T 2 -w images. Different selection methods were trained and tested using the J48 machine-learning classifiers. The feature selection method that obtained the highest diagnostic performance using the J48 classifier was identified; the diagnostic performance was also compared with that of a senior radiologist by means of McNemar's test. A total of 138 TA-derived features were extracted; among these, four features were selected, extracted from the IP (Short_Run_High_Gray_Level_Emphasis), OP (Mean_Intensity and Maximum_3D_Diameter), and T 2 -w (Standard_Deviation) images; the J48 classifier obtained a diagnostic accuracy of 80%. The expert radiologist obtained a diagnostic accuracy of 73%. McNemar's test did not show significant differences in terms of diagnostic performance between the J48 classifier and the expert radiologist. Machine learning conducted on MR TA-derived features is a potential tool to characterize adrenal lesions. 4 Technical Efficacy: Stage 2 J
A review on solid phase extraction of actinides and lanthanides with amide based extractants.
Ansari, Seraj A; Mohapatra, Prasanta K
2017-05-26
Solid phase extraction is gaining attention from separation scientists due to its high chromatographic utility. Though both grafted and impregnated forms of solid phase extraction resins are popular, the later is easy to make by impregnating a given organic extractant on to an inert solid support. Solid phase extraction on an impregnated support, also known as extraction chromatography, combines the advantages of liquid-liquid extraction and the ion exchange chromatography methods. On the flip side, the impregnated extraction chromatographic resins are less stable against leaching out of the organic extractant from the pores of the support material. Grafted resins, on the other hand, have a higher stability, which allows their prolong use. The goal of this article is a brief literature review on reported actinide and lanthanide separation methods based on solid phase extractants of both the types, i.e., (i) ligand impregnation on the solid support or (ii) ligand functionalized polymers (chemically bonded resins). Though the literature survey reveals an enormous volume of studies on the extraction chromatographic separation of actinides and lanthanides using several extractants, the focus of the present article is limited to the work carried out with amide based ligands, viz. monoamides, diamides and diglycolamides. The emphasis will be on reported applied experimental results rather than on data pertaining fundamental metal complexation. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhang, Huibin; Wang, Yuqiao; Chen, Haoran; Zhao, Yongli; Zhang, Jie
2017-12-01
In software defined optical networks (SDON), the centralized control plane may encounter numerous intrusion threatens which compromise the security level of provisioned services. In this paper, the issue of control plane security is studied and two machine-learning-based control plane intrusion detection techniques are proposed for SDON with properly selected features such as bandwidth, route length, etc. We validate the feasibility and efficiency of the proposed techniques by simulations. Results show an accuracy of 83% for intrusion detection can be achieved with the proposed machine-learning-based control plane intrusion detection techniques.
NASA Astrophysics Data System (ADS)
Hong, Haibo; Yin, Yuehong; Chen, Xing
2016-11-01
Despite the rapid development of computer science and information technology, an efficient human-machine integrated enterprise information system for designing complex mechatronic products is still not fully accomplished, partly because of the inharmonious communication among collaborators. Therefore, one challenge in human-machine integration is how to establish an appropriate knowledge management (KM) model to support integration and sharing of heterogeneous product knowledge. Aiming at the diversity of design knowledge, this article proposes an ontology-based model to reach an unambiguous and normative representation of knowledge. First, an ontology-based human-machine integrated design framework is described, then corresponding ontologies and sub-ontologies are established according to different purposes and scopes. Second, a similarity calculation-based ontology integration method composed of ontology mapping and ontology merging is introduced. The ontology searching-based knowledge sharing method is then developed. Finally, a case of human-machine integrated design of a large ultra-precision grinding machine is used to demonstrate the effectiveness of the method.
NASA Astrophysics Data System (ADS)
Wang, X.; Xu, L.
2018-04-01
One of the most important applications of remote sensing classification is water extraction. The water index (WI) based on Landsat images is one of the most common ways to distinguish water bodies from other land surface features. But conventional WI methods take into account spectral information only form a limited number of bands, and therefore the accuracy of those WI methods may be constrained in some areas which are covered with snow/ice, clouds, etc. An accurate and robust water extraction method is the key to the study at present. The support vector machine (SVM) using all bands spectral information can reduce for these classification error to some extent. Nevertheless, SVM which barely considers spatial information is relatively sensitive to noise in local regions. Conditional random field (CRF) which considers both spatial information and spectral information has proven to be able to compensate for these limitations. Hence, in this paper, we develop a systematic water extraction method by taking advantage of the complementarity between the SVM and a water index-guided stochastic fully-connected conditional random field (SVM-WIGSFCRF) to address the above issues. In addition, we comprehensively evaluate the reliability and accuracy of the proposed method using Landsat-8 operational land imager (OLI) images of one test site. We assess the method's performance by calculating the following accuracy metrics: Omission Errors (OE) and Commission Errors (CE); Kappa coefficient (KP) and Total Error (TE). Experimental results show that the new method can improve target detection accuracy under complex and changeable environments.
Elicitation of neurological knowledge with argument-based machine learning.
Groznik, Vida; Guid, Matej; Sadikov, Aleksander; Možina, Martin; Georgiev, Dejan; Kragelj, Veronika; Ribarič, Samo; Pirtošek, Zvezdan; Bratko, Ivan
2013-02-01
The paper describes the use of expert's knowledge in practice and the efficiency of a recently developed technique called argument-based machine learning (ABML) in the knowledge elicitation process. We are developing a neurological decision support system to help the neurologists differentiate between three types of tremors: Parkinsonian, essential, and mixed tremor (comorbidity). The system is intended to act as a second opinion for the neurologists, and most importantly to help them reduce the number of patients in the "gray area" that require a very costly further examination (DaTSCAN). We strive to elicit comprehensible and medically meaningful knowledge in such a way that it does not come at the cost of diagnostic accuracy. To alleviate the difficult problem of knowledge elicitation from data and domain experts, we used ABML. ABML guides the expert to explain critical special cases which cannot be handled automatically by machine learning. This very efficiently reduces the expert's workload, and combines expert's knowledge with learning data. 122 patients were enrolled into the study. The classification accuracy of the final model was 91%. Equally important, the initial and the final models were also evaluated for their comprehensibility by the neurologists. All 13 rules of the final model were deemed as appropriate to be able to support its decisions with good explanations. The paper demonstrates ABML's advantage in combining machine learning and expert knowledge. The accuracy of the system is very high with respect to the current state-of-the-art in clinical practice, and the system's knowledge base is assessed to be very consistent from a medical point of view. This opens up the possibility to use the system also as a teaching tool. Copyright © 2012 Elsevier B.V. All rights reserved.
Articulated, Performance-Based Instruction Objectives Guide for Machine Shop Technology.
ERIC Educational Resources Information Center
Henderson, William Edward, Jr., Ed.
This articulation guide contains 21 units of instruction for two years of machine shop. The objectives of the program are to provide the student with the basic terminology and fundamental knowledge and skills in machining (year 1) and to teach him/her to set up and operate machine tools and make or repair metal parts, tools, and machines (year 2).…
Xia, Jiaqi; Peng, Zhenling; Qi, Dawei; Mu, Hongbo; Yang, Jianyi
2017-03-15
Protein fold classification is a critical step in protein structure prediction. There are two possible ways to classify protein folds. One is through template-based fold assignment and the other is ab-initio prediction using machine learning algorithms. Combination of both solutions to improve the prediction accuracy was never explored before. We developed two algorithms, HH-fold and SVM-fold for protein fold classification. HH-fold is a template-based fold assignment algorithm using the HHsearch program. SVM-fold is a support vector machine-based ab-initio classification algorithm, in which a comprehensive set of features are extracted from three complementary sequence profiles. These two algorithms are then combined, resulting to the ensemble approach TA-fold. We performed a comprehensive assessment for the proposed methods by comparing with ab-initio methods and template-based threading methods on six benchmark datasets. An accuracy of 0.799 was achieved by TA-fold on the DD dataset that consists of proteins from 27 folds. This represents improvement of 5.4-11.7% over ab-initio methods. After updating this dataset to include more proteins in the same folds, the accuracy increased to 0.971. In addition, TA-fold achieved >0.9 accuracy on a large dataset consisting of 6451 proteins from 184 folds. Experiments on the LE dataset show that TA-fold consistently outperforms other threading methods at the family, superfamily and fold levels. The success of TA-fold is attributed to the combination of template-based fold assignment and ab-initio classification using features from complementary sequence profiles that contain rich evolution information. http://yanglab.nankai.edu.cn/TA-fold/. yangjy@nankai.edu.cn or mhb-506@163.com. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Information based universal feature extraction
NASA Astrophysics Data System (ADS)
Amiri, Mohammad; Brause, Rüdiger
2015-02-01
In many real world image based pattern recognition tasks, the extraction and usage of task-relevant features are the most crucial part of the diagnosis. In the standard approach, they mostly remain task-specific, although humans who perform such a task always use the same image features, trained in early childhood. It seems that universal feature sets exist, but they are not yet systematically found. In our contribution, we tried to find those universal image feature sets that are valuable for most image related tasks. In our approach, we trained a neural network by natural and non-natural images of objects and background, using a Shannon information-based algorithm and learning constraints. The goal was to extract those features that give the most valuable information for classification of visual objects hand-written digits. This will give a good start and performance increase for all other image learning tasks, implementing a transfer learning approach. As result, in our case we found that we could indeed extract features which are valid in all three kinds of tasks.
An efficient scheme for automatic web pages categorization using the support vector machine
NASA Astrophysics Data System (ADS)
Bhalla, Vinod Kumar; Kumar, Neeraj
2016-07-01
In the past few years, with an evolution of the Internet and related technologies, the number of the Internet users grows exponentially. These users demand access to relevant web pages from the Internet within fraction of seconds. To achieve this goal, there is a requirement of an efficient categorization of web page contents. Manual categorization of these billions of web pages to achieve high accuracy is a challenging task. Most of the existing techniques reported in the literature are semi-automatic. Using these techniques, higher level of accuracy cannot be achieved. To achieve these goals, this paper proposes an automatic web pages categorization into the domain category. The proposed scheme is based on the identification of specific and relevant features of the web pages. In the proposed scheme, first extraction and evaluation of features are done followed by filtering the feature set for categorization of domain web pages. A feature extraction tool based on the HTML document object model of the web page is developed in the proposed scheme. Feature extraction and weight assignment are based on the collection of domain-specific keyword list developed by considering various domain pages. Moreover, the keyword list is reduced on the basis of ids of keywords in keyword list. Also, stemming of keywords and tag text is done to achieve a higher accuracy. An extensive feature set is generated to develop a robust classification technique. The proposed scheme was evaluated using a machine learning method in combination with feature extraction and statistical analysis using support vector machine kernel as the classification tool. The results obtained confirm the effectiveness of the proposed scheme in terms of its accuracy in different categories of web pages.
NASA Astrophysics Data System (ADS)
Zhao, Fei; Zhang, Chi; Yang, Guilin; Chen, Chinyin
2016-12-01
This paper presents an online estimation method of cutting error by analyzing of internal sensor readings. The internal sensors of numerical control (NC) machine tool are selected to avoid installation problem. The estimation mathematic model of cutting error was proposed to compute the relative position of cutting point and tool center point (TCP) from internal sensor readings based on cutting theory of gear. In order to verify the effectiveness of the proposed model, it was simulated and experimented in gear generating grinding process. The cutting error of gear was estimated and the factors which induce cutting error were analyzed. The simulation and experiments verify that the proposed approach is an efficient way to estimate the cutting error of work-piece during machining process.
Time-Frequency Learning Machines for Nonstationarity Detection Using Surrogates
NASA Astrophysics Data System (ADS)
Borgnat, Pierre; Flandrin, Patrick; Richard, Cédric; Ferrari, André; Amoud, Hassan; Honeine, Paul
2012-03-01
Time-frequency representations provide a powerful tool for nonstationary signal analysis and classification, supporting a wide range of applications [12]. As opposed to conventional Fourier analysis, these techniques reveal the evolution in time of the spectral content of signals. In Ref. [7,38], time-frequency analysis is used to test stationarity of any signal. The proposed method consists of a comparison between global and local time-frequency features. The originality is to make use of a family of stationary surrogate signals for defining the null hypothesis of stationarity and, based upon this information, to derive statistical tests. An open question remains, however, about how to choose relevant time-frequency features. Over the last decade, a number of new pattern recognition methods based on reproducing kernels have been introduced. These learning machines have gained popularity due to their conceptual simplicity and their outstanding performance [30]. Initiated by Vapnik’s support vector machines (SVM) [35], they offer now a wide class of supervised and unsupervised learning algorithms. In Ref. [17-19], the authors have shown how the most effective and innovative learning machines can be tuned to operate in the time-frequency domain. This chapter follows this line of research by taking advantage of learning machines to test and quantify stationarity. Based on one-class SVM, our approach uses the entire time-frequency representation and does not require arbitrary feature extraction. Applied to a set of surrogates, it provides the domain boundary that includes most of these stationarized signals. This allows us to test the stationarity of the signal under investigation. This chapter is organized as follows. In Section 22.2, we introduce the surrogate data method to generate stationarized signals, namely, the null hypothesis of stationarity. The concept of time-frequency learning machines is presented in Section 22.3, and applied to one-class SVM in order
Dynamics of continental rift propagation: the end-member modes
NASA Astrophysics Data System (ADS)
Van Wijk, J. W.; Blackman, D. K.
2005-01-01
An important aspect of continental rifting is the progressive variation of deformation style along the rift axis during rift propagation. In regions of rift propagation, specifically transition zones from continental rifting to seafloor spreading, it has been observed that contrasting styles of deformation along the axis of rift propagation are bounded by shear zones. The focus of this numerical modeling study is to look at dynamic processes near the tip of a weak zone in continental lithosphere. More specifically, this study explores how modeled rift behavior depends on the value of rheological parameters of the crust. A three-dimensional finite element model is used to simulate lithosphere deformation in an extensional regime. The chosen approach emphasizes understanding the tectonic forces involved in rift propagation. Dependent on plate strength, two end-member modes are distinguished. The stalled rift phase is characterized by absence of rift propagation for a certain amount of time. Extension beyond the edge of the rift tip is no longer localized but occurs over a very wide zone, which requires a buildup of shear stresses near the rift tip and significant intra-plate deformation. This stage represents a situation in which a rift meets a locked zone. Localized deformation changes to distributed deformation in the locked zone, and the two different deformation styles are balanced by a shear zone oriented perpendicular to the trend. In the alternative rift propagation mode, rift propagation is a continuous process when the initial crust is weak. The extension style does not change significantly along the rift axis and lengthening of the rift zone is not accompanied by a buildup of shear stresses. Model predictions address aspects of previously unexplained rift evolution in the Laptev Sea, and its contrast with the tectonic evolution of, for example, the Gulf of Aden and Woodlark Basin.
Fiot, Jean-Baptiste; Cohen, Laurent D; Raniga, Parnesh; Fripp, Jurgen
2013-09-01
Support vector machines (SVM) are machine learning techniques that have been used for segmentation and classification of medical images, including segmentation of white matter hyper-intensities (WMH). Current approaches using SVM for WMH segmentation extract features from the brain and classify these followed by complex post-processing steps to remove false positives. The method presented in this paper combines advanced pre-processing, tissue-based feature selection and SVM classification to obtain efficient and accurate WMH segmentation. Features from 125 patients, generated from up to four MR modalities [T1-w, T2-w, proton-density and fluid attenuated inversion recovery(FLAIR)], differing neighbourhood sizes and the use of multi-scale features were compared. We found that although using all four modalities gave the best overall classification (average Dice scores of 0.54 ± 0.12, 0.72 ± 0.06 and 0.82 ± 0.06 respectively for small, moderate and severe lesion loads); this was not significantly different (p = 0.50) from using just T1-w and FLAIR sequences (Dice scores of 0.52 ± 0.13, 0.71 ± 0.08 and 0.81 ± 0.07). Furthermore, there was a negligible difference between using 5 × 5 × 5 and 3 × 3 × 3 features (p = 0.93). Finally, we show that careful consideration of features and pre-processing techniques not only saves storage space and computation time but also leads to more efficient classification, which outperforms the one based on all features with post-processing. Copyright © 2013 John Wiley & Sons, Ltd.
An Event-Triggered Machine Learning Approach for Accelerometer-Based Fall Detection.
Putra, I Putu Edy Suardiyana; Brusey, James; Gaura, Elena; Vesilo, Rein
2017-12-22
The fixed-size non-overlapping sliding window (FNSW) and fixed-size overlapping sliding window (FOSW) approaches are the most commonly used data-segmentation techniques in machine learning-based fall detection using accelerometer sensors. However, these techniques do not segment by fall stages (pre-impact, impact, and post-impact) and thus useful information is lost, which may reduce the detection rate of the classifier. Aligning the segment with the fall stage is difficult, as the segment size varies. We propose an event-triggered machine learning (EvenT-ML) approach that aligns each fall stage so that the characteristic features of the fall stages are more easily recognized. To evaluate our approach, two publicly accessible datasets were used. Classification and regression tree (CART), k -nearest neighbor ( k -NN), logistic regression (LR), and the support vector machine (SVM) were used to train the classifiers. EvenT-ML gives classifier F-scores of 98% for a chest-worn sensor and 92% for a waist-worn sensor, and significantly reduces the computational cost compared with the FNSW- and FOSW-based approaches, with reductions of up to 8-fold and 78-fold, respectively. EvenT-ML achieves a significantly better F-score than existing fall detection approaches. These results indicate that aligning feature segments with fall stages significantly increases the detection rate and reduces the computational cost.
Probabilistic and machine learning-based retrieval approaches for biomedical dataset retrieval
Karisani, Payam; Qin, Zhaohui S; Agichtein, Eugene
2018-01-01
Abstract The bioCADDIE dataset retrieval challenge brought together different approaches to retrieval of biomedical datasets relevant to a user’s query, expressed as a text description of a needed dataset. We describe experiments in applying a data-driven, machine learning-based approach to biomedical dataset retrieval as part of this challenge. We report on a series of experiments carried out to evaluate the performance of both probabilistic and machine learning-driven techniques from information retrieval, as applied to this challenge. Our experiments with probabilistic information retrieval methods, such as query term weight optimization, automatic query expansion and simulated user relevance feedback, demonstrate that automatically boosting the weights of important keywords in a verbose query is more effective than other methods. We also show that although there is a rich space of potential representations and features available in this domain, machine learning-based re-ranking models are not able to improve on probabilistic information retrieval techniques with the currently available training data. The models and algorithms presented in this paper can serve as a viable implementation of a search engine to provide access to biomedical datasets. The retrieval performance is expected to be further improved by using additional training data that is created by expert annotation, or gathered through usage logs, clicks and other processes during natural operation of the system. Database URL: https://github.com/emory-irlab/biocaddie PMID:29688379
Rule-based Approach on Extraction of Malay Compound Nouns in Standard Malay Document
NASA Astrophysics Data System (ADS)
Abu Bakar, Zamri; Kamal Ismail, Normaly; Rawi, Mohd Izani Mohamed
2017-08-01
Malay compound noun is defined as a form of words that exists when two or more words are combined into a single syntax and it gives a specific meaning. Compound noun acts as one unit and it is spelled separately unless an established compound noun is written closely from two words. The basic characteristics of compound noun can be seen in the Malay sentences which are the frequency of that word in the text itself. Thus, this extraction of compound nouns is significant for the following research which is text summarization, grammar checker, sentiments analysis, machine translation and word categorization. There are many research efforts that have been proposed in extracting Malay compound noun using linguistic approaches. Most of the existing methods were done on the extraction of bi-gram noun+noun compound. However, the result still produces some problems as to give a better result. This paper explores a linguistic method for extracting compound Noun from stand Malay corpus. A standard dataset are used to provide a common platform for evaluating research on the recognition of compound Nouns in Malay sentences. Therefore, an improvement for the effectiveness of the compound noun extraction is needed because the result can be compromised. Thus, this study proposed a modification of linguistic approach in order to enhance the extraction of compound nouns processing. Several pre-processing steps are involved including normalization, tokenization and tagging. The first step that uses the linguistic approach in this study is Part-of-Speech (POS) tagging. Finally, we describe several rules-based and modify the rules to get the most relevant relation between the first word and the second word in order to assist us in solving of the problems. The effectiveness of the relations used in our study can be measured using recall, precision and F1-score techniques. The comparison of the baseline values is very essential because it can provide whether there has been an improvement
Barrington, Luke; Turnbull, Douglas; Lanckriet, Gert
2012-01-01
Searching for relevant content in a massive amount of multimedia information is facilitated by accurately annotating each image, video, or song with a large number of relevant semantic keywords, or tags. We introduce game-powered machine learning, an integrated approach to annotating multimedia content that combines the effectiveness of human computation, through online games, with the scalability of machine learning. We investigate this framework for labeling music. First, a socially-oriented music annotation game called Herd It collects reliable music annotations based on the “wisdom of the crowds.” Second, these annotated examples are used to train a supervised machine learning system. Third, the machine learning system actively directs the annotation games to collect new data that will most benefit future model iterations. Once trained, the system can automatically annotate a corpus of music much larger than what could be labeled using human computation alone. Automatically annotated songs can be retrieved based on their semantic relevance to text-based queries (e.g., “funky jazz with saxophone,” “spooky electronica,” etc.). Based on the results presented in this paper, we find that actively coupling annotation games with machine learning provides a reliable and scalable approach to making searchable massive amounts of multimedia data. PMID:22460786
Game-powered machine learning.
Barrington, Luke; Turnbull, Douglas; Lanckriet, Gert
2012-04-24
Searching for relevant content in a massive amount of multimedia information is facilitated by accurately annotating each image, video, or song with a large number of relevant semantic keywords, or tags. We introduce game-powered machine learning, an integrated approach to annotating multimedia content that combines the effectiveness of human computation, through online games, with the scalability of machine learning. We investigate this framework for labeling music. First, a socially-oriented music annotation game called Herd It collects reliable music annotations based on the "wisdom of the crowds." Second, these annotated examples are used to train a supervised machine learning system. Third, the machine learning system actively directs the annotation games to collect new data that will most benefit future model iterations. Once trained, the system can automatically annotate a corpus of music much larger than what could be labeled using human computation alone. Automatically annotated songs can be retrieved based on their semantic relevance to text-based queries (e.g., "funky jazz with saxophone," "spooky electronica," etc.). Based on the results presented in this paper, we find that actively coupling annotation games with machine learning provides a reliable and scalable approach to making searchable massive amounts of multimedia data.
2005-12-09
decision making logic that respond to the environment (concentration of operands - the state vector), and bias or "mood" as established by its history of...mentioned in the chart, there is no need for file management in a ABC Machine. Information is distributed, no history is maintained. The instruction set... Postgresql ) for collection of cluster samples/snapshots over intervals of time. An prototypical example of an XML file to configure and launch the ABC
NASA Astrophysics Data System (ADS)
Miller, Matthew P.; Tesoriero, Anthony J.; Hood, Krista; Terziotti, Silvia; Wolock, David M.
2017-12-01
The myriad hydrologic and biogeochemical processes taking place in watersheds occurring across space and time are integrated and reflected in the quantity and quality of water in streams and rivers. Collection of high-frequency water quality data with sensors in surface waters provides new opportunities to disentangle these processes and quantify sources and transport of water and solutes in the coupled groundwater-surface water system. A new approach for separating the streamflow hydrograph into three components was developed and coupled with high-frequency nitrate data to estimate time-variable nitrate loads from chemically dilute quick flow, chemically concentrated quick flow, and slowflow groundwater end-member pathways for periods of up to 2 years in a groundwater-dominated and a quick-flow-dominated stream in central Wisconsin, using only streamflow and in-stream water quality data. The dilute and concentrated quick flow end-members were distinguished using high-frequency specific conductance data. Results indicate that dilute quick flow contributed less than 5% of the nitrate load at both sites, whereas 89 ± 8% of the nitrate load at the groundwater-dominated stream was from slowflow groundwater, and 84 ± 25% of the nitrate load at the quick-flow-dominated stream was from concentrated quick flow. Concentrated quick flow nitrate concentrations varied seasonally at both sites, with peak concentrations in the winter that were 2-3 times greater than minimum concentrations during the growing season. Application of this approach provides an opportunity to assess stream vulnerability to nonpoint source nitrate loading and expected stream responses to current or changing conditions and practices in watersheds.
Ding, Xueqin; Li, Li; Wang, Yuzhi; Chen, Jing; Huang, Yanhua; Xu, Kaijia
2014-12-01
A series of novel tetramethylguanidinium ionic liquids and hexaalkylguanidinium ionic liquids have been synthesized based on 1,1,3,3-tetramethylguanidine. The structures of the ionic liquids were confirmed by (1)H NMR spectroscopy and mass spectrometry. A green guanidinium ionic liquid based microwave-assisted extraction method has been developed with these guanidinium ionic liquids for the effective extraction of Praeruptorin A from Radix peucedani. After extraction, reversed-phase high-performance liquid chromatography with UV detection was employed for the analysis of Praeruptorin A. Several significant operating parameters were systematically optimized by single-factor and L9 (3(4)) orthogonal array experiments. The amount of Praeruptorin A extracted by [1,1,3,3-tetramethylguanidine]CH2CH(OH)COOH is the highest, reaching 11.05 ± 0.13 mg/g. Guanidinium ionic liquid based microwave-assisted extraction presents unique advantages in Praeruptorin A extraction compared with guanidinium ionic liquid based maceration extraction, guanidinium ionic liquid based heat reflux extraction and guanidinium ionic liquid based ultrasound-assisted extraction. The precision, stability, and repeatability of the process were investigated. The mechanisms of guanidinium ionic liquid based microwave-assisted extraction were researched by scanning electron microscopy and IR spectroscopy. All the results show that guanidinium ionic liquid based microwave-assisted extraction has a huge potential in the extraction of bioactive compounds from complex samples. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Component Pin Recognition Using Algorithms Based on Machine Learning
NASA Astrophysics Data System (ADS)
Xiao, Yang; Hu, Hong; Liu, Ze; Xu, Jiangchang
2018-04-01
The purpose of machine vision for a plug-in machine is to improve the machine’s stability and accuracy, and recognition of the component pin is an important part of the vision. This paper focuses on component pin recognition using three different techniques. The first technique involves traditional image processing using the core algorithm for binary large object (BLOB) analysis. The second technique uses the histogram of oriented gradients (HOG), to experimentally compare the effect of the support vector machine (SVM) and the adaptive boosting machine (AdaBoost) learning meta-algorithm classifiers. The third technique is the use of an in-depth learning method known as convolution neural network (CNN), which involves identifying the pin by comparing a sample to its training. The main purpose of the research presented in this paper is to increase the knowledge of learning methods used in the plug-in machine industry in order to achieve better results.
Residual Error Based Anomaly Detection Using Auto-Encoder in SMD Machine Sound.
Oh, Dong Yul; Yun, Il Dong
2018-04-24
Detecting an anomaly or an abnormal situation from given noise is highly useful in an environment where constantly verifying and monitoring a machine is required. As deep learning algorithms are further developed, current studies have focused on this problem. However, there are too many variables to define anomalies, and the human annotation for a large collection of abnormal data labeled at the class-level is very labor-intensive. In this paper, we propose to detect abnormal operation sounds or outliers in a very complex machine along with reducing the data-driven annotation cost. The architecture of the proposed model is based on an auto-encoder, and it uses the residual error, which stands for its reconstruction quality, to identify the anomaly. We assess our model using Surface-Mounted Device (SMD) machine sound, which is very complex, as experimental data, and state-of-the-art performance is successfully achieved for anomaly detection.
Machine learning and radiology.
Wang, Shijun; Summers, Ronald M
2012-07-01
In this paper, we give a short introduction to machine learning and survey its applications in radiology. We focused on six categories of applications in radiology: medical image segmentation, registration, computer aided detection and diagnosis, brain function or activity analysis and neurological disease diagnosis from fMR images, content-based image retrieval systems for CT or MRI images, and text analysis of radiology reports using natural language processing (NLP) and natural language understanding (NLU). This survey shows that machine learning plays a key role in many radiology applications. Machine learning identifies complex patterns automatically and helps radiologists make intelligent decisions on radiology data such as conventional radiographs, CT, MRI, and PET images and radiology reports. In many applications, the performance of machine learning-based automatic detection and diagnosis systems has shown to be comparable to that of a well-trained and experienced radiologist. Technology development in machine learning and radiology will benefit from each other in the long run. Key contributions and common characteristics of machine learning techniques in radiology are discussed. We also discuss the problem of translating machine learning applications to the radiology clinical setting, including advantages and potential barriers. Copyright © 2012. Published by Elsevier B.V.
NASA Astrophysics Data System (ADS)
Gao, Wei; Li, Xiang-ru
2017-07-01
The multi-task learning takes the multiple tasks together to make analysis and calculation, so as to dig out the correlations among them, and therefore to improve the accuracy of the analyzed results. This kind of methods have been widely applied to the machine learning, pattern recognition, computer vision, and other related fields. This paper investigates the application of multi-task learning in estimating the stellar atmospheric parameters, including the surface temperature (Teff), surface gravitational acceleration (lg g), and chemical abundance ([Fe/H]). Firstly, the spectral features of the three stellar atmospheric parameters are extracted by using the multi-task sparse group Lasso algorithm, then the support vector machine is used to estimate the atmospheric physical parameters. The proposed scheme is evaluated on both the Sloan stellar spectra and the theoretical spectra computed from the Kurucz's New Opacity Distribution Function (NEWODF) model. The mean absolute errors (MAEs) on the Sloan spectra are: 0.0064 for lg (Teff /K), 0.1622 for lg (g/(cm · s-2)), and 0.1221 dex for [Fe/H]; the MAEs on the synthetic spectra are 0.0006 for lg (Teff /K), 0.0098 for lg (g/(cm · s-2)), and 0.0082 dex for [Fe/H]. Experimental results show that the proposed scheme has a rather high accuracy for the estimation of stellar atmospheric parameters.
Streamlining machine learning in mobile devices for remote sensing
NASA Astrophysics Data System (ADS)
Coronel, Andrei D.; Estuar, Ma. Regina E.; Garcia, Kyle Kristopher P.; Dela Cruz, Bon Lemuel T.; Torrijos, Jose Emmanuel; Lim, Hadrian Paulo M.; Abu, Patricia Angela R.; Victorino, John Noel C.
2017-09-01
Mobile devices have been at the forefront of Intelligent Farming because of its ubiquitous nature. Applications on precision farming have been developed on smartphones to allow small farms to monitor environmental parameters surrounding crops. Mobile devices are used for most of these applications, collecting data to be sent to the cloud for storage, analysis, modeling and visualization. However, with the issue of weak and intermittent connectivity in geographically challenged areas of the Philippines, the solution is to provide analysis on the phone itself. Given this, the farmer gets a real time response after data submission. Though Machine Learning is promising, hardware constraints in mobile devices limit the computational capabilities, making model development on the phone restricted and challenging. This study discusses the development of a Machine Learning based mobile application using OpenCV libraries. The objective is to enable the detection of Fusarium oxysporum cubense (Foc) in juvenile and asymptomatic bananas using images of plant parts and microscopic samples as input. Image datasets of attached, unattached, dorsal, and ventral views of leaves were acquired through sampling protocols. Images of raw and stained specimens from soil surrounding the plant, and sap from the plant resulted to stained and unstained samples respectively. Segmentation and feature extraction techniques were applied to all images. Initial findings show no significant differences among the different feature extraction techniques. For differentiating infected from non-infected leaves, KNN yields highest average accuracy, as opposed to Naive Bayes and SVM. For microscopic images using MSER feature extraction, KNN has been tested as having a better accuracy than SVM or Naive-Bayes.
The Integration of Project-Based Methodology into Teaching in Machine Translation
ERIC Educational Resources Information Center
Madkour, Magda
2016-01-01
This quantitative-qualitative analytical research aimed at investigating the effect of integrating project-based teaching methodology into teaching machine translation on students' performance. Data was collected from the graduate students in the College of Languages and Translation, at Imam Muhammad Ibn Saud Islamic University, Riyadh, Saudi…
Method and system for fault accommodation of machines
NASA Technical Reports Server (NTRS)
Goebel, Kai Frank (Inventor); Subbu, Rajesh Venkat (Inventor); Rausch, Randal Thomas (Inventor); Frederick, Dean Kimball (Inventor)
2011-01-01
A method for multi-objective fault accommodation using predictive modeling is disclosed. The method includes using a simulated machine that simulates a faulted actual machine, and using a simulated controller that simulates an actual controller. A multi-objective optimization process is performed, based on specified control settings for the simulated controller and specified operational scenarios for the simulated machine controlled by the simulated controller, to generate a Pareto frontier-based solution space relating performance of the simulated machine to settings of the simulated controller, including adjustment to the operational scenarios to represent a fault condition of the simulated machine. Control settings of the actual controller are adjusted, represented by the simulated controller, for controlling the actual machine, represented by the simulated machine, in response to a fault condition of the actual machine, based on the Pareto frontier-based solution space, to maximize desirable operational conditions and minimize undesirable operational conditions while operating the actual machine in a region of the solution space defined by the Pareto frontier.
Differential spatial activity patterns of acupuncture by a machine learning based analysis
NASA Astrophysics Data System (ADS)
You, Youbo; Bai, Lijun; Xue, Ting; Zhong, Chongguang; Liu, Zhenyu; Tian, Jie
2011-03-01
Acupoint specificity, lying at the core of the Traditional Chinese Medicine, underlies the theoretical basis of acupuncture application. However, recent studies have reported that acupuncture stimulation at nonacupoint and acupoint can both evoke similar signal intensity decreases in multiple regions. And these regions were spatially overlapped. We used a machine learning based Support Vector Machine (SVM) approach to elucidate the specific neural response pattern induced by acupuncture stimulation. Group analysis demonstrated that stimulation at two different acupoints (belong to the same nerve segment but different meridians) could elicit distinct neural response patterns. Our findings may provide evidence for acupoint specificity.
NASA Astrophysics Data System (ADS)
Teffahi, Hanane; Yao, Hongxun; Belabid, Nasreddine; Chaib, Souleyman
2018-02-01
The satellite images with very high spatial resolution have been recently widely used in image classification topic as it has become challenging task in remote sensing field. Due to a number of limitations such as the redundancy of features and the high dimensionality of the data, different classification methods have been proposed for remote sensing images classification particularly the methods using feature extraction techniques. This paper propose a simple efficient method exploiting the capability of extended multi-attribute profiles (EMAP) with sparse autoencoder (SAE) for remote sensing image classification. The proposed method is used to classify various remote sensing datasets including hyperspectral and multispectral images by extracting spatial and spectral features based on the combination of EMAP and SAE by linking them to kernel support vector machine (SVM) for classification. Experiments on new hyperspectral image "Huston data" and multispectral image "Washington DC data" shows that this new scheme can achieve better performance of feature learning than the primitive features, traditional classifiers and ordinary autoencoder and has huge potential to achieve higher accuracy for classification in short running time.
NASA Astrophysics Data System (ADS)
Wen, Hongwei; Liu, Yue; Wang, Jieqiong; Zhang, Jishui; Peng, Yun; He, Huiguang
2016-03-01
Tourette syndrome (TS) is a developmental neuropsychiatric disorder with the cardinal symptoms of motor and vocal tics which emerges in early childhood and fluctuates in severity in later years. To date, the neural basis of TS is not fully understood yet and TS has a long-term prognosis that is difficult to accurately estimate. Few studies have looked at the potential of using diffusion tensor imaging (DTI) in conjunction with machine learning algorithms in order to automate the classification of healthy children and TS children. Here we apply Tract-Based Spatial Statistics (TBSS) method to 44 TS children and 48 age and gender matched healthy children in order to extract the diffusion values from each voxel in the white matter (WM) skeleton, and a feature selection algorithm (ReliefF) was used to select the most salient voxels for subsequent classification with support vector machine (SVM). We use a nested cross validation to yield an unbiased assessment of the classification method and prevent overestimation. The accuracy (88.04%), sensitivity (88.64%) and specificity (87.50%) were achieved in our method as peak performance of the SVM classifier was achieved using the axial diffusion (AD) metric, demonstrating the potential of a joint TBSS and SVM pipeline for fast, objective classification of healthy and TS children. These results support that our methods may be useful for the early identification of subjects with TS, and hold promise for predicting prognosis and treatment outcome for individuals with TS.
Modeling and simulation of five-axis virtual machine based on NX
NASA Astrophysics Data System (ADS)
Li, Xiaoda; Zhan, Xianghui
2018-04-01
Virtual technology in the machinery manufacturing industry has shown the role of growing. In this paper, the Siemens NX software is used to model the virtual CNC machine tool, and the parameters of the virtual machine are defined according to the actual parameters of the machine tool so that the virtual simulation can be carried out without loss of the accuracy of the simulation. How to use the machine builder of the CAM module to define the kinematic chain and machine components of the machine is described. The simulation of virtual machine can provide alarm information of tool collision and over cutting during the process to users, and can evaluate and forecast the rationality of the technological process.
Thutmose - Investigation of Machine Learning-Based Intrusion Detection Systems
2016-06-01
research is being done to incorporate the field of machine learning into intrusion detection. Machine learning is a branch of artificial intelligence (AI...adversarial drift." Proceedings of the 2013 ACM workshop on Artificial intelligence and security. ACM. (2013) Kantarcioglu, M., Xi, B., and Clifton, C. "A...34 Proceedings of the 4th ACM workshop on Security and artificial intelligence . ACM. (2011) Dua, S., and Du, X. Data Mining and Machine Learning in
Automatic detection system of shaft part surface defect based on machine vision
NASA Astrophysics Data System (ADS)
Jiang, Lixing; Sun, Kuoyuan; Zhao, Fulai; Hao, Xiangyang
2015-05-01
Surface physical damage detection is an important part of the shaft parts quality inspection and the traditional detecting methods are mostly human eye identification which has many disadvantages such as low efficiency, bad reliability. In order to improve the automation level of the quality detection of shaft parts and establish its relevant industry quality standard, a machine vision inspection system connected with MCU was designed to realize the surface detection of shaft parts. The system adopt the monochrome line-scan digital camera and use the dark-field and forward illumination technology to acquire images with high contrast; the images were segmented to Bi-value images through maximum between-cluster variance method after image filtering and image enhancing algorithms; then the mainly contours were extracted based on the evaluation criterion of the aspect ratio and the area; then calculate the coordinates of the centre of gravity of defects area, namely locating point coordinates; At last, location of the defects area were marked by the coding pen communicated with MCU. Experiment show that no defect was omitted and false alarm error rate was lower than 5%, which showed that the designed system met the demand of shaft part on-line real-time detection.
NASA Astrophysics Data System (ADS)
Sizov, Gennadi Y.
In this dissertation, a model-based multi-objective optimal design of permanent magnet ac machines, supplied by sine-wave current regulated drives, is developed and implemented. The design procedure uses an efficient electromagnetic finite element-based solver to accurately model nonlinear material properties and complex geometric shapes associated with magnetic circuit design. Application of an electromagnetic finite element-based solver allows for accurate computation of intricate performance parameters and characteristics. The first contribution of this dissertation is the development of a rapid computational method that allows accurate and efficient exploration of large multi-dimensional design spaces in search of optimum design(s). The computationally efficient finite element-based approach developed in this work provides a framework of tools that allow rapid analysis of synchronous electric machines operating under steady-state conditions. In the developed modeling approach, major steady-state performance parameters such as, winding flux linkages and voltages, average, cogging and ripple torques, stator core flux densities, core losses, efficiencies and saturated machine winding inductances, are calculated with minimum computational effort. In addition, the method includes means for rapid estimation of distributed stator forces and three-dimensional effects of stator and/or rotor skew on the performance of the machine. The second contribution of this dissertation is the development of the design synthesis and optimization method based on a differential evolution algorithm. The approach relies on the developed finite element-based modeling method for electromagnetic analysis and is able to tackle large-scale multi-objective design problems using modest computational resources. Overall, computational time savings of up to two orders of magnitude are achievable, when compared to current and prevalent state-of-the-art methods. These computational savings allow
Perspectives on Machine Learning for Classification of Schizotypy Using fMRI Data.
Madsen, Kristoffer H; Krohne, Laerke G; Cai, Xin-Lu; Wang, Yi; Chan, Raymond C K
2018-03-15
Functional magnetic resonance imaging is capable of estimating functional activation and connectivity in the human brain, and lately there has been increased interest in the use of these functional modalities combined with machine learning for identification of psychiatric traits. While these methods bear great potential for early diagnosis and better understanding of disease processes, there are wide ranges of processing choices and pitfalls that may severely hamper interpretation and generalization performance unless carefully considered. In this perspective article, we aim to motivate the use of machine learning schizotypy research. To this end, we describe common data processing steps while commenting on best practices and procedures. First, we introduce the important role of schizotypy to motivate the importance of reliable classification, and summarize existing machine learning literature on schizotypy. Then, we describe procedures for extraction of features based on fMRI data, including statistical parametric mapping, parcellation, complex network analysis, and decomposition methods, as well as classification with a special focus on support vector classification and deep learning. We provide more detailed descriptions and software as supplementary material. Finally, we present current challenges in machine learning for classification of schizotypy and comment on future trends and perspectives.
Skipping the real world: Classification of PolSAR images without explicit feature extraction
NASA Astrophysics Data System (ADS)
Hänsch, Ronny; Hellwich, Olaf
2018-06-01
The typical processing chain for pixel-wise classification from PolSAR images starts with an optional preprocessing step (e.g. speckle reduction), continues with extracting features projecting the complex-valued data into the real domain (e.g. by polarimetric decompositions) which are then used as input for a machine-learning based classifier, and ends in an optional postprocessing (e.g. label smoothing). The extracted features are usually hand-crafted as well as preselected and represent (a somewhat arbitrary) projection from the complex to the real domain in order to fit the requirements of standard machine-learning approaches such as Support Vector Machines or Artificial Neural Networks. This paper proposes to adapt the internal node tests of Random Forests to work directly on the complex-valued PolSAR data, which makes any explicit feature extraction obsolete. This approach leads to a classification framework with a significantly decreased computation time and memory footprint since no image features have to be computed and stored beforehand. The experimental results on one fully-polarimetric and one dual-polarimetric dataset show that, despite the simpler approach, accuracy can be maintained (decreased by only less than 2 % for the fully-polarimetric dataset) or even improved (increased by roughly 9 % for the dual-polarimetric dataset).
Zhao, Jiangsan; Bodner, Gernot; Rewald, Boris
2016-01-01
Phenotyping local crop cultivars is becoming more and more important, as they are an important genetic source for breeding – especially in regard to inherent root system architectures. Machine learning algorithms are promising tools to assist in the analysis of complex data sets; novel approaches are need to apply them on root phenotyping data of mature plants. A greenhouse experiment was conducted in large, sand-filled columns to differentiate 16 European Pisum sativum cultivars based on 36 manually derived root traits. Through combining random forest and support vector machine models, machine learning algorithms were successfully used for unbiased identification of most distinguishing root traits and subsequent pairwise cultivar differentiation. Up to 86% of pea cultivar pairs could be distinguished based on top five important root traits (Timp5) – Timp5 differed widely between cultivar pairs. Selecting top important root traits (Timp) provided a significant improved classification compared to using all available traits or randomly selected trait sets. The most frequent Timp of mature pea cultivars was total surface area of lateral roots originating from tap root segments at 0–5 cm depth. The high classification rate implies that culturing did not lead to a major loss of variability in root system architecture in the studied pea cultivars. Our results illustrate the potential of machine learning approaches for unbiased (root) trait selection and cultivar classification based on rather small, complex phenotypic data sets derived from pot experiments. Powerful statistical approaches are essential to make use of the increasing amount of (root) phenotyping information, integrating the complex trait sets describing crop cultivars. PMID:27999587
Sevenster, M; Buurman, J; Liu, P; Peters, J F; Chang, P J
2015-01-01
Accumulating quantitative outcome parameters may contribute to constructing a healthcare organization in which outcomes of clinical procedures are reproducible and predictable. In imaging studies, measurements are the principal category of quantitative para meters. The purpose of this work is to develop and evaluate two natural language processing engines that extract finding and organ measurements from narrative radiology reports and to categorize extracted measurements by their "temporality". The measurement extraction engine is developed as a set of regular expressions. The engine was evaluated against a manually created ground truth. Automated categorization of measurement temporality is defined as a machine learning problem. A ground truth was manually developed based on a corpus of radiology reports. A maximum entropy model was created using features that characterize the measurement itself and its narrative context. The model was evaluated in a ten-fold cross validation protocol. The measurement extraction engine has precision 0.994 and recall 0.991. Accuracy of the measurement classification engine is 0.960. The work contributes to machine understanding of radiology reports and may find application in software applications that process medical data.
Feature Extraction and Machine Learning for the Classification of Brazilian Savannah Pollen Grains
Souza, Junior Silva; da Silva, Gercina Gonçalves
2016-01-01
The classification of pollen species and types is an important task in many areas like forensic palynology, archaeological palynology and melissopalynology. This paper presents the first annotated image dataset for the Brazilian Savannah pollen types that can be used to train and test computer vision based automatic pollen classifiers. A first baseline human and computer performance for this dataset has been established using 805 pollen images of 23 pollen types. In order to access the computer performance, a combination of three feature extractors and four machine learning techniques has been implemented, fine tuned and tested. The results of these tests are also presented in this paper. PMID:27276196
Quantification of uncertainty in machining operations for on-machine acceptance.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Claudet, Andre A.; Tran, Hy D.; Su, Jiann-Chemg
2008-09-01
Manufactured parts are designed with acceptance tolerances, i.e. deviations from ideal design conditions, due to unavoidable errors in the manufacturing process. It is necessary to measure and evaluate the manufactured part, compared to the nominal design, to determine whether the part meets design specifications. The scope of this research project is dimensional acceptance of machined parts; specifically, parts machined using numerically controlled (NC, or also CNC for Computer Numerically Controlled) machines. In the design/build/accept cycle, the designer will specify both a nominal value, and an acceptable tolerance. As part of the typical design/build/accept business practice, it is required to verifymore » that the part did meet acceptable values prior to acceptance. Manufacturing cost must include not only raw materials and added labor, but also the cost of ensuring conformance to specifications. Ensuring conformance is a substantial portion of the cost of manufacturing. In this project, the costs of measurements were approximately 50% of the cost of the machined part. In production, cost of measurement would be smaller, but still a substantial proportion of manufacturing cost. The results of this research project will point to a science-based approach to reducing the cost of ensuring conformance to specifications. The approach that we take is to determine, a priori, how well a CNC machine can manufacture a particular geometry from stock. Based on the knowledge of the manufacturing process, we are then able to decide features which need further measurements from features which can be accepted 'as is' from the CNC. By calibration of the machine tool, and establishing a machining accuracy ratio, we can validate the ability of CNC to fabricate to a particular level of tolerance. This will eliminate the costs of checking for conformance for relatively large tolerances.« less
Kavuluru, Ramakanth; Han, Sifei; Harris, Daniel
2017-01-01
Diagnosis codes are extracted from medical records for billing and reimbursement and for secondary uses such as quality control and cohort identification. In the US, these codes come from the standard terminology ICD-9-CM derived from the international classification of diseases (ICD). ICD-9 codes are generally extracted by trained human coders by reading all artifacts available in a patient’s medical record following specific coding guidelines. To assist coders in this manual process, this paper proposes an unsupervised ensemble approach to automatically extract ICD-9 diagnosis codes from textual narratives included in electronic medical records (EMRs). Earlier attempts on automatic extraction focused on individual documents such as radiology reports and discharge summaries. Here we use a more realistic dataset and extract ICD-9 codes from EMRs of 1000 inpatient visits at the University of Kentucky Medical Center. Using named entity recognition (NER), graph-based concept-mapping of medical concepts, and extractive text summarization techniques, we achieve an example based average recall of 0.42 with average precision 0.47; compared with a baseline of using only NER, we notice a 12% improvement in recall with the graph-based approach and a 7% improvement in precision using the extractive text summarization approach. Although diagnosis codes are complex concepts often expressed in text with significant long range non-local dependencies, our present work shows the potential of unsupervised methods in extracting a portion of codes. As such, our findings are especially relevant for code extraction tasks where obtaining large amounts of training data is difficult. PMID:28748227
Kim, Jongin; Park, Hyeong-jun
2016-01-01
The purpose of this study is to classify EEG data on imagined speech in a single trial. We recorded EEG data while five subjects imagined different vowels, /a/, /e/, /i/, /o/, and /u/. We divided each single trial dataset into thirty segments and extracted features (mean, variance, standard deviation, and skewness) from all segments. To reduce the dimension of the feature vector, we applied a feature selection algorithm based on the sparse regression model. These features were classified using a support vector machine with a radial basis function kernel, an extreme learning machine, and two variants of an extreme learning machine with different kernels. Because each single trial consisted of thirty segments, our algorithm decided the label of the single trial by selecting the most frequent output among the outputs of the thirty segments. As a result, we observed that the extreme learning machine and its variants achieved better classification rates than the support vector machine with a radial basis function kernel and linear discrimination analysis. Thus, our results suggested that EEG responses to imagined speech could be successfully classified in a single trial using an extreme learning machine with a radial basis function and linear kernel. This study with classification of imagined speech might contribute to the development of silent speech BCI systems. PMID:28097128
Machine vision based teleoperation aid
NASA Technical Reports Server (NTRS)
Hoff, William A.; Gatrell, Lance B.; Spofford, John R.
1991-01-01
When teleoperating a robot using video from a remote camera, it is difficult for the operator to gauge depth and orientation from a single view. In addition, there are situations where a camera mounted for viewing by the teleoperator during a teleoperation task may not be able to see the tool tip, or the viewing angle may not be intuitive (requiring extensive training to reduce the risk of incorrect or dangerous moves by the teleoperator). A machine vision based teleoperator aid is presented which uses the operator's camera view to compute an object's pose (position and orientation), and then overlays onto the operator's screen information on the object's current and desired positions. The operator can choose to display orientation and translation information as graphics and/or text. This aid provides easily assimilated depth and relative orientation information to the teleoperator. The camera may be mounted at any known orientation relative to the tool tip. A preliminary experiment with human operators was conducted and showed that task accuracies were significantly greater with than without this aid.
Cooperating reduction machines
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kluge, W.E.
1983-11-01
This paper presents a concept and a system architecture for the concurrent execution of program expressions of a concrete reduction language based on lamda-expressions. If formulated appropriately, these expressions are well-suited for concurrent execution, following a demand-driven model of computation. In particular, recursive program expressions with nonlinear expansion may, at run time, recursively be partitioned into a hierarchy of independent subexpressions which can be reduced by a corresponding hierarchy of virtual reduction machines. This hierarchy unfolds and collapses dynamically, with virtual machines recursively assuming the role of masters that create and eventually terminate, or synchronize with, slaves. The paper alsomore » proposes a nonhierarchically organized system of reduction machines, each featuring a stack architecture, that effectively supports the allocation of virtual machines to the real machines of the system in compliance with their hierarchical order of creation and termination. 25 references.« less
Machine Learning: A Crucial Tool for Sensor Design
Zhao, Weixiang; Bhushan, Abhinav; Santamaria, Anthony D.; Simon, Melinda G.; Davis, Cristina E.
2009-01-01
Sensors have been widely used for disease diagnosis, environmental quality monitoring, food quality control, industrial process analysis and control, and other related fields. As a key tool for sensor data analysis, machine learning is becoming a core part of novel sensor design. Dividing a complete machine learning process into three steps: data pre-treatment, feature extraction and dimension reduction, and system modeling, this paper provides a review of the methods that are widely used for each step. For each method, the principles and the key issues that affect modeling results are discussed. After reviewing the potential problems in machine learning processes, this paper gives a summary of current algorithms in this field and provides some feasible directions for future studies. PMID:20191110
Machine learning for Big Data analytics in plants.
Ma, Chuang; Zhang, Hao Helen; Wang, Xiangfeng
2014-12-01
Rapid advances in high-throughput genomic technology have enabled biology to enter the era of 'Big Data' (large datasets). The plant science community not only needs to build its own Big-Data-compatible parallel computing and data management infrastructures, but also to seek novel analytical paradigms to extract information from the overwhelming amounts of data. Machine learning offers promising computational and analytical solutions for the integrative analysis of large, heterogeneous and unstructured datasets on the Big-Data scale, and is gradually gaining popularity in biology. This review introduces the basic concepts and procedures of machine-learning applications and envisages how machine learning could interface with Big Data technology to facilitate basic research and biotechnology in the plant sciences. Copyright © 2014 Elsevier Ltd. All rights reserved.
Heuristic for Critical Machine Based a Lot Streaming for Two-Stage Hybrid Production Environment
NASA Astrophysics Data System (ADS)
Vivek, P.; Saravanan, R.; Chandrasekaran, M.; Pugazhenthi, R.
2017-03-01
Lot streaming in Hybrid flowshop [HFS] is encountered in many real world problems. This paper deals with a heuristic approach for Lot streaming based on critical machine consideration for a two stage Hybrid Flowshop. The first stage has two identical parallel machines and the second stage has only one machine. In the second stage machine is considered as a critical by valid reasons these kind of problems is known as NP hard. A mathematical model developed for the selected problem. The simulation modelling and analysis were carried out in Extend V6 software. The heuristic developed for obtaining optimal lot streaming schedule. The eleven cases of lot streaming were considered. The proposed heuristic was verified and validated by real time simulation experiments. All possible lot streaming strategies and possible sequence under each lot streaming strategy were simulated and examined. The heuristic consistently yielded optimal schedule consistently in all eleven cases. The identification procedure for select best lot streaming strategy was suggested.
ClearTK 2.0: Design Patterns for Machine Learning in UIMA.
Bethard, Steven; Ogren, Philip; Becker, Lee
2014-05-01
ClearTK adds machine learning functionality to the UIMA framework, providing wrappers to popular machine learning libraries, a rich feature extraction library that works across different classifiers, and utilities for applying and evaluating machine learning models. Since its inception in 2008, ClearTK has evolved in response to feedback from developers and the community. This evolution has followed a number of important design principles including: conceptually simple annotator interfaces, readable pipeline descriptions, minimal collection readers, type system agnostic code, modules organized for ease of import, and assisting user comprehension of the complex UIMA framework.
MLBCD: a machine learning tool for big clinical data.
Luo, Gang
2015-01-01
Predictive modeling is fundamental for extracting value from large clinical data sets, or "big clinical data," advancing clinical research, and improving healthcare. Machine learning is a powerful approach to predictive modeling. Two factors make machine learning challenging for healthcare researchers. First, before training a machine learning model, the values of one or more model parameters called hyper-parameters must typically be specified. Due to their inexperience with machine learning, it is hard for healthcare researchers to choose an appropriate algorithm and hyper-parameter values. Second, many clinical data are stored in a special format. These data must be iteratively transformed into the relational table format before conducting predictive modeling. This transformation is time-consuming and requires computing expertise. This paper presents our vision for and design of MLBCD (Machine Learning for Big Clinical Data), a new software system aiming to address these challenges and facilitate building machine learning predictive models using big clinical data. The paper describes MLBCD's design in detail. By making machine learning accessible to healthcare researchers, MLBCD will open the use of big clinical data and increase the ability to foster biomedical discovery and improve care.
S V, Mahesh Kumar; R, Gunasundari
2018-06-02
Eye disease is a major health problem among the elderly people. Cataract and corneal arcus are the major abnormalities that exist in the anterior segment eye region of aged people. Hence, computer-aided diagnosis of anterior segment eye abnormalities will be helpful for mass screening and grading in ophthalmology. In this paper, we propose a multiclass computer-aided diagnosis (CAD) system using visible wavelength (VW) eye images to diagnose anterior segment eye abnormalities. In the proposed method, the input VW eye images are pre-processed for specular reflection removal and the iris circle region is segmented using a circular Hough Transform (CHT)-based approach. The first-order statistical features and wavelet-based features are extracted from the segmented iris circle and used for classification. The Support Vector Machine (SVM) by Sequential Minimal Optimization (SMO) algorithm was used for the classification. In experiments, we used 228 VW eye images that belong to three different classes of anterior segment eye abnormalities. The proposed method achieved a predictive accuracy of 96.96% with 97% sensitivity and 99% specificity. The experimental results show that the proposed method has significant potential for use in clinical applications.
Bisgin, Halil; Bera, Tanmay; Ding, Hongjian; Semey, Howard G; Wu, Leihong; Liu, Zhichao; Barnes, Amy E; Langley, Darryl A; Pava-Ripoll, Monica; Vyas, Himansu J; Tong, Weida; Xu, Joshua
2018-04-25
Insect pests, such as pantry beetles, are often associated with food contaminations and public health risks. Machine learning has the potential to provide a more accurate and efficient solution in detecting their presence in food products, which is currently done manually. In our previous research, we demonstrated such feasibility where Artificial Neural Network (ANN) based pattern recognition techniques could be implemented for species identification in the context of food safety. In this study, we present a Support Vector Machine (SVM) model which improved the average accuracy up to 85%. Contrary to this, the ANN method yielded ~80% accuracy after extensive parameter optimization. Both methods showed excellent genus level identification, but SVM showed slightly better accuracy for most species. Highly accurate species level identification remains a challenge, especially in distinguishing between species from the same genus which may require improvements in both imaging and machine learning techniques. In summary, our work does illustrate a new SVM based technique and provides a good comparison with the ANN model in our context. We believe such insights will pave better way forward for the application of machine learning towards species identification and food safety.
Classification and machine recognition of severe weather patterns
NASA Technical Reports Server (NTRS)
Wang, P. P.; Burns, R. C.
1976-01-01
Forecasting and warning of severe weather conditions are treated from the vantage point of pattern recognition by machine. Pictorial patterns and waveform patterns are distinguished. Time series data on sferics are dealt with by considering waveform patterns. A severe storm patterns recognition machine is described, along with schemes for detection via cross-correlation of time series (same channel or different channels). Syntactic and decision-theoretic approaches to feature extraction are discussed. Active and decayed tornados and thunderstorms, lightning discharges, and funnels and their related time series data are studied.
A machine learning-based framework to identify type 2 diabetes through electronic health records
Zheng, Tao; Xie, Wei; Xu, Liling; He, Xiaoying; Zhang, Ya; You, Mingrong; Yang, Gong; Chen, You
2016-01-01
Objective To discover diverse genotype-phenotype associations affiliated with Type 2 Diabetes Mellitus (T2DM) via genome-wide association study (GWAS) and phenome-wide association study (PheWAS), more cases (T2DM subjects) and controls (subjects without T2DM) are required to be identified (e.g., via Electronic Health Records (EHR)). However, existing expert based identification algorithms often suffer in a low recall rate and could miss a large number of valuable samples under conservative filtering standards. The goal of this work is to develop a semi-automated framework based on machine learning as a pilot study to liberalize filtering criteria to improve recall rate with a keeping of low false positive rate. Materials and methods We propose a data informed framework for identifying subjects with and without T2DM from EHR via feature engineering and machine learning. We evaluate and contrast the identification performance of widely-used machine learning models within our framework, including k-Nearest-Neighbors, Naïve Bayes, Decision Tree, Random Forest, Support Vector Machine and Logistic Regression. Our framework was conducted on 300 patient samples (161 cases, 60 controls and 79 unconfirmed subjects), randomly selected from 23,281 diabetes related cohort retrieved from a regional distributed EHR repository ranging from 2012 to 2014. Results We apply top-performing machine learning algorithms on the engineered features. We benchmark and contrast the accuracy, precision, AUC, sensitivity and specificity of classification models against the state-of-the-art expert algorithm for identification of T2DM subjects. Our results indicate that the framework achieved high identification performances (∼0.98 in average AUC), which are much higher than the state-of-the-art algorithm (0.71 in AUC). Discussion Expert algorithm-based identification of T2DM subjects from EHR is often hampered by the high missing rates due to their conservative selection criteria. Our
A machine learning-based framework to identify type 2 diabetes through electronic health records.
Zheng, Tao; Xie, Wei; Xu, Liling; He, Xiaoying; Zhang, Ya; You, Mingrong; Yang, Gong; Chen, You
2017-01-01
To discover diverse genotype-phenotype associations affiliated with Type 2 Diabetes Mellitus (T2DM) via genome-wide association study (GWAS) and phenome-wide association study (PheWAS), more cases (T2DM subjects) and controls (subjects without T2DM) are required to be identified (e.g., via Electronic Health Records (EHR)). However, existing expert based identification algorithms often suffer in a low recall rate and could miss a large number of valuable samples under conservative filtering standards. The goal of this work is to develop a semi-automated framework based on machine learning as a pilot study to liberalize filtering criteria to improve recall rate with a keeping of low false positive rate. We propose a data informed framework for identifying subjects with and without T2DM from EHR via feature engineering and machine learning. We evaluate and contrast the identification performance of widely-used machine learning models within our framework, including k-Nearest-Neighbors, Naïve Bayes, Decision Tree, Random Forest, Support Vector Machine and Logistic Regression. Our framework was conducted on 300 patient samples (161 cases, 60 controls and 79 unconfirmed subjects), randomly selected from 23,281 diabetes related cohort retrieved from a regional distributed EHR repository ranging from 2012 to 2014. We apply top-performing machine learning algorithms on the engineered features. We benchmark and contrast the accuracy, precision, AUC, sensitivity and specificity of classification models against the state-of-the-art expert algorithm for identification of T2DM subjects. Our results indicate that the framework achieved high identification performances (∼0.98 in average AUC), which are much higher than the state-of-the-art algorithm (0.71 in AUC). Expert algorithm-based identification of T2DM subjects from EHR is often hampered by the high missing rates due to their conservative selection criteria. Our framework leverages machine learning and feature
Feature extraction and selection strategies for automated target recognition
NASA Astrophysics Data System (ADS)
Greene, W. Nicholas; Zhang, Yuhan; Lu, Thomas T.; Chao, Tien-Hsin
2010-04-01
Several feature extraction and selection methods for an existing automatic target recognition (ATR) system using JPLs Grayscale Optical Correlator (GOC) and Optimal Trade-Off Maximum Average Correlation Height (OT-MACH) filter were tested using MATLAB. The ATR system is composed of three stages: a cursory regionof- interest (ROI) search using the GOC and OT-MACH filter, a feature extraction and selection stage, and a final classification stage. Feature extraction and selection concerns transforming potential target data into more useful forms as well as selecting important subsets of that data which may aide in detection and classification. The strategies tested were built around two popular extraction methods: Principal Component Analysis (PCA) and Independent Component Analysis (ICA). Performance was measured based on the classification accuracy and free-response receiver operating characteristic (FROC) output of a support vector machine(SVM) and a neural net (NN) classifier.
Nano Mechanical Machining Using AFM Probe
NASA Astrophysics Data System (ADS)
Mostofa, Md. Golam
Complex miniaturized components with high form accuracy will play key roles in the future development of many products, as they provide portability, disposability, lower material consumption in production, low power consumption during operation, lower sample requirements for testing, and higher heat transfer due to their very high surface-to-volume ratio. Given the high market demand for such micro and nano featured components, different manufacturing methods have been developed for their fabrication. Some of the common technologies in micro/nano fabrication are photolithography, electron beam lithography, X-ray lithography and other semiconductor processing techniques. Although these methods are capable of fabricating micro/nano structures with a resolution of less than a few nanometers, some of the shortcomings associated with these methods, such as high production costs for customized products, limited material choices, necessitate the development of other fabricating techniques. Micro/nano mechanical machining, such an atomic force microscope (AFM) probe based nano fabrication, has, therefore, been used to overcome some the major restrictions of the traditional processes. This technique removes material from the workpiece by engaging micro/nano size cutting tool (i.e. AFM probe) and is applicable on a wider range of materials compared to the photolithographic process. In spite of the unique benefits of nano mechanical machining, there are also some challenges with this technique, since the scale is reduced, such as size effects, burr formations, chip adhesions, fragility of tools and tool wear. Moreover, AFM based machining does not have any rotational movement, which makes fabrication of 3D features more difficult. Thus, vibration-assisted machining is introduced into AFM probe based nano mechanical machining to overcome the limitations associated with the conventional AFM probe based scratching method. Vibration-assisted machining reduced the cutting forces
Kim, Jongin; Lee, Boreom
2018-05-07
Different modalities such as structural MRI, FDG-PET, and CSF have complementary information, which is likely to be very useful for diagnosis of AD and MCI. Therefore, it is possible to develop a more effective and accurate AD/MCI automatic diagnosis method by integrating complementary information of different modalities. In this paper, we propose multi-modal sparse hierarchical extreme leaning machine (MSH-ELM). We used volume and mean intensity extracted from 93 regions of interest (ROIs) as features of MRI and FDG-PET, respectively, and used p-tau, t-tau, and Aβ42 as CSF features. In detail, high-level representation was individually extracted from each of MRI, FDG-PET, and CSF using a stacked sparse extreme learning machine auto-encoder (sELM-AE). Then, another stacked sELM-AE was devised to acquire a joint hierarchical feature representation by fusing the high-level representations obtained from each modality. Finally, we classified joint hierarchical feature representation using a kernel-based extreme learning machine (KELM). The results of MSH-ELM were compared with those of conventional ELM, single kernel support vector machine (SK-SVM), multiple kernel support vector machine (MK-SVM) and stacked auto-encoder (SAE). Performance was evaluated through 10-fold cross-validation. In the classification of AD vs. HC and MCI vs. HC problem, the proposed MSH-ELM method showed mean balanced accuracies of 96.10% and 86.46%, respectively, which is much better than those of competing methods. In summary, the proposed algorithm exhibits consistently better performance than SK-SVM, ELM, MK-SVM and SAE in the two binary classification problems (AD vs. HC and MCI vs. HC). © 2018 Wiley Periodicals, Inc.
Park, Eunjeong; Chang, Hyuk-Jae; Nam, Hyo Suk
2017-04-18
The pronator drift test (PDT), a neurological examination, is widely used in clinics to measure motor weakness of stroke patients. The aim of this study was to develop a PDT tool with machine learning classifiers to detect stroke symptoms based on quantification of proximal arm weakness using inertial sensors and signal processing. We extracted features of drift and pronation from accelerometer signals of wearable devices on the inner wrists of 16 stroke patients and 10 healthy controls. Signal processing and feature selection approach were applied to discriminate PDT features used to classify stroke patients. A series of machine learning techniques, namely support vector machine (SVM), radial basis function network (RBFN), and random forest (RF), were implemented to discriminate stroke patients from controls with leave-one-out cross-validation. Signal processing by the PDT tool extracted a total of 12 PDT features from sensors. Feature selection abstracted the major attributes from the 12 PDT features to elucidate the dominant characteristics of proximal weakness of stroke patients using machine learning classification. Our proposed PDT classifiers had an area under the receiver operating characteristic curve (AUC) of .806 (SVM), .769 (RBFN), and .900 (RF) without feature selection, and feature selection improves the AUCs to .913 (SVM), .956 (RBFN), and .975 (RF), representing an average performance enhancement of 15.3%. Sensors and machine learning methods can reliably detect stroke signs and quantify proximal arm weakness. Our proposed solution will facilitate pervasive monitoring of stroke patients. ©Eunjeong Park, Hyuk-Jae Chang, Hyo Suk Nam. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 18.04.2017.
Improving Machining Accuracy of CNC Machines with Innovative Design Methods
NASA Astrophysics Data System (ADS)
Yemelyanov, N. V.; Yemelyanova, I. V.; Zubenko, V. L.
2018-03-01
The article considers achieving the machining accuracy of CNC machines by applying innovative methods in modelling and design of machining systems, drives and machine processes. The topological method of analysis involves visualizing the system as matrices of block graphs with a varying degree of detail between the upper and lower hierarchy levels. This approach combines the advantages of graph theory and the efficiency of decomposition methods, it also has visual clarity, which is inherent in both topological models and structural matrices, as well as the resiliency of linear algebra as part of the matrix-based research. The focus of the study is on the design of automated machine workstations, systems, machines and units, which can be broken into interrelated parts and presented as algebraic, topological and set-theoretical models. Every model can be transformed into a model of another type, and, as a result, can be interpreted as a system of linear and non-linear equations which solutions determine the system parameters. This paper analyses the dynamic parameters of the 1716PF4 machine at the stages of design and exploitation. Having researched the impact of the system dynamics on the component quality, the authors have developed a range of practical recommendations which have enabled one to reduce considerably the amplitude of relative motion, exclude some resonance zones within the spindle speed range of 0...6000 min-1 and improve machining accuracy.
Metalworking and machining fluids
Erdemir, Ali; Sykora, Frank; Dorbeck, Mark
2010-10-12
Improved boron-based metal working and machining fluids. Boric acid and boron-based additives that, when mixed with certain carrier fluids, such as water, cellulose and/or cellulose derivatives, polyhydric alcohol, polyalkylene glycol, polyvinyl alcohol, starch, dextrin, in solid and/or solvated forms result in improved metalworking and machining of metallic work pieces. Fluids manufactured with boric acid or boron-based additives effectively reduce friction, prevent galling and severe wear problems on cutting and forming tools.
NASA Astrophysics Data System (ADS)
Scarth, P.; Trevithick, B.; Beutel, T.
2016-12-01
VegMachine Online is a freely available browser application that allows ranchers across Australia to view and interact with satellite derived ground cover state and change maps on their property and extract this information in a graphical format using interactive tools. It supports the delivery and communication of a massive earth observation data set in an accessible, producer friendly way . Around 250,000 Landsat TM, ETM and OLI images were acquired across Australia, converted to terrain corrected surface reflectance and masked for cloud, cloud shadow, terrain shadow and water. More than 2500 field sites across the Australian rangelands were used to derive endmembers used in a constrained unmixing approach to estimate the per-pixel proportion of bare, green and non-green vegetation for all images. A seasonal metoid compositing method was used to produce national fractional cover virtual mosaics for each three month period since 1988. The time series of green fraction is used to estimate the persistent green due to tree and shrub canopies, and this estimate is used to correct the fractional cover to ground cover for our mixed tree-grass rangeland systems. Finally, deciles are produced for key metrics every season to track a pixels relativity to the entire time series. These data are delivered through time series enabled web mapping services and customised web processing services that enable the full time series over any spatial extent to be interrogated in seconds via a RESTful interface. These services interface with a front end browser application that provides product visualization for any date in the time series, tools to draw or import polygon boundaries, plot time series ground cover comparisons, look at the effect of historical rainfall and tools to run the revised universal soil loss equation in web time to assess the effect of proposed changes in cover retention. VegMachine Online is already being used by ranchers monitoring paddock condition
Automatic sentence extraction for the detection of scientific paper relations
NASA Astrophysics Data System (ADS)
Sibaroni, Y.; Prasetiyowati, S. S.; Miftachudin, M.
2018-03-01
The relations between scientific papers are very useful for researchers to see the interconnection between scientific papers quickly. By observing the inter-article relationships, researchers can identify, among others, the weaknesses of existing research, performance improvements achieved to date, and tools or data typically used in research in specific fields. So far, methods that have been developed to detect paper relations include machine learning and rule-based methods. However, a problem still arises in the process of sentence extraction from scientific paper documents, which is still done manually. This manual process causes the detection of scientific paper relations longer and inefficient. To overcome this problem, this study performs an automatic sentences extraction while the paper relations are identified based on the citation sentence. The performance of the built system is then compared with that of the manual extraction system. The analysis results suggested that the automatic sentence extraction indicates a very high level of performance in the detection of paper relations, which is close to that of manual sentence extraction.
Interrogating Bronchoalveolar Lavage Samples via Exclusion-Based Analyte Extraction.
Tokar, Jacob J; Warrick, Jay W; Guckenberger, David J; Sperger, Jamie M; Lang, Joshua M; Ferguson, J Scott; Beebe, David J
2017-06-01
Although average survival rates for lung cancer have improved, earlier and better diagnosis remains a priority. One promising approach to assisting earlier and safer diagnosis of lung lesions is bronchoalveolar lavage (BAL), which provides a sample of lung tissue as well as proteins and immune cells from the vicinity of the lesion, yet diagnostic sensitivity remains a challenge. Reproducible isolation of lung epithelia and multianalyte extraction have the potential to improve diagnostic sensitivity and provide new information for developing personalized therapeutic approaches. We present the use of a recently developed exclusion-based, solid-phase-extraction technique called SLIDE (Sliding Lid for Immobilized Droplet Extraction) to facilitate analysis of BAL samples. We developed a SLIDE protocol for lung epithelial cell extraction and biomarker staining of patient BALs, testing both EpCAM and Trop2 as capture antigens. We characterized captured cells using TTF1 and p40 as immunostaining biomarkers of adenocarcinoma and squamous cell carcinoma, respectively. We achieved up to 90% (EpCAM) and 84% (Trop2) extraction efficiency of representative tumor cell lines. We then used the platform to process two patient BAL samples in parallel within the same sample plate to demonstrate feasibility and observed that Trop2-based extraction potentially extracts more target cells than EpCAM-based extraction.
Zhang, Xiaodong; Zeng, Zhen; Liu, Xianlei; Fang, Fengzhou
2015-09-21
Freeform surface is promising to be the next generation optics, however it needs high form accuracy for excellent performance. The closed-loop of fabrication-measurement-compensation is necessary for the improvement of the form accuracy. It is difficult to do an off-machine measurement during the freeform machining because the remounting inaccuracy can result in significant form deviations. On the other side, on-machine measurement may hides the systematic errors of the machine because the measuring device is placed in situ on the machine. This study proposes a new compensation strategy based on the combination of on-machine and off-machine measurement. The freeform surface is measured in off-machine mode with nanometric accuracy, and the on-machine probe achieves accurate relative position between the workpiece and machine after remounting. The compensation cutting path is generated according to the calculated relative position and shape errors to avoid employing extra manual adjustment or highly accurate reference-feature fixture. Experimental results verified the effectiveness of the proposed method.
Lamb wave based damage detection using Matching Pursuit and Support Vector Machine classifier
NASA Astrophysics Data System (ADS)
Agarwal, Sushant; Mitra, Mira
2014-03-01
In this paper, the suitability of using Matching Pursuit (MP) and Support Vector Machine (SVM) for damage detection using Lamb wave response of thin aluminium plate is explored. Lamb wave response of thin aluminium plate with or without damage is simulated using finite element. Simulations are carried out at different frequencies for various kinds of damage. The procedure is divided into two parts - signal processing and machine learning. Firstly, MP is used for denoising and to maintain the sparsity of the dataset. In this study, MP is extended by using a combination of time-frequency functions as the dictionary and is deployed in two stages. Selection of a particular type of atoms lead to extraction of important features while maintaining the sparsity of the waveform. The resultant waveform is then passed as input data for SVM classifier. SVM is used to detect the location of the potential damage from the reduced data. The study demonstrates that SVM is a robust classifier in presence of noise and more efficient as compared to Artificial Neural Network (ANN). Out-of-sample data is used for the validation of the trained and tested classifier. Trained classifiers are found successful in detection of the damage with more than 95% detection rate.
Discriminative feature-rich models for syntax-based machine translation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dixon, Kevin R.
This report describes the campus executive LDRD %E2%80%9CDiscriminative Feature-Rich Models for Syntax-Based Machine Translation,%E2%80%9D which was an effort to foster a better relationship between Sandia and Carnegie Mellon University (CMU). The primary purpose of the LDRD was to fund the research of a promising graduate student at CMU; in this case, Kevin Gimpel was selected from the pool of candidates. This report gives a brief overview of Kevin Gimpel's research.
NASA Astrophysics Data System (ADS)
Marulcu, Ismail; Barnett, Michael
2016-01-01
Background: Elementary Science Education is struggling with multiple challenges. National and State test results confirm the need for deeper understanding in elementary science education. Moreover, national policy statements and researchers call for increased exposure to engineering and technology in elementary science education. The basic motivation of this study is to suggest a solution to both improving elementary science education and increasing exposure to engineering and technology in it. Purpose/Hypothesis: This mixed-method study examined the impact of an engineering design-based curriculum compared to an inquiry-based curriculum on fifth graders' content learning of simple machines. We hypothesize that the LEGO-engineering design unit is as successful as the inquiry-based unit in terms of students' science content learning of simple machines. Design/Method: We used a mixed-methods approach to investigate our research questions; we compared the control and the experimental groups' scores from the tests and interviews by using Analysis of Covariance (ANCOVA) and compared each group's pre- and post-scores by using paired t-tests. Results: Our findings from the paired t-tests show that both the experimental and comparison groups significantly improved their scores from the pre-test to post-test on the multiple-choice, open-ended, and interview items. Moreover, ANCOVA results show that students in the experimental group, who learned simple machines with the design-based unit, performed significantly better on the interview questions. Conclusions: Our analyses revealed that the design-based Design a people mover: Simple machines unit was, if not better, as successful as the inquiry-based FOSS Levers and pulleys unit in terms of students' science content learning.
A Symbiotic Brain-Machine Interface through Value-Based Decision Making
Mahmoudi, Babak; Sanchez, Justin C.
2011-01-01
Background In the development of Brain Machine Interfaces (BMIs), there is a great need to enable users to interact with changing environments during the activities of daily life. It is expected that the number and scope of the learning tasks encountered during interaction with the environment as well as the pattern of brain activity will vary over time. These conditions, in addition to neural reorganization, pose a challenge to decoding neural commands for BMIs. We have developed a new BMI framework in which a computational agent symbiotically decoded users' intended actions by utilizing both motor commands and goal information directly from the brain through a continuous Perception-Action-Reward Cycle (PARC). Methodology The control architecture designed was based on Actor-Critic learning, which is a PARC-based reinforcement learning method. Our neurophysiology studies in rat models suggested that Nucleus Accumbens (NAcc) contained a rich representation of goal information in terms of predicting the probability of earning reward and it could be translated into an evaluative feedback for adaptation of the decoder with high precision. Simulated neural control experiments showed that the system was able to maintain high performance in decoding neural motor commands during novel tasks or in the presence of reorganization in the neural input. We then implanted a dual micro-wire array in the primary motor cortex (M1) and the NAcc of rat brain and implemented a full closed-loop system in which robot actions were decoded from the single unit activity in M1 based on an evaluative feedback that was estimated from NAcc. Conclusions Our results suggest that adapting the BMI decoder with an evaluative feedback that is directly extracted from the brain is a possible solution to the problem of operating BMIs in changing environments with dynamic neural signals. During closed-loop control, the agent was able to solve a reaching task by capturing the action and reward
Koul, Atesh; Becchio, Cristina; Cavallo, Andrea
2017-12-12
Recent years have seen an increased interest in machine learning-based predictive methods for analyzing quantitative behavioral data in experimental psychology. While these methods can achieve relatively greater sensitivity compared to conventional univariate techniques, they still lack an established and accessible implementation. The aim of current work was to build an open-source R toolbox - "PredPsych" - that could make these methods readily available to all psychologists. PredPsych is a user-friendly, R toolbox based on machine-learning predictive algorithms. In this paper, we present the framework of PredPsych via the analysis of a recently published multiple-subject motion capture dataset. In addition, we discuss examples of possible research questions that can be addressed with the machine-learning algorithms implemented in PredPsych and cannot be easily addressed with univariate statistical analysis. We anticipate that PredPsych will be of use to researchers with limited programming experience not only in the field of psychology, but also in that of clinical neuroscience, enabling computational assessment of putative bio-behavioral markers for both prognosis and diagnosis.
Bond strength and interactions of machined titanium-based alloy with dental cements.
Wadhwani, Chandur; Chung, Kwok-Hung
2015-11-01
The most appropriate luting agent for restoring cement-retained implant restorations has yet to be determined. Leachable chemicals from some types of cement designed for teeth may affect metal surfaces. The purpose of this in vitro study was to evaluate the shear bond strength and interactions of machined titanium-based alloy with dental luting agents. Eight dental luting agents representative of 4 different compositional classes (resin, polycarboxylate, glass ionomer, and zinc oxide-based cements) were used to evaluate their effect on machined titanium-6 aluminum-4 vanadium (Ti-6Al-4V) alloy surfaces. Ninety-six paired disks were cemented together (n=12). After incubation in a 37°C water bath for 7 days, the shear bond strength was measured with a universal testing machine (Instron) and a custom fixture with a crosshead speed of 5 mm/min. Differences were analyzed statistically with 1-way ANOVA and Tukey HSD tests (α=.05). The debonded surfaces of the Ti alloy disks were examined under a light microscope at ×10 magnification to record the failure pattern, and the representative specimens were observed under a scanning electron microscope. The mean ±SD of shear failure loads ranged from 3.4 ±0.5 to 15.2 ±2.6 MPa. The retention provided by both polycarboxylate cements was significantly greater than that of all other groups (P<.05). The scanning electron microscope examination revealed surface pits only on the bonded surface cemented with the polycarboxylate cements. Cementation with polycarboxylate cement obtained higher shear bond strength. Some chemical interactions occurred between the machined Ti-6Al-4V alloy surface and polycarboxylate cements during cementation. Copyright © 2015 Editorial Council for the Journal of Prosthetic Dentistry. Published by Elsevier Inc. All rights reserved.
Chang, Ni-Bin; Bai, Kaixu; Chen, Chi-Farn
2017-10-01
Monitoring water quality changes in lakes, reservoirs, estuaries, and coastal waters is critical in response to the needs for sustainable development. This study develops a remote sensing-based multiscale modeling system by integrating multi-sensor satellite data merging and image reconstruction algorithms in support of feature extraction with machine learning leading to automate continuous water quality monitoring in environmentally sensitive regions. This new Earth observation platform, termed "cross-mission data merging and image reconstruction with machine learning" (CDMIM), is capable of merging multiple satellite imageries to provide daily water quality monitoring through a series of image processing, enhancement, reconstruction, and data mining/machine learning techniques. Two existing key algorithms, including Spectral Information Adaptation and Synthesis Scheme (SIASS) and SMart Information Reconstruction (SMIR), are highlighted to support feature extraction and content-based mapping. Whereas SIASS can support various data merging efforts to merge images collected from cross-mission satellite sensors, SMIR can overcome data gaps by reconstructing the information of value-missing pixels due to impacts such as cloud obstruction. Practical implementation of CDMIM was assessed by predicting the water quality over seasons in terms of the concentrations of nutrients and chlorophyll-a, as well as water clarity in Lake Nicaragua, providing synergistic efforts to better monitor the aquatic environment and offer insightful lake watershed management strategies. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Xie, Huan; Luo, Xin; Xu, Xiong; Wang, Chen; Pan, Haiyan; Tong, Xiaohua; Liu, Shijie
2016-10-01
Water body is a fundamental element in urban ecosystems and water mapping is critical for urban and landscape planning and management. As remote sensing has increasingly been used for water mapping in rural areas, this spatially explicit approach applied in urban area is also a challenging work due to the water bodies mainly distributed in a small size and the spectral confusion widely exists between water and complex features in the urban environment. Water index is the most common method for water extraction at pixel level, and spectral mixture analysis (SMA) has been widely employed in analyzing urban environment at subpixel level recently. In this paper, we introduce an automatic subpixel water mapping method in urban areas using multispectral remote sensing data. The objectives of this research consist of: (1) developing an automatic land-water mixed pixels extraction technique by water index; (2) deriving the most representative endmembers of water and land by utilizing neighboring water pixels and adaptive iterative optimal neighboring land pixel for respectively; (3) applying a linear unmixing model for subpixel water fraction estimation. Specifically, to automatically extract land-water pixels, the locally weighted scatter plot smoothing is firstly used to the original histogram curve of WI image . And then the Ostu threshold is derived as the start point to select land-water pixels based on histogram of the WI image with the land threshold and water threshold determination through the slopes of histogram curve . Based on the previous process at pixel level, the image is divided into three parts: water pixels, land pixels, and mixed land-water pixels. Then the spectral mixture analysis (SMA) is applied to land-water mixed pixels for water fraction estimation at subpixel level. With the assumption that the endmember signature of a target pixel should be more similar to adjacent pixels due to spatial dependence, the endmember of water and land are determined
Fifty years of computer analysis in chest imaging: rule-based, machine learning, deep learning.
van Ginneken, Bram
2017-03-01
Half a century ago, the term "computer-aided diagnosis" (CAD) was introduced in the scientific literature. Pulmonary imaging, with chest radiography and computed tomography, has always been one of the focus areas in this field. In this study, I describe how machine learning became the dominant technology for tackling CAD in the lungs, generally producing better results than do classical rule-based approaches, and how the field is now rapidly changing: in the last few years, we have seen how even better results can be obtained with deep learning. The key differences among rule-based processing, machine learning, and deep learning are summarized and illustrated for various applications of CAD in the chest.
Classification of Strawberry Fruit Shape by Machine Learning
NASA Astrophysics Data System (ADS)
Ishikawa, T.; Hayashi, A.; Nagamatsu, S.; Kyutoku, Y.; Dan, I.; Wada, T.; Oku, K.; Saeki, Y.; Uto, T.; Tanabata, T.; Isobe, S.; Kochi, N.
2018-05-01
Shape is one of the most important traits of agricultural products due to its relationships with the quality, quantity, and value of the products. For strawberries, the nine types of fruit shape were defined and classified by humans based on the sampler patterns of the nine types. In this study, we tested the classification of strawberry shapes by machine learning in order to increase the accuracy of the classification, and we introduce the concept of computerization into this field. Four types of descriptors were extracted from the digital images of strawberries: (1) the Measured Values (MVs) including the length of the contour line, the area, the fruit length and width, and the fruit width/length ratio; (2) the Ellipse Similarity Index (ESI); (3) Elliptic Fourier Descriptors (EFDs), and (4) Chain Code Subtraction (CCS). We used these descriptors for the classification test along with the random forest approach, and eight of the nine shape types were classified with combinations of MVs + CCS + EFDs. CCS is a descriptor that adds human knowledge to the chain codes, and it showed higher robustness in classification than the other descriptors. Our results suggest machine learning's high ability to classify fruit shapes accurately. We will attempt to increase the classification accuracy and apply the machine learning methods to other plant species.
The Value Simulation-Based Learning Added to Machining Technology in Singapore
ERIC Educational Resources Information Center
Fang, Linda; Tan, Hock Soon; Thwin, Mya Mya; Tan, Kim Cheng; Koh, Caroline
2011-01-01
This study seeks to understand the value simulation-based learning (SBL) added to the learning of Machining Technology in a 15-week core subject course offered to university students. The research questions were: (1) How did SBL enhance classroom learning? (2) How did SBL help participants in their test? (3) How did SBL prepare participants for…
Improving Energy Efficiency in CNC Machining
NASA Astrophysics Data System (ADS)
Pavanaskar, Sushrut S.
We present our work on analyzing and improving the energy efficiency of multi-axis CNC milling process. Due to the differences in energy consumption behavior, we treat 3- and 5-axis CNC machines separately in our work. For 3-axis CNC machines, we first propose an energy model that estimates the energy requirement for machining a component on a specified 3-axis CNC milling machine. Our model makes machine-specific predictions of energy requirements while also considering the geometric aspects of the machining toolpath. Our model - and the associated software tool - facilitate direct comparison of various alternative toolpath strategies based on their energy-consumption performance. Further, we identify key factors in toolpath planning that affect energy consumption in CNC machining. We then use this knowledge to propose and demonstrate a novel toolpath planning strategy that may be used to generate new toolpaths that are inherently energy-efficient, inspired by research on digital micrography -- a form of computational art. For 5-axis CNC machines, the process planning problem consists of several sub-problems that researchers have traditionally solved separately to obtain an approximate solution. After illustrating the need to solve all sub-problems simultaneously for a truly optimal solution, we propose a unified formulation based on configuration space theory. We apply our formulation to solve a problem variant that retains key characteristics of the full problem but has lower dimensionality, allowing visualization in 2D. Given the complexity of the full 5-axis toolpath planning problem, our unified formulation represents an important step towards obtaining a truly optimal solution. With this work on the two types of CNC machines, we demonstrate that without changing the current infrastructure or business practices, machine-specific, geometry-based, customized toolpath planning can save energy in CNC machining.
Summary of vulnerability related technologies based on machine learning
NASA Astrophysics Data System (ADS)
Zhao, Lei; Chen, Zhihao; Jia, Qiong
2018-04-01
As the scale of information system increases by an order of magnitude, the complexity of system software is getting higher. The vulnerability interaction from design, development and deployment to implementation stages greatly increases the risk of the entire information system being attacked successfully. Considering the limitations and lags of the existing mainstream security vulnerability detection techniques, this paper summarizes the development and current status of related technologies based on the machine learning methods applied to deal with massive and irregular data, and handling security vulnerabilities.
Machine learning for cardiac ultrasound time series data
NASA Astrophysics Data System (ADS)
Yuan, Baichuan; Chitturi, Sathya R.; Iyer, Geoffrey; Li, Nuoyu; Xu, Xiaochuan; Zhan, Ruohan; Llerena, Rafael; Yen, Jesse T.; Bertozzi, Andrea L.
2017-03-01
We consider the problem of identifying frames in a cardiac ultrasound video associated with left ventricular chamber end-systolic (ES, contraction) and end-diastolic (ED, expansion) phases of the cardiac cycle. Our procedure involves a simple application of non-negative matrix factorization (NMF) to a series of frames of a video from a single patient. Rank-2 NMF is performed to compute two end-members. The end members are shown to be close representations of the actual heart morphology at the end of each phase of the heart function. Moreover, the entire time series can be represented as a linear combination of these two end-member states thus providing a very low dimensional representation of the time dynamics of the heart. Unlike previous work, our methods do not require any electrocardiogram (ECG) information in order to select the end-diastolic frame. Results are presented for a data set of 99 patients including both healthy and diseased examples.
A Machine LearningFramework to Forecast Wave Conditions
NASA Astrophysics Data System (ADS)
Zhang, Y.; James, S. C.; O'Donncha, F.
2017-12-01
Recently, significant effort has been undertaken to quantify and extract wave energy because it is renewable, environmental friendly, abundant, and often close to population centers. However, a major challenge is the ability to accurately and quickly predict energy production, especially across a 48-hour cycle. Accurate forecasting of wave conditions is a challenging undertaking that typically involves solving the spectral action-balance equation on a discretized grid with high spatial resolution. The nature of the computations typically demands high-performance computing infrastructure. Using a case-study site at Monterey Bay, California, a machine learning framework was trained to replicate numerically simulated wave conditions at a fraction of the typical computational cost. Specifically, the physics-based Simulating WAves Nearshore (SWAN) model, driven by measured wave conditions, nowcast ocean currents, and wind data, was used to generate training data for machine learning algorithms. The model was run between April 1st, 2013 and May 31st, 2017 generating forecasts at three-hour intervals yielding 11,078 distinct model outputs. SWAN-generated fields of 3,104 wave heights and a characteristic period could be replicated through simple matrix multiplications using the mapping matrices from machine learning algorithms. In fact, wave-height RMSEs from the machine learning algorithms (9 cm) were less than those for the SWAN model-verification exercise where those simulations were compared to buoy wave data within the model domain (>40 cm). The validated machine learning approach, which acts as an accurate surrogate for the SWAN model, can now be used to perform real-time forecasts of wave conditions for the next 48 hours using available forecasted boundary wave conditions, ocean currents, and winds. This solution has obvious applications to wave-energy generation as accurate wave conditions can be forecasted with over a three-order-of-magnitude reduction in
Chen, Qiu-Feng; Chen, Hua-Jun; Liu, Jun; Sun, Tao; Shen, Qun-Tai
2016-01-01
Machine learning-based approaches play an important role in examining functional magnetic resonance imaging (fMRI) data in a multivariate manner and extracting features predictive of group membership. This study was performed to assess the potential for measuring brain intrinsic activity to identify minimal hepatic encephalopathy (MHE) in cirrhotic patients, using the support vector machine (SVM) method. Resting-state fMRI data were acquired in 16 cirrhotic patients with MHE and 19 cirrhotic patients without MHE. The regional homogeneity (ReHo) method was used to investigate the local synchrony of intrinsic brain activity. Psychometric Hepatic Encephalopathy Score (PHES) was used to define MHE condition. SVM-classifier was then applied using leave-one-out cross-validation, to determine the discriminative ReHo-map for MHE. The discrimination map highlights a set of regions, including the prefrontal cortex, anterior cingulate cortex, anterior insular cortex, inferior parietal lobule, precentral and postcentral gyri, superior and medial temporal cortices, and middle and inferior occipital gyri. The optimized discriminative model showed total accuracy of 82.9% and sensitivity of 81.3%. Our results suggested that a combination of the SVM approach and brain intrinsic activity measurement could be helpful for detection of MHE in cirrhotic patients.