parse research method: Topics by Science.gov

Sample records for parse research method

Research on polarization imaging information parsing method

NASA Astrophysics Data System (ADS)

Yuan, Hongwu; Zhou, Pucheng; Wang, Xiaolong

2016-11-01

Polarization information parsing plays an important role in polarization imaging detection. This paper focus on the polarization information parsing method: Firstly, the general process of polarization information parsing is given, mainly including polarization image preprocessing, multiple polarization parameters calculation, polarization image fusion and polarization image tracking, etc.; And then the research achievements of the polarization information parsing method are presented, in terms of polarization image preprocessing, the polarization image registration method based on the maximum mutual information is designed. The experiment shows that this method can improve the precision of registration and be satisfied the need of polarization information parsing; In terms of multiple polarization parameters calculation, based on the omnidirectional polarization inversion model is built, a variety of polarization parameter images are obtained and the precision of inversion is to be improve obviously; In terms of polarization image fusion , using fuzzy integral and sparse representation, the multiple polarization parameters adaptive optimal fusion method is given, and the targets detection in complex scene is completed by using the clustering image segmentation algorithm based on fractal characters; In polarization image tracking, the average displacement polarization image characteristics of auxiliary particle filtering fusion tracking algorithm is put forward to achieve the smooth tracking of moving targets. Finally, the polarization information parsing method is applied to the polarization imaging detection of typical targets such as the camouflage target, the fog and latent fingerprints.
A Semantic Constraint on Syntactic Parsing.

ERIC Educational Resources Information Center

Crain, Stephen; Coker, Pamela L.

This research examines how semantic information influences syntactic parsing decisions during sentence processing. In the first experiment, subjects were presented lexical strings having syntactically identical surface structures but with two possible underlying structures: "The children taught by the Berlitz method," and "The…
The lived experience of doing the right thing: a parse method study.

PubMed

Smith, Sandra Maxwell

2012-01-01

The purposes of this research were to discover the structure of the experience of doing the right thing and to contribute to nursing knowledge. The Parse research method was used in this study to answer the research question: What is the structure of the lived experience of doing the right thing? Participants were 10 individuals living in the community. The central finding of this study was the following structure: The lived experience of doing the right thing is steadfast uprightness amid adversity, as honorableness with significant affiliations emerges with contentment. New knowledge extended the theory of humanbecoming and enhanced understanding of the experience of doing the right thing.
Image portion identification methods, image parsing methods, image parsing systems, and articles of manufacture

DOEpatents

Lassahn, Gordon D.; Lancaster, Gregory D.; Apel, William A.; Thompson, Vicki S.

2013-01-08

Image portion identification methods, image parsing methods, image parsing systems, and articles of manufacture are described. According to one embodiment, an image portion identification method includes accessing data regarding an image depicting a plurality of biological substrates corresponding to at least one biological sample and indicating presence of at least one biological indicator within the biological sample and, using processing circuitry, automatically identifying a portion of the image depicting one of the biological substrates but not others of the biological substrates.
The lived experience of serenity: using Parse's research method.

PubMed

Kruse, B G

1999-04-01

Parse's research method was used to investigate the meaning of serenity for survivors of a life-threatening illness or traumatic event. Ten survivors of cancer told their stories of the meaning of serenity as they had lived it in their lives. Descriptions were aided by photographs chosen by each participant to represent the meaning of serenity for them. The structure of serenity was generated through the extraction-synthesis process. Four main concepts--steering-yielding with the flow, savoring remembered visions of engaging surroundings, abiding with aloneness-togetherness, and attesting to a loving presence--emerged and led to a theoretical structure of serenity from the human becoming perspective. Findings confirm serenity as a multidimensional process.
MEG Evidence for Incremental Sentence Composition in the Anterior Temporal Lobe

ERIC Educational Resources Information Center

Brennan, Jonathan R.; Pylkkänen, Liina

2017-01-01

Research investigating the brain basis of language comprehension has associated the left anterior temporal lobe (ATL) with sentence-level combinatorics. Using magnetoencephalography (MEG), we test the parsing strategy implemented in this brain region. The number of incremental parse steps from a predictive left-corner parsing strategy that is…
Accuracy and Tuning of Flow Parsing for Visual Perception of Object Motion During Self-Motion

PubMed Central

Niehorster, Diederick C.

2017-01-01

How do we perceive object motion during self-motion using visual information alone? Previous studies have reported that the visual system can use optic flow to identify and globally subtract the retinal motion component resulting from self-motion to recover scene-relative object motion, a process called flow parsing. In this article, we developed a retinal motion nulling method to directly measure and quantify the magnitude of flow parsing (i.e., flow parsing gain) in various scenarios to examine the accuracy and tuning of flow parsing for the visual perception of object motion during self-motion. We found that flow parsing gains were below unity for all displays in all experiments; and that increasing self-motion and object motion speed did not alter flow parsing gain. We conclude that visual information alone is not sufficient for the accurate perception of scene-relative motion during self-motion. Although flow parsing performs global subtraction, its accuracy also depends on local motion information in the retinal vicinity of the moving object. Furthermore, the flow parsing gain was constant across common self-motion or object motion speeds. These results can be used to inform and validate computational models of flow parsing. PMID:28567272
Trying something new.

PubMed

Condon, Barbara Backer

2013-01-01

Trying something new is a universal living experience of health. Although trying something new frequently occurs in healthcare, its meaning has never explicitly been studied. Parse's humanbecoming school of thought is the theoretical perspective for this study. The research question for this study is: What is the structure of the living experience of trying something new? The purpose of this study was to advance nursing science. Parse's qualitative phenomenological-hermeneutic research method was used to guide this study. Participants were 8 men and 2 women, ages 29 to 65 who utilize an outpatient mental health facility in the Midwest. Data were collected with dialogical engagement. The major finding of the study is the structure: Trying something new is engaging in capricious exploitations with vacillating sentiments, as wistful contemplation surfaces with disparate affiliations.
Pragmatic precision oncology: the secondary uses of clinical tumor molecular profiling

PubMed Central

Thota, Ramya; Staggs, David B; Johnson, Douglas B; Warner, Jeremy L

2016-01-01

Background Precision oncology increasingly utilizes molecular profiling of tumors to determine treatment decisions with targeted therapeutics. The molecular profiling data is valuable in the treatment of individual patients as well as for multiple secondary uses. Objective To automatically parse, categorize, and aggregate clinical molecular profile data generated during cancer care as well as use this data to address multiple secondary use cases. Methods A system to parse, categorize and aggregate molecular profile data was created. A naÿve Bayesian classifier categorized results according to clinical groups. The accuracy of these systems were validated against a published expertly-curated subset of molecular profiling data. Results Following one year of operation, 819 samples have been accurately parsed and categorized to generate a data repository of 10,620 genetic variants. The database has been used for operational, clinical trial, and discovery science research. Conclusions A real-time database of molecular profiling data is a pragmatic solution to several knowledge management problems in the practice and science of precision oncology. PMID:27026612
The Storage and Processing of Morphologically Complex Words in L2 Spanish

ERIC Educational Resources Information Center

Foote, Rebecca

2017-01-01

Research with native speakers indicates that, during word recognition, regularly inflected words undergo parsing that segments them into stems and affixes. In contrast, studies with learners suggest that this parsing may not take place in L2. This study's research questions are: Do L2 Spanish learners store and process regularly inflected,…
Parsing Flowcharts and Series-Parallel Graphs

DTIC Science & Technology

1978-11-01

descriptions of the graph. This possible multiplicity is undesirable in most practical applications, a fact that makes parti%:ularly useful reduction...to parse TT networks, some of the features that make this parsing method useful in other cases are more natually introduced in the context of this...as Figure 4.5 shows. This multiplicity is due to the associativity of consecutive Two Terminal Series and Two Terminal Parallel compositions. In spite
A hierarchical methodology for urban facade parsing from TLS point clouds

NASA Astrophysics Data System (ADS)

Li, Zhuqiang; Zhang, Liqiang; Mathiopoulos, P. Takis; Liu, Fangyu; Zhang, Liang; Li, Shuaipeng; Liu, Hao

2017-01-01

The effective and automated parsing of building facades from terrestrial laser scanning (TLS) point clouds of urban environments is an important research topic in the GIS and remote sensing fields. It is also challenging because of the complexity and great variety of the available 3D building facade layouts as well as the noise and data missing of the input TLS point clouds. In this paper, we introduce a novel methodology for the accurate and computationally efficient parsing of urban building facades from TLS point clouds. The main novelty of the proposed methodology is that it is a systematic and hierarchical approach that considers, in an adaptive way, the semantic and underlying structures of the urban facades for segmentation and subsequent accurate modeling. Firstly, the available input point cloud is decomposed into depth planes based on a data-driven method; such layer decomposition enables similarity detection in each depth plane layer. Secondly, the labeling of the facade elements is performed using the SVM classifier in combination with our proposed BieS-ScSPM algorithm. The labeling outcome is then augmented with weak architectural knowledge. Thirdly, least-squares fitted normalized gray accumulative curves are applied to detect regular structures, and a binarization dilation extraction algorithm is used to partition facade elements. A dynamic line-by-line division is further applied to extract the boundaries of the elements. The 3D geometrical façade models are then reconstructed by optimizing facade elements across depth plane layers. We have evaluated the performance of the proposed method using several TLS facade datasets. Qualitative and quantitative performance comparisons with several other state-of-the-art methods dealing with the same facade parsing problem have demonstrated its superiority in performance and its effectiveness in improving segmentation accuracy.
Comparison of Classical and Lazy Approach in SCG Compiler

NASA Astrophysics Data System (ADS)

Jirák, Ota; Kolář, Dušan

2011-09-01

The existing parsing methods of scattered context grammar usually expand nonterminals deeply in the pushdown. This expansion is implemented by using either a linked list, or some kind of an auxiliary pushdown. This paper describes the parsing algorithm of an LL(1) scattered context grammar. The given algorithm merges two principles together. The first approach is a table-driven parsing method commonly used for parsing of the context-free grammars. The second is a delayed execution used in functional programming. The main part of this paper is a proof of equivalence between the common principle (the whole rule is applied at once) and our approach (execution of the rules is delayed). Therefore, this approach works with the pushdown top only. In the most cases, the second approach is faster than the first one. Finally, the future work is discussed.
Revisiting Executive Function Measurement: Implications for Lifespan Development

ERIC Educational Resources Information Center

Wiebe, Sandra A.; McFall, G. Peggy

2014-01-01

Since Miyake and his colleagues (2000) published their seminal paper on the use of confirmatory factor analysis (CFA) to parse executive function (EF), CFA methods have become ubiquitous in EF research. In their interesting and thoughtful Focus article, "Executive Function: Formative Versus Reflective Measurement," Willoughby and…
The Living Experience of Feeling Surprised.

PubMed

Bunkers, Sandra Schmidt

2017-01-01

The purpose of this article is to report the finding of a Parse research method study on the universal living experience of feeling surprised. In dialogical engagement with the researcher, eight participants described the experience. The structure of the living experience of feeling surprised was found to be: Feeling surprised is stunning amazement arising with shifting fortunes, as delight amid despair surfaces with diverse involvements.
A Semantic Parsing Method for Mapping Clinical Questions to Logical Forms

PubMed Central

Roberts, Kirk; Patra, Braja Gopal

2017-01-01

This paper presents a method for converting natural language questions about structured data in the electronic health record (EHR) into logical forms. The logical forms can then subsequently be converted to EHR-dependent structured queries. The natural language processing task, known as semantic parsing, has the potential to convert questions to logical forms with extremely high precision, resulting in a system that is usable and trusted by clinicians for real-time use in clinical settings. We propose a hybrid semantic parsing method, combining rule-based methods with a machine learning-based classifier. The overall semantic parsing precision on a set of 212 questions is 95.6%. The parser’s rules furthermore allow it to “know what it does not know”, enabling the system to indicate when unknown terms prevent it from understanding the question’s full logical structure. When combined with a module for converting a logical form into an EHR-dependent query, this high-precision approach allows for a question answering system to provide a user with a single, verifiably correct answer. PMID:29854217
Parser Combinators: a Practical Application for Generating Parsers for NMR Data

PubMed Central

Fenwick, Matthew; Weatherby, Gerard; Ellis, Heidi JC; Gryk, Michael R.

2013-01-01

Nuclear Magnetic Resonance (NMR) spectroscopy is a technique for acquiring protein data at atomic resolution and determining the three-dimensional structure of large protein molecules. A typical structure determination process results in the deposition of a large data sets to the BMRB (Bio-Magnetic Resonance Data Bank). This data is stored and shared in a file format called NMR-Star. This format is syntactically and semantically complex making it challenging to parse. Nevertheless, parsing these files is crucial to applying the vast amounts of biological information stored in NMR-Star files, allowing researchers to harness the results of previous studies to direct and validate future work. One powerful approach for parsing files is to apply a Backus-Naur Form (BNF) grammar, which is a high-level model of a file format. Translation of the grammatical model to an executable parser may be automatically accomplished. This paper will show how we applied a model BNF grammar of the NMR-Star format to create a free, open-source parser, using a method that originated in the functional programming world known as “parser combinators”. This paper demonstrates the effectiveness of a principled approach to file specification and parsing. This paper also builds upon our previous work [1], in that 1) it applies concepts from Functional Programming (which is relevant even though the implementation language, Java, is more mainstream than Functional Programming), and 2) all work and accomplishments from this project will be made available under standard open source licenses to provide the community with the opportunity to learn from our techniques and methods. PMID:24352525
The Experience of Feeling Disrespected: A Humanbecoming Perspective.

PubMed

Hawkins, Kim

2017-04-01

The concept of feeling disrespected was explored using the Parse research method. Ten women living with embodied largeness were asked, "What is the experience of feeling disrespected?" The structure of the living experience was feeling disrespected is mortifying disheartenment arising with disquieting irreverence, as distancing affiliations surface while enduring hardship. The findings provided new knowledge of living quality, advanced nursing practice, and presented future direction for research.
Parsing GML data based on integrative GML syntactic and semantic schemas database

NASA Astrophysics Data System (ADS)

Miao, Lizhi; Zhang, Shuliang; Lu, Guonian; Gao, Xiaoli; Jiao, Donglai; Gan, Jiayan

2007-06-01

This paper proposes a new method to parse various application schemas of Geography Markup Language (GML) for understanding syntax and semantic of their element and type in order to implement uniform interpretation of the same GML instance data among diverse users. The proposed method generates an Integrative GML Syntactic and Semantic Schemas Database (IGSSSDB) from GML3.1 core schemas and corresponding application schema. This paper parses GML data based on IGSSSDB, which is composed of syntactic and semantic information, nesting information and mapping rules of GML core schemas and application schemas. Three kinds of relational tables are designed for storing information from schemas when constructing IGSSSDB. Those are info tables for schemas included and namespace imported in application schemas, tables for information related to schemas and catalog tables of core schemas. In relational tables, we propose to use homologous regular expression to describe model of elements and complex types in schemas, which can ensure model complete and readable. Based on IGSSSDB, we design and develop many APIs to implement GML data parsing, and can process syntactic and semantic information of GML data from diverse fields and users. At the latter part of this paper, test study is implemented to show that the proposed method is feasible and appropriate for parsing GML data. Also, it founds a good basis for future GML data studies such as storage, index and query etc.
Hermes, the Information Messenger, Integrating Information Services and Delivering Them to the End User.

ERIC Educational Resources Information Center

Coello-Coutino, Gerardo; Ainsworth, Shirley; Escalante-Gonzalbo, Ana Marie

2002-01-01

Describes Hermes, a research tool that uses specially designed acquisition, parsing and presentation methods to integrate information resources on the Internet, from searching in disparate bibliographic databases, to accessing full text articles online, and developing a web of information associated with each reference via one common interface.…

Memory mechanisms supporting syntactic comprehension.

PubMed

Caplan, David; Waters, Gloria

2013-04-01

Efforts to characterize the memory system that supports sentence comprehension have historically drawn extensively on short-term memory as a source of mechanisms that might apply to sentences. The focus of these efforts has changed significantly in the past decade. As a result of changes in models of short-term working memory (ST-WM) and developments in models of sentence comprehension, the effort to relate entire components of an ST-WM system, such as those in the model developed by Baddeley (Nature Reviews Neuroscience 4: 829-839, 2003) to sentence comprehension has largely been replaced by an effort to relate more specific mechanisms found in modern models of ST-WM to memory processes that support one aspect of sentence comprehension--the assignment of syntactic structure (parsing) and its use in determining sentence meaning (interpretation) during sentence comprehension. In this article, we present the historical background to recent studies of the memory mechanisms that support parsing and interpretation and review recent research into this relation. We argue that the results of this research do not converge on a set of mechanisms derived from ST-WM that apply to parsing and interpretation. We argue that the memory mechanisms supporting parsing and interpretation have features that characterize another memory system that has been postulated to account for skilled performance-long-term working memory. We propose a model of the relation of different aspects of parsing and interpretation to ST-WM and long-term working memory.
Automatic Parsing of Parental Verbal Input

PubMed Central

Sagae, Kenji; MacWhinney, Brian; Lavie, Alon

2006-01-01

To evaluate theoretical proposals regarding the course of child language acquisition, researchers often need to rely on the processing of large numbers of syntactically parsed utterances, both from children and their parents. Because it is so difficult to do this by hand, there are currently no parsed corpora of child language input data. To automate this process, we developed a system that combined the MOR tagger, a rule-based parser, and statistical disambiguation techniques. The resultant system obtained nearly 80% correct parses for the sentences spoken to children. To achieve this level, we had to construct a particular processing sequence that minimizes problems caused by the coverage/ambiguity trade-off in parser design. These procedures are particularly appropriate for use with the CHILDES database, an international corpus of transcripts. The data and programs are now freely available over the Internet. PMID:15190707
Applying Semantic-based Probabilistic Context-Free Grammar to Medical Language Processing – A Preliminary Study on Parsing Medication Sentences

PubMed Central

Xu, Hua; AbdelRahman, Samir; Lu, Yanxin; Denny, Joshua C.; Doan, Son

2011-01-01

Semantic-based sublanguage grammars have been shown to be an efficient method for medical language processing. However, given the complexity of the medical domain, parsers using such grammars inevitably encounter ambiguous sentences, which could be interpreted by different groups of production rules and consequently result in two or more parse trees. One possible solution, which has not been extensively explored previously, is to augment productions in medical sublanguage grammars with probabilities to resolve the ambiguity. In this study, we associated probabilities with production rules in a semantic-based grammar for medication findings and evaluated its performance on reducing parsing ambiguity. Using the existing data set from 2009 i2b2 NLP (Natural Language Processing) challenge for medication extraction, we developed a semantic-based CFG (Context Free Grammar) for parsing medication sentences and manually created a Treebank of 4,564 medication sentences from discharge summaries. Using the Treebank, we derived a semantic-based PCFG (probabilistic Context Free Grammar) for parsing medication sentences. Our evaluation using a 10-fold cross validation showed that the PCFG parser dramatically improved parsing performance when compared to the CFG parser. PMID:21856440
A structural SVM approach for reference parsing.

PubMed

Zhang, Xiaoli; Zou, Jie; Le, Daniel X; Thoma, George R

2011-06-09

Automated extraction of bibliographic data, such as article titles, author names, abstracts, and references is essential to the affordable creation of large citation databases. References, typically appearing at the end of journal articles, can also provide valuable information for extracting other bibliographic data. Therefore, parsing individual reference to extract author, title, journal, year, etc. is sometimes a necessary preprocessing step in building citation-indexing systems. The regular structure in references enables us to consider reference parsing a sequence learning problem and to study structural Support Vector Machine (structural SVM), a newly developed structured learning algorithm on parsing references. In this study, we implemented structural SVM and used two types of contextual features to compare structural SVM with conventional SVM. Both methods achieve above 98% token classification accuracy and above 95% overall chunk-level accuracy for reference parsing. We also compared SVM and structural SVM to Conditional Random Field (CRF). The experimental results show that structural SVM and CRF achieve similar accuracies at token- and chunk-levels. When only basic observation features are used for each token, structural SVM achieves higher performance compared to SVM since it utilizes the contextual label features. However, when the contextual observation features from neighboring tokens are combined, SVM performance improves greatly, and is close to that of structural SVM after adding the second order contextual observation features. The comparison of these two methods with CRF using the same set of binary features show that both structural SVM and CRF perform better than SVM, indicating their stronger sequence learning ability in reference parsing.
Integrating high dimensional bi-directional parsing models for gene mention tagging.

PubMed

Hsu, Chun-Nan; Chang, Yu-Ming; Kuo, Cheng-Ju; Lin, Yu-Shi; Huang, Han-Shen; Chung, I-Fang

2008-07-01

Tagging gene and gene product mentions in scientific text is an important initial step of literature mining. In this article, we describe in detail our gene mention tagger participated in BioCreative 2 challenge and analyze what contributes to its good performance. Our tagger is based on the conditional random fields model (CRF), the most prevailing method for the gene mention tagging task in BioCreative 2. Our tagger is interesting because it accomplished the highest F-scores among CRF-based methods and second over all. Moreover, we obtained our results by mostly applying open source packages, making it easy to duplicate our results. We first describe in detail how we developed our CRF-based tagger. We designed a very high dimensional feature set that includes most of information that may be relevant. We trained bi-directional CRF models with the same set of features, one applies forward parsing and the other backward, and integrated two models based on the output scores and dictionary filtering. One of the most prominent factors that contributes to the good performance of our tagger is the integration of an additional backward parsing model. However, from the definition of CRF, it appears that a CRF model is symmetric and bi-directional parsing models will produce the same results. We show that due to different feature settings, a CRF model can be asymmetric and the feature setting for our tagger in BioCreative 2 not only produces different results but also gives backward parsing models slight but constant advantage over forward parsing model. To fully explore the potential of integrating bi-directional parsing models, we applied different asymmetric feature settings to generate many bi-directional parsing models and integrate them based on the output scores. Experimental results show that this integrated model can achieve even higher F-score solely based on the training corpus for gene mention tagging. Data sets, programs and an on-line service of our gene mention tagger can be accessed at http://aiia.iis.sinica.edu.tw/biocreative2.htm.
Memory mechanisms supporting syntactic comprehension

PubMed Central

Waters, Gloria

2013-01-01

Efforts to characterize the memory system that supports sentence comprehension have historically drawn extensively on short-term memory as a source of mechanisms that might apply to sentences. The focus of these efforts has changed significantly in the past decade. As a result of changes in models of short-term working memory (ST-WM) and developments in models of sentence comprehension, the effort to relate entire components of an ST-WM system, such as those in the model developed by Baddeley (Nature Reviews Neuroscience 4: 829–839, 2003) to sentence comprehension has largely been replaced by an effort to relate more specific mechanisms found in modern models of ST-WM to memory processes that support one aspect of sentence comprehension—the assignment of syntactic structure (parsing) and its use in determining sentence meaning (interpretation) during sentence comprehension. In this article, we present the historical background to recent studies of the memory mechanisms that support parsing and interpretation and review recent research into this relation. We argue that the results of this research do not converge on a set of mechanisms derived from ST-WM that apply to parsing and interpretation. We argue that the memory mechanisms supporting parsing and interpretation have features that characterize another memory system that has been postulated to account for skilled performance—long-term working memory. We propose a model of the relation of different aspects of parsing and interpretation to ST-WM and long-term working memory. PMID:23319178
The Humanbecoming theory as a reinterpretation of the symbolic interactionism: a critique of its specific nature and scientific underpinnings.

PubMed

Tapp, Diane; Lavoie, Mireille

2017-04-01

Discussions about real knowledge contained in grand theories and models seem to remain an active quest in the academic sphere. The most fervent of these defendants is Rosemarie Parse with her Humanbecoming School of Thought (1981, 1998). This article first highlights the similarities between Parse's theory and Blumer's symbolic interactionism (1969). This comparison will act as a counterargument to Parse's assertions that her theory is original 'nursing' material. Standing on the contemporary philosophy of science, the very possibility for discovering specific nursing knowledge will be questioned. Second, Parse's scientific assumptions will be thoroughly addressed and contrasted with Blumer's more moderate view of knowledge. It will lead to recognize that the valorization of the social nature of existence and reality does not necessarily induce requirements and methods such as those proposed by Parse. According to Blumer's point of view, her perspective may not even be desirable. Recommendations will be raised about the necessity for a distanced relationship to knowledge, being the key to the pursuit of its improvement, not its circular contemplation. © 2016 John Wiley & Sons Ltd.
Stochastic Time Models of Syllable Structure

PubMed Central

Shaw, Jason A.; Gafos, Adamantios I.

2015-01-01

Drawing on phonology research within the generative linguistics tradition, stochastic methods, and notions from complex systems, we develop a modelling paradigm linking phonological structure, expressed in terms of syllables, to speech movement data acquired with 3D electromagnetic articulography and X-ray microbeam methods. The essential variable in the models is syllable structure. When mapped to discrete coordination topologies, syllabic organization imposes systematic patterns of variability on the temporal dynamics of speech articulation. We simulated these dynamics under different syllabic parses and evaluated simulations against experimental data from Arabic and English, two languages claimed to parse similar strings of segments into different syllabic structures. Model simulations replicated several key experimental results, including the fallibility of past phonetic heuristics for syllable structure, and exposed the range of conditions under which such heuristics remain valid. More importantly, the modelling approach consistently diagnosed syllable structure proving resilient to multiple sources of variability in experimental data including measurement variability, speaker variability, and contextual variability. Prospects for extensions of our modelling paradigm to acoustic data are also discussed. PMID:25996153
Down the SoTL Rabbit Hole: Using a Phenomenological Approach to Parse the Development of Student Actors

ERIC Educational Resources Information Center

Perkins, Kathleen M.

2016-01-01

Theatre is a multi-dimensional discipline encompassing aspects of several domains in the arts and humanities. Therefore, an array of scholarly practices, pedagogies, and methods might be available to a SoTL researcher from the close reading of texts in script analysis to portfolio critiques in set, costume, and lighting design--approaches shared…
Generic Detection of Register Realignment

NASA Astrophysics Data System (ADS)

Ďurfina, Lukáš; Kolář, Dušan

2011-09-01

The register realignment is a method of binary obfuscation and it is used by malware writers. The paper introduces the method how register realignment can be recognized by analysis based on the scattered context grammars. Such an analysis includes exploration of bytes affected by realignment, finding new valid values for them, building the scattered context grammar and parse an obfuscated code by this grammar. The created grammar has LL property--an ability for parsing by this type of grammar.
Generic Detection of Register Realignment

NASA Astrophysics Data System (ADS)

Durfina, Lukáš; Kolář, Dušan

2011-09-01

The register realignment is a method of binary obfuscation and it is used by malware writers. The paper introduces the method how register realignment can be recognized by analysis based on the scattered context grammars. Such an analysis includes exploration of bytes affected by realignment, finding new valid values for them, building the scattered context grammar and parse an obfuscated code by this grammar. The created grammar has LL property—an ability for parsing by this type of grammar.
Development of an HL7 interface engine, based on tree structure and streaming algorithm, for large-size messages which include image data.

PubMed

Um, Ki Sung; Kwak, Yun Sik; Cho, Hune; Kim, Il Kon

2005-11-01

A basic assumption of Health Level Seven (HL7) protocol is 'No limitation of message length'. However, most existing commercial HL7 interface engines do limit message length because they use the string array method, which is run in the main memory for the HL7 message parsing process. Specifically, messages with image and multi-media data create a long string array and thus cause the computer system to raise critical and fatal problem. Consequently, HL7 messages cannot handle the image and multi-media data necessary in modern medical records. This study aims to solve this problem with the 'streaming algorithm' method. This new method for HL7 message parsing applies the character-stream object which process character by character between the main memory and hard disk device with the consequence that the processing load on main memory could be alleviated. The main functions of this new engine are generating, parsing, validating, browsing, sending, and receiving HL7 messages. Also, the engine can parse and generate XML-formatted HL7 messages. This new HL7 engine successfully exchanged HL7 messages with 10 megabyte size images and discharge summary information between two university hospitals.
An automatic method to generate domain-specific investigator networks using PubMed abstracts.

PubMed

Yu, Wei; Yesupriya, Ajay; Wulf, Anja; Qu, Junfeng; Gwinn, Marta; Khoury, Muin J

2007-06-20

Collaboration among investigators has become critical to scientific research. This includes ad hoc collaboration established through personal contacts as well as formal consortia established by funding agencies. Continued growth in online resources for scientific research and communication has promoted the development of highly networked research communities. Extending these networks globally requires identifying additional investigators in a given domain, profiling their research interests, and collecting current contact information. We present a novel strategy for building investigator networks dynamically and producing detailed investigator profiles using data available in PubMed abstracts. We developed a novel strategy to obtain detailed investigator information by automatically parsing the affiliation string in PubMed records. We illustrated the results by using a published literature database in human genome epidemiology (HuGE Pub Lit) as a test case. Our parsing strategy extracted country information from 92.1% of the affiliation strings in a random sample of PubMed records and in 97.0% of HuGE records, with accuracies of 94.0% and 91.0%, respectively. Institution information was parsed from 91.3% of the general PubMed records (accuracy 86.8%) and from 94.2% of HuGE PubMed records (accuracy 87.0). We demonstrated the application of our approach to dynamic creation of investigator networks by creating a prototype information system containing a large database of PubMed abstracts relevant to human genome epidemiology (HuGE Pub Lit), indexed using PubMed medical subject headings converted to Unified Medical Language System concepts. Our method was able to identify 70-90% of the investigators/collaborators in three different human genetics fields; it also successfully identified 9 of 10 genetics investigators within the PREBIC network, an existing preterm birth research network. We successfully created a web-based prototype capable of creating domain-specific investigator networks based on an application that accurately generates detailed investigator profiles from PubMed abstracts combined with robust standard vocabularies. This approach could be used for other biomedical fields to efficiently establish domain-specific investigator networks.
An automatic method to generate domain-specific investigator networks using PubMed abstracts

PubMed Central

Yu, Wei; Yesupriya, Ajay; Wulf, Anja; Qu, Junfeng; Gwinn, Marta; Khoury, Muin J

2007-01-01

Background Collaboration among investigators has become critical to scientific research. This includes ad hoc collaboration established through personal contacts as well as formal consortia established by funding agencies. Continued growth in online resources for scientific research and communication has promoted the development of highly networked research communities. Extending these networks globally requires identifying additional investigators in a given domain, profiling their research interests, and collecting current contact information. We present a novel strategy for building investigator networks dynamically and producing detailed investigator profiles using data available in PubMed abstracts. Results We developed a novel strategy to obtain detailed investigator information by automatically parsing the affiliation string in PubMed records. We illustrated the results by using a published literature database in human genome epidemiology (HuGE Pub Lit) as a test case. Our parsing strategy extracted country information from 92.1% of the affiliation strings in a random sample of PubMed records and in 97.0% of HuGE records, with accuracies of 94.0% and 91.0%, respectively. Institution information was parsed from 91.3% of the general PubMed records (accuracy 86.8%) and from 94.2% of HuGE PubMed records (accuracy 87.0). We demonstrated the application of our approach to dynamic creation of investigator networks by creating a prototype information system containing a large database of PubMed abstracts relevant to human genome epidemiology (HuGE Pub Lit), indexed using PubMed medical subject headings converted to Unified Medical Language System concepts. Our method was able to identify 70–90% of the investigators/collaborators in three different human genetics fields; it also successfully identified 9 of 10 genetics investigators within the PREBIC network, an existing preterm birth research network. Conclusion We successfully created a web-based prototype capable of creating domain-specific investigator networks based on an application that accurately generates detailed investigator profiles from PubMed abstracts combined with robust standard vocabularies. This approach could be used for other biomedical fields to efficiently establish domain-specific investigator networks. PMID:17584920
Hope in "Rita Hayworth and Shawshank Redemption": a human becoming hermeneutic study.

PubMed

Parse, Rosemarie Rizzo

2007-04-01

This article is the report of the human becoming hermeneutic method study on "Rita Hayworth and Shawshank Redemption" (the short story, the screenplay, and the film). The study unfolded during the Parse-King dialogue that answered the research question what is hope as humanly lived? Emergent meanings were discovered that enhanced knowledge and understanding of hope in general and expanded the human becoming school of thought.
Learning for Semantic Parsing and Natural Language Generation Using Statistical Machine Translation Techniques

DTIC Science & Technology

2007-08-01

In this domain, queries typically show a deeply nested structure, which makes the semantic parsing task rather challenging , e.g.: What states border...only 80% of the GEOQUERY queries are semantically tractable, which shows that GEOQUERY is indeed a more challenging domain than ATIS. Note that none...a particularly challenging task, because of the inherent ambiguity of natural languages on both sides. It has inspired a large body of research. In
Using the Longman Mini-concordancer on Tagged and Parsed Corpora, with Special Reference to Their Use as an Aid to Grammar Learning.

ERIC Educational Resources Information Center

Qiao, Hong Liang; Sussex, Roland

1996-01-01

Presents methods for using the Longman Mini-Concordancer on tagged and parsed corpora rather than plain text corpora. The article discusses several aspects with models to be applied in the classroom as an aid to grammar learning. This paper suggests exercises suitable for teaching English to both native and nonnative speakers. (13 references)…
ParseCNV integrative copy number variation association software with quality tracking

PubMed Central

Glessner, Joseph T.; Li, Jin; Hakonarson, Hakon

2013-01-01

A number of copy number variation (CNV) calling algorithms exist; however, comprehensive software tools for CNV association studies are lacking. We describe ParseCNV, unique software that takes CNV calls and creates probe-based statistics for CNV occurrence in both case–control design and in family based studies addressing both de novo and inheritance events, which are then summarized based on CNV regions (CNVRs). CNVRs are defined in a dynamic manner to allow for a complex CNV overlap while maintaining precise association region. Using this approach, we avoid failure to converge and non-monotonic curve fitting weaknesses of programs, such as CNVtools and CNVassoc, and although Plink is easy to use, it only provides combined CNV state probe-based statistics, not state-specific CNVRs. Existing CNV association methods do not provide any quality tracking information to filter confident associations, a key issue which is fully addressed by ParseCNV. In addition, uncertainty in CNV calls underlying CNV associations is evaluated to verify significant results, including CNV overlap profiles, genomic context, number of probes supporting the CNV and single-probe intensities. When optimal quality control parameters are followed using ParseCNV, 90% of CNVs validate by polymerase chain reaction, an often problematic stage because of inadequate significant association review. ParseCNV is freely available at http://parsecnv.sourceforge.net. PMID:23293001
ParseCNV integrative copy number variation association software with quality tracking.

PubMed

Glessner, Joseph T; Li, Jin; Hakonarson, Hakon

2013-03-01

A number of copy number variation (CNV) calling algorithms exist; however, comprehensive software tools for CNV association studies are lacking. We describe ParseCNV, unique software that takes CNV calls and creates probe-based statistics for CNV occurrence in both case-control design and in family based studies addressing both de novo and inheritance events, which are then summarized based on CNV regions (CNVRs). CNVRs are defined in a dynamic manner to allow for a complex CNV overlap while maintaining precise association region. Using this approach, we avoid failure to converge and non-monotonic curve fitting weaknesses of programs, such as CNVtools and CNVassoc, and although Plink is easy to use, it only provides combined CNV state probe-based statistics, not state-specific CNVRs. Existing CNV association methods do not provide any quality tracking information to filter confident associations, a key issue which is fully addressed by ParseCNV. In addition, uncertainty in CNV calls underlying CNV associations is evaluated to verify significant results, including CNV overlap profiles, genomic context, number of probes supporting the CNV and single-probe intensities. When optimal quality control parameters are followed using ParseCNV, 90% of CNVs validate by polymerase chain reaction, an often problematic stage because of inadequate significant association review. ParseCNV is freely available at http://parsecnv.sourceforge.net.
Comparing the Concept of Caring in Islamic Perspective with Watson and Parse's Nursing Theories

PubMed Central

Sadat-Hoseini, Akram-Sadat; Khosropanah, Abdoul-Hosein

2017-01-01

Background: In the nursing profession, it is apparent that the definition of caring differs between various perspectives. This article compares the difference of caring in Islamic with the Parse and Watson theories. Materials and Methods: In this study, we use concept analyses of Walker–Avants and compare research methods. Material used is all Islamic documents. Results: According to Islamic documents, there are four major types of caring, namely, (1) God taking care of humans, (2) Humans taking care of themselves, (3) Other humans taking care of humans, and (4) The universe taking care of humans and vice versa. God caring for humans affects the three other types of caring. All three definitions of caring have humanistic and holistic view. According to Watson's and Parse's definition, the development of the caring theory is based on the person's experiences that result from human interactions with, and experiences of, their environment. In Islamic definition, although the caring process is affected by environmental experiences and interactions, human not developed only base the effect of environment; rather, it is developed on the basis of human nature and divine commands. God taking care of humans is specific to Islamic perspective and is not found in other definitions. Islamic perspective maintains that God is the creator of humanity and is in charge of guiding humans. A superior form of human can always be discovered. Conclusions: Thus, nursing implementation in Muslims must be done based on Islamic commands, and Islamic commands are superior to human experiences. However, Islamic commands interpreted with human wisdom and thought can be striving toward excellence. PMID:28584543

The lived experience of feeling sad.

PubMed

Bunkers, Sandra Schmidt

2010-07-01

The purpose of this study was to enhance understanding of the lived experience of feeling sad. Parse's phenomenological-hermeneutic research method was used to answer the research question: What is the structure of the lived experience of feeling sad? Participants were 7 elders who had lost a pet. Data were collected with dialogical engagement. The major finding of the study is the structure: Feeling sad is penetrating anguish surfacing with contemplating absent-yet present intimacies, while prevailing amid misfortune. Feeling sad is discussed in relation to the principles of humanbecoming and in relation to how it can inform future nursing research and nursing practice.
Representations of the language recognition problem for a theorem prover

NASA Technical Reports Server (NTRS)

Minker, J.; Vanderbrug, G. J.

1972-01-01

Two representations of the language recognition problem for a theorem prover in first order logic are presented and contrasted. One of the representations is based on the familiar method of generating sentential forms of the language, and the other is based on the Cocke parsing algorithm. An augmented theorem prover is described which permits recognition of recursive languages. The state-transformation method developed by Cordell Green to construct problem solutions in resolution-based systems can be used to obtain the parse tree. In particular, the end-order traversal of the parse tree is derived in one of the representations. An inference system, termed the cycle inference system, is defined which makes it possible for the theorem prover to model the method on which the representation is based. The general applicability of the cycle inference system to state space problems is discussed. Given an unsatisfiable set S, where each clause has at most one positive literal, it is shown that there exists an input proof. The clauses for the two representations satisfy these conditions, as do many state space problems.
Learning for Semantic Parsing with Kernels under Various Forms of Supervision

DTIC Science & Technology

2007-08-01

natural language sentences to their formal executable meaning representations. This is a challenging problem and is critical for developing computing...sentences are semantically tractable. This indi- cates that Geoquery is more challenging domain for semantic parsing than ATIS. In the past, there have been a...Combining parsers. In Proceedings of the Conference on Em- pirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/ VLC -99), pp. 187–194
Experimental Evaluation of Processing Time for the Synchronization of XML-Based Business Objects

NASA Astrophysics Data System (ADS)

Ameling, Michael; Wolf, Bernhard; Springer, Thomas; Schill, Alexander

Business objects (BOs) are data containers for complex data structures used in business applications such as Supply Chain Management and Customer Relationship Management. Due to the replication of application logic, multiple copies of BOs are created which have to be synchronized and updated. This is a complex and time consuming task because BOs rigorously vary in their structure according to the distribution, number and size of elements. Since BOs are internally represented as XML documents, the parsing of XML is one major cost factor which has to be considered for minimizing the processing time during synchronization. The prediction of the parsing time for BOs is an significant property for the selection of an efficient synchronization mechanism. In this paper, we present a method to evaluate the influence of the structure of BOs on their parsing time. The results of our experimental evaluation incorporating four different XML parsers examine the dependencies between the distribution of elements and the parsing time. Finally, a general cost model will be validated and simplified according to the results of the experimental setup.
voevent-parse: Parse, manipulate, and generate VOEvent XML packets

NASA Astrophysics Data System (ADS)

Staley, Tim D.

2014-11-01

voevent-parse, written in Python, parses, manipulates, and generates VOEvent XML packets; it is built atop lxml.objectify. Details of transients detected by many projects, including Fermi, Swift, and the Catalina Sky Survey, are currently made available as VOEvents, which is also the standard alert format by future facilities such as LSST and SKA. However, working with XML and adhering to the sometimes lengthy VOEvent schema can be a tricky process. voevent-parse provides convenience routines for common tasks, while allowing the user to utilise the full power of the lxml library when required. An earlier version of voevent-parse was part of the pysovo (ascl:1411.002) library.
Incremental Learning of Context Free Grammars by Parsing-Based Rule Generation and Rule Set Search

NASA Astrophysics Data System (ADS)

Nakamura, Katsuhiko; Hoshina, Akemi

This paper discusses recent improvements and extensions in Synapse system for inductive inference of context free grammars (CFGs) from sample strings. Synapse uses incremental learning, rule generation based on bottom-up parsing, and the search for rule sets. The form of production rules in the previous system is extended from Revised Chomsky Normal Form A→βγ to Extended Chomsky Normal Form, which also includes A→B, where each of β and γ is either a terminal or nonterminal symbol. From the result of bottom-up parsing, a rule generation mechanism synthesizes minimum production rules required for parsing positive samples. Instead of inductive CYK algorithm in the previous version of Synapse, the improved version uses a novel rule generation method, called ``bridging,'' which bridges the lacked part of the derivation tree for the positive string. The improved version also employs a novel search strategy, called serial search in addition to minimum rule set search. The synthesis of grammars by the serial search is faster than the minimum set search in most cases. On the other hand, the size of the generated CFGs is generally larger than that by the minimum set search, and the system can find no appropriate grammar for some CFL by the serial search. The paper shows experimental results of incremental learning of several fundamental CFGs and compares the methods of rule generation and search strategies.
A Discriminative Sentence Compression Method as Combinatorial Optimization Problem

NASA Astrophysics Data System (ADS)

Hirao, Tsutomu; Suzuki, Jun; Isozaki, Hideki

In the study of automatic summarization, the main research topic was `important sentence extraction' but nowadays `sentence compression' is a hot research topic. Conventional sentence compression methods usually transform a given sentence into a parse tree or a dependency tree, and modify them to get a shorter sentence. However, this method is sometimes too rigid. In this paper, we regard sentence compression as an combinatorial optimization problem that extracts an optimal subsequence of words. Hori et al. also proposed a similar method, but they used only a small number of features and their weights were tuned by hand. We introduce a large number of features such as part-of-speech bigrams and word position in the sentence. Furthermore, we train the system by discriminative learning. According to our experiments, our method obtained better score than other methods with statistical significance.
Attribute And-Or Grammar for Joint Parsing of Human Pose, Parts and Attributes.

PubMed

Park, Seyoung; Nie, Xiaohan; Zhu, Song-Chun

2017-07-25

This paper presents an attribute and-or grammar (A-AOG) model for jointly inferring human body pose and human attributes in a parse graph with attributes augmented to nodes in the hierarchical representation. In contrast to other popular methods in the current literature that train separate classifiers for poses and individual attributes, our method explicitly represents the decomposition and articulation of body parts, and account for the correlations between poses and attributes. The A-AOG model is an amalgamation of three traditional grammar formulations: (i)Phrase structure grammar representing the hierarchical decomposition of the human body from whole to parts; (ii)Dependency grammar modeling the geometric articulation by a kinematic graph of the body pose; and (iii)Attribute grammar accounting for the compatibility relations between different parts in the hierarchy so that their appearances follow a consistent style. The parse graph outputs human detection, pose estimation, and attribute prediction simultaneously, which are intuitive and interpretable. We conduct experiments on two tasks on two datasets, and experimental results demonstrate the advantage of joint modeling in comparison with computing poses and attributes independently. Furthermore, our model obtains better performance over existing methods for both pose estimation and attribute prediction tasks.
The Living Experience of Feeling Playful.

PubMed

Baumann, Steven L; Tanzi, Donna; Lewis, Tricia A

2017-07-01

The purpose of this study was to investigate the living experience of feeling playful. Parse's research method was used to answer the question: What is the structure of the living experience of feeling playful? The participants were 10 persons, ages 9 to 83, living in the United States. The central finding of the study is the living experience of feeling playful is entertaining amusements amid burdens with uplifting endeavors strengthening affiliations with blissful moments of unfettered unfolding. The living experience of feeling playful is discussed in relation to the principles of the humanbecoming paradigm and in relation to how it can inform further research.
Feeling Peaceful: A Universal Living Experience.

PubMed

Doucet, Thomas J

2018-01-01

The purpose of this study was to investigate the living experience of feeling peaceful. Parse's research method was used to answer the question: What is the structure of the living experience of feeling peaceful? Twelve participants living in a community consented to partake in the study. The central finding of the study is the structure: feeling peaceful is contentedness amid tribulation, as unburdening surfaces with devout involvements. The findings are discussed in relation to the humanbecoming school of thought and extant literature.
Flexible Parsing.

DTIC Science & Technology

1986-06-30

Machine Studies .. 14. Minton, S. N., Hayes, P. J., and Fain, J. E. Controlling Search in Flexible Parsing. Proc. Ninth Int. Jt. Conf. on Artificial...interaction through the COUSIN command interface", International Journal of Man- Machine Studies , Vol. 19, No. 3, September 1983, pp. 285-305. 8...in a gracefully interacting user interface," "Dynamic strategy selection in flexible parsing," and "Parsing spoken language: a semantic case frame
PyParse: a semiautomated system for scoring spoken recall data.

PubMed

Solway, Alec; Geller, Aaron S; Sederberg, Per B; Kahana, Michael J

2010-02-01

Studies of human memory often generate data on the sequence and timing of recalled items, but scoring such data using conventional methods is difficult or impossible. We describe a Python-based semiautomated system that greatly simplifies this task. This software, called PyParse, can easily be used in conjunction with many common experiment authoring systems. Scored data is output in a simple ASCII format and can be accessed with the programming language of choice, allowing for the identification of features such as correct responses, prior-list intrusions, extra-list intrusions, and repetitions.
Parsing clinical text: how good are the state-of-the-art parsers?

PubMed Central

2015-01-01

Background Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain. Methods In this study, we investigated the performance of three state-of-the-art parsers: the Stanford parser, the Bikel parser, and the Charniak parser, using following two datasets: (1) A Treebank containing 1,100 sentences that were randomly selected from progress notes used in the 2010 i2b2 NLP challenge and manually annotated according to a Penn Treebank based guideline; and (2) the MiPACQ Treebank, which is developed based on pathology notes and clinical notes, containing 13,091 sentences. We conducted three experiments on both datasets. First, we measured the performance of the three state-of-the-art parsers on the clinical Treebanks with their default settings. Then we re-trained the parsers using the clinical Treebanks and evaluated their performance using the 10-fold cross validation method. Finally we re-trained the parsers by combining the clinical Treebanks with the Penn Treebank. Results Our results showed that the original parsers achieved lower performance in clinical text (Bracketing F-measure in the range of 66.6%-70.3%) compared to general English text. After retraining on the clinical Treebank, all parsers achieved better performance, with the best performance from the Stanford parser that reached the highest Bracketing F-measure of 73.68% on progress notes and 83.72% on the MiPACQ corpus using 10-fold cross validation. When the combined clinical Treebanks and Penn Treebank was used, of the three parsers, the Charniak parser achieved the highest Bracketing F-measure of 73.53% on progress notes and the Stanford parser reached the highest F-measure of 84.15% on the MiPACQ corpus. Conclusions Our study demonstrates that re-training using clinical Treebanks is critical for improving general English parsers' performance on clinical text, and combining clinical and open domain corpora might achieve optimal performance for parsing clinical text. PMID:26045009
Introduction of statistical information in a syntactic analyzer for document image recognition

NASA Astrophysics Data System (ADS)

Maroneze, André O.; Coüasnon, Bertrand; Lemaitre, Aurélie

2011-01-01

This paper presents an improvement to document layout analysis systems, offering a possible solution to Sayre's paradox (which states that an element "must be recognized before it can be segmented; and it must be segmented before it can be recognized"). This improvement, based on stochastic parsing, allows integration of statistical information, obtained from recognizers, during syntactic layout analysis. We present how this fusion of numeric and symbolic information in a feedback loop can be applied to syntactic methods to improve document description expressiveness. To limit combinatorial explosion during exploration of solutions, we devised an operator that allows optional activation of the stochastic parsing mechanism. Our evaluation on 1250 handwritten business letters shows this method allows the improvement of global recognition scores.
Research on complex 3D tree modeling based on L-system

NASA Astrophysics Data System (ADS)

Gang, Chen; Bin, Chen; Yuming, Liu; Hui, Li

2018-03-01

L-system as a fractal iterative system could simulate complex geometric patterns. Based on the field observation data of trees and knowledge of forestry experts, this paper extracted modeling constraint rules and obtained an L-system rules set. Using the self-developed L-system modeling software the L-system rule set was parsed to generate complex tree 3d models.The results showed that the geometrical modeling method based on l-system could be used to describe the morphological structure of complex trees and generate 3D tree models.
Nonschematic drawing recognition: a new approach based on attributed graph grammar with flexible embedding

NASA Astrophysics Data System (ADS)

Lee, Kyu J.; Kunii, T. L.; Noma, T.

1993-01-01

In this paper, we propose a syntactic pattern recognition method for non-schematic drawings, based on a new attributed graph grammar with flexible embedding. In our graph grammar, the embedding rule permits the nodes of a guest graph to be arbitrarily connected with the nodes of a host graph. The ambiguity caused by this flexible embedding is controlled with the evaluation of synthesized attributes and the check of context sensitivity. To integrate parsing with the synthesized attribute evaluation and the context sensitivity check, we also develop a bottom up parsing algorithm.
FastaValidator: an open-source Java library to parse and validate FASTA formatted sequences.

PubMed

Waldmann, Jost; Gerken, Jan; Hankeln, Wolfgang; Schweer, Timmy; Glöckner, Frank Oliver

2014-06-14

Advances in sequencing technologies challenge the efficient importing and validation of FASTA formatted sequence data which is still a prerequisite for most bioinformatic tools and pipelines. Comparative analysis of commonly used Bio*-frameworks (BioPerl, BioJava and Biopython) shows that their scalability and accuracy is hampered. FastaValidator represents a platform-independent, standardized, light-weight software library written in the Java programming language. It targets computer scientists and bioinformaticians writing software which needs to parse quickly and accurately large amounts of sequence data. For end-users FastaValidator includes an interactive out-of-the-box validation of FASTA formatted files, as well as a non-interactive mode designed for high-throughput validation in software pipelines. The accuracy and performance of the FastaValidator library qualifies it for large data sets such as those commonly produced by massive parallel (NGS) technologies. It offers scientists a fast, accurate and standardized method for parsing and validating FASTA formatted sequence data.
Parsing interindividual drug variability: an emerging role for systems pharmacology

PubMed Central

Turner, Richard M; Park, B Kevin; Pirmohamed, Munir

2015-01-01

There is notable interindividual heterogeneity in drug response, affecting both drug efficacy and toxicity, resulting in patient harm and the inefficient utilization of limited healthcare resources. Pharmacogenomics is at the forefront of research to understand interindividual drug response variability, but although many genotype-drug response associations have been identified, translation of pharmacogenomic associations into clinical practice has been hampered by inconsistent findings and inadequate predictive values. These limitations are in part due to the complex interplay between drug-specific, human body and environmental factors influencing drug response and therefore pharmacogenomics, whilst intrinsically necessary, is by itself unlikely to adequately parse drug variability. The emergent, interdisciplinary and rapidly developing field of systems pharmacology, which incorporates but goes beyond pharmacogenomics, holds significant potential to further parse interindividual drug variability. Systems pharmacology broadly encompasses two distinct research efforts, pharmacologically-orientated systems biology and pharmacometrics. Pharmacologically-orientated systems biology utilizes high throughput omics technologies, including next-generation sequencing, transcriptomics and proteomics, to identify factors associated with differential drug response within the different levels of biological organization in the hierarchical human body. Increasingly complex pharmacometric models are being developed that quantitatively integrate factors associated with drug response. Although distinct, these research areas complement one another and continual development can be facilitated by iterating between dynamic experimental and computational findings. Ultimately, quantitative data-derived models of sufficient detail will be required to help realize the goal of precision medicine. WIREs Syst Biol Med 2015, 7:221–241. doi: 10.1002/wsbm.1302 PMID:25950758
Parsing clinical text: how good are the state-of-the-art parsers?

PubMed

Jiang, Min; Huang, Yang; Fan, Jung-wei; Tang, Buzhou; Denny, Josh; Xu, Hua

2015-01-01

Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain. In this study, we investigated the performance of three state-of-the-art parsers: the Stanford parser, the Bikel parser, and the Charniak parser, using following two datasets: (1) A Treebank containing 1,100 sentences that were randomly selected from progress notes used in the 2010 i2b2 NLP challenge and manually annotated according to a Penn Treebank based guideline; and (2) the MiPACQ Treebank, which is developed based on pathology notes and clinical notes, containing 13,091 sentences. We conducted three experiments on both datasets. First, we measured the performance of the three state-of-the-art parsers on the clinical Treebanks with their default settings. Then we re-trained the parsers using the clinical Treebanks and evaluated their performance using the 10-fold cross validation method. Finally we re-trained the parsers by combining the clinical Treebanks with the Penn Treebank. Our results showed that the original parsers achieved lower performance in clinical text (Bracketing F-measure in the range of 66.6%-70.3%) compared to general English text. After retraining on the clinical Treebank, all parsers achieved better performance, with the best performance from the Stanford parser that reached the highest Bracketing F-measure of 73.68% on progress notes and 83.72% on the MiPACQ corpus using 10-fold cross validation. When the combined clinical Treebanks and Penn Treebank was used, of the three parsers, the Charniak parser achieved the highest Bracketing F-measure of 73.53% on progress notes and the Stanford parser reached the highest F-measure of 84.15% on the MiPACQ corpus. Our study demonstrates that re-training using clinical Treebanks is critical for improving general English parsers' performance on clinical text, and combining clinical and open domain corpora might achieve optimal performance for parsing clinical text.
Intelligent Text Retrieval and Knowledge Acquisition from Texts for NASA Applications: Preprocessing Issues

NASA Technical Reports Server (NTRS)

2002-01-01

A system that retrieves problem reports from a NASA database is described. The database is queried with natural language questions. Part-of-speech tags are first assigned to each word in the question using a rule based tagger. A partial parse of the question is then produced with independent sets of deterministic finite state a utomata. Using partial parse information, a look up strategy searches the database for problem reports relevant to the question. A bigram stemmer and irregular verb conjugates have been incorporated into the system to improve accuracy. The system is evaluated by a set of fifty five questions posed by NASA engineers. A discussion of future research is also presented.

Automated Program Recognition by Graph Parsing

DTIC Science & Technology

1992-07-01

structures (cliches) in a program can help an experienced programmer understand the program. Based on the known relationships between the clichis, a...Graph Parsing Linda Mary Wills Abstract The recognition of standard computational structures (cliches) in a program can help an experienced programmer...3.4.1 Structure -Sharing ....... ............................ 76 3.4.2 Aggregation ....................................... 80 2 3.5 Chart Parsing Flow
Deriving pathway maps from automated text analysis using a grammar-based approach.

PubMed

Olsson, Björn; Gawronska, Barbara; Erlendsson, Björn

2006-04-01

We demonstrate how automated text analysis can be used to support the large-scale analysis of metabolic and regulatory pathways by deriving pathway maps from textual descriptions found in the scientific literature. The main assumption is that correct syntactic analysis combined with domain-specific heuristics provides a good basis for relation extraction. Our method uses an algorithm that searches through the syntactic trees produced by a parser based on a Referent Grammar formalism, identifies relations mentioned in the sentence, and classifies them with respect to their semantic class and epistemic status (facts, counterfactuals, hypotheses). The semantic categories used in the classification are based on the relation set used in KEGG (Kyoto Encyclopedia of Genes and Genomes), so that pathway maps using KEGG notation can be automatically generated. We present the current version of the relation extraction algorithm and an evaluation based on a corpus of abstracts obtained from PubMed. The results indicate that the method is able to combine a reasonable coverage with high accuracy. We found that 61% of all sentences were parsed, and 97% of the parse trees were judged to be correct. The extraction algorithm was tested on a sample of 300 parse trees and was found to produce correct extractions in 90.5% of the cases.
Parsing English. Course Notes for a Tutorial on Computational Semantics, March 17-22, 1975.

ERIC Educational Resources Information Center

Wilks, Yorick

The course in parsing English is essentially a survey and comparison of several of the principal systems used for understanding natural language. The basic procedure of parsing is described. The discussion of the principal systems is based on the idea that "meaning is procedures," that is, that the procedures of application give a parsed…
Integrated Japanese Dependency Analysis Using a Dialog Context

NASA Astrophysics Data System (ADS)

Ikegaya, Yuki; Noguchi, Yasuhiro; Kogure, Satoru; Itoh, Toshihiko; Konishi, Tatsuhiro; Kondo, Makoto; Asoh, Hideki; Takagi, Akira; Itoh, Yukihiro

This paper describes how to perform syntactic parsing and semantic analysis in a dialog system. The paper especially deals with how to disambiguate potentially ambiguous sentences using the contextual information. Although syntactic parsing and semantic analysis are often studied independently of each other, correct parsing of a sentence often requires the semantic information on the input and/or the contextual information prior to the input. Accordingly, we merge syntactic parsing with semantic analysis, which enables syntactic parsing taking advantage of the semantic content of an input and its context. One of the biggest problems of semantic analysis is how to interpret dependency structures. We employ a framework for semantic representations that circumvents the problem. Within the framework, the meaning of any predicate is converted into a semantic representation which only permits a single type of predicate: an identifying predicate "aru". The semantic representations are expressed as sets of "attribute-value" pairs, and those semantic representations are stored in the context information. Our system disambiguates syntactic/semantic ambiguities of inputs referring to the attribute-value pairs in the context information. We have experimentally confirmed the effectiveness of our approach; specifically, the experiment confirmed high accuracy of parsing and correctness of generated semantic representations.
High-frequency neural activity predicts word parsing in ambiguous speech streams.

PubMed

Kösem, Anne; Basirat, Anahita; Azizi, Leila; van Wassenhove, Virginie

2016-12-01

During speech listening, the brain parses a continuous acoustic stream of information into computational units (e.g., syllables or words) necessary for speech comprehension. Recent neuroscientific hypotheses have proposed that neural oscillations contribute to speech parsing, but whether they do so on the basis of acoustic cues (bottom-up acoustic parsing) or as a function of available linguistic representations (top-down linguistic parsing) is unknown. In this magnetoencephalography study, we contrasted acoustic and linguistic parsing using bistable speech sequences. While listening to the speech sequences, participants were asked to maintain one of the two possible speech percepts through volitional control. We predicted that the tracking of speech dynamics by neural oscillations would not only follow the acoustic properties but also shift in time according to the participant's conscious speech percept. Our results show that the latency of high-frequency activity (specifically, beta and gamma bands) varied as a function of the perceptual report. In contrast, the phase of low-frequency oscillations was not strongly affected by top-down control. Whereas changes in low-frequency neural oscillations were compatible with the encoding of prelexical segmentation cues, high-frequency activity specifically informed on an individual's conscious speech percept. Copyright © 2016 the American Physiological Society.
High-frequency neural activity predicts word parsing in ambiguous speech streams

PubMed Central

Basirat, Anahita; Azizi, Leila; van Wassenhove, Virginie

2016-01-01

During speech listening, the brain parses a continuous acoustic stream of information into computational units (e.g., syllables or words) necessary for speech comprehension. Recent neuroscientific hypotheses have proposed that neural oscillations contribute to speech parsing, but whether they do so on the basis of acoustic cues (bottom-up acoustic parsing) or as a function of available linguistic representations (top-down linguistic parsing) is unknown. In this magnetoencephalography study, we contrasted acoustic and linguistic parsing using bistable speech sequences. While listening to the speech sequences, participants were asked to maintain one of the two possible speech percepts through volitional control. We predicted that the tracking of speech dynamics by neural oscillations would not only follow the acoustic properties but also shift in time according to the participant's conscious speech percept. Our results show that the latency of high-frequency activity (specifically, beta and gamma bands) varied as a function of the perceptual report. In contrast, the phase of low-frequency oscillations was not strongly affected by top-down control. Whereas changes in low-frequency neural oscillations were compatible with the encoding of prelexical segmentation cues, high-frequency activity specifically informed on an individual's conscious speech percept. PMID:27605528
Extraction of CYP chemical interactions from biomedical literature using natural language processing methods.

PubMed

Jiao, Dazhi; Wild, David J

2009-02-01

This paper proposes a system that automatically extracts CYP protein and chemical interactions from journal article abstracts, using natural language processing (NLP) and text mining methods. In our system, we employ a maximum entropy based learning method, using results from syntactic, semantic, and lexical analysis of texts. We first present our system architecture and then discuss the data set for training our machine learning based models and the methods in building components in our system, such as part of speech (POS) tagging, Named Entity Recognition (NER), dependency parsing, and relation extraction. An evaluation of the system is conducted at the end, yielding very promising results: The POS, dependency parsing, and NER components in our system have achieved a very high level of accuracy as measured by precision, ranging from 85.9% to 98.5%, and the precision and the recall of the interaction extraction component are 76.0% and 82.6%, and for the overall system are 68.4% and 72.2%, respectively.
Feeling understood: a melody of human becoming.

PubMed

Jonas-Simpson, C M

2001-07-01

This phenomenological-hermeneutic study centered on the phenomenon of feeling understood, which was conceptualized by the researcher as a melody of human becoming significant to quality of life. For the first time the Parse research method was used with music as part of the dialogical engagement. The study was conducted with 10 women living with an enduring health situation who volunteered to be in tape-recorded dialogue with the researcher to discuss feeling understood and to create a musical expression of this phenomenon. The finding of this study, which is the structure of the lived experience of feeling understood, surfaced from the dialogues and musical expressions: Feeling understood is an unburdening quietude with triumphant bliss arising with the attentive reverence of nurturing engagements, while fortifying integrity emerges amid potential disregard.
An All-Fragments Grammar for Simple and Accurate Parsing

DTIC Science & Technology

2012-03-21

Tsujii. Probabilistic CFG with latent annotations. In Proceedings of ACL, 2005. Slav Petrov and Dan Klein. Improved Inference for Unlexicalized Parsing. In...Proceedings of NAACL-HLT, 2007. Slav Petrov and Dan Klein. Sparse Multi-Scale Grammars for Discriminative Latent Variable Parsing. In Proceedings of...EMNLP, 2008. Slav Petrov, Leon Barrett, Romain Thibaux, and Dan Klein. Learning Accurate, Compact, and Interpretable Tree Annotation. In Proceedings
High-content image informatics of the structural nuclear protein NuMA parses trajectories for stem/progenitor cell lineages and oncogenic transformation.

PubMed

Vega, Sebastián L; Liu, Er; Arvind, Varun; Bushman, Jared; Sung, Hak-Joon; Becker, Matthew L; Lelièvre, Sophie; Kohn, Joachim; Vidi, Pierre-Alexandre; Moghe, Prabhas V

2017-02-01

Stem and progenitor cells that exhibit significant regenerative potential and critical roles in cancer initiation and progression remain difficult to characterize. Cell fates are determined by reciprocal signaling between the cell microenvironment and the nucleus; hence parameters derived from nuclear remodeling are ideal candidates for stem/progenitor cell characterization. Here we applied high-content, single cell analysis of nuclear shape and organization to examine stem and progenitor cells destined to distinct differentiation endpoints, yet undistinguishable by conventional methods. Nuclear descriptors defined through image informatics classified mesenchymal stem cells poised to either adipogenic or osteogenic differentiation, and oligodendrocyte precursors isolated from different regions of the brain and destined to distinct astrocyte subtypes. Nuclear descriptors also revealed early changes in stem cells after chemical oncogenesis, allowing the identification of a class of cancer-mitigating biomaterials. To capture the metrology of nuclear changes, we developed a simple and quantitative "imaging-derived" parsing index, which reflects the dynamic evolution of the high-dimensional space of nuclear organizational features. A comparative analysis of parsing outcomes via either nuclear shape or textural metrics of the nuclear structural protein NuMA indicates the nuclear shape alone is a weak phenotypic predictor. In contrast, variations in the NuMA organization parsed emergent cell phenotypes and discerned emergent stages of stem cell transformation, supporting a prognosticating role for this protein in the outcomes of nuclear functions. Copyright © 2017 Elsevier Inc. All rights reserved.
Solving LR Conflicts Through Context Aware Scanning

NASA Astrophysics Data System (ADS)

Leon, C. Rodriguez; Forte, L. Garcia

2011-09-01

This paper presents a new algorithm to compute the exact list of tokens expected by any LR syntax analyzer at any point of the scanning process. The lexer can, at any time, compute the exact list of valid tokens to return only tokens in this set. In the case than more than one matching token is in the valid set, the lexer can resort to a nested LR parser to disambiguate. Allowing nested LR parsing requires some slight modifications when building the LR parsing tables. We also show how LR parsers can parse conflictive and inherently ambiguous languages using a combination of nested parsing and context aware scanning. These expanded lexical analyzers can be generated from high level specifications.
High-content image informatics of the structural nuclear protein NuMA parses trajectories for stem/progenitor cell lineages and oncogenic transformation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vega, Sebastián L.; Liu, Er; Arvind, Varun

Stem and progenitor cells that exhibit significant regenerative potential and critical roles in cancer initiation and progression remain difficult to characterize. Cell fates are determined by reciprocal signaling between the cell microenvironment and the nucleus; hence parameters derived from nuclear remodeling are ideal candidates for stem/progenitor cell characterization. Here we applied high-content, single cell analysis of nuclear shape and organization to examine stem and progenitor cells destined to distinct differentiation endpoints, yet undistinguishable by conventional methods. Nuclear descriptors defined through image informatics classified mesenchymal stem cells poised to either adipogenic or osteogenic differentiation, and oligodendrocyte precursors isolated from different regionsmore » of the brain and destined to distinct astrocyte subtypes. Nuclear descriptors also revealed early changes in stem cells after chemical oncogenesis, allowing the identification of a class of cancer-mitigating biomaterials. To capture the metrology of nuclear changes, we developed a simple and quantitative “imaging-derived” parsing index, which reflects the dynamic evolution of the high-dimensional space of nuclear organizational features. A comparative analysis of parsing outcomes via either nuclear shape or textural metrics of the nuclear structural protein NuMA indicates the nuclear shape alone is a weak phenotypic predictor. In contrast, variations in the NuMA organization parsed emergent cell phenotypes and discerned emergent stages of stem cell transformation, supporting a prognosticating role for this protein in the outcomes of nuclear functions. - Highlights: • High-content analysis of nuclear shape and organization classify stem and progenitor cells poised for distinct lineages. • Early oncogenic changes in mesenchymal stem cells (MSCs) are also detected with nuclear descriptors. • A new class of cancer-mitigating biomaterials was identified based on image informatics. • Textural metrics of the nuclear structural protein NuMA are sufficient to parse emergent cell phenotypes.« less
Performance evaluation of continuity of care records (CCRs): parsing models in a mobile health management system.

PubMed

Chen, Hung-Ming; Liou, Yong-Zan

2014-10-01

In a mobile health management system, mobile devices act as the application hosting devices for personal health records (PHRs) and the healthcare servers construct to exchange and analyze PHRs. One of the most popular PHR standards is continuity of care record (CCR). The CCR is expressed in XML formats. However, parsing is an expensive operation that can degrade XML processing performance. Hence, the objective of this study was to identify different operational and performance characteristics for those CCR parsing models including the XML DOM parser, the SAX parser, the PULL parser, and the JSON parser with regard to JSON data converted from XML-based CCR. Thus, developers can make sensible choices for their target PHR applications to parse CCRs when using mobile devices or servers with different system resources. Furthermore, the simulation experiments of four case studies are conducted to compare the parsing performance on Android mobile devices and the server with large quantities of CCR data.
Pragmatic precision oncology: the secondary uses of clinical tumor molecular profiling.

PubMed

Rioth, Matthew J; Thota, Ramya; Staggs, David B; Johnson, Douglas B; Warner, Jeremy L

2016-07-01

Precision oncology increasingly utilizes molecular profiling of tumors to determine treatment decisions with targeted therapeutics. The molecular profiling data is valuable in the treatment of individual patients as well as for multiple secondary uses. To automatically parse, categorize, and aggregate clinical molecular profile data generated during cancer care as well as use this data to address multiple secondary use cases. A system to parse, categorize and aggregate molecular profile data was created. A naÿve Bayesian classifier categorized results according to clinical groups. The accuracy of these systems were validated against a published expertly-curated subset of molecular profiling data. Following one year of operation, 819 samples have been accurately parsed and categorized to generate a data repository of 10,620 genetic variants. The database has been used for operational, clinical trial, and discovery science research. A real-time database of molecular profiling data is a pragmatic solution to several knowledge management problems in the practice and science of precision oncology. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Context-free parsing with connectionist networks

NASA Astrophysics Data System (ADS)

Fanty, M. A.

1986-08-01

This paper presents a simple algorithm which converts any context-free grammar into a connectionist network which parses strings (of arbitrary but fixed maximum length) in the language defined by that grammar. The network is fast, O(n), and deterministicd. It consists of binary units which compute a simple function of their input. When the grammar is put in Chomsky normal form, O(n3) units needed to parse inputs of length up to n.
A reusability and efficiency oriented software design method for mobile land inspection

NASA Astrophysics Data System (ADS)

Cai, Wenwen; He, Jun; Wang, Qing

2008-10-01

Aiming at the requirement from the real-time land inspection domain, a land inspection handset system was presented in this paper. In order to increase the reusability of the system, a design pattern based framework was presented. Encapsulation for command like actions by applying COMMAND pattern was proposed for the problem of complex UI interactions. Integrating several GPS-log parsing engines into a general parsing framework was archived by introducing STRATEGY pattern. A network transmission module based network middleware was constructed. For mitigating the high coupling of complex network communication programs, FACTORY pattern was applied to facilitate the decoupling. Moreover, in order to efficiently manipulate huge GIS datasets, a VISITOR pattern and Quad-tree based multi-scale representation method was presented. It had been proved practically that these design patterns reduced the coupling between the subsystems, and improved the expansibility.
Using Parse's humanbecoming theory in Japan.

PubMed

Tanaka, Junko; Katsuno, Towako; Takahashi, Teruko

2012-01-01

In this paper the authors discuss the use of Parse's humanbecoming theory in Japan. Elements of the theory are used in the nursing approach to an 88 year-old Japanese man who had complications following surgery. Process recordings of the dialogues between the patient, the patient's wife, and the nurse were made and considered in light of the three methodologies of Parse's theory; illuminating meaning, synchronizing rhythms, and mobilizing transcendence. The theory is seen as useful in Japan.
Modeling Syntax for Parsing and Translation

DTIC Science & Technology

2003-12-15

20 CHAPTER 2. MONOLINGUAL PROBABILISTIC PARSING a the D cat snake D S O chased S O ran SS Mary O Figure 2.1: Part of a dictionary . the cat S chased S O...along with their training algorithms: a monolingual gen- erative model of sentence structure, and a model of the relationship between the structure of a...tasks of monolingual parsing and word-level bilingual corpus alignment, they are demonstrated in two additional applications. First, a new statistical
Video content parsing based on combined audio and visual information

NASA Astrophysics Data System (ADS)

Zhang, Tong; Kuo, C.-C. Jay

1999-08-01

While previous research on audiovisual data segmentation and indexing primarily focuses on the pictorial part, significant clues contained in the accompanying audio flow are often ignored. A fully functional system for video content parsing can be achieved more successfully through a proper combination of audio and visual information. By investigating the data structure of different video types, we present tools for both audio and visual content analysis and a scheme for video segmentation and annotation in this research. In the proposed system, video data are segmented into audio scenes and visual shots by detecting abrupt changes in audio and visual features, respectively. Then, the audio scene is categorized and indexed as one of the basic audio types while a visual shot is presented by keyframes and associate image features. An index table is then generated automatically for each video clip based on the integration of outputs from audio and visual analysis. It is shown that the proposed system provides satisfying video indexing results.
Research in Knowledge Representation for Natural Language Understanding

DTIC Science & Technology

1980-11-01

artificial intelligence, natural language understanding , parsing, syntax, semantics, speaker meaning, knowledge representation, semantic networks...TinB PAGE map M W006 1Report No. 4513 L RESEARCH IN KNOWLEDGE REPRESENTATION FOR NATURAL LANGUAGE UNDERSTANDING Annual Report 1 September 1979 to 31... understanding , knowledge representation, and knowledge based inference. The work that we have been doing falls into three classes, successively motivated by

Rapid transcriptome characterization and parsing of sequences in a non-model host-pathogen interaction; pea-Sclerotinia sclerotiorum

PubMed Central

2012-01-01

Background White mold, caused by Sclerotinia sclerotiorum, is one of the most important diseases of pea (Pisum sativum L.), however, little is known about the genetics and biochemistry of this interaction. Identification of genes underlying resistance in the host or pathogenicity and virulence factors in the pathogen will increase our knowledge of the pea-S. sclerotiorum interaction and facilitate the introgression of new resistance genes into commercial pea varieties. Although the S. sclerotiorum genome sequence is available, no pea genome is available, due in part to its large genome size (~3500 Mb) and extensive repeated motifs. Here we present an EST data set specific to the interaction between S. sclerotiorum and pea, and a method to distinguish pathogen and host sequences without a species-specific reference genome. Results 10,158 contigs were obtained by de novo assembly of 128,720 high-quality reads generated by 454 pyrosequencing of the pea-S. sclerotiorum interactome. A method based on the tBLASTx program was modified to distinguish pea and S. sclerotiorum ESTs. To test this strategy, a mixture of known ESTs (18,490 pea and 17,198 S. sclerotiorum ESTs) from public databases were pooled and parsed; the tBLASTx method successfully separated 90.1% of the artificial EST mix with 99.9% accuracy. The tBLASTx method successfully parsed 89.4% of the 454-derived EST contigs, as validated by PCR, into pea (6,299 contigs) and S. sclerotiorum (2,780 contigs) categories. Two thousand eight hundred and forty pea ESTs and 996 S. sclerotiorum ESTs were predicted to be expressed specifically during the pea-S. sclerotiorum interaction as determined by homology search against 81,449 pea ESTs (from flowers, leaves, cotyledons, epi- and hypocotyl, and etiolated and light treated etiolated seedlings) and 57,751 S. sclerotiorum ESTs (from mycelia at neutral pH, developing apothecia and developing sclerotia). Among those ESTs specifically expressed, 277 (9.8%) pea ESTs were predicted to be involved in plant defense and response to biotic or abiotic stress, and 93 (9.3%) S. sclerotiorum ESTs were predicted to be involved in pathogenicity/virulence. Additionally, 142 S. sclerotiorum ESTs were identified as secretory/signal peptides of which only 21 were previously reported. Conclusions We present and characterize an EST resource specific to the pea-S. sclerotiorum interaction. Additionally, the tBLASTx method used to parse S. sclerotiorum and pea ESTs was demonstrated to be a reliable and accurate method to distinguish ESTs without a reference genome. PMID:23181755
The Body That Speaks: Recombining Bodies and Speech Sources in Unscripted Face-to-Face Communication.

PubMed

Gillespie, Alex; Corti, Kevin

2016-01-01

This article examines advances in research methods that enable experimental substitution of the speaking body in unscripted face-to-face communication. A taxonomy of six hybrid social agents is presented by combining three types of bodies (mechanical, virtual, and human) with either an artificial or human speech source. Our contribution is to introduce and explore the significance of two particular hybrids: (1) the cyranoid method that enables humans to converse face-to-face through the medium of another person's body, and (2) the echoborg method that enables artificial intelligence to converse face-to-face through the medium of a human body. These two methods are distinct in being able to parse the unique influence of the human body when combined with various speech sources. We also introduce a new framework for conceptualizing the body's role in communication, distinguishing three levels: self's perspective on the body, other's perspective on the body, and self's perspective of other's perspective on the body. Within each level the cyranoid and echoborg methodologies make important research questions tractable. By conceptualizing and synthesizing these methods, we outline a novel paradigm of research on the role of the body in unscripted face-to-face communication.
The Body That Speaks: Recombining Bodies and Speech Sources in Unscripted Face-to-Face Communication

PubMed Central

Gillespie, Alex; Corti, Kevin

2016-01-01

This article examines advances in research methods that enable experimental substitution of the speaking body in unscripted face-to-face communication. A taxonomy of six hybrid social agents is presented by combining three types of bodies (mechanical, virtual, and human) with either an artificial or human speech source. Our contribution is to introduce and explore the significance of two particular hybrids: (1) the cyranoid method that enables humans to converse face-to-face through the medium of another person's body, and (2) the echoborg method that enables artificial intelligence to converse face-to-face through the medium of a human body. These two methods are distinct in being able to parse the unique influence of the human body when combined with various speech sources. We also introduce a new framework for conceptualizing the body's role in communication, distinguishing three levels: self's perspective on the body, other's perspective on the body, and self's perspective of other's perspective on the body. Within each level the cyranoid and echoborg methodologies make important research questions tractable. By conceptualizing and synthesizing these methods, we outline a novel paradigm of research on the role of the body in unscripted face-to-face communication. PMID:27660616
[Pilot study of domain-specific terminology adaptation for morphological analysis: research on unknown terms in national examination documents of radiological technologists].

PubMed

Tsuji, Shintarou; Nishimoto, Naoki; Ogasawara, Katsuhiko

2008-07-20

Although large medical texts are stored in electronic format, they are seldom reused because of the difficulty of processing narrative texts by computer. Morphological analysis is a key technology for extracting medical terms correctly and automatically. This process parses a sentence into its smallest unit, the morpheme. Phrases consisting of two or more technical terms, however, cause morphological analysis software to fail in parsing the sentence and output unprocessed terms as "unknown words." The purpose of this study was to reduce the number of unknown words in medical narrative text processing. The results of parsing the text with additional dictionaries were compared with the analysis of the number of unknown words in the national examination for radiologists. The ratio of unknown words was reduced 1.0% to 0.36% by adding terminologies of radiological technology, MeSH, and ICD-10 labels. The terminology of radiological technology was the most effective resource, being reduced by 0.62%. This result clearly showed the necessity of additional dictionary selection and trends in unknown words. The potential for this investigation is to make available a large body of clinical information that would otherwise be inaccessible for applications other than manual health care review by personnel.
Object-oriented parsing of biological databases with Python.

PubMed

Ramu, C; Gemünd, C; Gibson, T J

2000-07-01

While database activities in the biological area are increasing rapidly, rather little is done in the area of parsing them in a simple and object-oriented way. We present here an elegant, simple yet powerful way of parsing biological flat-file databases. We have taken EMBL, SWISSPROT and GENBANK as examples. EMBL and SWISS-PROT do not differ much in the format structure. GENBANK has a very different format structure than EMBL and SWISS-PROT. Extracting the desired fields in an entry (for example a sub-sequence with an associated feature) for later analysis is a constant need in the biological sequence-analysis community: this is illustrated with tools to make new splice-site databases. The interface to the parser is abstract in the sense that the access to all the databases is independent from their different formats, since parsing instructions are hidden.
GBParsy: a GenBank flatfile parser library with high speed.

PubMed

Lee, Tae-Ho; Kim, Yeon-Ki; Nahm, Baek Hie

2008-07-25

GenBank flatfile (GBF) format is one of the most popular sequence file formats because of its detailed sequence features and ease of readability. To use the data in the file by a computer, a parsing process is required and is performed according to a given grammar for the sequence and the description in a GBF. Currently, several parser libraries for the GBF have been developed. However, with the accumulation of DNA sequence information from eukaryotic chromosomes, parsing a eukaryotic genome sequence with these libraries inevitably takes a long time, due to the large GBF file and its correspondingly large genomic nucleotide sequence and related feature information. Thus, there is significant need to develop a parsing program with high speed and efficient use of system memory. We developed a library, GBParsy, which was C language-based and parses GBF files. The parsing speed was maximized by using content-specified functions in place of regular expressions that are flexible but slow. In addition, we optimized an algorithm related to memory usage so that it also increased parsing performance and efficiency of memory usage. GBParsy is at least 5-100x faster than current parsers in benchmark tests. GBParsy is estimated to extract annotated information from almost 100 Mb of a GenBank flatfile for chromosomal sequence information within a second. Thus, it should be used for a variety of applications such as on-time visualization of a genome at a web site.
FRED: a program development tool

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shilling, J.

1985-09-01

The structured, screen-based editor FRED is introduced. FRED provides incremental parsing and semantic analysis. The parsing is based on an LL(1) top-down algorithm which has been modified to provide follow-the-cursor parsing and soft templates. The languages accepted by the editor are LL(1) languages with the addition of the Unknown and preferred production non-terminal classes. The semantic analysis is based on the incremental update of attribute grammar equations. We briefly describe the interface between FRED and an automated reference librarian system that is under development.
Foreign Language Translation of Chemical Nomenclature by Computer

PubMed Central

2009-01-01

Chemical compound names remain the primary method for conveying molecular structures between chemists and researchers. In research articles, patents, chemical catalogues, government legislation, and textbooks, the use of IUPAC and traditional compound names is universal, despite efforts to introduce more machine-friendly representations such as identifiers and line notations. Fortunately, advances in computing power now allow chemical names to be parsed and generated (read and written) with almost the same ease as conventional connection tables. A significant complication, however, is that although the vast majority of chemistry uses English nomenclature, a significant fraction is in other languages. This complicates the task of filing and analyzing chemical patents, purchasing from compound vendors, and text mining research articles or Web pages. We describe some issues with manipulating chemical names in various languages, including British, American, German, Japanese, Chinese, Spanish, Swedish, Polish, and Hungarian, and describe the current state-of-the-art in software tools to simplify the process. PMID:19239237
Detecting modification of biomedical events using a deep parsing approach

PubMed Central

2012-01-01

Background This work describes a system for identifying event mentions in bio-molecular research abstracts that are either speculative (e.g. analysis of IkappaBalpha phosphorylation, where it is not specified whether phosphorylation did or did not occur) or negated (e.g. inhibition of IkappaBalpha phosphorylation, where phosphorylation did not occur). The data comes from a standard dataset created for the BioNLP 2009 Shared Task. The system uses a machine-learning approach, where the features used for classification are a combination of shallow features derived from the words of the sentences and more complex features based on the semantic outputs produced by a deep parser. Method To detect event modification, we use a Maximum Entropy learner with features extracted from the data relative to the trigger words of the events. The shallow features are bag-of-words features based on a small sliding context window of 3-4 tokens on either side of the trigger word. The deep parser features are derived from parses produced by the English Resource Grammar and the RASP parser. The outputs of these parsers are converted into the Minimal Recursion Semantics formalism, and from this, we extract features motivated by linguistics and the data itself. All of these features are combined to create training or test data for the machine learning algorithm. Results Over the test data, our methods produce approximately a 4% absolute increase in F-score for detection of event modification compared to a baseline based only on the shallow bag-of-words features. Conclusions Our results indicate that grammar-based techniques can enhance the accuracy of methods for detecting event modification. PMID:22595089
Recognition of Equations Using a Two-Dimensional Stochastic Context-Free Grammar

NASA Astrophysics Data System (ADS)

Chou, Philip A.

1989-11-01

We propose using two-dimensional stochastic context-free grammars for image recognition, in a manner analogous to using hidden Markov models for speech recognition. The value of the approach is demonstrated in a system that recognizes printed, noisy equations. The system uses a two-dimensional probabilistic version of the Cocke-Younger-Kasami parsing algorithm to find the most likely parse of the observed image, and then traverses the corresponding parse tree in accordance with translation formats associated with each production rule, to produce eqn I troff commands for the imaged equation. In addition, it uses two-dimensional versions of the Inside/Outside and Baum re-estimation algorithms for learning the parameters of the grammar from a training set of examples. Parsing the image of a simple noisy equation currently takes about one second of cpu time on an Alliant FX/80.
Prosody and parsing in coordination structures.

PubMed

Schepman, A; Rodway, P

2000-05-01

The effect of prosodic boundary cues on the off-line disambiguation and on-line parsing of coordination structures was examined. It was found that relative clauses were attached to coordinated object noun phrases in preference to second conjuncts in sentences like: The lawyer greeted the powerful barrister and the wise judge who was/were walking to the courtroom. Naive speakers signalled the syntactic contrast between the two structures by a prosodic break between the conjuncts when the relative clause was attached to the second conjunct. Listeners were able to use this prosodic information in both off-line syntactic disambiguation and on-line syntactic parsing. The findings are compatible with a model in which prosody has a strong immediate effect on parsing. It is argued that the current experimental design has avoided confounds present in earlier studies on the on-line integration of prosodic and syntactic information.
HDM/PASCAL Verification System User's Manual

NASA Technical Reports Server (NTRS)

Hare, D.

1983-01-01

The HDM/Pascal verification system is a tool for proving the correctness of programs written in PASCAL and specified in the Hierarchical Development Methodology (HDM). This document assumes an understanding of PASCAL, HDM, program verification, and the STP system. The steps toward verification which this tool provides are parsing programs and specifications, checking the static semantics, and generating verification conditions. Some support functions are provided such as maintaining a data base, status management, and editing. The system runs under the TOPS-20 and TENEX operating systems and is written in INTERLISP. However, no knowledge is assumed of these operating systems or of INTERLISP. The system requires three executable files, HDMVCG, PARSE, and STP. Optionally, the editor EMACS should be on the system in order for the editor to work. The file HDMVCG is invoked to run the system. The files PARSE and STP are used as lower forks to perform the functions of parsing and proving.
Perception of scene-relative object movement: Optic flow parsing and the contribution of monocular depth cues.

PubMed

Warren, Paul A; Rushton, Simon K

2009-05-01

We have recently suggested that the brain uses its sensitivity to optic flow in order to parse retinal motion into components arising due to self and object movement (e.g. Rushton, S. K., & Warren, P. A. (2005). Moving observers, 3D relative motion and the detection of object movement. Current Biology, 15, R542-R543). Here, we explore whether stereo disparity is necessary for flow parsing or whether other sources of depth information, which could theoretically constrain flow-field interpretation, are sufficient. Stationary observers viewed large field of view stimuli containing textured cubes, moving in a manner that was consistent with a complex observer movement through a stationary scene. Observers made speeded responses to report the perceived direction of movement of a probe object presented at different depths in the scene. Across conditions we varied the presence or absence of different binocular and monocular cues to depth order. In line with previous studies, results consistent with flow parsing (in terms of both perceived direction and response time) were found in the condition in which motion parallax and stereoscopic disparity were present. Observers were poorer at judging object movement when depth order was specified by parallax alone. However, as more monocular depth cues were added to the stimulus the results approached those found when the scene contained stereoscopic cues. We conclude that both monocular and binocular static depth information contribute to flow parsing. These findings are discussed in the context of potential architectures for a model of the flow parsing mechanism.
CTEPP STANDARD OPERATING PROCEDURE FOR ENTERING OR IMPORTING ELECTRONIC DATA INTO THE CTEPP DATABASE (SOP-4.12)

EPA Science Inventory

This SOP described the method used to automatically parse analytical data generated from gas chromatography/mass spectrometry (GC/MS) analyses into CTEPP summary spreadsheets and electronically import the summary spreadsheets into the CTEPP study database.
Automated vocabulary discovery for geo-parsing online epidemic intelligence.

PubMed

Keller, Mikaela; Freifeld, Clark C; Brownstein, John S

2009-11-24

Automated surveillance of the Internet provides a timely and sensitive method for alerting on global emerging infectious disease threats. HealthMap is part of a new generation of online systems designed to monitor and visualize, on a real-time basis, disease outbreak alerts as reported by online news media and public health sources. HealthMap is of specific interest for national and international public health organizations and international travelers. A particular task that makes such a surveillance useful is the automated discovery of the geographic references contained in the retrieved outbreak alerts. This task is sometimes referred to as "geo-parsing". A typical approach to geo-parsing would demand an expensive training corpus of alerts manually tagged by a human. Given that human readers perform this kind of task by using both their lexical and contextual knowledge, we developed an approach which relies on a relatively small expert-built gazetteer, thus limiting the need of human input, but focuses on learning the context in which geographic references appear. We show in a set of experiments, that this approach exhibits a substantial capacity to discover geographic locations outside of its initial lexicon. The results of this analysis provide a framework for future automated global surveillance efforts that reduce manual input and improve timeliness of reporting.
Detecting modification of biomedical events using a deep parsing approach.

PubMed

Mackinlay, Andrew; Martinez, David; Baldwin, Timothy

2012-04-30

This work describes a system for identifying event mentions in bio-molecular research abstracts that are either speculative (e.g. analysis of IkappaBalpha phosphorylation, where it is not specified whether phosphorylation did or did not occur) or negated (e.g. inhibition of IkappaBalpha phosphorylation, where phosphorylation did not occur). The data comes from a standard dataset created for the BioNLP 2009 Shared Task. The system uses a machine-learning approach, where the features used for classification are a combination of shallow features derived from the words of the sentences and more complex features based on the semantic outputs produced by a deep parser. To detect event modification, we use a Maximum Entropy learner with features extracted from the data relative to the trigger words of the events. The shallow features are bag-of-words features based on a small sliding context window of 3-4 tokens on either side of the trigger word. The deep parser features are derived from parses produced by the English Resource Grammar and the RASP parser. The outputs of these parsers are converted into the Minimal Recursion Semantics formalism, and from this, we extract features motivated by linguistics and the data itself. All of these features are combined to create training or test data for the machine learning algorithm. Over the test data, our methods produce approximately a 4% absolute increase in F-score for detection of event modification compared to a baseline based only on the shallow bag-of-words features. Our results indicate that grammar-based techniques can enhance the accuracy of methods for detecting event modification.
Event Structure and Cognitive Control

ERIC Educational Resources Information Center

Reimer, Jason F.; Radvansky, Gabriel A.; Lorsbach, Thomas C.; Armendarez, Joseph J.

2015-01-01

Recently, a great deal of research has demonstrated that although everyday experience is continuous in nature, it is parsed into separate events. The aim of the present study was to examine whether event structure can influence the effectiveness of cognitive control. Across 5 experiments we varied the structure of events within the AX-CPT by…
Toward a Dynamic, Multidimensional Research Framework for Strategic Processing

ERIC Educational Resources Information Center

Dinsmore, Daniel L.

2017-01-01

While the empirical literature on strategic processing is vast, understanding how and why certain strategies work for certain learners is far from clear. The purpose of this review is to systematically examine the theoretical and empirical literature on strategic process to parse out current conceptual and methodological progress to inform new…
How Teachers Teach: Mapping the Terrain of Practice

ERIC Educational Resources Information Center

Sykes, Gary; Wilson, Suzanne

2015-01-01

This paper--conceived as a framework for competencies in teaching--represents an interpretive synthesis by the authors of main and contemporary currents in the research on teaching and learning. The framework resulting from this review parses teaching into two main domains--instruction and role responsibilities--within each of which a set of broad…
Graduation Rates: Real Kids, Real Numbers

ERIC Educational Resources Information Center

Swanson, Christopher B.

2004-01-01

Controversies over graduation rates and No Child Left Behind have raged in research, media and political circles for almost a year. All too often, though, when complex issues of social and economic importance collide with policy and politics, heat is generated but little light. As a result, it may be difficult for local educators to parse the…

Hardware independence checkout software

NASA Technical Reports Server (NTRS)

Cameron, Barry W.; Helbig, H. R.

1990-01-01

ACSI has developed a program utilizing CLIPS to assess compliance with various programming standards. Essentially the program parses C code to extract the names of all function calls. These are asserted as CLIPS facts which also include information about line numbers, source file names, and called functions. Rules have been devised to establish functions called that have not been defined in any of the source parsed. These are compared against lists of standards (represented as facts) using rules that check intersections and/or unions of these. By piping the output into other processes the source is appropriately commented by generating and executing parsed scripts.
Metacoder: An R package for visualization and manipulation of community taxonomic diversity data.

PubMed

Foster, Zachary S L; Sharpton, Thomas J; Grünwald, Niklaus J

2017-02-01

Community-level data, the type generated by an increasing number of metabarcoding studies, is often graphed as stacked bar charts or pie graphs that use color to represent taxa. These graph types do not convey the hierarchical structure of taxonomic classifications and are limited by the use of color for categories. As an alternative, we developed metacoder, an R package for easily parsing, manipulating, and graphing publication-ready plots of hierarchical data. Metacoder includes a dynamic and flexible function that can parse most text-based formats that contain taxonomic classifications, taxon names, taxon identifiers, or sequence identifiers. Metacoder can then subset, sample, and order this parsed data using a set of intuitive functions that take into account the hierarchical nature of the data. Finally, an extremely flexible plotting function enables quantitative representation of up to 4 arbitrary statistics simultaneously in a tree format by mapping statistics to the color and size of tree nodes and edges. Metacoder also allows exploration of barcode primer bias by integrating functions to run digital PCR. Although it has been designed for data from metabarcoding research, metacoder can easily be applied to any data that has a hierarchical component such as gene ontology or geographic location data. Our package complements currently available tools for community analysis and is provided open source with an extensive online user manual.
Metacoder: An R package for visualization and manipulation of community taxonomic diversity data

PubMed Central

Foster, Zachary S. L.; Sharpton, Thomas J.

2017-01-01

Community-level data, the type generated by an increasing number of metabarcoding studies, is often graphed as stacked bar charts or pie graphs that use color to represent taxa. These graph types do not convey the hierarchical structure of taxonomic classifications and are limited by the use of color for categories. As an alternative, we developed metacoder, an R package for easily parsing, manipulating, and graphing publication-ready plots of hierarchical data. Metacoder includes a dynamic and flexible function that can parse most text-based formats that contain taxonomic classifications, taxon names, taxon identifiers, or sequence identifiers. Metacoder can then subset, sample, and order this parsed data using a set of intuitive functions that take into account the hierarchical nature of the data. Finally, an extremely flexible plotting function enables quantitative representation of up to 4 arbitrary statistics simultaneously in a tree format by mapping statistics to the color and size of tree nodes and edges. Metacoder also allows exploration of barcode primer bias by integrating functions to run digital PCR. Although it has been designed for data from metabarcoding research, metacoder can easily be applied to any data that has a hierarchical component such as gene ontology or geographic location data. Our package complements currently available tools for community analysis and is provided open source with an extensive online user manual. PMID:28222096
Acoustic landmarks drive delta-theta oscillations to enable speech comprehension by facilitating perceptual parsing

PubMed Central

Doelling, Keith; Arnal, Luc; Ghitza, Oded; Poeppel, David

2013-01-01

A growing body of research suggests that intrinsic neuronal slow (< 10 Hz) oscillations in auditory cortex appear to track incoming speech and other spectro-temporally complex auditory signals. Within this framework, several recent studies have identified critical-band temporal envelopes as the specific acoustic feature being reflected by the phase of these oscillations. However, how this alignment between speech acoustics and neural oscillations might underpin intelligibility is unclear. Here we test the hypothesis that the ‘sharpness’ of temporal fluctuations in the critical band envelope acts as a temporal cue to speech syllabic rate, driving delta-theta rhythms to track the stimulus and facilitate intelligibility. We interpret our findings as evidence that sharp events in the stimulus cause cortical rhythms to re-align and parse the stimulus into syllable-sized chunks for further decoding. Using magnetoencephalographic recordings, we show that by removing temporal fluctuations that occur at the syllabic rate, envelope-tracking activity is reduced. By artificially reinstating these temporal fluctuations, envelope-tracking activity is regained. These changes in tracking correlate with intelligibility of the stimulus. Together, the results suggest that the sharpness of fluctuations in the stimulus, as reflected in the cochlear output, drive oscillatory activity to track and entrain to the stimulus, at its syllabic rate. This process likely facilitates parsing of the stimulus into meaningful chunks appropriate for subsequent decoding, enhancing perception and intelligibility. PMID:23791839
Yes! An object-oriented compiler compiler (YOOCC)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Avotins, J.; Mingins, C.; Schmidt, H.

1995-12-31

Grammar-based processor generation is one of the most widely studied areas in language processor construction. However, there have been very few approaches to date that reconcile object-oriented principles, processor generation, and an object-oriented language. Pertinent here also. is that currently to develop a processor using the Eiffel Parse libraries requires far too much time to be expended on tasks that can be automated. For these reasons, we have developed YOOCC (Yes! an Object-Oriented Compiler Compiler), which produces a processor framework from a grammar using an enhanced version of the Eiffel Parse libraries, incorporating the ideas hypothesized by Meyer, and Grapemore » and Walden, as well as many others. Various essential changes have been made to the Eiffel Parse libraries. Examples are presented to illustrate the development of a processor using YOOCC, and it is concluded that the Eiffel Parse libraries are now not only an intelligent, but also a productive option for processor construction.« less
Construction of a robust, large-scale, collaborative database for raw data in computational chemistry: the Collaborative Chemistry Database Tool (CCDBT).

PubMed

Chen, Mingyang; Stott, Amanda C; Li, Shenggang; Dixon, David A

2012-04-01

A robust metadata database called the Collaborative Chemistry Database Tool (CCDBT) for massive amounts of computational chemistry raw data has been designed and implemented. It performs data synchronization and simultaneously extracts the metadata. Computational chemistry data in various formats from different computing sources, software packages, and users can be parsed into uniform metadata for storage in a MySQL database. Parsing is performed by a parsing pyramid, including parsers written for different levels of data types and sets created by the parser loader after loading parser engines and configurations. Copyright Â© 2011 Elsevier Inc. All rights reserved.
Speed up of XML parsers with PHP language implementation

NASA Astrophysics Data System (ADS)

Georgiev, Bozhidar; Georgieva, Adriana

2012-11-01

In this paper, authors introduce PHP5's XML implementation and show how to read, parse, and write a short and uncomplicated XML file using Simple XML in a PHP environment. The possibilities for mutual work of PHP5 language and XML standard are described. The details of parsing process with Simple XML are also cleared. A practical project PHP-XML-MySQL presents the advantages of XML implementation in PHP modules. This approach allows comparatively simple search of XML hierarchical data by means of PHP software tools. The proposed project includes database, which can be extended with new data and new XML parsing functions.
Integrated Processing in Planning and Understanding.

DTIC Science & Technology

1986-12-01

to language analysis seemed necessary. The second observation was the rather commonsense one that it is easier to understand a foreign language ...syntactic analysis Probably the most widely employed method for natural language analysis is augmea ted transition network parsing, or ATNs (Thorne, Bratley...accomplished. It is for this reason that the programming language Prolog, which implements that general method , has proven so well-stilted to writing ATN
Mention Detection: Heuristics for the OntoNotes Annotations

DTIC Science & Technology

2011-01-01

Mention Detection: Heuristics for the OntoNotes annotations Jonathan K. Kummerfeld, Mohit Bansal, David Burkett and Dan Klein Computer Science...considered the provided parses and parses produced by the Berke - ley parser (Petrov et al., 2006) trained on the pro- vided training data. We added a
Simultaneous Translation: Idiom Interpretation and Parsing Heuristics.

ERIC Educational Resources Information Center

McDonald, Janet L.; Carpenter, Patricia A.

1981-01-01

Presents a model of interpretation, parsing and error recovery in simultaneous translation using two experts and two amateur German-English bilingual translators orally translating from English to German. Argues that the translator first comprehends the text in English and divides it into meaningful units before translating. Study also…
Combining Natural Language Processing and Statistical Text Mining: A Study of Specialized versus Common Languages

ERIC Educational Resources Information Center

Jarman, Jay

2011-01-01

This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms,…
Statistical Clustering and the Contents of the Infant Vocabulary

ERIC Educational Resources Information Center

Swingley, Daniel

2005-01-01

Infants parse speech into word-sized units according to biases that develop in the first year. One bias, present before the age of 7 months, is to cluster syllables that tend to co-occur. The present computational research demonstrates that this statistical clustering bias could lead to the extraction of speech sequences that are actual words,…
Effects of Tasks on BOLD Signal Responses to Sentence Contrasts: Review and Commentary

ERIC Educational Resources Information Center

Caplan, David; Gow, David

2012-01-01

Functional neuroimaging studies of syntactic processing have been interpreted as identifying the neural locations of parsing and interpretive operations. However, current behavioral studies of sentence processing indicate that many operations occur simultaneously with parsing and interpretation. In this review, we point to issues that arise in…
Applications of Parsing Theory to Computer-Assisted Instruction.

ERIC Educational Resources Information Center

Markosian, Lawrence Z.; Ager, Tryg A.

1983-01-01

Applications of an LR-1 parsing algorithm to intelligent programs for computer assisted instruction in symbolic logic and foreign languages are discussed. The system has been adequately used for diverse instructional applications, including analysis of student input, generation of pattern drills, and modeling the student's understanding of the…
Parsing the Practice of Teaching

ERIC Educational Resources Information Center

Kennedy, Mary

2016-01-01

Teacher education programs typically teach novices about one part of teaching at a time. We might offer courses on different topics--cultural foundations, learning theory, or classroom management--or we may parse teaching practice itself into a set of discrete techniques, such as core teaching practices, that can be taught individually. Missing…
Learning for Semantic Parsing Using Statistical Syntactic Parsing Techniques

DTIC Science & Technology

2010-05-01

Workshop on Supervisory Con- trol of Learning and Adaptive Systems. San Jose, CA. Roland Kuhn and Renato De Mori (1995). The application of semantic...Processing (EMNLP-09), pp. 1–10. Suntec,Singapore. Ana-Maria Popescu, Alex Armanasu, Oren Etzioni, David Ko and Alexander Yates (2004). Modern natural
Observations on positivism and pseudoscience in qualitative nursing research.

PubMed

Johnson, M

1999-07-01

In this paper I will examine the boundaries between positivism, interpretivism and pseudoscience, arguing that some qualitative researchers may risk the credibility of nursing research by utilizing concepts from the margins of science. There are two major threats to the perceived rigour and credibility of qualitative research in its many forms. First is a trend in some work towards a mystical view of both the methods and the content of the qualitative enterprise. This can be detected, I will argue, in the work of Rosemary Parse in particular. The second potentially damaging trend is almost its epistemological opposite, towards excessive reliance on precise procedures, strict definitions and verification exemplified by Juliet Corbin and others. I will suggest that this is nothing to fear, but something to be clear about. This is not social constructionism or interpretivism but a 'qualitative' version of positivism. The paper concludes that students and researchers should be cautious in the uncritical acceptance of theories and 'research' which approach the boundaries of pseudoscience on the one hand, and 'hard' science on the other.
A Semantic Analysis Method for Scientific and Engineering Code

NASA Technical Reports Server (NTRS)

Stewart, Mark E. M.

1998-01-01

This paper develops a procedure to statically analyze aspects of the meaning or semantics of scientific and engineering code. The analysis involves adding semantic declarations to a user's code and parsing this semantic knowledge with the original code using multiple expert parsers. These semantic parsers are designed to recognize formulae in different disciplines including physical and mathematical formulae and geometrical position in a numerical scheme. In practice, a user would submit code with semantic declarations of primitive variables to the analysis procedure, and its semantic parsers would automatically recognize and document some static, semantic concepts and locate some program semantic errors. A prototype implementation of this analysis procedure is demonstrated. Further, the relationship between the fundamental algebraic manipulations of equations and the parsing of expressions is explained. This ability to locate some semantic errors and document semantic concepts in scientific and engineering code should reduce the time, risk, and effort of developing and using these codes.
Weighting Statistical Inputs for Data Used to Support Effective Decision Making During Severe Emergency Weather and Environmental Events

NASA Technical Reports Server (NTRS)

Gardner, Adrian

2010-01-01

National Aeronautical and Space Administration (NASA) weather and atmospheric environmental organizations are insatiable consumers of geophysical, hydrometeorological and solar weather statistics. The expanding array of internet-worked sensors producing targeted physical measurements has generated an almost factorial explosion of near real-time inputs to topical statistical datasets. Normalizing and value-based parsing of such statistical datasets in support of time-constrained weather and environmental alerts and warnings is essential, even with dedicated high-performance computational capabilities. What are the optimal indicators for advanced decision making? How do we recognize the line between sufficient statistical sampling and excessive, mission destructive sampling ? How do we assure that the normalization and parsing process, when interpolated through numerical models, yields accurate and actionable alerts and warnings? This presentation will address the integrated means and methods to achieve desired outputs for NASA and consumers of its data.
Context Modulates Attention to Social Scenes in Toddlers with Autism

ERIC Educational Resources Information Center

Chawarska, Katarzyna; Macari, Suzanne; Shic, Frederick

2012-01-01

Background: In typical development, the unfolding of social and communicative skills hinges upon the ability to allocate and sustain attention toward people, a skill present moments after birth. Deficits in social attention have been well documented in autism, though the underlying mechanisms are poorly understood. Methods: In order to parse the…

Interactive Cohort Identification of Sleep Disorder Patients Using Natural Language Processing and i2b2

PubMed Central

Chen, W.; Kowatch, R.; Lin, S.; Splaingard, M.

2015-01-01

Summary Nationwide Children’s Hospital established an i2b2 (Informatics for Integrating Biology & the Bedside) application for sleep disorder cohort identification. Discrete data were gleaned from semistructured sleep study reports. The system showed to work more efficiently than the traditional manual chart review method, and it also enabled searching capabilities that were previously not possible. Objective We report on the development and implementation of the sleep disorder i2b2 cohort identification system using natural language processing of semi-structured documents. Methods We developed a natural language processing approach to automatically parse concepts and their values from semi-structured sleep study documents. Two parsers were developed: a regular expression parser for extracting numeric concepts and a NLP based tree parser for extracting textual concepts. Concepts were further organized into i2b2 ontologies based on document structures and in-domain knowledge. Results 26,550 concepts were extracted with 99% being textual concepts. 1.01 million facts were extracted from sleep study documents such as demographic information, sleep study lab results, medications, procedures, diagnoses, among others. The average accuracy of terminology parsing was over 83% when comparing against those by experts. The system is capable of capturing both standard and non-standard terminologies. The time for cohort identification has been reduced significantly from a few weeks to a few seconds. Conclusion Natural language processing was shown to be powerful for quickly converting large amount of semi-structured or unstructured clinical data into discrete concepts, which in combination of intuitive domain specific ontologies, allows fast and effective interactive cohort identification through the i2b2 platform for research and clinical use. PMID:26171080
SU-E-T-473: A Patient-Specific QC Paradigm Based On Trajectory Log Files and DICOM Plan Files

DOE Office of Scientific and Technical Information (OSTI.GOV)

DeMarco, J; McCloskey, S; Low, D

Purpose: To evaluate a remote QC tool for monitoring treatment machine parameters and treatment workflow. Methods: The Varian TrueBeamTM linear accelerator is a digital machine that records machine axis parameters and MLC leaf positions as a function of delivered monitor unit or control point. This information is saved to a binary trajectory log file for every treatment or imaging field in the patient treatment session. A MATLAB analysis routine was developed to parse the trajectory log files for a given patient, compare the expected versus actual machine and MLC positions as well as perform a cross-comparison with the DICOM-RT planmore » file exported from the treatment planning system. The parsing routine sorts the trajectory log files based on the time and date stamp and generates a sequential report file listing treatment parameters and provides a match relative to the DICOM-RT plan file. Results: The trajectory log parsing-routine was compared against a standard record and verify listing for patients undergoing initial IMRT dosimetry verification and weekly and final chart QC. The complete treatment course was independently verified for 10 patients of varying treatment site and a total of 1267 treatment fields were evaluated including pre-treatment imaging fields where applicable. In the context of IMRT plan verification, eight prostate SBRT plans with 4-arcs per plan were evaluated based on expected versus actual machine axis parameters. The average value for the maximum RMS MLC error was 0.067±0.001mm and 0.066±0.002mm for leaf bank A and B respectively. Conclusion: A real-time QC analysis program was tested using trajectory log files and DICOM-RT plan files. The parsing routine is efficient and able to evaluate all relevant machine axis parameters during a patient treatment course including MLC leaf positions and table positions at time of image acquisition and during treatment.« less
Domain Adaption of Parsing for Operative Notes

PubMed Central

Wang, Yan; Pakhomov, Serguei; Ryan, James O.; Melton, Genevieve B.

2016-01-01

Background Full syntactic parsing of clinical text as a part of clinical natural language processing (NLP) is critical for a wide range of applications, such as identification of adverse drug reactions, patient cohort identification, and gene interaction extraction. Several robust syntactic parsers are publicly available to produce linguistic representations for sentences. However, these existing parsers are mostly trained on general English text and often require adaptation for optimal performance on clinical text. Our objective was to adapt an existing general English parser for the clinical text of operative reports via lexicon augmentation, statistics adjusting, and grammar rules modification based on a set of biomedical text. Method The Stanford unlexicalized probabilistic context-free grammar (PCFG) parser lexicon was expanded with SPECIALIST lexicon along with statistics collected from a limited set of operative notes tagged with a two of POS taggers (GENIA tagger and MedPost). The most frequently occurring verb entries of the SPECIALIST lexicon were adjusted based on manual review of verb usage in operative notes. Stanford parser grammar production rules were also modified based on linguistic features of operative reports. An analogous approach was then applied to the GENIA corpus to test the generalizability of this approach to biomedical text. Results The new unlexicalized PCFG parser extended with the extra lexicon from SPECIALIST along with accurate statistics collected from an operative note corpus tagged with GENIA POS tagger improved the parser performance by 2.26% from 87.64% to 89.90%. There was a progressive improvement with the addition of multiple approaches. Most of the improvement occurred with lexicon augmentation combined with statistics from the operative notes corpus. Application of this approach on the GENIA corpus showed that parsing performance was boosted by 3.81% with a simple new grammar and the addition of the GENIA corpus lexicon. Conclusion Using statistics collected from clinical text tagged with POS taggers along with proper modification of grammars and lexicons of an unlexicalized PCFG parser can improve parsing performance. PMID:25661593
Harmony Search Algorithm for Word Sense Disambiguation.

PubMed

Abed, Saad Adnan; Tiun, Sabrina; Omar, Nazlia

2015-01-01

Word Sense Disambiguation (WSD) is the task of determining which sense of an ambiguous word (word with multiple meanings) is chosen in a particular use of that word, by considering its context. A sentence is considered ambiguous if it contains ambiguous word(s). Practically, any sentence that has been classified as ambiguous usually has multiple interpretations, but just one of them presents the correct interpretation. We propose an unsupervised method that exploits knowledge based approaches for word sense disambiguation using Harmony Search Algorithm (HSA) based on a Stanford dependencies generator (HSDG). The role of the dependency generator is to parse sentences to obtain their dependency relations. Whereas, the goal of using the HSA is to maximize the overall semantic similarity of the set of parsed words. HSA invokes a combination of semantic similarity and relatedness measurements, i.e., Jiang and Conrath (jcn) and an adapted Lesk algorithm, to perform the HSA fitness function. Our proposed method was experimented on benchmark datasets, which yielded results comparable to the state-of-the-art WSD methods. In order to evaluate the effectiveness of the dependency generator, we perform the same methodology without the parser, but with a window of words. The empirical results demonstrate that the proposed method is able to produce effective solutions for most instances of the datasets used.
Harmony Search Algorithm for Word Sense Disambiguation

PubMed Central

Abed, Saad Adnan; Tiun, Sabrina; Omar, Nazlia

2015-01-01

Word Sense Disambiguation (WSD) is the task of determining which sense of an ambiguous word (word with multiple meanings) is chosen in a particular use of that word, by considering its context. A sentence is considered ambiguous if it contains ambiguous word(s). Practically, any sentence that has been classified as ambiguous usually has multiple interpretations, but just one of them presents the correct interpretation. We propose an unsupervised method that exploits knowledge based approaches for word sense disambiguation using Harmony Search Algorithm (HSA) based on a Stanford dependencies generator (HSDG). The role of the dependency generator is to parse sentences to obtain their dependency relations. Whereas, the goal of using the HSA is to maximize the overall semantic similarity of the set of parsed words. HSA invokes a combination of semantic similarity and relatedness measurements, i.e., Jiang and Conrath (jcn) and an adapted Lesk algorithm, to perform the HSA fitness function. Our proposed method was experimented on benchmark datasets, which yielded results comparable to the state-of-the-art WSD methods. In order to evaluate the effectiveness of the dependency generator, we perform the same methodology without the parser, but with a window of words. The empirical results demonstrate that the proposed method is able to produce effective solutions for most instances of the datasets used. PMID:26422368
Recursive Optimization of Digital Circuits

DTIC Science & Technology

1990-12-14

Obverse- Specification . . . A-23 A.14 Non-MDS Optimization of SAMPLE .. .. .. .. .. .. ..... A-24 Appendix B . BORIS Recursive Optimization System...Software ...... B -i B .1 DESIGN.S File . .... .. .. .. .. .. .. .. .. .. ... ... B -2 B .2 PARSE.S File. .. .. .. .. .. .. .. .. ... .. ... .... B -1i B .3...TABULAR.S File. .. .. .. .. .. .. ... .. ... .. ... B -22 B .4 MDS.S File. .. .. .. .. .. .. .. ... .. ... .. ...... B -28 B .5 COST.S File
Time-Driven Effects on Parsing during Reading

ERIC Educational Resources Information Center

Roll, Mikael; Lindgren, Magnus; Alter, Kai; Horne, Merle

2012-01-01

The phonological trace of perceived words starts fading away in short-term memory after a few seconds. Spoken utterances are usually 2-3 s long, possibly to allow the listener to parse the words into coherent prosodic phrases while they still have a clear representation. Results from this brain potential study suggest that even during silent…
The Effect of Exposure on Syntactic Parsing in Spanish-English Bilinguals

ERIC Educational Resources Information Center

Dussias, Paola E.; Sagarra, Nuria

2007-01-01

An eye tracking experiment examined how exposure to a second language (L2) influences sentence parsing in the first language. Forty-four monolingual Spanish speakers, 24 proficient Spanish-English bilinguals with limited immersion experience in the L2 environment and 20 proficient Spanish-English bilinguals with extensive L2 immersion experience…
A python tool for the implementation of domain-specific languages

NASA Astrophysics Data System (ADS)

Dejanović, Igor; Vaderna, Renata; Milosavljević, Gordana; Simić, Miloš; Vuković, Željko

2017-07-01

In this paper we describe textX, a meta-language and a tool for building Domain-Specific Languages. It is implemented in Python using Arpeggio PEG (Parsing Expression Grammar) parser library. From a single language description (grammar) textX will build a parser and a meta-model (a.k.a. abstract syntax) of the language. The parser is used to parse textual representations of models conforming to the meta-model. As a result of parsing, a Python object graph will be automatically created. The structure of the object graph will conform to the meta-model defined by the grammar. This approach frees a developer from the need to manually analyse a parse tree and transform it to other suitable representation. The textX library is independent of any integrated development environment and can be easily integrated in any Python project. The textX tool works as a grammar interpreter. The parser is configured at run-time using the grammar. The textX tool is a free and open-source project available at GitHub.
Advances in Hispanic Linguistics: Papers from the Hispanic Linguistics Symposium (2nd, Columbus, OH, October 9-11, 1998). Volumes 1-2.

ERIC Educational Resources Information Center

Gutierrez-Rexach, Javier, Ed.; Martinez-Gil, Fernando, Ed.

Papers from the 1998 Hispanic Linguistics Symposium include: "Patterns of Gender Agreement in the Speech of Second Language Learners"; "'Nomas' in Mexican American Dialect"; "Parsing Spanish 'solo'"; "On Levels of Processing and Levels of Comprehension"; "The Role of Attention in Second/Foreign Language Classroom Research: Methodological Issues";…
Freedom: A Promise of Possibility.

PubMed

Bunkers, Sandra Schmidt

2015-10-01

The idea of freedom as a promise of possibility is explored in this column. The core concepts from a research study on considering tomorrow (Bunkers, 1998) coupled with humanbecoming community change processes (Parse, 2003) are used to illuminate this notion. The importance of intentionality in human freedom is discussed from both a human science and a natural science perspective. © The Author(s) 2015.
"It Could Have Been so Much Better": The Aesthetic and Social Work of Theatre

ERIC Educational Resources Information Center

Gallagher, Kathleen; Freeman, Barry; Wessells, Anne

2010-01-01

In this paper, the authors consider early results from their ethnographic research in urban drama classrooms by parsing the aesthetic and social imperatives at play in the classroom. Moved by the observation that teachers and students alike seem to be pursuing elusive aesthetic and social ideals, the authors draw on Judith Butler's notion of…
Some Educational Implications from Research on Story Grammar and Story Comprehension.

ERIC Educational Resources Information Center

Freedman, Jonathan M.; Owings, Richard A.

Folk tales were read to 32 kindergarten children of varying levels of language ability, as measured by the language scale of the Metropolitan Readiness Test. Recall protocols were parsed into the categories described by N. L. Stein and C. G. Glenn. Low ability children were found to be less likely to recall details of "internal plan" and…
RadSearch: a RIS/PACS integrated query tool

NASA Astrophysics Data System (ADS)

Tsao, Sinchai; Documet, Jorge; Moin, Paymann; Wang, Kevin; Liu, Brent J.

2008-03-01

Radiology Information Systems (RIS) contain a wealth of information that can be used for research, education, and practice management. However, the sheer amount of information available makes querying specific data difficult and time consuming. Previous work has shown that a clinical RIS database and its RIS text reports can be extracted, duplicated and indexed for searches while complying with HIPAA and IRB requirements. This project's intent is to provide a software tool, the RadSearch Toolkit, to allow intelligent indexing and parsing of RIS reports for easy yet powerful searches. In addition, the project aims to seamlessly query and retrieve associated images from the Picture Archiving and Communication System (PACS) in situations where an integrated RIS/PACS is in place - even subselecting individual series, such as in an MRI study. RadSearch's application of simple text parsing techniques to index text-based radiology reports will allow the search engine to quickly return relevant results. This powerful combination will be useful in both private practice and academic settings; administrators can easily obtain complex practice management information such as referral patterns; researchers can conduct retrospective studies with specific, multiple criteria; teaching institutions can quickly and effectively create thorough teaching files.
A Method for Analyzing Commonalities in Clinical Trial Target Populations

PubMed Central

He, Zhe; Carini, Simona; Hao, Tianyong; Sim, Ida; Weng, Chunhua

2014-01-01

ClinicalTrials.gov presents great opportunities for analyzing commonalities in clinical trial target populations to facilitate knowledge reuse when designing eligibility criteria of future trials or to reveal potential systematic biases in selecting population subgroups for clinical research. Towards this goal, this paper presents a novel data resource for enabling such analyses. Our method includes two parts: (1) parsing and indexing eligibility criteria text; and (2) mining common eligibility features and attributes of common numeric features (e.g., A1c). We designed and built a database called “Commonalities in Target Populations of Clinical Trials” (COMPACT), which stores structured eligibility criteria and trial metadata in a readily computable format. We illustrate its use in an example analytic module called CONECT using COMPACT as the backend. Type 2 diabetes is used as an example to analyze commonalities in the target populations of 4,493 clinical trials on this disease. PMID:25954450
Perception of object trajectory: parsing retinal motion into self and object movement components.

PubMed

Warren, Paul A; Rushton, Simon K

2007-08-16

A moving observer needs to be able to estimate the trajectory of other objects moving in the scene. Without the ability to do so, it would be difficult to avoid obstacles or catch a ball. We hypothesized that neural mechanisms sensitive to the patterns of motion generated on the retina during self-movement (optic flow) play a key role in this process, "parsing" motion due to self-movement from that due to object movement. We investigated this "flow parsing" hypothesis by measuring the perceived trajectory of a moving probe placed within a flow field that was consistent with movement of the observer. In the first experiment, the flow field was consistent with an eye rotation; in the second experiment, it was consistent with a lateral translation of the eyes. We manipulated the distance of the probe in both experiments and assessed the consequences. As predicted by the flow parsing hypothesis, manipulating the distance of the probe had differing effects on the perceived trajectory of the probe in the two experiments. The results were consistent with the scene geometry and the type of simulated self-movement. In a third experiment, we explored the contribution of local and global motion processing to the results of the first two experiments. The data suggest that the parsing process involves global motion processing, not just local motion contrast. The findings of this study support a role for optic flow processing in the perception of object movement during self-movement.
Error-Correcting Parsing for Syntactic Pattern Recognition

DTIC Science & Technology

1977-08-01

1971. 55. Slromoney, G., Slromoney, R., and K. Krlthlvasan, "Abstract Families of Matrices and Picture Langauges," Computer Graphic and Image...T112 111X1 121 Tine USLO FOR LINXirjG A THtt .186 SEC "INPUT CHARACTER IS A DISTANCE PORN N0*flAL A IS_ 3 TINE USED FOX PARSING S.l&l SEC
Perceiving Event Dynamics and Parsing Hollywood Films

ERIC Educational Resources Information Center

Cutting, James E.; Brunick, Kaitlin L.; Candan, Ayse

2012-01-01

We selected 24 Hollywood movies released from 1940 through 2010 to serve as a film corpus. Eight viewers, three per film, parsed them into events, which are best termed subscenes. While watching a film a second time, viewers scrolled through frames and recorded the frame number where each event began. Viewers agreed about 90% of the time. We then…
Parsing Protocols Using Problem Solving Grammars. AI Memo 385.

ERIC Educational Resources Information Center

Miller, Mark L.; Goldstein, Ira P.

A theory of the planning and debugging of computer programs is formalized as a context free grammar, which is used to reveal the constituent structure of problem solving episodes by parsing protocols in which programs are written, tested, and debugged. This is illustrated by the detailed analysis of an actual session with a beginning student…
Parsing in a Dynamical System: An Attractor-Based Account of the Interaction of Lexical and Structural Constraints in Sentence Processing.

ERIC Educational Resources Information Center

Tabor, Whitney; And Others

1997-01-01

Proposes a dynamical systems approach to parsing in which syntactic hypotheses are associated with attractors in a metric space. The experiments discussed documented various contingent frequency effects that cut across traditional linguistic grains, each of which was predicted by the dynamical systems model. (47 references) (Author/CK)

Effects of Prosodic and Lexical Constraints on Parsing in Young Children (and Adults)

ERIC Educational Resources Information Center

Snedeker, Jesse; Yuan, Sylvia

2008-01-01

Prior studies of ambiguity resolution in young children have found that children rely heavily on lexical information but persistently fail to use referential constraints in online parsing [Trueswell, J.C., Sekerina, I., Hill, N.M., & Logrip, M.L, (1999). The kindergarten-path effect: Studying on-line sentence processing in young children.…
Semantics Boosts Syntax in Artificial Grammar Learning Tasks with Recursion

ERIC Educational Resources Information Center

Fedor, Anna; Varga, Mate; Szathmary, Eors

2012-01-01

Center-embedded recursion (CER) in natural language is exemplified by sentences such as "The malt that the rat ate lay in the house." Parsing center-embedded structures is in the focus of attention because this could be one of the cognitive capacities that make humans distinct from all other animals. The ability to parse CER is usually…
Writing filter processes for the SAGA editor, appendix G

NASA Technical Reports Server (NTRS)

Kirslis, Peter A.

1985-01-01

The SAGA editor provides a mechanism by which separate processes can be invoked during an editing session to traverse portions of the parse tree being edited. These processes, termed filter processes, read, analyze, and possibly transform the parse tree, returning the result to the editor. By defining new commands with the editor's user defined command facility, which invoke filter processes, authors of filter can provide complex operations as simple commands. A tree plotter, pretty printer, and Pascal tree transformation program were already written using this facility. The filter processes are introduced, parse tree structure is described and the library interface made available to the programmer. Also discussed is how to compile and run filter processes. Examples are presented to illustrate aspect of each of these areas.
Using topography to meet wildlife and fuels treatment objectives in fire-suppressed landscapes

Treesearch

Emma C. Underwood; Joshua H. Viers; James F. Quinn; Malcolm North

2010-01-01

Past forest management practices, fire suppression, and climate change are increasing the need to actively manage California Sierra Nevada forests for multiple environmental amenities. Here we present a relatively low-cost, repeatable method for spatially parsing the landscape to help the U.S. Forest Service manage for different forest and fuel conditions to meet...
Robust Deep Semantics for Language Understanding

DTIC Science & Technology

focus on five areas: deep learning, textual inferential relations, relation and event extraction by distant supervision , semantic parsing and...ontology expansion, and coreference resolution. As time went by, the program focus converged towards emphasizing technologies for knowledge base...natural logic methods for text understanding, improved mention coreference algorithms, and the further development of multilingual tools in CoreNLP.
Parsing Heterogeneity in Autism Spectrum Disorders: Visual Scanning of Dynamic Social Scenes in School-Aged Children

ERIC Educational Resources Information Center

Rice, Katherine; Moriuchi, Jennifer M.; Jones, Warren; Klin, Ami

2012-01-01

Objective: To examine patterns of variability in social visual engagement and their relationship to standardized measures of social disability in a heterogeneous sample of school-aged children with autism spectrum disorders (ASD). Method: Eye-tracking measures of visual fixation during free-viewing of dynamic social scenes were obtained for 109…
Exploiting graph kernels for high performance biomedical relation extraction.

PubMed

Panyam, Nagesh C; Verspoor, Karin; Cohn, Trevor; Ramamohanarao, Kotagiri

2018-01-30

Relation extraction from biomedical publications is an important task in the area of semantic mining of text. Kernel methods for supervised relation extraction are often preferred over manual feature engineering methods, when classifying highly ordered structures such as trees and graphs obtained from syntactic parsing of a sentence. Tree kernels such as the Subset Tree Kernel and Partial Tree Kernel have been shown to be effective for classifying constituency parse trees and basic dependency parse graphs of a sentence. Graph kernels such as the All Path Graph kernel (APG) and Approximate Subgraph Matching (ASM) kernel have been shown to be suitable for classifying general graphs with cycles, such as the enhanced dependency parse graph of a sentence. In this work, we present a high performance Chemical-Induced Disease (CID) relation extraction system. We present a comparative study of kernel methods for the CID task and also extend our study to the Protein-Protein Interaction (PPI) extraction task, an important biomedical relation extraction task. We discuss novel modifications to the ASM kernel to boost its performance and a method to apply graph kernels for extracting relations expressed in multiple sentences. Our system for CID relation extraction attains an F-score of 60%, without using external knowledge sources or task specific heuristic or rules. In comparison, the state of the art Chemical-Disease Relation Extraction system achieves an F-score of 56% using an ensemble of multiple machine learning methods, which is then boosted to 61% with a rule based system employing task specific post processing rules. For the CID task, graph kernels outperform tree kernels substantially, and the best performance is obtained with APG kernel that attains an F-score of 60%, followed by the ASM kernel at 57%. The performance difference between the ASM and APG kernels for CID sentence level relation extraction is not significant. In our evaluation of ASM for the PPI task, ASM performed better than APG kernel for the BioInfer dataset, in the Area Under Curve (AUC) measure (74% vs 69%). However, for all the other PPI datasets, namely AIMed, HPRD50, IEPA and LLL, ASM is substantially outperformed by the APG kernel in F-score and AUC measures. We demonstrate a high performance Chemical Induced Disease relation extraction, without employing external knowledge sources or task specific heuristics. Our work shows that graph kernels are effective in extracting relations that are expressed in multiple sentences. We also show that the graph kernels, namely the ASM and APG kernels, substantially outperform the tree kernels. Among the graph kernels, we showed the ASM kernel as effective for biomedical relation extraction, with comparable performance to the APG kernel for datasets such as the CID-sentence level relation extraction and BioInfer in PPI. Overall, the APG kernel is shown to be significantly more accurate than the ASM kernel, achieving better performance on most datasets.
The Effects of Age of Immersion and Working Memory on Second Language Processing of Island Constraints: An Eye-Movement Study

ERIC Educational Resources Information Center

Jung, Sehoon

2017-01-01

One of the central questions in recent second language processing research is whether the types of parsing heuristics and linguistic resources adult L2 learners compute during online processing are qualitatively similar or different from those used by native speakers of the target language. While the current L2 processing literature provides…
Infants' Use of Category Knowledge and Object Attributes when Segregating Objects at 8.5 Months of Age

ERIC Educational Resources Information Center

Needham, Amy; Cantlon, Jessica F.; Ormsbee Holley, Susan M.

2006-01-01

The current research investigates infants' perception of a novel object from a category that is familiar to young infants: key rings. We ask whether experiences obtained outside the lab would allow young infants to parse the visible portions of a partly occluded key ring display into one single unit, presumably as a result of having categorized it…
They Said So on the News: Parsing Media Reports About Birth

PubMed Central

Romano, Amy M.; Lythgoe, Andrea; Goer, Henci

2010-01-01

In this column, the authors reprise recent selections from the Lamaze International research blog, Science & Sensibility. Each selection discusses shortcomings in the news media coverage of childbirth issues. The authors demonstrate how to identify misleading claims in the media and highlight how childbirth educators can apply a common-sense approach and careful fact checking to help women understand the whole story. PMID:20174490
A Network-Based Algorithm for Clustering Multivariate Repeated Measures Data

NASA Technical Reports Server (NTRS)

Koslovsky, Matthew; Arellano, John; Schaefer, Caroline; Feiveson, Alan; Young, Millennia; Lee, Stuart

2017-01-01

The National Aeronautics and Space Administration (NASA) Astronaut Corps is a unique occupational cohort for which vast amounts of measures data have been collected repeatedly in research or operational studies pre-, in-, and post-flight, as well as during multiple clinical care visits. In exploratory analyses aimed at generating hypotheses regarding physiological changes associated with spaceflight exposure, such as impaired vision, it is of interest to identify anomalies and trends across these expansive datasets. Multivariate clustering algorithms for repeated measures data may help parse the data to identify homogeneous groups of astronauts that have higher risks for a particular physiological change. However, available clustering methods may not be able to accommodate the complex data structures found in NASA data, since the methods often rely on strict model assumptions, require equally-spaced and balanced assessment times, cannot accommodate missing data or differing time scales across variables, and cannot process continuous and discrete data simultaneously. To fill this gap, we propose a network-based, multivariate clustering algorithm for repeated measures data that can be tailored to fit various research settings. Using simulated data, we demonstrate how our method can be used to identify patterns in complex data structures found in practice.
Is human sentence parsing serial or parallel? Evidence from event-related brain potentials.

PubMed

Hopf, Jens-Max; Bader, Markus; Meng, Michael; Bayer, Josef

2003-01-01

In this ERP study we investigate the processes that occur in syntactically ambiguous German sentences at the point of disambiguation. Whereas most psycholinguistic theories agree on the view that processing difficulties arise when parsing preferences are disconfirmed (so-called garden-path effects), important differences exist with respect to theoretical assumptions about the parser's recovery from a misparse. A key distinction can be made between parsers that compute all alternative syntactic structures in parallel (parallel parsers) and parsers that compute only a single preferred analysis (serial parsers). To distinguish empirically between parallel and serial parsing models, we compare ERP responses to garden-path sentences with ERP responses to truly ungrammatical sentences. Garden-path sentences contain a temporary and ultimately curable ungrammaticality, whereas truly ungrammatical sentences remain so permanently--a difference which gives rise to different predictions in the two classes of parsing architectures. At the disambiguating word, ERPs in both sentence types show negative shifts of similar onset latency, amplitude, and scalp distribution in an initial time window between 300 and 500 ms. In a following time window (500-700 ms), the negative shift to garden-path sentences disappears at right central parietal sites, while it continues in permanently ungrammatical sentences. These data are taken as evidence for a strictly serial parser. The absence of a difference in the early time window indicates that temporary and permanent ungrammaticalities trigger the same kind of parsing responses. Later differences can be related to successful reanalysis in garden-path but not in ungrammatical sentences. Copyright 2003 Elsevier Science B.V.
Pippi — Painless parsing, post-processing and plotting of posterior and likelihood samples

NASA Astrophysics Data System (ADS)

Scott, Pat

2012-11-01

Interpreting samples from likelihood or posterior probability density functions is rarely as straightforward as it seems it should be. Producing publication-quality graphics of these distributions is often similarly painful. In this short note I describe pippi, a simple, publicly available package for parsing and post-processing such samples, as well as generating high-quality PDF graphics of the results. Pippi is easily and extensively configurable and customisable, both in its options for parsing and post-processing samples, and in the visual aspects of the figures it produces. I illustrate some of these using an existing supersymmetric global fit, performed in the context of a gamma-ray search for dark matter. Pippi can be downloaded and followed at http://github.com/patscott/pippi.
The Parsing Syllable Envelopes Test for Assessment of Amplitude Modulation Discrimination Skills in Children: Development, Normative Data, and Test-Retest Reliability Studies.

PubMed

Cameron, Sharon; Chong-White, Nicky; Mealings, Kiri; Beechey, Tim; Dillon, Harvey; Young, Taegan

2018-02-01

Intensity peaks and valleys in the acoustic signal are salient cues to syllable structure, which is accepted to be a crucial early step in phonological processing. As such, the ability to detect low-rate (envelope) modulations in signal amplitude is essential to parse an incoming speech signal into smaller phonological units. The Parsing Syllable Envelopes (ParSE) test was developed to quantify the ability of children to recognize syllable boundaries using an amplitude modulation detection paradigm. The envelope of a 750-msec steady-state /a/ vowel is modulated into two or three pseudo-syllables using notches with modulation depths varying between 0% and 100% along an 11-step continuum. In an adaptive three-alternative forced-choice procedure, the participant identified whether one, two, or three pseudo-syllables were heard. Development of the ParSE stimuli and test protocols, and collection of normative and test-retest reliability data. Eleven adults (aged 23 yr 10 mo to 50 yr 9 mo, mean 32 yr 10 mo) and 134 typically developing, primary-school children (aged 6 yr 0 mo to 12 yr 4 mo, mean 9 yr 3 mo). There were 73 males and 72 females. Data were collected using a touchscreen computer. Psychometric functions (PFs) were automatically fit to individual data by the ParSE software. Performance was related to the modulation depth at which syllables can be detected with 88% accuracy (referred to as the upper boundary of the uncertainty region [UBUR]). A shallower PF slope reflected a greater level of uncertainty. Age effects were determined based on raw scores. z Scores were calculated to account for the effect of age on performance. Outliers, and individual data for which the confidence interval of the UBUR exceeded a maximum allowable value, were removed. Nonparametric tests were used as the data were skewed toward negative performance. Across participants, the performance criterion (UBUR) was met with a median modulation depth of 42%. The effect of age on the UBUR was significant (p < 0.00001). The UBUR ranged from 50% modulation depth for 6-yr-olds to 25% for adults. Children aged 6-10 had significantly higher uncertainty region boundaries than adults. A skewed distribution toward negative performance occurred (p = 0.00007). There was no significant difference in performance on the ParSE between males and females (p = 0.60). Test-retest z scores were strongly correlated (r = 0.68, p < 0.0000001). The ParSE normative data show that the ability to identify syllable boundaries based on changes in amplitude modulation improves with age, and that some children in the general population have performance much worse than their age peers. The test is suitable for use in planned studies in a clinical population. American Academy of Audiology
Phoneme Restoration Methods Reveal Prosodic Influences on Syntactic Parsing: Data from Bulgarian

ERIC Educational Resources Information Center

Stoyneshka-Raleva, Iglika

2013-01-01

This dissertation introduces and evaluates a new methodology for studying aspects of human language processing and the factors to which it is sensitive. It makes use of the phoneme restoration illusion (Warren, 1970). A small portion of a spoken sentence is replaced by a burst of noise. Listeners typically mentally restore the missing phoneme(s),…
Design of a Low-Cost Adaptive Question Answering System for Closed Domain Factoid Queries

ERIC Educational Resources Information Center

Toh, Huey Ling

2010-01-01

Closed domain question answering (QA) systems achieve precision and recall at the cost of complex language processing techniques to parse the answer corpus. We propose a "query-based" model for indexing answers in a closed domain factoid QA system. Further, we use a phrase term inference method for improving the ranking order of related questions.…
A Development System for Augmented Transition Network Grammars and a Large Grammar for Technical Prose. Technical Report No. 25.

ERIC Educational Resources Information Center

Mayer, John; Kieras, David E.

Using a system based on standard augmented transition network (ATN) parsing approach, this report describes a technique for the rapid development of natural language parsing, called High-Level Grammar Specification Language (HGSL). The first part of the report describes the syntax and semantics of HGSL and the network implementation of each of its…
Conceptual Plural Information Is Used to Guide Early Parsing Decisions: Evidence from Garden-Path Sentences with Reciprocal Verbs

ERIC Educational Resources Information Center

Patson, Nikole D.; Ferreira, Fernanda

2009-01-01

In three eyetracking studies, we investigated the role of conceptual plurality in initial parsing decisions in temporarily ambiguous sentences with reciprocal verbs (e.g., "While the lovers kissed the baby played alone"). We varied the subject of the first clause using three types of plural noun phrases: conjoined noun phrases ("the bride and the…
"gnparser": a powerful parser for scientific names based on Parsing Expression Grammar.

PubMed

Mozzherin, Dmitry Y; Myltsev, Alexander A; Patterson, David J

2017-05-26

Scientific names in biology act as universal links. They allow us to cross-reference information about organisms globally. However variations in spelling of scientific names greatly diminish their ability to interconnect data. Such variations may include abbreviations, annotations, misspellings, etc. Authorship is a part of a scientific name and may also differ significantly. To match all possible variations of a name we need to divide them into their elements and classify each element according to its role. We refer to this as 'parsing' the name. Parsing categorizes name's elements into those that are stable and those that are prone to change. Names are matched first by combining them according to their stable elements. Matches are then refined by examining their varying elements. This two stage process dramatically improves the number and quality of matches. It is especially useful for the automatic data exchange within the context of "Big Data" in biology. We introduce Global Names Parser (gnparser). It is a Java tool written in Scala language (a language for Java Virtual Machine) to parse scientific names. It is based on a Parsing Expression Grammar. The parser can be applied to scientific names of any complexity. It assigns a semantic meaning (such as genus name, species epithet, rank, year of publication, authorship, annotations, etc.) to all elements of a name. It is able to work with nested structures as in the names of hybrids. gnparser performs with ≈99% accuracy and processes 30 million name-strings/hour per CPU thread. The gnparser library is compatible with Scala, Java, R, Jython, and JRuby. The parser can be used as a command line application, as a socket server, a web-app or as a RESTful HTTP-service. It is released under an Open source MIT license. Global Names Parser (gnparser) is a fast, high precision tool for biodiversity informaticians and biologists working with large numbers of scientific names. It can replace expensive and error-prone manual parsing and standardization of scientific names in many situations, and can quickly enhance the interoperability of distributed biological information.
Parse, simulation, and prediction of NOx emission across the Midwestern United States

NASA Astrophysics Data System (ADS)

Fang, H.; Michalski, G. M.; Spak, S.

2017-12-01

Accurately constraining N emissions in space and time has been a challenge for atmospheric scientists. It has been suggested that 15N isotopes may be a way of tracking N emission sources across various spatial and temporal scales. However, the complexity of multiple N sources that can quickly change in intensity has made this a difficult problem. We have used a SMOKE emission model to parse NOx emission across the Midwestern United States for a one-year simulation. An isotope mass balance methods was used to assign 15N values to road, non-road, point, and area sources. The SMOKE emissions and isotope mass balance were then combined to predict the 15N of NOx emissions (Figure 1). This ^15N of NOx emissions model was then incorporated into CMAQ to assess the role of transport and chemistry would impact the 15N value of NOx due to mixing and removal processes. The predicted 15N value of NOx was compared to those in recent measurements of NOx and atmospheric nitrate.

Software Development Of XML Parser Based On Algebraic Tools

NASA Astrophysics Data System (ADS)

Georgiev, Bozhidar; Georgieva, Adriana

2011-12-01

In this paper, is presented one software development and implementation of an algebraic method for XML data processing, which accelerates XML parsing process. Therefore, the proposed in this article nontraditional approach for fast XML navigation with algebraic tools contributes to advanced efforts in the making of an easier user-friendly API for XML transformations. Here the proposed software for XML documents processing (parser) is easy to use and can manage files with strictly defined data structure. The purpose of the presented algorithm is to offer a new approach for search and restructuring hierarchical XML data. This approach permits fast XML documents processing, using algebraic model developed in details in previous works of the same authors. So proposed parsing mechanism is easy accessible to the web consumer who is able to control XML file processing, to search different elements (tags) in it, to delete and to add a new XML content as well. The presented various tests show higher rapidity and low consumption of resources in comparison with some existing commercial parsers.
Parsing fear: A reassessment of the evidence for fear deficits in psychopathy.

PubMed

Hoppenbrouwers, Sylco S; Bulten, Berend H; Brazil, Inti A

2016-06-01

Psychopathy is a personality disorder characterized by interpersonal manipulation and callousness, and reckless and impulsive antisocial behavior. It is often seen as a disorder in which profound emotional disturbances lead to antisocial behavior. A lack of fear in particular has been proposed as an etiologically salient factor. In this review, we employ a conceptual model in which fear is parsed into separate subcomponents. Important historical conceptualizations of psychopathy, the neuroscientific and empirical evidence for fear deficits in psychopathy are compared against this model. The empirical evidence is also subjected to a meta-analysis. We conclude that most studies have used the term "fear" generically, amassing different methods and levels of measurement under the umbrella term "fear." Unlike earlier claims that psychopathy is related to general fearlessness, we show there is evidence that psychopathic individuals have deficits in threat detection and responsivity, but that the evidence for reduced subjective experience of fear in psychopathy is far less compelling. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
The Analysis of Nominal Compounds,

DTIC Science & Technology

1985-12-01

34Phenomenologically plausible parsing" in Proceedings of the 1984 American Association for Aritificial Intelligence Conference, pp. 335-339. 27 Wilensky, R...34December, 1985 - CPTM #8 LJ _DTIC -5ELECTE’ DEC 1 6 198M This series of internal memos describes research in E artificial intelligence conducted under...representational techniques for natural language that have evolved in linguistics and artificial intelligence , it is difficult to find much uniformity in the
Texture Mixing via Universal Simulation

DTIC Science & Technology

2005-08-01

classes and universal simulation. Based on the well-known Lempel and Ziv (LZ) universal compression scheme, the universal type class of a one...length that produce the same tree (dictionary) under the Lempel - Ziv (LZ) incre- mental parsing defined in the well-known LZ78 universal compression ...the well known Lempel - Ziv parsing algorithm . The goal is not just to synthesize mixed textures, but to understand what texture is. We are currently
Parsing and Tagging of Bilingual Dictionary

DTIC Science & Technology

2003-09-01

LAMP-TR-106 CAR-TR-991 CS-TR-4529 UMIACS-TR-2003-97 PARSING ANS TAGGING OF BILINGUAL DICTIONARY Huanfeng Ma1,2, Burcu Karagol-Ayan1,2, David... dictionaries hold great potential as a source of lexical resources for training and testing automated systems for optical character recognition, machine...translation, and cross-language information retrieval. In this paper, we describe a system for extracting term lexicons from printed bilingual dictionaries
Annotating Socio-Cultural Structures in Text

DTIC Science & Technology

2012-10-31

parts of speech (POS) within text, using the Stanford Part of Speech Tagger (Stanford Log-Linear, 2011). The ERDC-CERL taxonomy is then used to...annotated NP/VP Pane: Shows the sentence parsed using the Parts of Speech tagger Document View Pane: Specifies the document (being annotated) in three...first parsed using the Stanford Parts of Speech tagger and converted to an XML document both components which are done through the Import function
Text-to-phonemic transcription and parsing into mono-syllables of English text

NASA Astrophysics Data System (ADS)

Jusgir Mullick, Yugal; Agrawal, S. S.; Tayal, Smita; Goswami, Manisha

2004-05-01

The present paper describes a program that converts the English text (entered through the normal computer keyboard) into its phonemic representation and then parses it into mono-syllables. For every letter a set of context based rules is defined in lexical order. A default rule is also defined separately for each letter. Beginning from the first letter of the word the rules are checked and the most appropriate rule is applied on the letter to find its actual orthographic representation. If no matching rule is found, then the default rule is applied. Current rule sets the next position to be analyzed. Proceeding in the same manner orthographic representation for each word can be found. For example, ``reading'' is represented as ``rEdiNX'' by applying the following rules: r-->r move 1 position ahead ead-->Ed move 3 position ahead i-->i move 1 position ahead ng-->NX move 2 position ahead, i.e., end of word. The phonemic representations obtained from the above procedure are parsed to get mono-syllabic representation for various combinations such as CVC, CVCC, CV, CVCVC, etc. For example, the above phonemic representation will be parsed as rEdiNX---> /rE/ /diNX/. This study is a part of developing TTS for Indian English.
CACTI: free, open-source software for the sequential coding of behavioral interactions.

PubMed

Glynn, Lisa H; Hallgren, Kevin A; Houck, Jon M; Moyers, Theresa B

2012-01-01

The sequential analysis of client and clinician speech in psychotherapy sessions can help to identify and characterize potential mechanisms of treatment and behavior change. Previous studies required coding systems that were time-consuming, expensive, and error-prone. Existing software can be expensive and inflexible, and furthermore, no single package allows for pre-parsing, sequential coding, and assignment of global ratings. We developed a free, open-source, and adaptable program to meet these needs: The CASAA Application for Coding Treatment Interactions (CACTI). Without transcripts, CACTI facilitates the real-time sequential coding of behavioral interactions using WAV-format audio files. Most elements of the interface are user-modifiable through a simple XML file, and can be further adapted using Java through the terms of the GNU Public License. Coding with this software yields interrater reliabilities comparable to previous methods, but at greatly reduced time and expense. CACTI is a flexible research tool that can simplify psychotherapy process research, and has the potential to contribute to the improvement of treatment content and delivery.
What is it that lingers? Garden-path (mis)interpretations in younger and older adults.

PubMed

Malyutina, Svetlana; den Ouden, Dirk-Bart

2016-01-01

Previous research has shown that comprehenders do not always conduct a full (re)analysis of temporarily ambiguous "garden-path" sentences. The present study used a sentence-picture matching task to investigate what kind of representations are formed when full reanalysis is not performed: Do comprehenders "blend" two incompatible representations as a result of shallow syntactic processing or do they erroneously maintain the initial incorrect parsing without incorporating new information, and does this vary with age? Twenty-five younger and 15 older adults performed a multiple-choice sentence-picture matching task with stimuli including early-closure garden-path sentences. The results suggest that the type of erroneous representation is affected by linguistic variables, such as sentence structure, verb type, and semantic plausibility, as well as by age. Older adults' response patterns indicate an increased reliance on inferencing based on lexical and semantic cues, with a lower bar for accepting an initial parse and with a weaker drive to reanalyse a syntactic representation. Among younger adults, there was a tendency to blend two representations into a single interpretation, even if this was not licensed by the syntax.
MASC: Multiprocessor Architecture for Symbolic Processing

DTIC Science & Technology

1989-08-01

accomplish the ultimate goal of reducing the time to develop the 5- 12 correct parse. 5.6.5 Effect of’ Sentence Length Figure 5-6 shows the relationship...later; this effectively restricts the breadth of study rather severely. 6.1.2. Objective The tools described here have been developed in response to...either expressed or implied, of the Defense Advanced Research Projects Agency or the U.S. Government. ROME AIR DEVELOPMENT CENTER Air Force Systems
Prony Ringdown GUI (CERTS Prony Ringdown, part of the DSI Tool Box)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tuffner, Francis; Marinovici, PNNL Laurentiu; Hauer, PNNL John

2014-02-21

The PNNL Prony Ringdown graphical user interface is one analysis tool included in the Dynamic System Identification toolbox (DSI Toolbox). The Dynamic System Identification toolbox is a MATLAB-based collection of tools for parsing and analyzing phasor measurement unit data, especially in regards to small signal stability. It includes tools to read the data, preprocess it, and perform small signal analysis. 5. Method of Solution: The Dynamic System Identification Toolbox (DSI Toolbox) is designed to provide a research environment for examining phasor measurement unit data and performing small signal stability analysis. The software uses a series of text-driven menus to helpmore » guide users and organize the toolbox features. Methods for reading in populate phasor measurement unit data are provided, with appropriate preprocessing options for small-signal-stability analysis. The toolbox includes the Prony Ringdown GUI and basic algorithms to estimate information on oscillatory modes of the system, such as modal frequency and damping ratio.« less
A field ornithologist’s guide to genomics: Practical considerations for ecology and conservation

USGS Publications Warehouse

Oyler-McCance, Sara J.; Oh, Kevin; Langin, Kathryn; Aldridge, Cameron L.

2016-01-01

Vast improvements in sequencing technology have made it practical to simultaneously sequence millions of nucleotides distributed across the genome, opening the door for genomic studies in virtually any species. Ornithological research stands to benefit in three substantial ways. First, genomic methods enhance our ability to parse and simultaneously analyze both neutral and non-neutral genomic regions, thus providing insight into adaptive evolution and divergence. Second, the sheer quantity of sequence data generated by current sequencing platforms allows increased precision and resolution in analyses. Third, high-throughput sequencing can benefit applications that focus on a small number of loci that are otherwise prohibitively expensive, time-consuming, and technically difficult using traditional sequencing methods. These advances have improved our ability to understand evolutionary processes like speciation and local adaptation, but they also offer many practical applications in the fields of population ecology, migration tracking, conservation planning, diet analyses, and disease ecology. This review provides a guide for field ornithologists interested in incorporating genomic approaches into their research program, with an emphasis on techniques related to ecology and conservation. We present a general overview of contemporary genomic approaches and methods, as well as important considerations when selecting a genomic technique. We also discuss research questions that are likely to benefit from utilizing high-throughput sequencing instruments, highlighting select examples from recent avian studies.
Ground Operations Aerospace Language (GOAL). Volume 2: Compiler

NASA Technical Reports Server (NTRS)

1973-01-01

The principal elements and functions of the Ground Operations Aerospace Language (GOAL) compiler are presented. The technique used to transcribe the syntax diagrams into machine processable format for use by the parsing routines is described. An explanation of the parsing technique used to process GOAL source statements is included. The compiler diagnostics and the output reports generated during a GOAL compilation are explained. A description of the GOAL program package is provided.
Imitation as behaviour parsing.

PubMed Central

Byrne, R W

2003-01-01

Non-human great apes appear to be able to acquire elaborate skills partly by imitation, raising the possibility of the transfer of skill by imitation in animals that have only rudimentary mentalizing capacities: in contrast to the frequent assumption that imitation depends on prior understanding of others' intentions. Attempts to understand the apes' behaviour have led to the development of a purely mechanistic model of imitation, the 'behaviour parsing' model, in which the statistical regularities that are inevitable in planned behaviour are used to decipher the organization of another agent's behaviour, and thence to imitate parts of it. Behaviour can thereby be understood statistically in terms of its correlations (circumstances of use, effects on the environment) without understanding of intentions or the everyday physics of cause-and-effect. Thus, imitation of complex, novel behaviour may not require mentalizing, but conversely behaviour parsing may be a necessary preliminary to attributing intention and cause. PMID:12689378
Perceived visual speed constrained by image segmentation

NASA Technical Reports Server (NTRS)

Verghese, P.; Stone, L. S.

1996-01-01

Little is known about how or where the visual system parses the visual scene into objects or surfaces. However, it is generally assumed that the segmentation and grouping of pieces of the image into discrete entities is due to 'later' processing stages, after the 'early' processing of the visual image by local mechanisms selective for attributes such as colour, orientation, depth, and motion. Speed perception is also thought to be mediated by early mechanisms tuned for speed. Here we show that manipulating the way in which an image is parsed changes the way in which local speed information is processed. Manipulations that cause multiple stimuli to appear as parts of a single patch degrade speed discrimination, whereas manipulations that perceptually divide a single large stimulus into parts improve discrimination. These results indicate that processes as early as speed perception may be constrained by the parsing of the visual image into discrete entities.
Revise and resubmit: How real-time parsing limitations influence grammar acquisition

PubMed Central

Pozzan, Lucia; Trueswell, John C.

2015-01-01

We present the results from a three-day artificial language learning study on adults. The study examined whether sentence-parsing limitations, in particular, difficulties revising initial syntactic/semantic commitments during comprehension, shape learners’ ability to acquire a language. Findings show that both comprehension and production of morphology pertaining to sentence argument structure are delayed when this morphology consistently appears at the end, rather than at the beginning, of sentences in otherwise identical grammatical systems. This suggests that real-time processing constraints impact acquisition; morphological cues that tend to guide linguistic analyses are easier to learn than cues that revise these analyses. Parallel performance in production and comprehension indicates that parsing constraints affect grammatical acquisition, not just real-time commitments. Properties of the linguistic system (e.g., ordering of cues within a sentence) interact with the properties of the cognitive system (cognitive control and conflict-resolution abilities) and together affect language acquisition. PMID:26026607
Research on Chinese characters display of airborne MFD based on GL studio

NASA Astrophysics Data System (ADS)

Wang, Zhile; Dong, Junyu; Hu, Wenting; Cui, Yipeng

2018-04-01

GL Studio cannot display Chinese characters during developing the airborne MFD, this paper propose a method of establishing a Chinese character font with GB2312 encoding, establish the font table and the display unit of Chinese characters based on GL Studio. Abstract the storage and display data model of Chinese characters, parse the GB encoding of the corresponding Chinese characters that MFD received, find the coordinates of the Chinese characters in the font table, establish the dynamic control model and the dynamic display model of Chinese characters based on the display unit of Chinese characters. In GL Studio and VC ++.NET environment, this model has been successfully applied to develop the airborne MFD in a variety of mission simulators. This method has successfully solved the problem that GL Studio software cannot develop MFD software of Chinese domestic aircraft and can also be used for other professional airborne MFD development tools such as IDATA. It has been proved by experiments that this is a fast effective scalable and reconfigurable method of developing both actual equipment and simulators.
Effects of Tasks on BOLD Signal Responses to Sentence Contrasts: Review and Commentary

PubMed Central

Caplan, David; Gow, David

2010-01-01

Functional neuroimaging studies of syntactic processing have been interpreted as identifying the neural locations of parsing and interpretive operations. However, current behavioral studies of sentence processing indicate that many operations occur simultaneously with parsing and interpretation. In this review, we point to issues that arise in discriminating the effects of these concurrent processes from those of the parser/interpreter in neural measures and to approaches that may help resolve them. PMID:20932562
Design, Implementation and Testing of a Common Data Model Supporting Autonomous Vehicle Compatibility and Interoperability

DTIC Science & Technology

2006-09-01

is that it is universally applicable. That is, it can be used to parse an instance of any Chomsky Normal Form context-free grammar . This relative... Chomsky -Normal-Form grammar corresponding to the vehicle-specific data format, use of the Cocke-Younger- Kasami algorithm to generate a parse tree...05). The productions of a Chomsky Normal Form context-free grammar have three significant characteristics: • There are no useless symbols (i.e
A Database Design for the Brazilian Air Force Flying Unit Operational Control System.

DTIC Science & Technology

1984-12-14

Company, 1980. 23. Pereira Filho, Jorge da Cunha " Banco de Dados Hoje" Dados e Ideias - Brazilian Magazine , 99 : 55-63 (February 1979). 24. Rich...34QUAIS SAO AS CONDICOES DE EMPREGO OPERACIONAL DO MIRAGE 2120 ?" LIKE "WHAT IS THE FORCESTATUS OF MIRAGE 2120?" % Parsed ! % [ Production added to system...4 - QUAIS SAO AS CONDICOES DE EMPREGO OPERACIONAL DO MIRAGE 2120 ? % Parsed ! % (S DEPLOC) = sbbr % (S-TIME) = 1000 % Endurance = 0200 % (SITCODE

'Visual’ parsing can be taught quickly without visual experience during critical periods

PubMed Central

Reich, Lior; Amedi, Amir

2015-01-01

Cases of invasive sight-restoration in congenital blind adults demonstrated that acquiring visual abilities is extremely challenging, presumably because visual-experience during critical-periods is crucial for learning visual-unique concepts (e.g. size constancy). Visual rehabilitation can also be achieved using sensory-substitution-devices (SSDs) which convey visual information non-invasively through sounds. We tested whether one critical concept – visual parsing, which is highly-impaired in sight-restored patients – can be learned using SSD. To this end, congenitally blind adults participated in a unique, relatively short (~70 hours), SSD-‘vision’ training. Following this, participants successfully parsed 2D and 3D visual objects. Control individuals naïve to SSDs demonstrated that while some aspects of parsing with SSD are intuitive, the blind’s success could not be attributed to auditory processing alone. Furthermore, we had a unique opportunity to compare the SSD-users’ abilities to those reported for sight-restored patients who performed similar tasks visually, and who had months of eyesight. Intriguingly, the SSD-users outperformed the patients on most criteria tested. These suggest that with adequate training and technologies, key high-order visual features can be quickly acquired in adulthood, and lack of visual-experience during critical-periods can be somewhat compensated for. Practically, these highlight the potential of SSDs as standalone-aids or combined with invasive restoration approaches. PMID:26482105
Research Activities of the Northwest Laboratory for Integrated Systems

DTIC Science & Technology

1987-04-06

table, and composite table (to assist evaluation of objects) are each built. The parse tree is also checked to make sure there are no meaningless...Stan- ford) as well as the Apollo DN series. All of these implementations require eight bit planes for effective use of color. Also supported are AED...time of intersection had not yet passed the queuing of the segment was delayed until that time. This algorithm had the effect of preserving the slope of
A fusion network for semantic segmentation using RGB-D data

NASA Astrophysics Data System (ADS)

Yuan, Jiahui; Zhang, Kun; Xia, Yifan; Qi, Lin; Dong, Junyu

2018-04-01

Semantic scene parsing is considerable in many intelligent field, including perceptual robotics. For the past few years, pixel-wise prediction tasks like semantic segmentation with RGB images has been extensively studied and has reached very remarkable parsing levels, thanks to convolutional neural networks (CNNs) and large scene datasets. With the development of stereo cameras and RGBD sensors, it is expected that additional depth information will help improving accuracy. In this paper, we propose a semantic segmentation framework incorporating RGB and complementary depth information. Motivated by the success of fully convolutional networks (FCN) in semantic segmentation field, we design a fully convolutional networks consists of two branches which extract features from both RGB and depth data simultaneously and fuse them as the network goes deeper. Instead of aggregating multiple model, our goal is to utilize RGB data and depth data more effectively in a single model. We evaluate our approach on the NYU-Depth V2 dataset, which consists of 1449 cluttered indoor scenes, and achieve competitive results with the state-of-the-art methods.
Interactive Cohort Identification of Sleep Disorder Patients Using Natural Language Processing and i2b2.

PubMed

Chen, W; Kowatch, R; Lin, S; Splaingard, M; Huang, Y

2015-01-01

Nationwide Children's Hospital established an i2b2 (Informatics for Integrating Biology & the Bedside) application for sleep disorder cohort identification. Discrete data were gleaned from semistructured sleep study reports. The system showed to work more efficiently than the traditional manual chart review method, and it also enabled searching capabilities that were previously not possible. We report on the development and implementation of the sleep disorder i2b2 cohort identification system using natural language processing of semi-structured documents. We developed a natural language processing approach to automatically parse concepts and their values from semi-structured sleep study documents. Two parsers were developed: a regular expression parser for extracting numeric concepts and a NLP based tree parser for extracting textual concepts. Concepts were further organized into i2b2 ontologies based on document structures and in-domain knowledge. 26,550 concepts were extracted with 99% being textual concepts. 1.01 million facts were extracted from sleep study documents such as demographic information, sleep study lab results, medications, procedures, diagnoses, among others. The average accuracy of terminology parsing was over 83% when comparing against those by experts. The system is capable of capturing both standard and non-standard terminologies. The time for cohort identification has been reduced significantly from a few weeks to a few seconds. Natural language processing was shown to be powerful for quickly converting large amount of semi-structured or unstructured clinical data into discrete concepts, which in combination of intuitive domain specific ontologies, allows fast and effective interactive cohort identification through the i2b2 platform for research and clinical use.
Deriving a probabilistic syntacto-semantic grammar for biomedicine based on domain-specific terminologies

PubMed Central

Fan, Jung-Wei; Friedman, Carol

2011-01-01

Biomedical natural language processing (BioNLP) is a useful technique that unlocks valuable information stored in textual data for practice and/or research. Syntactic parsing is a critical component of BioNLP applications that rely on correctly determining the sentence and phrase structure of free text. In addition to dealing with the vast amount of domain-specific terms, a robust biomedical parser needs to model the semantic grammar to obtain viable syntactic structures. With either a rule-based or corpus-based approach, the grammar engineering process requires substantial time and knowledge from experts, and does not always yield a semantically transferable grammar. To reduce the human effort and to promote semantic transferability, we propose an automated method for deriving a probabilistic grammar based on a training corpus consisting of concept strings and semantic classes from the Unified Medical Language System (UMLS), a comprehensive terminology resource widely used by the community. The grammar is designed to specify noun phrases only due to the nominal nature of the majority of biomedical terminological concepts. Evaluated on manually parsed clinical notes, the derived grammar achieved a recall of 0.644, precision of 0.737, and average cross-bracketing of 0.61, which demonstrated better performance than a control grammar with the semantic information removed. Error analysis revealed shortcomings that could be addressed to improve performance. The results indicated the feasibility of an approach which automatically incorporates terminology semantics in the building of an operational grammar. Although the current performance of the unsupervised solution does not adequately replace manual engineering, we believe once the performance issues are addressed, it could serve as an aide in a semi-supervised solution. PMID:21549857
Parsing Heterogeneity in the Brain Connectivity of Depressed and Healthy Adults During Positive Mood.

PubMed

Price, Rebecca B; Lane, Stephanie; Gates, Kathleen; Kraynak, Thomas E; Horner, Michelle S; Thase, Michael E; Siegle, Greg J

2017-02-15

There is well-known heterogeneity in affective mechanisms in depression that may extend to positive affect. We used data-driven parsing of neural connectivity to reveal subgroups present across depressed and healthy individuals during positive processing, informing targets for mechanistic intervention. Ninety-two individuals (68 depressed patients, 24 never-depressed control subjects) completed a sustained positive mood induction during functional magnetic resonance imaging. Directed functional connectivity paths within a depression-relevant network were characterized using Group Iterative Multiple Model Estimation (GIMME), a method shown to accurately recover the direction and presence of connectivity paths in individual participants. During model selection, individuals were clustered using community detection on neural connectivity estimates. Subgroups were externally tested across multiple levels of analysis. Two connectivity-based subgroups emerged: subgroup A, characterized by weaker connectivity overall, and subgroup B, exhibiting hyperconnectivity (relative to subgroup A), particularly among ventral affective regions. Subgroup predicted diagnostic status (subgroup B contained 81% of patients; 50% of control subjects; χ 2 = 8.6, p = .003) and default mode network connectivity during a separate resting-state task. Among patients, subgroup B members had higher self-reported symptoms, lower sustained positive mood during the induction, and higher negative bias on a reaction-time task. Symptom-based depression subgroups did not predict these external variables. Neural connectivity-based categorization travels with diagnostic category and is clinically predictive, but not clinically deterministic. Both patients and control subjects showed heterogeneous, and overlapping, profiles. The larger and more severely affected patient subgroup was characterized by ventrally driven hyperconnectivity during positive processing. Data-driven parsing suggests heterogeneous substrates of depression and possible resilience in control subjects in spite of biological overlap. Copyright © 2016 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Integrating Syntax, Semantics, and Discourse DARPA Natural Language Understanding Program. Volume 3. Papers

DTIC Science & Technology

1989-09-30

parses, in a second experiment. This procedure used PUNDIT’s Selection Pattern Query and Response ( SPQR ) component JLang19881. We first used SPQR in...messages pattern. SPQR continues the analysis of the ISR. from each domain, and the resulting output is and the parsing of the sentence is allowed to...UNISYS P. 0. Box 517, Paoli, PA 19301 ABSTRACT knowledge. This paper presents SPQR (Selectional Pat- One obvious benefit of acquiring domain- tern Queries
owlcpp: a C++ library for working with OWL ontologies.

PubMed

Levin, Mikhail K; Cowell, Lindsay G

2015-01-01

The increasing use of ontologies highlights the need for a library for working with ontologies that is efficient, accessible from various programming languages, and compatible with common computational platforms. We developed owlcpp, a library for storing and searching RDF triples, parsing RDF/XML documents, converting triples into OWL axioms, and reasoning. The library is written in ISO-compliant C++ to facilitate efficiency, portability, and accessibility from other programming languages. Internally, owlcpp uses the Raptor RDF Syntax library for parsing RDF/XML and the FaCT++ library for reasoning. The current version of owlcpp is supported under Linux, OSX, and Windows platforms and provides an API for Python. The results of our evaluation show that, compared to other commonly used libraries, owlcpp is significantly more efficient in terms of memory usage and searching RDF triple stores. owlcpp performs strict parsing and detects errors ignored by other libraries, thus reducing the possibility of incorrect semantic interpretation of ontologies. owlcpp is available at http://owl-cpp.sf.net/ under the Boost Software License, Version 1.0.
An expert system for natural language processing

NASA Technical Reports Server (NTRS)

Hennessy, John F.

1988-01-01

A solution to the natural language processing problem that uses a rule based system, written in OPS5, to replace the traditional parsing method is proposed. The advantage to using a rule based system are explored. Specifically, the extensibility of a rule based solution is discussed as well as the value of maintaining rules that function independently. Finally, the power of using semantics to supplement the syntactic analysis of a sentence is considered.
Criteria for Evaluating the Performance of Compilers

DTIC Science & Technology

1974-10-01

cannot be made to fit, then an auxiliary mechanism outside the parser might be used . Finally, changing the choice of parsing tech - nique to a...was not useful in providing a basic for compiler evaluation. The study of the first question eztablished criteria and methodb for assigning four...program. The study of the second question estab- lished criteria for defining a "compiler Gibson mix", and established methods for using this "mix" to
Multivariate Analysis for the Choice and Evasion of the Student in a Higher Educational Institution from Southern of Santa Catarina, in Brazil

ERIC Educational Resources Information Center

Queiroz, Fernanda Cristina Barbosa Pereira; Samohyl, Robert Wayne; Queiroz, Jamerson Viegas; Lima, Nilton Cesar; de Souza, Gustavo Henrique Silva

2014-01-01

This paper aims to develop and implement a method to identify the causes of the choice of a course and the reasons for evasion in higher education. This way, we sought to identify the factors that influence student choice to opt for Higher Education Institution parsed, as well as the factors influencing its evasion. The methodology employed was…
Automated extraction of Biomarker information from pathology reports.

PubMed

Lee, Jeongeun; Song, Hyun-Je; Yoon, Eunsil; Park, Seong-Bae; Park, Sung-Hye; Seo, Jeong-Wook; Park, Peom; Choi, Jinwook

2018-05-21

Pathology reports are written in free-text form, which precludes efficient data gathering. We aimed to overcome this limitation and design an automated system for extracting biomarker profiles from accumulated pathology reports. We designed a new data model for representing biomarker knowledge. The automated system parses immunohistochemistry reports based on a "slide paragraph" unit defined as a set of immunohistochemistry findings obtained for the same tissue slide. Pathology reports are parsed using context-free grammar for immunohistochemistry, and using a tree-like structure for surgical pathology. The performance of the approach was validated on manually annotated pathology reports of 100 randomly selected patients managed at Seoul National University Hospital. High F-scores were obtained for parsing biomarker name and corresponding test results (0.999 and 0.998, respectively) from the immunohistochemistry reports, compared to relatively poor performance for parsing surgical pathology findings. However, applying the proposed approach to our single-center dataset revealed information on 221 unique biomarkers, which represents a richer result than biomarker profiles obtained based on the published literature. Owing to the data representation model, the proposed approach can associate biomarker profiles extracted from an immunohistochemistry report with corresponding pathology findings listed in one or more surgical pathology reports. Term variations are resolved by normalization to corresponding preferred terms determined by expanded dictionary look-up and text similarity-based search. Our proposed approach for biomarker data extraction addresses key limitations regarding data representation and can handle reports prepared in the clinical setting, which often contain incomplete sentences, typographical errors, and inconsistent formatting.
Integrating Syntax, Semantics, and Discourse DARPA (Defense Advanced Research Projects Agency) Natural Language Understanding Program

DTIC Science & Technology

1988-08-01

heavily on the original SPQR component, and uses the same context free grammar to analyze the ISR. The main difference is that, where before SPQR ...ISR is semantically coherent. This has been tested thoroughly on the CASREPS domain, and selects the same parses that SPQR Eid, in less time. There...were a few SPQR patterns that reflected semantic information that could only be provided by time analysis, such as the fact that [pressure during
Foundations for the Development of a Simple Natural Language Interface for Task Knowledge Elicitation and Representation.

DTIC Science & Technology

1982-01-01

the best known being ELIZA - a simulated Rogerian psychotherapist (Weizenbaum 1966), and PARRY - a simulated paranoid patient (Colby 1968). These...derived from the syntactic aspects of the input, that is, the word classes (noun, verb etc) rather than the word meanings. The concept of parsing is...captures the "full" meaning of a word or concept , consequently few researchers actually seek "absolute" definitions of words. The definition of a word, as
Online Object Tracking, Learning and Parsing with And-Or Graphs.

PubMed

Wu, Tianfu; Lu, Yang; Zhu, Song-Chun

2017-12-01

This paper presents a method, called AOGTracker, for simultaneously tracking, learning and parsing (TLP) of unknown objects in video sequences with a hierarchical and compositional And-Or graph (AOG) representation. The TLP method is formulated in the Bayesian framework with a spatial and a temporal dynamic programming (DP) algorithms inferring object bounding boxes on-the-fly. During online learning, the AOG is discriminatively learned using latent SVM [1] to account for appearance (e.g., lighting and partial occlusion) and structural (e.g., different poses and viewpoints) variations of a tracked object, as well as distractors (e.g., similar objects) in background. Three key issues in online inference and learning are addressed: (i) maintaining purity of positive and negative examples collected online, (ii) controling model complexity in latent structure learning, and (iii) identifying critical moments to re-learn the structure of AOG based on its intrackability. The intrackability measures uncertainty of an AOG based on its score maps in a frame. In experiments, our AOGTracker is tested on two popular tracking benchmarks with the same parameter setting: the TB-100/50/CVPR2013 benchmarks , [3] , and the VOT benchmarks [4] -VOT 2013, 2014, 2015 and TIR2015 (thermal imagery tracking). In the former, our AOGTracker outperforms state-of-the-art tracking algorithms including two trackers based on deep convolutional network [5] , [6] . In the latter, our AOGTracker outperforms all other trackers in VOT2013 and is comparable to the state-of-the-art methods in VOT2014, 2015 and TIR2015.
Omen: identifying potential spear-phishing targets before the email is sent.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wendt, Jeremy Daniel.

2013-07-01

We present the results of a two year project focused on a common social engineering attack method called "spear phishing". In a spear phishing attack, the user receives an email with information specifically focused on the user. This email contains either a malware-laced attachment or a link to download the malware that has been disguised as a useful program. Spear phishing attacks have been one of the most effective avenues for attackers to gain initial entry into a target network. This project focused on a proactive approach to spear phishing. To create an effective, user-specific spear phishing email, the attackermore » must research the intended recipient. We believe that much of the information used by the attacker is provided by the target organization's own external website. Thus when researching potential targets, the attacker leaves signs of his research in the webserver's logs. We created tools and visualizations to improve cybersecurity analysts' abilities to quickly understand a visitor's visit patterns and interests. Given these suspicious visitors and log-parsing tools, analysts can more quickly identify truly suspicious visitors, search for potential spear-phishing targeted users, and improve security around those users before the spear phishing email is sent.« less
CACTI: Free, Open-Source Software for the Sequential Coding of Behavioral Interactions

PubMed Central

Glynn, Lisa H.; Hallgren, Kevin A.; Houck, Jon M.; Moyers, Theresa B.

2012-01-01

The sequential analysis of client and clinician speech in psychotherapy sessions can help to identify and characterize potential mechanisms of treatment and behavior change. Previous studies required coding systems that were time-consuming, expensive, and error-prone. Existing software can be expensive and inflexible, and furthermore, no single package allows for pre-parsing, sequential coding, and assignment of global ratings. We developed a free, open-source, and adaptable program to meet these needs: The CASAA Application for Coding Treatment Interactions (CACTI). Without transcripts, CACTI facilitates the real-time sequential coding of behavioral interactions using WAV-format audio files. Most elements of the interface are user-modifiable through a simple XML file, and can be further adapted using Java through the terms of the GNU Public License. Coding with this software yields interrater reliabilities comparable to previous methods, but at greatly reduced time and expense. CACTI is a flexible research tool that can simplify psychotherapy process research, and has the potential to contribute to the improvement of treatment content and delivery. PMID:22815713
[Advances of portable electrocardiogram monitor design].

PubMed

Ding, Shenping; Wang, Yinghai; Wu, Weirong; Deng, Lingli; Lu, Jidong

2014-06-01

Portable electrocardiogram monitor is an important equipment in the clinical diagnosis of cardiovascular diseases due to its portable, real-time features. It has a broad application and development prospects in China. In the present review, previous researches on the portable electrocardiogram monitors have been arranged, analyzed and summarized. According to the characteristics of the electrocardiogram (ECG), this paper discusses the ergonomic design of the portable electrocardiogram monitor, including hardware and software. The circuit components and software modules were parsed from the ECG features and system functions. Finally, the development trend and reference are provided for the portable electrocardiogram monitors and for the subsequent research and product design.
Natural-Language Parser for PBEM

NASA Technical Reports Server (NTRS)

James, Mark

2010-01-01

A computer program called "Hunter" accepts, as input, a colloquial-English description of a set of policy-based-management rules, and parses that description into a form useable by policy-based enterprise management (PBEM) software. PBEM is a rules-based approach suitable for automating some management tasks. PBEM simplifies the management of a given enterprise through establishment of policies addressing situations that are likely to occur. Hunter was developed to have a unique capability to extract the intended meaning instead of focusing on parsing the exact ways in which individual words are used.
Translation lexicon acquisition from bilingual dictionaries

NASA Astrophysics Data System (ADS)

Doermann, David S.; Ma, Huanfeng; Karagol-Ayan, Burcu; Oard, Douglas W.

2001-12-01

Bilingual dictionaries hold great potential as a source of lexical resources for training automated systems for optical character recognition, machine translation and cross-language information retrieval. In this work we describe a system for extracting term lexicons from printed copies of bilingual dictionaries. We describe our approach to page and definition segmentation and entry parsing. We have used the approach to parse a number of dictionaries and demonstrate the results for retrieval using a French-English Dictionary to generate a translation lexicon and a corpus of English queries applied to French documents to evaluation cross-language IR.

RGSS-ID: an approach to new radiologic reporting system.

PubMed

Ikeda, M; Sakuma, S; Maruyama, K

1990-01-01

RGSS-ID is a developmental computer system that applies artificial intelligence (AI) methods to a reporting system. The representation scheme called Generalized Finding Representation (GFR) is proposed to bridge the gap between natural language expressions in the radiology report and AI methods. The entry process of RGSS-ID is made mainly by selecting items; our system allows a radiologist to compose a sentence which can be completely parsed by the computer. Further RGSS-ID encodes findings into the expression corresponding to GFR, and stores this expression into the knowledge data base. The final printed report is made in the natural language.
Parsing and Quantification of Raw Orbitrap Mass Spectrometer Data Using RawQuant.

PubMed

Kovalchik, Kevin A; Moggridge, Sophie; Chen, David D Y; Morin, Gregg B; Hughes, Christopher S

2018-06-01

Effective analysis of protein samples by mass spectrometry (MS) requires careful selection and optimization of a range of experimental parameters. As the output from the primary detection device, the "raw" MS data file can be used to gauge the success of a given sample analysis. However, the closed-source nature of the standard raw MS file can complicate effective parsing of the data contained within. To ease and increase the range of analyses possible, the RawQuant tool was developed to enable parsing of raw MS files derived from Thermo Orbitrap instruments to yield meta and scan data in an openly readable text format. RawQuant can be commanded to export user-friendly files containing MS 1 , MS 2 , and MS 3 metadata as well as matrices of quantification values based on isobaric tagging approaches. In this study, the utility of RawQuant is demonstrated in several scenarios: (1) reanalysis of shotgun proteomics data for the identification of the human proteome, (2) reanalysis of experiments utilizing isobaric tagging for whole-proteome quantification, and (3) analysis of a novel bacterial proteome and synthetic peptide mixture for assessing quantification accuracy when using isobaric tags. Together, these analyses successfully demonstrate RawQuant for the efficient parsing and quantification of data from raw Thermo Orbitrap MS files acquired in a range of common proteomics experiments. In addition, the individual analyses using RawQuant highlights parametric considerations in the different experimental sets and suggests targetable areas to improve depth of coverage in identification-focused studies and quantification accuracy when using isobaric tags.
Paradigms for Assessing Hedonic Processing and Motivation in Humans: Relevance to Understanding Negative Symptoms in Psychopathology.

PubMed

Barch, Deanna M; Gold, James M; Kring, Ann M

2017-07-01

Clinicians and researchers have long known that one of the debilitating aspects of psychotic disorders is the presence of "negative symptoms," which involve impairments in hedonic and motivational function, and/or alterations in expressive affect. We have a number of excellent clinical tools available for assessing the presence and severity of negative symptoms. However, to better understand the mechanisms that may give rise to negative symptoms, we need tools and methods that can help distinguish among different potential contributing causes, as a means to develop more targeted intervention pathways. Using such paradigms is particularly important if we wish to understand whether the causes are the same or different across disorders that may share surface features of negative symptoms. This approach is in line with the goals of the Research Diagnostic Criteria Initiative, which advocates understanding the nature of core dimensions of brain-behavior relationships transdiagnostically. Here we highlight some of the emerging measures and paradigms that may help us to parse the nature and causes of negative symptoms, illustrating both the research approaches from which they emerge and the types of constructs that they can help elucidate. © The Author 2017. Published by Oxford University Press on behalf of the Maryland Psychiatric Research Center. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Deriving Competencies for Mentors of Clinical and Translational Scholars

PubMed Central

Abedin, Zainab; Biskup, Ewelina; Silet, Karin; Garbutt, Jane M.; Kroenke, Kurt; Feldman, Mitchell D.; McGee, Jr, Richard; Fleming, Michael; Pincus, Harold Alan

2012-01-01

Abstract Although the importance of research mentorship has been well established, the role of mentors of junior clinical and translational science investigators is not clearly defined. The authors attempt to derive a list of actionable competencies for mentors from a series of complementary methods. We examined focus groups, the literature, competencies derived for clinical and translational scholars, mentor training curricula, mentor evaluation forms and finally conducted an expert panel process in order to compose this list. These efforts resulted in a set of competencies that include generic competencies expected of all mentors, competencies specific to scientists, and competencies that are clinical and translational research specific. They are divided into six thematic areas: (1) Communication and managing the relationship, (2) Psychosocial support, (3) Career and professional development, (4) Professional enculturation and scientific integrity, (5) Research development, and (6) Clinical and translational investigator development. For each thematic area, we have listed associated competencies, 19 in total. For each competency, we list examples that are actionable and measurable. Although a comprehensive approach was used to derive this list of competencies, further work will be required to parse out how to apply and adapt them, as well future research directions and evaluation processes. Clin Trans Sci 2012; Volume 5: 273–280 PMID:22686206
On the relation between dependency distance, crossing dependencies, and parsing. Comment on "Dependency distance: a new perspective on syntactic patterns in natural languages" by Haitao Liu et al.

NASA Astrophysics Data System (ADS)

Gómez-Rodríguez, Carlos

2017-07-01

Liu et al. [1] provide a comprehensive account of research on dependency distance in human languages. While the article is a very rich and useful report on this complex subject, here I will expand on a few specific issues where research in computational linguistics (specifically natural language processing) can inform DDM research, and vice versa. These aspects have not been explored much in [1] or elsewhere, probably due to the little overlap between both research communities, but they may provide interesting insights for improving our understanding of the evolution of human languages, the mechanisms by which the brain processes and understands language, and the construction of effective computer systems to achieve this goal.
The Unification Space implemented as a localist neural net: predictions and error-tolerance in a constraint-based parser.

PubMed

Vosse, Theo; Kempen, Gerard

2009-12-01

We introduce a novel computer implementation of the Unification-Space parser (Vosse and Kempen in Cognition 75:105-143, 2000) in the form of a localist neural network whose dynamics is based on interactive activation and inhibition. The wiring of the network is determined by Performance Grammar (Kempen and Harbusch in Verb constructions in German and Dutch. Benjamins, Amsterdam, 2003), a lexicalist formalism with feature unification as binding operation. While the network is processing input word strings incrementally, the evolving shape of parse trees is represented in the form of changing patterns of activation in nodes that code for syntactic properties of words and phrases, and for the grammatical functions they fulfill. The system is capable, at least qualitatively and rudimentarily, of simulating several important dynamic aspects of human syntactic parsing, including garden-path phenomena and reanalysis, effects of complexity (various types of clause embeddings), fault-tolerance in case of unification failures and unknown words, and predictive parsing (expectation-based analysis, surprisal effects). English is the target language of the parser described.
KEGGParser: parsing and editing KEGG pathway maps in Matlab.

PubMed

Arakelyan, Arsen; Nersisyan, Lilit

2013-02-15

KEGG pathway database is a collection of manually drawn pathway maps accompanied with KGML format files intended for use in automatic analysis. KGML files, however, do not contain the required information for complete reproduction of all the events indicated in the static image of a pathway map. Several parsers and editors of KEGG pathways exist for processing KGML files. We introduce KEGGParser-a MATLAB based tool for KEGG pathway parsing, semiautomatic fixing, editing, visualization and analysis in MATLAB environment. It also works with Scilab. The source code is available at http://www.mathworks.com/matlabcentral/fileexchange/37561.
A novel argument for the Universality of Parsing principles.

PubMed

Grillo, Nino; Costa, João

2014-10-01

Previous work on Relative Clause attachment has overlooked a crucial grammatical distinction across both the languages and structures tested: the selective availability of Pseudo Relatives. We reconsider the literature in light of this observation and argue that, all else being equal, local attachment is found with genuine Relative Clauses and that non-local attachment emerges when their surface identical imposters, Pseudo Relatives, are available. Hence, apparent cross-linguistic variation in parsing preferences is reducible to grammatical factors. The results from two novel experiments in Italian are presented in support of these conclusions. Copyright © 2014 Elsevier B.V. All rights reserved.
(abstract) Modeling Protein Families and Human Genes: Hidden Markov Models and a Little Beyond

NASA Technical Reports Server (NTRS)

Baldi, Pierre

1994-01-01

We will first give a brief overview of Hidden Markov Models (HMMs) and their use in Computational Molecular Biology. In particular, we will describe a detailed application of HMMs to the G-Protein-Coupled-Receptor Superfamily. We will also describe a number of analytical results on HMMs that can be used in discrimination tests and database mining. We will then discuss the limitations of HMMs and some new directions of research. We will conclude with some recent results on the application of HMMs to human gene modeling and parsing.
Connections and lingering presence as cocreated art.

PubMed

Dempsey, Leona F

2008-10-01

Parse described nursing practice as a performing art where the nurse is like a dancer. Just as in any dance performance, unplanned events may occur. When a nurse is artistically living, unique and meaningful performances might emerge from unplanned events. In this practice column, the author describes how shifting experiences surfaced with unforeseen connections and lingering presence during her study of feeling confined. In her study she was in true presence with men living in prison, who were diagnosed with severe mental illness. The humanbecoming school of thought was the nursing perspective guiding the research study.
Codestream-Based Identification of JPEG 2000 Images with Different Coding Parameters

NASA Astrophysics Data System (ADS)

Watanabe, Osamu; Fukuhara, Takahiro; Kiya, Hitoshi

A method of identifying JPEG 2000 images with different coding parameters, such as code-block sizes, quantization-step sizes, and resolution levels, is presented. It does not produce false-negative matches regardless of different coding parameters (compression rate, code-block size, and discrete wavelet transform (DWT) resolutions levels) or quantization step sizes. This feature is not provided by conventional methods. Moreover, the proposed approach is fast because it uses the number of zero-bit-planes that can be extracted from the JPEG 2000 codestream by only parsing the header information without embedded block coding with optimized truncation (EBCOT) decoding. The experimental results revealed the effectiveness of image identification based on the new method.
Locating and parsing bibliographic references in HTML medical articles

PubMed Central

Zou, Jie; Le, Daniel; Thoma, George R.

2010-01-01

The set of references that typically appear toward the end of journal articles is sometimes, though not always, a field in bibliographic (citation) databases. But even if references do not constitute such a field, they can be useful as a preprocessing step in the automated extraction of other bibliographic data from articles, as well as in computer-assisted indexing of articles. Automation in data extraction and indexing to minimize human labor is key to the affordable creation and maintenance of large bibliographic databases. Extracting the components of references, such as author names, article title, journal name, publication date and other entities, is therefore a valuable and sometimes necessary task. This paper describes a two-step process using statistical machine learning algorithms, to first locate the references in HTML medical articles and then to parse them. Reference locating identifies the reference section in an article and then decomposes it into individual references. We formulate this step as a two-class classification problem based on text and geometric features. An evaluation conducted on 500 articles drawn from 100 medical journals achieves near-perfect precision and recall rates for locating references. Reference parsing identifies the components of each reference. For this second step, we implement and compare two algorithms. One relies on sequence statistics and trains a Conditional Random Field. The other focuses on local feature statistics and trains a Support Vector Machine to classify each individual word, followed by a search algorithm that systematically corrects low confidence labels if the label sequence violates a set of predefined rules. The overall performance of these two reference-parsing algorithms is about the same: above 99% accuracy at the word level, and over 97% accuracy at the chunk level. PMID:20640222
Locating and parsing bibliographic references in HTML medical articles.

PubMed

Zou, Jie; Le, Daniel; Thoma, George R

2010-06-01

The set of references that typically appear toward the end of journal articles is sometimes, though not always, a field in bibliographic (citation) databases. But even if references do not constitute such a field, they can be useful as a preprocessing step in the automated extraction of other bibliographic data from articles, as well as in computer-assisted indexing of articles. Automation in data extraction and indexing to minimize human labor is key to the affordable creation and maintenance of large bibliographic databases. Extracting the components of references, such as author names, article title, journal name, publication date and other entities, is therefore a valuable and sometimes necessary task. This paper describes a two-step process using statistical machine learning algorithms, to first locate the references in HTML medical articles and then to parse them. Reference locating identifies the reference section in an article and then decomposes it into individual references. We formulate this step as a two-class classification problem based on text and geometric features. An evaluation conducted on 500 articles drawn from 100 medical journals achieves near-perfect precision and recall rates for locating references. Reference parsing identifies the components of each reference. For this second step, we implement and compare two algorithms. One relies on sequence statistics and trains a Conditional Random Field. The other focuses on local feature statistics and trains a Support Vector Machine to classify each individual word, followed by a search algorithm that systematically corrects low confidence labels if the label sequence violates a set of predefined rules. The overall performance of these two reference-parsing algorithms is about the same: above 99% accuracy at the word level, and over 97% accuracy at the chunk level.
Marginal Space Deep Learning: Efficient Architecture for Volumetric Image Parsing.

PubMed

Ghesu, Florin C; Krubasik, Edward; Georgescu, Bogdan; Singh, Vivek; Yefeng Zheng; Hornegger, Joachim; Comaniciu, Dorin

2016-05-01

Robust and fast solutions for anatomical object detection and segmentation support the entire clinical workflow from diagnosis, patient stratification, therapy planning, intervention and follow-up. Current state-of-the-art techniques for parsing volumetric medical image data are typically based on machine learning methods that exploit large annotated image databases. Two main challenges need to be addressed, these are the efficiency in scanning high-dimensional parametric spaces and the need for representative image features which require significant efforts of manual engineering. We propose a pipeline for object detection and segmentation in the context of volumetric image parsing, solving a two-step learning problem: anatomical pose estimation and boundary delineation. For this task we introduce Marginal Space Deep Learning (MSDL), a novel framework exploiting both the strengths of efficient object parametrization in hierarchical marginal spaces and the automated feature design of Deep Learning (DL) network architectures. In the 3D context, the application of deep learning systems is limited by the very high complexity of the parametrization. More specifically 9 parameters are necessary to describe a restricted affine transformation in 3D, resulting in a prohibitive amount of billions of scanning hypotheses. The mechanism of marginal space learning provides excellent run-time performance by learning classifiers in clustered, high-probability regions in spaces of gradually increasing dimensionality. To further increase computational efficiency and robustness, in our system we learn sparse adaptive data sampling patterns that automatically capture the structure of the input. Given the object localization, we propose a DL-based active shape model to estimate the non-rigid object boundary. Experimental results are presented on the aortic valve in ultrasound using an extensive dataset of 2891 volumes from 869 patients, showing significant improvements of up to 45.2% over the state-of-the-art. To our knowledge, this is the first successful demonstration of the DL potential to detection and segmentation in full 3D data with parametrized representations.
[Measurement and performance analysis of functional neural network].

PubMed

Li, Shan; Liu, Xinyu; Chen, Yan; Wan, Hong

2018-04-01

The measurement of network is one of the important researches in resolving neuronal population information processing mechanism using complex network theory. For the quantitative measurement problem of functional neural network, the relation between the measure indexes, i.e. the clustering coefficient, the global efficiency, the characteristic path length and the transitivity, and the network topology was analyzed. Then, the spike-based functional neural network was established and the simulation results showed that the measured network could represent the original neural connections among neurons. On the basis of the former work, the coding of functional neural network in nidopallium caudolaterale (NCL) about pigeon's motion behaviors was studied. We found that the NCL functional neural network effectively encoded the motion behaviors of the pigeon, and there were significant differences in four indexes among the left-turning, the forward and the right-turning. Overall, the establishment method of spike-based functional neural network is available and it is an effective tool to parse the brain information processing mechanism.
Drug related webpages classification using images and text information based on multi-kernel learning

NASA Astrophysics Data System (ADS)

Hu, Ruiguang; Xiao, Liping; Zheng, Wenjuan

2015-12-01

In this paper, multi-kernel learning(MKL) is used for drug-related webpages classification. First, body text and image-label text are extracted through HTML parsing, and valid images are chosen by the FOCARSS algorithm. Second, text based BOW model is used to generate text representation, and image-based BOW model is used to generate images representation. Last, text and images representation are fused with a few methods. Experimental results demonstrate that the classification accuracy of MKL is higher than those of all other fusion methods in decision level and feature level, and much higher than the accuracy of single-modal classification.
Processing of ICARTT Data Files Using Fuzzy Matching and Parser Combinators

NASA Technical Reports Server (NTRS)

Rutherford, Matthew T.; Typanski, Nathan D.; Wang, Dali; Chen, Gao

2014-01-01

In this paper, the task of parsing and matching inconsistent, poorly formed text data through the use of parser combinators and fuzzy matching is discussed. An object-oriented implementation of the parser combinator technique is used to allow for a relatively simple interface for adapting base parsers. For matching tasks, a fuzzy matching algorithm with Levenshtein distance calculations is implemented to match string pair, which are otherwise difficult to match due to the aforementioned irregularities and errors in one or both pair members. Used in concert, the two techniques allow parsing and matching operations to be performed which had previously only been done manually.
Thermo-msf-parser: an open source Java library to parse and visualize Thermo Proteome Discoverer msf files.

PubMed

Colaert, Niklaas; Barsnes, Harald; Vaudel, Marc; Helsens, Kenny; Timmerman, Evy; Sickmann, Albert; Gevaert, Kris; Martens, Lennart

2011-08-05

The Thermo Proteome Discoverer program integrates both peptide identification and quantification into a single workflow for peptide-centric proteomics. Furthermore, its close integration with Thermo mass spectrometers has made it increasingly popular in the field. Here, we present a Java library to parse the msf files that constitute the output of Proteome Discoverer. The parser is also implemented as a graphical user interface allowing convenient access to the information found in the msf files, and in Rover, a program to analyze and validate quantitative proteomics information. All code, binaries, and documentation is freely available at http://thermo-msf-parser.googlecode.com.
ANTLR Tree Grammar Generator and Extensions

NASA Technical Reports Server (NTRS)

Craymer, Loring

2005-01-01

A computer program implements two extensions of ANTLR (Another Tool for Language Recognition), which is a set of software tools for translating source codes between different computing languages. ANTLR supports predicated- LL(k) lexer and parser grammars, a notation for annotating parser grammars to direct tree construction, and predicated tree grammars. [ LL(k) signifies left-right, leftmost derivation with k tokens of look-ahead, referring to certain characteristics of a grammar.] One of the extensions is a syntax for tree transformations. The other extension is the generation of tree grammars from annotated parser or input tree grammars. These extensions can simplify the process of generating source-to-source language translators and they make possible an approach, called "polyphase parsing," to translation between computing languages. The typical approach to translator development is to identify high-level semantic constructs such as "expressions," "declarations," and "definitions" as fundamental building blocks in the grammar specification used for language recognition. The polyphase approach is to lump ambiguous syntactic constructs during parsing and then disambiguate the alternatives in subsequent tree transformation passes. Polyphase parsing is believed to be useful for generating efficient recognizers for C++ and other languages that, like C++, have significant ambiguities.
Acoustic facilitation of object movement detection during self-motion

PubMed Central

Calabro, F. J.; Soto-Faraco, S.; Vaina, L. M.

2011-01-01

In humans, as well as most animal species, perception of object motion is critical to successful interaction with the surrounding environment. Yet, as the observer also moves, the retinal projections of the various motion components add to each other and extracting accurate object motion becomes computationally challenging. Recent psychophysical studies have demonstrated that observers use a flow-parsing mechanism to estimate and subtract self-motion from the optic flow field. We investigated whether concurrent acoustic cues for motion can facilitate visual flow parsing, thereby enhancing the detection of moving objects during simulated self-motion. Participants identified an object (the target) that moved either forward or backward within a visual scene containing nine identical textured objects simulating forward observer translation. We found that spatially co-localized, directionally congruent, moving auditory stimuli enhanced object motion detection. Interestingly, subjects who performed poorly on the visual-only task benefited more from the addition of moving auditory stimuli. When auditory stimuli were not co-localized to the visual target, improvements in detection rates were weak. Taken together, these results suggest that parsing object motion from self-motion-induced optic flow can operate on multisensory object representations. PMID:21307050

Hierarchical parsing and semantic navigation of full body CT data

NASA Astrophysics Data System (ADS)

Seifert, Sascha; Barbu, Adrian; Zhou, S. Kevin; Liu, David; Feulner, Johannes; Huber, Martin; Suehling, Michael; Cavallaro, Alexander; Comaniciu, Dorin

2009-02-01

Whole body CT scanning is a common diagnosis technique for discovering early signs of metastasis or for differential diagnosis. Automatic parsing and segmentation of multiple organs and semantic navigation inside the body can help the clinician in efficiently obtaining accurate diagnosis. However, dealing with the large amount of data of a full body scan is challenging and techniques are needed for the fast detection and segmentation of organs, e.g., heart, liver, kidneys, bladder, prostate, and spleen, and body landmarks, e.g., bronchial bifurcation, coccyx tip, sternum, lung tips. Solving the problem becomes even more challenging if partial body scans are used, where not all organs are present. We propose a new approach to this problem, in which a network of 1D and 3D landmarks is trained to quickly parse the 3D CT data and estimate which organs and landmarks are present as well as their most probable locations and boundaries. Using this approach, the segmentation of seven organs and detection of 19 body landmarks can be obtained in about 20 seconds with state-of-the-art accuracy and has been validated on 80 CT full or partial body scans.
Variations in Medical Subject Headings (MeSH) mapping: from the natural language of patron terms to the controlled vocabulary of mapped lists*

PubMed Central

Gault, Lora V.; Shultz, Mary; Davies, Kathy J.

2002-01-01

Objectives: This study compared the mapping of natural language patron terms to the Medical Subject Headings (MeSH) across six MeSH interfaces for the MEDLINE database. Methods: Test data were obtained from search requests submitted by patrons to the Library of the Health Sciences, University of Illinois at Chicago, over a nine-month period. Search request statements were parsed into separate terms or phrases. Using print sources from the National Library of Medicine, Each parsed patron term was assigned corresponding MeSH terms. Each patron term was entered into each of the selected interfaces to determine how effectively they mapped to MeSH. Data were collected for mapping success, accessibility of MeSH term within mapped list, and total number of MeSH choices within each list. Results: The selected MEDLINE interfaces do not map the same patron term in the same way, nor do they consistently lead to what is considered the appropriate MeSH term. Conclusions: If searchers utilize the MEDLINE database to its fullest potential by mapping to MeSH, the results of the mapping will vary between interfaces. This variance may ultimately impact the search results. These differences should be considered when choosing a MEDLINE interface and when instructing end users. PMID:11999175
PDB explorer -- a web based algorithm for protein annotation viewer and 3D visualization.

PubMed

Nayarisseri, Anuraj; Shardiwal, Rakesh Kumar; Yadav, Mukesh; Kanungo, Neha; Singh, Pooja; Shah, Pratik; Ahmed, Sheaza

2014-12-01

The PDB file format, is a text format characterizing the three dimensional structures of macro molecules available in the Protein Data Bank (PDB). Determined protein structure are found in coalition with other molecules or ions such as nucleic acids, water, ions, Drug molecules and so on, which therefore can be described in the PDB format and have been deposited in PDB database. PDB is a machine generated file, it's not human readable format, to read this file we need any computational tool to understand it. The objective of our present study is to develop a free online software for retrieval, visualization and reading of annotation of a protein 3D structure which is available in PDB database. Main aim is to create PDB file in human readable format, i.e., the information in PDB file is converted in readable sentences. It displays all possible information from a PDB file including 3D structure of that file. Programming languages and scripting languages like Perl, CSS, Javascript, Ajax, and HTML have been used for the development of PDB Explorer. The PDB Explorer directly parses the PDB file, calling methods for parsed element secondary structure element, atoms, coordinates etc. PDB Explorer is freely available at http://www.pdbexplorer.eminentbio.com/home with no requirement of log-in.
Building a comprehensive syntactic and semantic corpus of Chinese clinical texts.

PubMed

He, Bin; Dong, Bin; Guan, Yi; Yang, Jinfeng; Jiang, Zhipeng; Yu, Qiubin; Cheng, Jianyi; Qu, Chunyan

2017-05-01

To build a comprehensive corpus covering syntactic and semantic annotations of Chinese clinical texts with corresponding annotation guidelines and methods as well as to develop tools trained on the annotated corpus, which supplies baselines for research on Chinese texts in the clinical domain. An iterative annotation method was proposed to train annotators and to develop annotation guidelines. Then, by using annotation quality assurance measures, a comprehensive corpus was built, containing annotations of part-of-speech (POS) tags, syntactic tags, entities, assertions, and relations. Inter-annotator agreement (IAA) was calculated to evaluate the annotation quality and a Chinese clinical text processing and information extraction system (CCTPIES) was developed based on our annotated corpus. The syntactic corpus consists of 138 Chinese clinical documents with 47,426 tokens and 2612 full parsing trees, while the semantic corpus includes 992 documents that annotated 39,511 entities with their assertions and 7693 relations. IAA evaluation shows that this comprehensive corpus is of good quality, and the system modules are effective. The annotated corpus makes a considerable contribution to natural language processing (NLP) research into Chinese texts in the clinical domain. However, this corpus has a number of limitations. Some additional types of clinical text should be introduced to improve corpus coverage and active learning methods should be utilized to promote annotation efficiency. In this study, several annotation guidelines and an annotation method for Chinese clinical texts were proposed, and a comprehensive corpus with its NLP modules were constructed, providing a foundation for further study of applying NLP techniques to Chinese texts in the clinical domain. Copyright © 2017. Published by Elsevier Inc.
A rapid place name locating algorithm based on ontology qualitative retrieval, ranking and recommendation

NASA Astrophysics Data System (ADS)

Fan, Hong; Zhu, Anfeng; Zhang, Weixia

2015-12-01

In order to meet the rapid positioning of 12315 complaints, aiming at the natural language expression of telephone complaints, a semantic retrieval framework is proposed which is based on natural language parsing and geographical names ontology reasoning. Among them, a search result ranking and recommended algorithms is proposed which is regarding both geo-name conceptual similarity and spatial geometry relation similarity. The experiments show that this method can assist the operator to quickly find location of 12,315 complaints, increased industry and commerce customer satisfaction.
Grieving a loss: the lived experience for elders residing in an institution.

PubMed

Pilkington, F Beryl

2005-07-01

Grieving a loss is a profound and universal human experience. This phenomenological-hermeneutic study was an inquiry into the lived experience of grieving a loss. The nursing perspective was Parse's human becoming theory. Participants were 10 elderly persons residing in a long-term care facility. The study finding specifies the structure of the lived experience of grieving a loss as aching solitude amid enduring cherished affiliations, as serene acquiescence arises with sorrowful curtailments. Findings are discussed in relation to the guiding theoretical perspective and related literature. Recommendations for additional research and insights for practice are presented.
The value of parsing as feature generation for gene mention recognition

PubMed Central

Smith, Larry H; Wilbur, W John

2009-01-01

We measured the extent to which information surrounding a base noun phrase reflects the presence of a gene name, and evaluated seven different parsers in their ability to provide information for that purpose. Using the GENETAG corpus as a gold standard, we performed machine learning to recognize from its context when a base noun phrase contained a gene name. Starting with the best lexical features, we assessed the gain of adding dependency or dependency-like relations from a full sentence parse. Features derived from parsers improved performance in this partial gene mention recognition task by a small but statistically significant amount. There were virtually no differences between parsers in these experiments. PMID:19345281
Parsing Citations in Biomedical Articles Using Conditional Random Fields

PubMed Central

Zhang, Qing; Cao, Yong-Gang; Yu, Hong

2011-01-01

Citations are used ubiquitously in biomedical full-text articles and play an important role for representing both the rhetorical structure and the semantic content of the articles. As a result, text mining systems will significantly benefit from a tool that automatically extracts the content of a citation. In this study, we applied the supervised machine-learning algorithms Conditional Random Fields (CRFs) to automatically parse a citation into its fields (e.g., Author, Title, Journal, and Year). With a subset of html format open-access PubMed Central articles, we report an overall 97.95% F1-score. The citation parser can be accessed at: http://www.cs.uwm.edu/~qing/projects/cithit/index.html. PMID:21419403
Development of 3D browsing and interactive web system

NASA Astrophysics Data System (ADS)

Shi, Xiaonan; Fu, Jian; Jin, Chaolin

2017-09-01

In the current market, users need to download specific software or plug-ins to browse the 3D model, and browsing the system may be unstable, and it cannot be 3D model interaction issues In order to solve this problem, this paper presents a solution to the interactive browsing of the model in the server-side parsing model, and when the system is applied, the user only needs to input the system URL and upload the 3D model file to operate the browsing The server real-time parsing 3D model, the interactive response speed, these completely follows the user to walk the minimalist idea, and solves the current market block 3D content development question.
Violence against women: the phenomenon of workplace violence against nurses.

PubMed

Child, R J Howerton; Mentes, Janet C

2010-02-01

Registered nurses have been the recipients of an alarming increase in workplace violence (WPV). Emergency and psychiatric nurses have been found to be the most vulnerable and yet few solid reporting procedures exist to fully account for a true number of incidents. Further compounding the problem is the lack of a standard definition of violence to guide reporting procedures, interventions, legislation, and research. While there are certain risk factors that not only predispose the nurse and the patient to WPV, research continues to attempt to parse out which risk factors are the key determinants of WPV and also which interventions prove to be significant in reducing WPV. The nursing shortage is expected only to increase; recruitment and retention of qualified staff members may be deterred by WPV. This necessitates focused research on the phenomenon of workplace violence in health care.
A Grammatical Approach to RNA-RNA Interaction Prediction

NASA Astrophysics Data System (ADS)

Kato, Yuki; Akutsu, Tatsuya; Seki, Hiroyuki

2007-11-01

Much attention has been paid to two interacting RNA molecules involved in post-transcriptional control of gene expression. Although there have been a few studies on RNA-RNA interaction prediction based on dynamic programming algorithm, no grammar-based approach has been proposed. The purpose of this paper is to provide a new modeling for RNA-RNA interaction based on multiple context-free grammar (MCFG). We present a polynomial time parsing algorithm for finding the most likely derivation tree for the stochastic version of MCFG, which is applicable to RNA joint secondary structure prediction including kissing hairpin loops. Also, elementary tests on RNA-RNA interaction prediction have shown that the proposed method is comparable to Alkan et al.'s method.
MMTF-An efficient file format for the transmission, visualization, and analysis of macromolecular structures.

PubMed

Bradley, Anthony R; Rose, Alexander S; Pavelka, Antonín; Valasatava, Yana; Duarte, Jose M; Prlić, Andreas; Rose, Peter W

2017-06-01

Recent advances in experimental techniques have led to a rapid growth in complexity, size, and number of macromolecular structures that are made available through the Protein Data Bank. This creates a challenge for macromolecular visualization and analysis. Macromolecular structure files, such as PDB or PDBx/mmCIF files can be slow to transfer, parse, and hard to incorporate into third-party software tools. Here, we present a new binary and compressed data representation, the MacroMolecular Transmission Format, MMTF, as well as software implementations in several languages that have been developed around it, which address these issues. We describe the new format and its APIs and demonstrate that it is several times faster to parse, and about a quarter of the file size of the current standard format, PDBx/mmCIF. As a consequence of the new data representation, it is now possible to visualize structures with millions of atoms in a web browser, keep the whole PDB archive in memory or parse it within few minutes on average computers, which opens up a new way of thinking how to design and implement efficient algorithms in structural bioinformatics. The PDB archive is available in MMTF file format through web services and data that are updated on a weekly basis.
MMTF—An efficient file format for the transmission, visualization, and analysis of macromolecular structures

PubMed Central

Pavelka, Antonín; Valasatava, Yana; Prlić, Andreas

2017-01-01

Recent advances in experimental techniques have led to a rapid growth in complexity, size, and number of macromolecular structures that are made available through the Protein Data Bank. This creates a challenge for macromolecular visualization and analysis. Macromolecular structure files, such as PDB or PDBx/mmCIF files can be slow to transfer, parse, and hard to incorporate into third-party software tools. Here, we present a new binary and compressed data representation, the MacroMolecular Transmission Format, MMTF, as well as software implementations in several languages that have been developed around it, which address these issues. We describe the new format and its APIs and demonstrate that it is several times faster to parse, and about a quarter of the file size of the current standard format, PDBx/mmCIF. As a consequence of the new data representation, it is now possible to visualize structures with millions of atoms in a web browser, keep the whole PDB archive in memory or parse it within few minutes on average computers, which opens up a new way of thinking how to design and implement efficient algorithms in structural bioinformatics. The PDB archive is available in MMTF file format through web services and data that are updated on a weekly basis. PMID:28574982
Motion based parsing for video from observational psychology

NASA Astrophysics Data System (ADS)

Kokaram, Anil; Doyle, Erika; Lennon, Daire; Joyeux, Laurent; Fuller, Ray

2006-01-01

In Psychology it is common to conduct studies involving the observation of humans undertaking some task. The sessions are typically recorded on video and used for subjective visual analysis. The subjective analysis is tedious and time consuming, not only because much useless video material is recorded but also because subjective measures of human behaviour are not necessarily repeatable. This paper presents tools using content based video analysis that allow automated parsing of video from one such study involving Dyslexia. The tools rely on implicit measures of human motion that can be generalised to other applications in the domain of human observation. Results comparing quantitative assessment of human motion with subjective assessment are also presented, illustrating that the system is a useful scientific tool.
GFFview: A Web Server for Parsing and Visualizing Annotation Information of Eukaryotic Genome.

PubMed

Deng, Feilong; Chen, Shi-Yi; Wu, Zhou-Lin; Hu, Yongsong; Jia, Xianbo; Lai, Song-Jia

2017-10-01

Owing to wide application of RNA sequencing (RNA-seq) technology, more and more eukaryotic genomes have been extensively annotated, such as the gene structure, alternative splicing, and noncoding loci. Annotation information of genome is prevalently stored as plain text in General Feature Format (GFF), which could be hundreds or thousands Mb in size. Therefore, it is a challenge for manipulating GFF file for biologists who have no bioinformatic skill. In this study, we provide a web server (GFFview) for parsing the annotation information of eukaryotic genome and then generating statistical description of six indices for visualization. GFFview is very useful for investigating quality and difference of the de novo assembled transcriptome in RNA-seq studies.
Experiential wellbeing data from the American Time Use Survey: Comparisons with other methods and analytic illustrations with age and income.

PubMed

Stone, Arthur A; Schneider, Stefan; Krueger, Alan; Schwartz, Joseph E; Deaton, Angus

2018-02-01

There has been a recent upsurge of interest in self-reported measures of wellbeing by official statisticians and by researchers in the social sciences. This paper considers data from a wellbeing supplement to the American Time Use Survey (ATUS), which parsed the previous day into episodes. Respondents provided ratings of five experiential wellbeing adjectives (happiness, stress, tiredness, sadness, and pain) for each of three randomly selected episodes. Because the ATUS Well-being module has not received very much attention, in this paper we provide the reader with details about the features of these data and our approach to analyzing the data (e.g., weighting considerations), and then illustrate the applicability of these data to current issues. Specifically, we examine the association of age and income with all of the experiential wellbeing adjective in the ATUS. Results from the ATUS wellbeing module were broadly consistent with earlier findings on age, but did not confirm all earlier findings between income and wellbeing. We conclude that the ATUS, with its measurement of time use, specific activities, and hedonic experience in a nationally representative survey, offers a unique opportunity to incorporate time use into the burgeoning field of wellbeing research.
Parsing partial molar volumes of small molecules: a molecular dynamics study.

PubMed

Patel, Nisha; Dubins, David N; Pomès, Régis; Chalikian, Tigran V

2011-04-28

We used molecular dynamics (MD) simulations in conjunction with the Kirkwood-Buff theory to compute the partial molar volumes for a number of small solutes of various chemical natures. We repeated our computations using modified pair potentials, first, in the absence of the Coulombic term and, second, in the absence of the Coulombic and the attractive Lennard-Jones terms. Comparison of our results with experimental data and the volumetric results of Monte Carlo simulation with hard sphere potentials and scaled particle theory-based computations led us to conclude that, for small solutes, the partial molar volume computed with the Lennard-Jones potential in the absence of the Coulombic term nearly coincides with the cavity volume. On the other hand, MD simulations carried out with the pair interaction potentials containing only the repulsive Lennard-Jones term produce unrealistically large partial molar volumes of solutes that are close to their excluded volumes. Our simulation results are in good agreement with the reported schemes for parsing partial molar volume data on small solutes. In particular, our determined interaction volumes() and the thickness of the thermal volume for individual compounds are in good agreement with empirical estimates. This work is the first computational study that supports and lends credence to the practical algorithms of parsing partial molar volume data that are currently in use for molecular interpretations of volumetric data.
Integrated circuit test-port architecture and method and apparatus of test-port generation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Teifel, John

A method and apparatus are provided for generating RTL code for a test-port interface of an integrated circuit. In an embodiment, a test-port table is provided as input data. A computer automatically parses the test-port table into data structures and analyzes it to determine input, output, local, and output-enable port names. The computer generates address-detect and test-enable logic constructed from combinational functions. The computer generates one-hot multiplexer logic for at least some of the output ports. The one-hot multiplexer logic for each port is generated so as to enable the port to toggle between data signals and test signals. Themore » computer then completes the generation of the RTL code.« less
Development of Pulsar Detection Methods for a Galactic Center Search

NASA Astrophysics Data System (ADS)

Thornton, Stephen; Wharton, Robert; Cordes, James; Chatterjee, Shami

2018-01-01

Finding pulsars within the inner parsec of the galactic center would be incredibly beneficial: for pulsars sufficiently close to Sagittarius A*, extremely precise tests of general relativity in the strong field regime could be performed through measurement of post-Keplerian parameters. Binary pulsar systems with sufficiently short orbital periods could provide the same laboratories with which to test existing theories. Fast and efficient methods are needed to parse large sets of time-domain data from different telescopes to search for periodicity in signals and differentiate radio frequency interference (RFI) from pulsar signals. Here we demonstrate several techniques to reduce red noise (low-frequency interference), generate signals from pulsars in binary orbits, and create plots that allow for fast detection of both RFI and pulsars.
morph

DOE Office of Scientific and Technical Information (OSTI.GOV)

Goodall, John; Iannacone, Mike; Athalye, Anish

2013-08-01

Morph is a framework and domain-specific language (DSL) that helps parse and transform structured documents. It currently supports several file formats including XML, JSON, and CSV, and custom formats are usable as well.

NASA Taxonomies for Searching Problem Reports and FMEAs

NASA Technical Reports Server (NTRS)

Malin, Jane T.; Throop, David R.

2006-01-01

Many types of hazard and risk analyses are used during the life cycle of complex systems, including Failure Modes and Effects Analysis (FMEA), Hazard Analysis, Fault Tree and Event Tree Analysis, Probabilistic Risk Assessment, Reliability Analysis and analysis of Problem Reporting and Corrective Action (PRACA) databases. The success of these methods depends on the availability of input data and the analysts knowledge. Standard nomenclature can increase the reusability of hazard, risk and problem data. When nomenclature in the source texts is not standard, taxonomies with mapping words (sets of rough synonyms) can be combined with semantic search to identify items and tag them with metadata based on a rich standard nomenclature. Semantic search uses word meanings in the context of parsed phrases to find matches. The NASA taxonomies provide the word meanings. Spacecraft taxonomies and ontologies (generalization hierarchies with attributes and relationships, based on terms meanings) are being developed for types of subsystems, functions, entities, hazards and failures. The ontologies are broad and general, covering hardware, software and human systems. Semantic search of Space Station texts was used to validate and extend the taxonomies. The taxonomies have also been used to extract system connectivity (interaction) models and functions from requirements text. Now the Reconciler semantic search tool and the taxonomies are being applied to improve search in the Space Shuttle PRACA database, to discover recurring patterns of failure. Usual methods of string search and keyword search fall short because the entries are terse and have numerous shortcuts (irregular abbreviations, nonstandard acronyms, cryptic codes) and modifier words cannot be used in sentence context to refine the search. The limited and fixed FMEA categories associated with the entries do not make the fine distinctions needed in the search. The approach assigns PRACA report titles to problem classes in the taxonomy. Each ontology class includes mapping words - near-synonyms naming different manifestations of that problem class. The mapping words for Problems, Entities and Functions are converted to a canonical form plus any of a small set of modifier words (e.g. non-uniformity NOT + UNIFORM.) The report titles are parsed as sentences if possible, or treated as a flat sequence of word tokens if parsing fails. When canonical forms in the title match mapping words, the PRACA entry is associated with the corresponding Problem, Entity or Function in the ontology. The user can search for types of failures associated with types of equipment, clustering by type of problem (e.g., all bearings found with problems of being uneven: rough, irregular, gritty ). The results could also be used for tagging PRACA report entries with rich metadata. This approach could also be applied to searching and tagging failure modes, failure effects and mitigations in FMEAs. In the pilot work, parsing 52K+ truncated titles (the test cases that were available), has resulted in identification of both a type of equipment and type of problem in about 75% of the cases. The results are displayed in a manner analogous to Google search results. The effort has also led to the enrichment of the taxonomy, adding some new categories and many new mapping words. Further work would make enhancements that have been identified for improving the clustering and further reducing the false alarm rate. (In searching for recurring problems, good clustering is more important than reducing false alarms). Searching complete PRACA reports should lead to immediate improvement.
Toward a theory of distributed word expert natural language parsing

NASA Technical Reports Server (NTRS)

Rieger, C.; Small, S.

1981-01-01

An approach to natural language meaning-based parsing in which the unit of linguistic knowledge is the word rather than the rewrite rule is described. In the word expert parser, knowledge about language is distributed across a population of procedural experts, each representing a word of the language, and each an expert at diagnosing that word's intended usage in context. The parser is structured around a coroutine control environment in which the generator-like word experts ask questions and exchange information in coming to collective agreement on sentence meaning. The word expert theory is advanced as a better cognitive model of human language expertise than the traditional rule-based approach. The technical discussion is organized around examples taken from the prototype LISP system which implements parts of the theory.
Genes, Environments, Personality, and Successful Aging: Toward a Comprehensive Developmental Model in Later Life

PubMed Central

Krueger, Robert F.; South, Susan C.; Gruenewald, Tara L.; Seeman, Teresa E.; Roberts, Brent W.

2012-01-01

Background. Outcomes in aging and health research, such as longevity, can be conceptualized as reflecting both genetic and environmental (nongenetic) effects. Parsing genetic and environmental influences can be challenging, particularly when taking a life span perspective, but an understanding of how genetic variants and environments relate to successful aging is critical to public health and intervention efforts. Methods. We review the literature, and survey promising methods, to understand this interplay. We also propose the investigation of personality as a nexus connecting genetics, environments, and health outcomes. Results. Personality traits may reflect psychological mechanisms by which underlying etiologic (genetic and environmental) effects predispose individuals to broad propensities to engage in (un)healthy patterns of behavior across the life span. In terms of methodology, traditional behavior genetic approaches have been used profitably to understand how genetic factors and environments relate to health and personality in somewhat separate literatures; we discuss how other behavior genetic approaches can help connect these literatures and provide new insights. Conclusions. Co-twin control designs can be employed to help determine causality via a closer approximation of the idealized counterfactual design. Gene-by-environment interaction (G × E) designs can be employed to understand how individual difference characteristics, such as personality, might moderate genetic and environmental influences on successful aging outcomes. Application of such methods can clarify the interplay of genes, environments, personality, and successful aging. PMID:22454369
Decision paths in complex tasks

NASA Technical Reports Server (NTRS)

Galanter, Eugene

1991-01-01

Complex real world action and its prediction and control has escaped analysis by the classical methods of psychological research. The reason is that psychologists have no procedures to parse complex tasks into their constituents. Where such a division can be made, based say on expert judgment, there is no natural scale to measure the positive or negative values of the components. Even if we could assign numbers to task parts, we lack rules i.e., a theory, to combine them into a total task representation. We compare here two plausible theories for the amalgamation of the value of task components. Both of these theories require a numerical representation of motivation, for motivation is the primary variable that guides choice and action in well-learned tasks. We address this problem of motivational quantification and performance prediction by developing psychophysical scales of the desireability or aversiveness of task components based on utility scaling methods (Galanter 1990). We modify methods used originally to scale sensory magnitudes (Stevens and Galanter 1957), and that have been applied recently to the measure of task 'workload' by Gopher and Braune (1984). Our modification uses utility comparison scaling techniques which avoid the unnecessary assumptions made by Gopher and Braune. Formula for the utility of complex tasks based on the theoretical models are used to predict decision and choice of alternate paths to the same goal.
Research issues of geometry-based visual languages and some solutions

NASA Astrophysics Data System (ADS)

Green, Thorn G.

This dissertation addresses the problem of how to design visual language systems that are based upon Geometric Algebra, and provide a visual coupling of algebraic expressions and geometric depictions. This coupling of algebraic expressions and geometric depictions provides a new means for expressing both mathematical and geometric relationships present in mathematics, physics, and Computer-Aided Geometric Design (CAGD). Another significant feature of such a system is that the result of changing a parameter (by dragging the mouse) can be seen immediately in the depiction(s) of all expressions that use that parameter. This greatly aides the cognition of the relationships between variables. Systems for representing such a coupling of algebra and geometry have characteristics of both visual language systems, and systems for scientific visualization. Instead of using a parsing or dataflow paradigm for the visual language representation, the systems instead represent equations as manipulatible constrained diagrams for their visualization. This requires that the design of such a system have (but is not limited to) a means for parsing equations entered by the user, a scheme for producing a visual representation of these equations; techniques for maintaining the coupling between the expressions entered and the diagrams displayed; algorithms for maintaining the consistency of the diagrams; and, indexing capabilities that are efficient enough to allow diagrams to be created, and manipulated in a short enough period of time. The author proposes solutions for how such a design can be realized.
Detection and Differentiation of Frontotemporal Dementia and Related Disorders From Alzheimer Disease Using the Montreal Cognitive Assessment.

PubMed

Coleman, Kristy K L; Coleman, Brenda L; MacKinley, Julia D; Pasternak, Stephen H; Finger, Elizabeth C

2016-01-01

The Montreal Cognitive Assessment (MoCA) is a cognitive screening tool used by practitioners worldwide. The efficacy of the MoCA for screening frontotemporal dementia (FTD) and related disorders is unknown. The objectives were: (1) to determine whether the MoCA detects cognitive impairment (CI) in FTD subjects; (2) to determine whether Alzheimer disease (AD) and FTD subtypes and related disorders can be parsed using the MoCA; and (3) describe longitudinal MoCA performance by subtype. We extracted demographic and testing data from a database of patients referred to a cognitive neurology clinic who met criteria for probable AD or FTD (N=192). Logistic regression was used to determine whether dementia subtypes were associated with overall scores, subscores, or combinations of subscores on the MoCA. Initial MoCA results demonstrated CI in the majority of FTD subjects (87%). FTD subjects (N=94) performed better than AD subjects (N=98) on the MoCA (mean scores: 18.1 vs. 16.3; P=0.02). Subscores parsed many, but not all subtypes. FTD subjects had a larger decline on the MoCA within 13 to 36 months than AD subjects (P=0.02). The results indicate that the MoCA is a useful tool to identify and track progression of CI in FTD. Further, the data informs future research on scoring models for the MoCA to enhance cognitive screening and detection of FTD patients.
Methods and means used in programming intelligent searches of technical documents

NASA Technical Reports Server (NTRS)

Gross, David L.

1993-01-01

In order to meet the data research requirements of the Safety, Reliability & Quality Assurance activities at Kennedy Space Center (KSC), a new computer search method for technical data documents was developed. By their very nature, technical documents are partially encrypted because of the author's use of acronyms, abbreviations, and shortcut notations. This problem of computerized searching is compounded at KSC by the volume of documentation that is produced during normal Space Shuttle operations. The Centralized Document Database (CDD) is designed to solve this problem. It provides a common interface to an unlimited number of files of various sizes, with the capability to perform any diversified types and levels of data searches. The heart of the CDD is the nature and capability of its search algorithms. The most complex form of search that the program uses is with the use of a domain-specific database of acronyms, abbreviations, synonyms, and word frequency tables. This database, along with basic sentence parsing, is used to convert a request for information into a relational network. This network is used as a filter on the original document file to determine the most likely locations for the data requested. This type of search will locate information that traditional techniques, (i.e., Boolean structured key-word searching), would not find.
Development of a New Paradigm for Analysis of Disdrometric Data

NASA Astrophysics Data System (ADS)

Larsen, Michael L.; Kostinski, Alexander B.

2017-04-01

A number of disdrometers currently on the market are able to characterize hydrometeors on a drop-by-drop basis with arrival timestamps associated with each arriving hydrometeor. This allows an investigator to parse a time series into disjoint intervals that have equal numbers of drops, instead of the traditional subdivision into equal time intervals. Such a "fixed-N" partitioning of the data can provide several advantages over the traditional equal time binning method, especially within the context of quantifying measurement uncertainty (which typically scales with the number of hydrometeors in each sample). An added bonus is the natural elimination of measurements that are devoid of all drops. This analysis method is investigated by utilizing data from a dense array of disdrometers located near Charleston, South Carolina, USA. Implications for the usefulness of this method in future studies are explored.
Top tagging: a method for identifying boosted hadronically decaying top quarks.

PubMed

Kaplan, David E; Rehermann, Keith; Schwartz, Matthew D; Tweedie, Brock

2008-10-03

A method is introduced for distinguishing top jets (boosted, hadronically decaying top quarks) from light-quark and gluon jets using jet substructure. The procedure involves parsing the jet cluster to resolve its subjets and then imposing kinematic constraints. With this method, light-quark or gluon jets with p{T} approximately 1 TeV can be rejected with an efficiency of around 99% while retaining up to 40% of top jets. This reduces the dijet background to heavy tt[over ] resonances by a factor of approximately 10 000, thereby allowing resonance searches in tt[over ] to be extended into the all-hadronic channel. In addition, top tagging can be used in tt[over ] events when one of the top quarks decays semileptonically, in events with missing energy, and in studies of b-tagging efficiency at high p{T}.
A systems approach to bone pathophysiology.

PubMed

Weiss, Aaron J; Lipshtat, Azi; Mechanick, Jeffrey I

2010-11-01

With evolving interest in multiscalar biological systems one could assume that reductionist approaches may not fully describe biological complexity. Instead, tools such as mathematical modeling, network analysis, and other multiplexed clinical- and research-oriented tests enable rapid analyses of high-throughput data parsed at the genomic, proteomic, metabolomic, and physiomic levels. A physiomic-level approach allows for recursive horizontal and vertical integration of subsystem coupling across and within spatiotemporal scales. Additionally, this methodology recognizes previously ignored subsystems and the strong, nonintuitively obvious and indirect connections among physiological events that potentially account for the uncertainties in medicine. In this review, we flip the reductionist research paradigm and review the concept of systems biology and its applications to bone pathophysiology. Specifically, a bone-centric physiome model is presented that incorporates systemic-level processes with their respective therapeutic implications. © 2010 New York Academy of Sciences.
Design and Implementation of a C++ Software Package to scan for and parse Tsunami Messages issued by the Tsunami Warning Centers for Operational use at the Pacific Tsunami Warning Center

NASA Astrophysics Data System (ADS)

Sardina, V.

2012-12-01

The US Tsunami Warning Centers (TWCs) have traditionally generated their tsunami message products primarily as blocks of text then tagged with headers that identify them on each particular communications' (comms) circuit. Each warning center has a primary area of responsibility (AOR) within which it has an authoritative role regarding parameters such as earthquake location and magnitude. This means that when a major tsunamigenic event occurs the other warning centers need to quickly access the earthquake parameters issued by the authoritative warning center before issuing their message products intended for customers in their own AOR. Thus, within the operational context of the TWCs the scientists on duty have an operational need to access the information contained in the message products issued by other warning centers as quickly as possible. As a solution to this operational problem we designed and implemented a C++ software package that allows scanning for and parsing the entire suite of tsunami message products issued by the Pacific Tsunami Warning Center (PTWC), the West Coast and Alaska Tsunami Warning Center (WCATWC), and the Japan Meteorological Agency (JMA). The scanning and parsing classes composing the resulting C++ software package allow parsing both non-official message products(observatory messages) routinely issued by the TWCs, and all official tsunami message products such as tsunami advisories, watches, and warnings. This software package currently allows scientists on duty at the PTWC to automatically retrieve the parameters contained in tsunami messages issued by WCATWC, JMA, or PTWC itself. Extension of the capabilities of the classes composing the software package would make it possible to generate XML and CAP compliant versions of the TWCs' message products until new messaging software natively adds this capabilities. Customers who receive the TWCs' tsunami message products could also use the package to automatically retrieve information from messages sent via any text-based communications' circuit currently used by the TWCs to disseminate their tsunami message products.
Seabed mapping and characterization of sediment variability using the usSEABED data base

USGS Publications Warehouse

Goff, J.A.; Jenkins, C.J.; Jeffress, Williams S.

2008-01-01

We present a methodology for statistical analysis of randomly located marine sediment point data, and apply it to the US continental shelf portions of usSEABED mean grain size records. The usSEABED database, like many modern, large environmental datasets, is heterogeneous and interdisciplinary. We statistically test the database as a source of mean grain size data, and from it provide a first examination of regional seafloor sediment variability across the entire US continental shelf. Data derived from laboratory analyses ("extracted") and from word-based descriptions ("parsed") are treated separately, and they are compared statistically and deterministically. Data records are selected for spatial analysis by their location within sample regions: polygonal areas defined in ArcGIS chosen by geography, water depth, and data sufficiency. We derive isotropic, binned semivariograms from the data, and invert these for estimates of noise variance, field variance, and decorrelation distance. The highly erratic nature of the semivariograms is a result both of the random locations of the data and of the high level of data uncertainty (noise). This decorrelates the data covariance matrix for the inversion, and largely prevents robust estimation of the fractal dimension. Our comparison of the extracted and parsed mean grain size data demonstrates important differences between the two. In particular, extracted measurements generally produce finer mean grain sizes, lower noise variance, and lower field variance than parsed values. Such relationships can be used to derive a regionally dependent conversion factor between the two. Our analysis of sample regions on the US continental shelf revealed considerable geographic variability in the estimated statistical parameters of field variance and decorrelation distance. Some regional relationships are evident, and overall there is a tendency for field variance to be higher where the average mean grain size is finer grained. Surprisingly, parsed and extracted noise magnitudes correlate with each other, which may indicate that some portion of the data variability that we identify as "noise" is caused by real grain size variability at very short scales. Our analyses demonstrate that by applying a bias-correction proxy, usSEABED data can be used to generate reliable interpolated maps of regional mean grain size and sediment character.
Evaluating the Reliability of Emergency Response Systems for Large-Scale Incident Operations

PubMed Central

Jackson, Brian A.; Faith, Kay Sullivan; Willis, Henry H.

2012-01-01

Abstract The ability to measure emergency preparedness—to predict the likely performance of emergency response systems in future events—is critical for policy analysis in homeland security. Yet it remains difficult to know how prepared a response system is to deal with large-scale incidents, whether it be a natural disaster, terrorist attack, or industrial or transportation accident. This research draws on the fields of systems analysis and engineering to apply the concept of system reliability to the evaluation of emergency response systems. The authors describe a method for modeling an emergency response system; identifying how individual parts of the system might fail; and assessing the likelihood of each failure and the severity of its effects on the overall response effort. The authors walk the reader through two applications of this method: a simplified example in which responders must deliver medical treatment to a certain number of people in a specified time window, and a more complex scenario involving the release of chlorine gas. The authors also describe an exploratory analysis in which they parsed a set of after-action reports describing real-world incidents, to demonstrate how this method can be used to quantitatively analyze data on past response performance. The authors conclude with a discussion of how this method of measuring emergency response system reliability could inform policy discussion of emergency preparedness, how system reliability might be improved, and the costs of doing so. PMID:28083267
Biomedical discovery acceleration, with applications to craniofacial development.

PubMed

Leach, Sonia M; Tipney, Hannah; Feng, Weiguo; Baumgartner, William A; Kasliwal, Priyanka; Schuyler, Ronald P; Williams, Trevor; Spritz, Richard A; Hunter, Lawrence

2009-03-01

The profusion of high-throughput instruments and the explosion of new results in the scientific literature, particularly in molecular biomedicine, is both a blessing and a curse to the bench researcher. Even knowledgeable and experienced scientists can benefit from computational tools that help navigate this vast and rapidly evolving terrain. In this paper, we describe a novel computational approach to this challenge, a knowledge-based system that combines reading, reasoning, and reporting methods to facilitate analysis of experimental data. Reading methods extract information from external resources, either by parsing structured data or using biomedical language processing to extract information from unstructured data, and track knowledge provenance. Reasoning methods enrich the knowledge that results from reading by, for example, noting two genes that are annotated to the same ontology term or database entry. Reasoning is also used to combine all sources into a knowledge network that represents the integration of all sorts of relationships between a pair of genes, and to calculate a combined reliability score. Reporting methods combine the knowledge network with a congruent network constructed from experimental data and visualize the combined network in a tool that facilitates the knowledge-based analysis of that data. An implementation of this approach, called the Hanalyzer, is demonstrated on a large-scale gene expression array dataset relevant to craniofacial development. The use of the tool was critical in the creation of hypotheses regarding the roles of four genes never previously characterized as involved in craniofacial development; each of these hypotheses was validated by further experimental work.
The design and implementation of an automated system for logging clinical experiences using an anesthesia information management system.

PubMed

Simpao, Allan; Heitz, James W; McNulty, Stephen E; Chekemian, Beth; Brenn, B Randall; Epstein, Richard H

2011-02-01

Residents in anesthesia training programs throughout the world are required to document their clinical cases to help ensure that they receive adequate training. Current systems involve self-reporting, are subject to delayed updates and misreported data, and do not provide a practicable method of validation. Anesthesia information management systems (AIMS) are being used increasingly in training programs and are a logical source for verifiable documentation. We hypothesized that case logs generated automatically from an AIMS would be sufficiently accurate to replace the current manual process. We based our analysis on the data reporting requirements of the American College of Graduate Medical Education (ACGME). We conducted a systematic review of ACGME requirements and our AIMS record, and made modifications after identifying data element and attribution issues. We studied 2 methods (parsing of free text procedure descriptions and CPT4 procedure code mapping) to automatically determine ACGME case categories and generated AIMS-based case logs and compared these to assignments made by manual inspection of the anesthesia records. We also assessed under- and overreporting of cases entered manually by our residents into the ACGME website. The parsing and mapping methods assigned cases to a majority of the ACGME categories with accuracies of 95% and 97%, respectively, as compared with determinations made by 2 residents and 1 attending who manually reviewed all procedure descriptions. Comparison of AIMS-based case logs with reports from the ACGME Resident Case Log System website showed that >50% of residents either underreported or overreported their total case counts by at least 5%. The AIMS database is a source of contemporaneous documentation of resident experience that can be queried to generate valid, verifiable case logs. The extent of AIMS adoption by academic anesthesia departments should encourage accreditation organizations to support uploading of AIMS-based case log files to improve accuracy and to decrease the clerical burden on anesthesia residents.
Arcus v. 1.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

Englehardt, Robert; Steele, Andrew

Arcus, developed by Sandia National Laboratories, is a library for calculating, parsing, formatting, converting and comparing both IPv4 and IPv6 addresses and subnets. It accounts for 128-bit numbers on 32-bit platforms.
Guidance of visual attention by semantic information in real-world scenes

PubMed Central

Wu, Chia-Chien; Wick, Farahnaz Ahmed; Pomplun, Marc

2014-01-01

Recent research on attentional guidance in real-world scenes has focused on object recognition within the context of a scene. This approach has been valuable for determining some factors that drive the allocation of visual attention and determine visual selection. This article provides a review of experimental work on how different components of context, especially semantic information, affect attentional deployment. We review work from the areas of object recognition, scene perception, and visual search, highlighting recent studies examining semantic structure in real-world scenes. A better understanding on how humans parse scene representations will not only improve current models of visual attention but also advance next-generation computer vision systems and human-computer interfaces. PMID:24567724
Progress in The Semantic Analysis of Scientific Code

NASA Technical Reports Server (NTRS)

Stewart, Mark

2000-01-01

This paper concerns a procedure that analyzes aspects of the meaning or semantics of scientific and engineering code. This procedure involves taking a user's existing code, adding semantic declarations for some primitive variables, and parsing this annotated code using multiple, independent expert parsers. These semantic parsers encode domain knowledge and recognize formulae in different disciplines including physics, numerical methods, mathematics, and geometry. The parsers will automatically recognize and document some static, semantic concepts and help locate some program semantic errors. These techniques may apply to a wider range of scientific codes. If so, the techniques could reduce the time, risk, and effort required to develop and modify scientific codes.
Geo-Distinctive Comorbidity Networks of Pediatric Asthma.

PubMed

Shin, Eun Kyong; Shaban-Nejad, Arash

2018-01-01

Most pediatric asthma cases occur in complex interdependencies, exhibiting complex manifestation of multiple symptoms. Studying asthma comorbidities can help to better understand the etiology pathway of the disease. Albeit such relations of co-expressed symptoms and their interactions have been highlighted recently, empirical investigation has not been rigorously applied to pediatric asthma cases. In this study, we use computational network modeling and analysis to reveal the links and associations between commonly co-observed diseases/conditions with asthma among children in Memphis, Tennessee. We present a novel method for geo-parsed comorbidity network analysis to show the distinctive patterns of comorbidity networks in urban and suburban areas in Memphis.
Rapid automatic keyword extraction for information retrieval and analysis

DOEpatents

Rose, Stuart J [Richland, WA; Cowley,; E, Wendy [Richland, WA; Crow, Vernon L [Richland, WA; Cramer, Nicholas O [Richland, WA

2012-03-06

Methods and systems for rapid automatic keyword extraction for information retrieval and analysis. Embodiments can include parsing words in an individual document by delimiters, stop words, or both in order to identify candidate keywords. Word scores for each word within the candidate keywords are then calculated based on a function of co-occurrence degree, co-occurrence frequency, or both. Based on a function of the word scores for words within the candidate keyword, a keyword score is calculated for each of the candidate keywords. A portion of the candidate keywords are then extracted as keywords based, at least in part, on the candidate keywords having the highest keyword scores.

Integration of Dakota into the NEAMS Workbench

DOE Office of Scientific and Technical Information (OSTI.GOV)

Swiler, Laura Painton; Lefebvre, Robert A.; Langley, Brandon R.

2017-07-01

This report summarizes a NEAMS (Nuclear Energy Advanced Modeling and Simulation) project focused on integrating Dakota into the NEAMS Workbench. The NEAMS Workbench, developed at Oak Ridge National Laboratory, is a new software framework that provides a graphical user interface, input file creation, parsing, validation, job execution, workflow management, and output processing for a variety of nuclear codes. Dakota is a tool developed at Sandia National Laboratories that provides a suite of uncertainty quantification and optimization algorithms. Providing Dakota within the NEAMS Workbench allows users of nuclear simulation codes to perform uncertainty and optimization studies on their nuclear codes frommore » within a common, integrated environment. Details of the integration and parsing are provided, along with an example of Dakota running a sampling study on the fuels performance code, BISON, from within the NEAMS Workbench.« less
Punctuation and Implicit Prosody in Silent Reading: An ERP Study Investigating English Garden-Path Sentences.

PubMed

Drury, John E; Baum, Shari R; Valeriote, Hope; Steinhauer, Karsten

2016-01-01

This study presents the first two ERP reading studies of comma-induced effects of covert (implicit) prosody on syntactic parsing decisions in English. The first experiment used a balanced 2 × 2 design in which the presence/absence of commas determined plausibility (e.g., John, said Mary, was the nicest boy at the party vs. John said Mary was the nicest boy at the party ). The second reading experiment replicated a previous auditory study investigating the role of overt prosodic boundaries in closure ambiguities (Pauker et al., 2011). In both experiments, commas reliably elicited CPS components and generally played a dominant role in determining parsing decisions in the face of input ambiguity. The combined set of findings provides further evidence supporting the claim that mechanisms subserving speech processing play an active role during silent reading.
Modeling the Arden Syntax for medical decisions in XML.

PubMed

Kim, Sukil; Haug, Peter J; Rocha, Roberto A; Choi, Inyoung

2008-10-01

A new model expressing Arden Syntax with the eXtensible Markup Language (XML) was developed to increase its portability. Every example was manually parsed and reviewed until the schema and the style sheet were considered to be optimized. When the first schema was finished, several MLMs in Arden Syntax Markup Language (ArdenML) were validated against the schema. They were then transformed to HTML formats with the style sheet, during which they were compared to the original text version of their own MLM. When faults were found in the transformed MLM, the schema and/or style sheet was fixed. This cycle continued until all the examples were encoded into XML documents. The original MLMs were encoded in XML according to the proposed XML schema and reverse-parsed MLMs in ArdenML were checked using a public domain Arden Syntax checker. Two hundred seventy seven examples of MLMs were successfully transformed into XML documents using the model, and the reverse-parse yielded the original text version of MLMs. Two hundred sixty five of the 277 MLMs showed the same error patterns before and after transformation, and all 11 errors related to statement structure were resolved in XML version. The model uses two syntax checking mechanisms, first an XML validation process, and second, a syntax check using an XSL style sheet. Now that we have a schema for ArdenML, we can also begin the development of style sheets for transformation ArdenML into other languages.
Incremental Refinement of FAÇADE Models with Attribute Grammar from 3d Point Clouds

NASA Astrophysics Data System (ADS)

Dehbi, Y.; Staat, C.; Mandtler, L.; Pl¨umer, L.

2016-06-01

Data acquisition using unmanned aerial vehicles (UAVs) has gotten more and more attention over the last years. Especially in the field of building reconstruction the incremental interpretation of such data is a demanding task. In this context formal grammars play an important role for the top-down identification and reconstruction of building objects. Up to now, the available approaches expect offline data in order to parse an a-priori known grammar. For mapping on demand an on the fly reconstruction based on UAV data is required. An incremental interpretation of the data stream is inevitable. This paper presents an incremental parser of grammar rules for an automatic 3D building reconstruction. The parser enables a model refinement based on new observations with respect to a weighted attribute context-free grammar (WACFG). The falsification or rejection of hypotheses is supported as well. The parser can deal with and adapt available parse trees acquired from previous interpretations or predictions. Parse trees derived so far are updated in an iterative way using transformation rules. A diagnostic step searches for mismatches between current and new nodes. Prior knowledge on façades is incorporated. It is given by probability densities as well as architectural patterns. Since we cannot always assume normal distributions, the derivation of location and shape parameters of building objects is based on a kernel density estimation (KDE). While the level of detail is continuously improved, the geometrical, semantic and topological consistency is ensured.
Performance of Lempel-Ziv compressors with deferred innovation

NASA Technical Reports Server (NTRS)

Cohn, Martin

1989-01-01

The noiseless data-compression algorithms introduced by Lempel and Ziv (LZ) parse an input data string into successive substrings each consisting of two parts: The citation, which is the longest prefix that has appeared earlier in the input, and the innovation, which is the symbol immediately following the citation. In extremal versions of the LZ algorithm the citation may have begun anywhere in the input; in incremental versions it must have begun at a previous parse position. Originally the citation and the innovation were encoded, either individually or jointly, into an output word to be transmitted or stored. Subsequently, it was speculated that the cost of this encoding may be excessively high because the innovation contributes roughly 1g(A) bits, where A is the size of the input alphabet, regardless of the compressibility of the source. To remedy this excess, it was suggested to store the parsed substring as usual, but encoding for output only the citation, leaving the innovation to be encoded as the first symbol of the next substring. Being thus included in the next substring, the innovation can participate in whatever compression that substring enjoys. This strategy is called deferred innovation. It is exemplified in the algorithm described by Welch and implemented in the C program compress that has widely displaced adaptive Huffman coding (compact) as a UNIX system utility. The excessive expansion is explained, an implicit warning is given against using the deferred innovation compressors on nearly incompressible data.
Statistical learning of movement.

PubMed

Ongchoco, Joan Danielle Khonghun; Uddenberg, Stefan; Chun, Marvin M

2016-12-01

The environment is dynamic, but objects move in predictable and characteristic ways, whether they are a dancer in motion, or a bee buzzing around in flight. Sequences of movement are comprised of simpler motion trajectory elements chained together. But how do we know where one trajectory element ends and another begins, much like we parse words from continuous streams of speech? As a novel test of statistical learning, we explored the ability to parse continuous movement sequences into simpler element trajectories. Across four experiments, we showed that people can robustly parse such sequences from a continuous stream of trajectories under increasingly stringent tests of segmentation ability and statistical learning. Observers viewed a single dot as it moved along simple sequences of paths, and were later able to discriminate these sequences from novel and partial ones shown at test. Observers demonstrated this ability when there were potentially helpful trajectory-segmentation cues such as a common origin for all movements (Experiment 1); when the dot's motions were entirely continuous and unconstrained (Experiment 2); when sequences were tested against partial sequences as a more stringent test of statistical learning (Experiment 3); and finally, even when the element trajectories were in fact pairs of trajectories, so that abrupt directional changes in the dot's motion could no longer signal inter-trajectory boundaries (Experiment 4). These results suggest that observers can automatically extract regularities in movement - an ability that may underpin our capacity to learn more complex biological motions, as in sport or dance.
Mining protein phosphorylation information from biomedical literature using NLP parsing and Support Vector Machines.

PubMed

Raja, Kalpana; Natarajan, Jeyakumar

2018-07-01

Extraction of protein phosphorylation information from biomedical literature has gained much attention because of the importance in numerous biological processes. In this study, we propose a text mining methodology which consists of two phases, NLP parsing and SVM classification to extract phosphorylation information from literature. First, using NLP parsing we divide the data into three base-forms depending on the biomedical entities related to phosphorylation and further classify into ten sub-forms based on their distribution with phosphorylation keyword. Next, we extract the phosphorylation entity singles/pairs/triplets and apply SVM to classify the extracted singles/pairs/triplets using a set of features applicable to each sub-form. The performance of our methodology was evaluated on three corpora namely PLC, iProLink and hPP corpus. We obtained promising results of >85% F-score on ten sub-forms of training datasets on cross validation test. Our system achieved overall F-score of 93.0% on iProLink and 96.3% on hPP corpus test datasets. Furthermore, our proposed system achieved best performance on cross corpus evaluation and outperformed the existing system with recall of 90.1%. The performance analysis of our unique system on three corpora reveals that it extracts protein phosphorylation information efficiently in both non-organism specific general datasets such as PLC and iProLink, and human specific dataset such as hPP corpus. Copyright © 2018 Elsevier B.V. All rights reserved.
PPI-IRO: a two-stage method for protein-protein interaction extraction based on interaction relation ontology.

PubMed

Li, Chuan-Xi; Chen, Peng; Wang, Ru-Jing; Wang, Xiu-Jie; Su, Ya-Ru; Li, Jinyan

2014-01-01

Mining Protein-Protein Interactions (PPIs) from the fast-growing biomedical literature resources has been proven as an effective approach for the identification of biological regulatory networks. This paper presents a novel method based on the idea of Interaction Relation Ontology (IRO), which specifies and organises words of various proteins interaction relationships. Our method is a two-stage PPI extraction method. At first, IRO is applied in a binary classifier to determine whether sentences contain a relation or not. Then, IRO is taken to guide PPI extraction by building sentence dependency parse tree. Comprehensive and quantitative evaluations and detailed analyses are used to demonstrate the significant performance of IRO on relation sentences classification and PPI extraction. Our PPI extraction method yielded a recall of around 80% and 90% and an F1 of around 54% and 66% on corpora of AIMed and BioInfer, respectively, which are superior to most existing extraction methods.
Serious Games that Improve Performance

NASA Technical Reports Server (NTRS)

McGowan, Clement, III; Pecheux, Benjamin

2010-01-01

Serious games can help people function more effectively in complex settings, facilitate their role as team members, and provide insight into their team's mission. In such games, coordination and cooperation among team members are foundational to the mission's success and provide a preview of what individuals and the team as a whole could choose to do in a real scenario. Serious games often model events requiring life-or-death choices, such as civilian rescue during chemical warfare. How the players communicate and what actions they take can determine the number of lives lost or saved. However, merely playing a game is not enough to realize its most practical value, which is in learning what actions and communication methods are closest to what the mission requires. Teams often play serious games in isolation, so when the game is complete, an analytical stage is needed to extract the strategies used and examine each strategy's success relative to the others chosen. Recognizing the importance of this next stage, Noblis has been developing Game Analysis, software that parses individual game play into meaningful units and generates a strategic analysis. Trainers create a custom game-specific grammar that reflects the objects and range of actions allowable in a particular game, which Game Analysis then uses to parse the data and generate a practical analysis. Trainers have then enough information to represent strategies in tools, such as Gantt and heat map charts. First-responder trainees in North Carolina have already partnered Hot-Zone and Game Analysis with great success.
Probabilistic grammatical model for helix‐helix contact site classification

PubMed Central

2013-01-01

Background Hidden Markov Models power many state‐of‐the‐art tools in the field of protein bioinformatics. While excelling in their tasks, these methods of protein analysis do not convey directly information on medium‐ and long‐range residue‐residue interactions. This requires an expressive power of at least context‐free grammars. However, application of more powerful grammar formalisms to protein analysis has been surprisingly limited. Results In this work, we present a probabilistic grammatical framework for problem‐specific protein languages and apply it to classification of transmembrane helix‐helix pairs configurations. The core of the model consists of a probabilistic context‐free grammar, automatically inferred by a genetic algorithm from only a generic set of expert‐based rules and positive training samples. The model was applied to produce sequence based descriptors of four classes of transmembrane helix‐helix contact site configurations. The highest performance of the classifiers reached AUCROC of 0.70. The analysis of grammar parse trees revealed the ability of representing structural features of helix‐helix contact sites. Conclusions We demonstrated that our probabilistic context‐free framework for analysis of protein sequences outperforms the state of the art in the task of helix‐helix contact site classification. However, this is achieved without necessarily requiring modeling long range dependencies between interacting residues. A significant feature of our approach is that grammar rules and parse trees are human‐readable. Thus they could provide biologically meaningful information for molecular biologists. PMID:24350601
Perceptual learning improves visual performance in juvenile amblyopia.

PubMed

Li, Roger W; Young, Karen G; Hoenig, Pia; Levi, Dennis M

2005-09-01

To determine whether practicing a position-discrimination task improves visual performance in children with amblyopia and to determine the mechanism(s) of improvement. Five children (age range, 7-10 years) with amblyopia practiced a positional acuity task in which they had to judge which of three pairs of lines was misaligned. Positional noise was produced by distributing the individual patches of each line segment according to a Gaussian probability function. Observers were trained at three noise levels (including 0), with each observer performing between 3000 and 4000 responses in 7 to 10 sessions. Trial-by-trial feedback was provided. Four of the five observers showed significant improvement in positional acuity. In those four observers, on average, positional acuity with no noise improved by approximately 32% and with high noise by approximately 26%. A position-averaging model was used to parse the improvement into an increase in efficiency or a decrease in equivalent input noise. Two observers showed increased efficiency (51% and 117% improvements) with no significant change in equivalent input noise across sessions. The other two observers showed both a decrease in equivalent input noise (18% and 29%) and an increase in efficiency (17% and 71%). All five observers showed substantial improvement in Snellen acuity (approximately 26%) after practice. Perceptual learning can improve visual performance in amblyopic children. The improvement can be parsed into two important factors: decreased equivalent input noise and increased efficiency. Perceptual learning techniques may add an effective new method to the armamentarium of amblyopia treatments.
A Bifactor Approach to Model Multifaceted Constructs in Statistical Mediation Analysis.

PubMed

Gonzalez, Oscar; MacKinnon, David P

Statistical mediation analysis allows researchers to identify the most important mediating constructs in the causal process studied. Identifying specific mediators is especially relevant when the hypothesized mediating construct consists of multiple related facets. The general definition of the construct and its facets might relate differently to an outcome. However, current methods do not allow researchers to study the relationships between general and specific aspects of a construct to an outcome simultaneously. This study proposes a bifactor measurement model for the mediating construct as a way to parse variance and represent the general aspect and specific facets of a construct simultaneously. Monte Carlo simulation results are presented to help determine the properties of mediated effect estimation when the mediator has a bifactor structure and a specific facet of a construct is the true mediator. This study also investigates the conditions when researchers can detect the mediated effect when the multidimensionality of the mediator is ignored and treated as unidimensional. Simulation results indicated that the mediation model with a bifactor mediator measurement model had unbiased and adequate power to detect the mediated effect with a sample size greater than 500 and medium a - and b -paths. Also, results indicate that parameter bias and detection of the mediated effect in both the data-generating model and the misspecified model varies as a function of the amount of facet variance represented in the mediation model. This study contributes to the largely unexplored area of measurement issues in statistical mediation analysis.
Using Informatics-, Bioinformatics- and Genomics-Based Approaches for the Molecular Surveillance and Detection of Biothreat Agents

NASA Astrophysics Data System (ADS)

Seto, Donald

The convergence and wealth of informatics, bioinformatics and genomics methods and associated resources allow a comprehensive and rapid approach for the surveillance and detection of bacterial and viral organisms. Coupled with the continuing race for the fastest, most cost-efficient and highest-quality DNA sequencing technology, that is, "next generation sequencing", the detection of biological threat agents by `cheaper and faster' means is possible. With the application of improved bioinformatic tools for the understanding of these genomes and for parsing unique pathogen genome signatures, along with `state-of-the-art' informatics which include faster computational methods, equipment and databases, it is feasible to apply new algorithms to biothreat agent detection. Two such methods are high-throughput DNA sequencing-based and resequencing microarray-based identification. These are illustrated and validated by two examples involving human adenoviruses, both from real-world test beds.
Unlocking Index Animalium: From paper slips to bytes and bits

PubMed Central

Pilsk, Suzanne C.; Kalfatovic, Martin R.; Richard, Joel M.

2016-01-01

Abstract In 1996 Smithsonian Libraries (SIL) embarked on the digitization of its collections. By 1999, a full-scale digitization center was in place and rare volumes from the natural history collections, often of high illustrative value, were the focus for the first years of the program. The resulting beautiful books made available for online display were successful to a certain extent, but it soon became clear that the data locked within the texts needed to be converted to more usable and re-purposable form via digitization methods that went beyond simple page imaging and included text conversion elements. Library staff met with researchers from the taxonomic community to understand their path to the literature and identified tools (indexes and bibliographies) used to connect to the library holdings. The traditional library metadata describing the titles, which made them easily retrievable from the shelves of libraries, was not meeting the needs of the researcher looking for more detailed and granular data within the texts. The result was to identify proper print tools that could potential assist researchers in digital form. This paper outlines the project undertaken to convert Charles Davies Sherborn’s Index Animalium into a tool to connect researchers to the library holdings: from a print index to a database to eventually a dataset. Sherborn’s microcitation of a species name and his bibliographies help bridge the gap between taxonomist and literature holdings of libraries. In 2004, SIL received funding from the Smithsonian’s Atherton Seidell Endowment to create an online version of Sherborn’s Index Animalium. The initial project was to digitize the page images and re-key the data into a simple data structure. As the project evolved, a more complex database was developed which enabled quality field searching to retrieve species names and to search the bibliography. Problems with inconsistent abbreviations and styling of his bibliographies made the parsing of the data difficult. Coinciding with the development of the Biodiversity Heritage Library (BHL) in 2005, it became obvious there was a need to integrate the database converted Index Animalium, BHL’s scanned taxonomic literature, and taxonomic intelligence (the algorithmic identification of binomial, Latinate name-strings). The challenges of working with legacy taxonomic citation, computer matching algorithms, and making connections have brought us to today’s goal of making Sherborn available and linked to other datasets. Partnering with others to allow machine-to-machine communications the data is being examined for possible transformation into RDF markup and meeting the standards of Linked Open Data. SIL staff have partnered with Thomson Reuters and the Global Names Initiative to further enhance the Index Animalium data set. Thomson Reuters’ staff is now working on integrating the species microcitation and species name in the ION: Index to Organism Names project; Richard Pyle (The Bishop Museum) is also working on further parsing of the text. The Index Animalium collaborative project’s ultimate goal is to successful have researchers go seamlessly from the species name in either ION or the scanned pages of Index Animalium to the digitized original description in BHL - connecting taxonomic researchers to original authored species descriptions with just a click. PMID:26877657
Unlocking Index Animalium: From paper slips to bytes and bits.

PubMed

Pilsk, Suzanne C; Kalfatovic, Martin R; Richard, Joel M

2016-01-01

In 1996 Smithsonian Libraries (SIL) embarked on the digitization of its collections. By 1999, a full-scale digitization center was in place and rare volumes from the natural history collections, often of high illustrative value, were the focus for the first years of the program. The resulting beautiful books made available for online display were successful to a certain extent, but it soon became clear that the data locked within the texts needed to be converted to more usable and re-purposable form via digitization methods that went beyond simple page imaging and included text conversion elements. Library staff met with researchers from the taxonomic community to understand their path to the literature and identified tools (indexes and bibliographies) used to connect to the library holdings. The traditional library metadata describing the titles, which made them easily retrievable from the shelves of libraries, was not meeting the needs of the researcher looking for more detailed and granular data within the texts. The result was to identify proper print tools that could potential assist researchers in digital form. This paper outlines the project undertaken to convert Charles Davies Sherborn's Index Animalium into a tool to connect researchers to the library holdings: from a print index to a database to eventually a dataset. Sherborn's microcitation of a species name and his bibliographies help bridge the gap between taxonomist and literature holdings of libraries. In 2004, SIL received funding from the Smithsonian's Atherton Seidell Endowment to create an online version of Sherborn's Index Animalium. The initial project was to digitize the page images and re-key the data into a simple data structure. As the project evolved, a more complex database was developed which enabled quality field searching to retrieve species names and to search the bibliography. Problems with inconsistent abbreviations and styling of his bibliographies made the parsing of the data difficult. Coinciding with the development of the Biodiversity Heritage Library (BHL) in 2005, it became obvious there was a need to integrate the database converted Index Animalium, BHL's scanned taxonomic literature, and taxonomic intelligence (the algorithmic identification of binomial, Latinate name-strings). The challenges of working with legacy taxonomic citation, computer matching algorithms, and making connections have brought us to today's goal of making Sherborn available and linked to other datasets. Partnering with others to allow machine-to-machine communications the data is being examined for possible transformation into RDF markup and meeting the standards of Linked Open Data. SIL staff have partnered with Thomson Reuters and the Global Names Initiative to further enhance the Index Animalium data set. Thomson Reuters' staff is now working on integrating the species microcitation and species name in the ION: Index to Organism Names project; Richard Pyle (The Bishop Museum) is also working on further parsing of the text. The Index Animalium collaborative project's ultimate goal is to successful have researchers go seamlessly from the species name in either ION or the scanned pages of Index Animalium to the digitized original description in BHL - connecting taxonomic researchers to original authored species descriptions with just a click.
TopFed: TCGA tailored federated query processing and linking to LOD.

PubMed

Saleem, Muhammad; Padmanabhuni, Shanmukha S; Ngomo, Axel-Cyrille Ngonga; Iqbal, Aftab; Almeida, Jonas S; Decker, Stefan; Deus, Helena F

2014-01-01

The Cancer Genome Atlas (TCGA) is a multidisciplinary, multi-institutional effort to catalogue genetic mutations responsible for cancer using genome analysis techniques. One of the aims of this project is to create a comprehensive and open repository of cancer related molecular analysis, to be exploited by bioinformaticians towards advancing cancer knowledge. However, devising bioinformatics applications to analyse such large dataset is still challenging, as it often requires downloading large archives and parsing the relevant text files. Therefore, it is making it difficult to enable virtual data integration in order to collect the critical co-variates necessary for analysis. We address these issues by transforming the TCGA data into the Semantic Web standard Resource Description Format (RDF), link it to relevant datasets in the Linked Open Data (LOD) cloud and further propose an efficient data distribution strategy to host the resulting 20.4 billion triples data via several SPARQL endpoints. Having the TCGA data distributed across multiple SPARQL endpoints, we enable biomedical scientists to query and retrieve information from these SPARQL endpoints by proposing a TCGA tailored federated SPARQL query processing engine named TopFed. We compare TopFed with a well established federation engine FedX in terms of source selection and query execution time by using 10 different federated SPARQL queries with varying requirements. Our evaluation results show that TopFed selects on average less than half of the sources (with 100% recall) with query execution time equal to one third to that of FedX. With TopFed, we aim to offer biomedical scientists a single-point-of-access through which distributed TCGA data can be accessed in unison. We believe the proposed system can greatly help researchers in the biomedical domain to carry out their research effectively with TCGA as the amount and diversity of data exceeds the ability of local resources to handle its retrieval and parsing.
"Parsing the heterogeneity of psychopathy and aggression: Differential associations across dimensions and gender": Correction to Hecht et al. (2016).

PubMed

2017-01-01

Reports an error in "Parsing the heterogeneity of psychopathy and aggression: Differential associations across dimensions and gender" by Lisa K. Hecht, Joanna M. Berg, Scott O. Lilienfeld and Robert D. Latzman ( Personality Disorders: Theory, Research, and Treatment , 2016[Jan], Vol 7[1], 2-14). In the article, there was an error in Table 3 and in the fifth paragraph of the Results. The correct information has been provided. (The following abstract of the original article appeared in record 2015-29370-001.) Psychopathy is a multidimensional construct that is broadly associated with both reactive (RA) and proactive (PA) aggression. Nevertheless, a consistent pattern of associations between psychopathy and these 2 aggression subtypes has yet to emerge because of methodological differences across studies. Moreover, research has yet to examine gender differences in the relation between dimensions of psychopathy and RA/PA. Accordingly, we examined the associations between psychopathy dimensions, as operationalized by 2 self-report instruments, and subtypes of aggression within a diverse sample of undergraduates (N = 1,158). Results confirmed that psychopathy is broadly associated with PA, as well as RA, with dimensions of psychopathy evidencing common and distinct associations with both raw and residual RA and PA scores. In both models of psychopathy, PA was significantly and positively associated with all dimensions, whereas RA was significantly negatively associated with interpersonal and affective dimensions, and significantly positively associated with dimensions related to an antisocial and impulsive lifestyle. Gender significantly moderated associations among dimensions of psychopathy and RA/PA, such that the antisocial/behavioral dimension of psychopathy was positively associated with PA for males, whereas the antisocial/behavioral dimension was positively associated with RA for females. Results suggest both generality and specificity of psychopathy dimensions as related to subtypes of aggression, as well as possible differential pathways from psychopathy to different subtypes of aggression in men and women. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Automating Flood Hazard Mapping Methods for Near Real-time Storm Surge Inundation and Vulnerability Assessment

NASA Astrophysics Data System (ADS)

Weigel, A. M.; Griffin, R.; Gallagher, D.

2015-12-01

Storm surge has enough destructive power to damage buildings and infrastructure, erode beaches, and threaten human life across large geographic areas, hence posing the greatest threat of all the hurricane hazards. The United States Gulf of Mexico has proven vulnerable to hurricanes as it has been hit by some of the most destructive hurricanes on record. With projected rises in sea level and increases in hurricane activity, there is a need to better understand the associated risks for disaster mitigation, preparedness, and response. GIS has become a critical tool in enhancing disaster planning, risk assessment, and emergency response by communicating spatial information through a multi-layer approach. However, there is a need for a near real-time method of identifying areas with a high risk of being impacted by storm surge. Research was conducted alongside Baron, a private industry weather enterprise, to facilitate automated modeling and visualization of storm surge inundation and vulnerability on a near real-time basis. This research successfully automated current flood hazard mapping techniques using a GIS framework written in a Python programming environment, and displayed resulting data through an Application Program Interface (API). Data used for this methodology included high resolution topography, NOAA Probabilistic Surge model outputs parsed from Rich Site Summary (RSS) feeds, and the NOAA Census tract level Social Vulnerability Index (SoVI). The development process required extensive data processing and management to provide high resolution visualizations of potential flooding and population vulnerability in a timely manner. The accuracy of the developed methodology was assessed using Hurricane Isaac as a case study, which through a USGS and NOAA partnership, contained ample data for statistical analysis. This research successfully created a fully automated, near real-time method for mapping high resolution storm surge inundation and vulnerability for the Gulf of Mexico, and improved the accuracy and resolution of the Probabilistic Storm Surge model.
Multidimensional incremental parsing for universal source coding.

PubMed

Bae, Soo Hyun; Juang, Biing-Hwang

2008-10-01

A multidimensional incremental parsing algorithm (MDIP) for multidimensional discrete sources, as a generalization of the Lempel-Ziv coding algorithm, is investigated. It consists of three essential component schemes, maximum decimation matching, hierarchical structure of multidimensional source coding, and dictionary augmentation. As a counterpart of the longest match search in the Lempel-Ziv algorithm, two classes of maximum decimation matching are studied. Also, an underlying behavior of the dictionary augmentation scheme for estimating the source statistics is examined. For an m-dimensional source, m augmentative patches are appended into the dictionary at each coding epoch, thus requiring the transmission of a substantial amount of information to the decoder. The property of the hierarchical structure of the source coding algorithm resolves this issue by successively incorporating lower dimensional coding procedures in the scheme. In regard to universal lossy source coders, we propose two distortion functions, the local average distortion and the local minimax distortion with a set of threshold levels for each source symbol. For performance evaluation, we implemented three image compression algorithms based upon the MDIP; one is lossless and the others are lossy. The lossless image compression algorithm does not perform better than the Lempel-Ziv-Welch coding, but experimentally shows efficiency in capturing the source structure. The two lossy image compression algorithms are implemented using the two distortion functions, respectively. The algorithm based on the local average distortion is efficient at minimizing the signal distortion, but the images by the one with the local minimax distortion have a good perceptual fidelity among other compression algorithms. Our insights inspire future research on feature extraction of multidimensional discrete sources.
Predicting fecal indicator organism contamination in Oregon coastal streams.

PubMed

Pettus, Paul; Foster, Eugene; Pan, Yangdong

2015-12-01

In this study, we used publicly available GIS layers and statistical tree-based modeling (CART and Random Forest) to predict pathogen indicator counts at a regional scale using 88 spatially explicit landscape predictors and 6657 samples from non-estuarine streams in the Oregon Coast Range. A total of 532 frequently sampled sites were parsed down to 93 pathogen sampling sites to control for spatial and temporal biases. This model's 56.5% explanation of variance, was comparable to other regional models, while still including a large number of variables. Analysis showed the most important predictors on bacteria counts to be: forest and natural riparian zones, cattle related activities, and urban land uses. This research confirmed linkages to anthropogenic activities, with the research prediction mapping showing increased bacteria counts in agricultural and urban land use areas and lower counts with more natural riparian conditions. Copyright © 2015 Elsevier Ltd. All rights reserved.

SkData: data sets and algorithm evaluation protocols in Python

NASA Astrophysics Data System (ADS)

Bergstra, James; Pinto, Nicolas; Cox, David D.

2015-01-01

Machine learning benchmark data sets come in all shapes and sizes, whereas classification algorithms assume sanitized input, such as (x, y) pairs with vector-valued input x and integer class label y. Researchers and practitioners know all too well how tedious it can be to get from the URL of a new data set to a NumPy ndarray suitable for e.g. pandas or sklearn. The SkData library handles that work for a growing number of benchmark data sets (small and large) so that one-off in-house scripts for downloading and parsing data sets can be replaced with library code that is reliable, community-tested, and documented. The SkData library also introduces an open-ended formalization of training and testing protocols that facilitates direct comparison with published research. This paper describes the usage and architecture of the SkData library.
Dissecting social cell biology and tumors using Drosophila genetics.

PubMed

Pastor-Pareja, José Carlos; Xu, Tian

2013-01-01

Cancer was seen for a long time as a strictly cell-autonomous process in which oncogenes and tumor-suppressor mutations drive clonal cell expansions. Research in the past decade, however, paints a more integrative picture of communication and interplay between neighboring cells in tissues. It is increasingly clear as well that tumors, far from being homogenous lumps of cells, consist of different cell types that function together as complex tissue-level communities. The repertoire of interactive cell behaviors and the quantity of cellular players involved call for a social cell biology that investigates these interactions. Research into this social cell biology is critical for understanding development of normal and tumoral tissues. Such complex social cell biology interactions can be parsed in Drosophila. Techniques in Drosophila for analysis of gene function and clonal behavior allow us to generate tumors and dissect their complex interactive biology with cellular resolution. Here, we review recent Drosophila research aimed at understanding tissue-level biology and social cell interactions in tumors, highlighting the principles these studies reveal.
Medical and Transmission Vector Vocabulary Alignment with Schema.org

DOE Office of Scientific and Technical Information (OSTI.GOV)

Smith, William P.; Chappell, Alan R.; Corley, Courtney D.

Available biomedical ontologies and knowledge bases currently lack formal and standards-based interconnections between disease, disease vector, and drug treatment vocabularies. The PNNL Medical Linked Dataset (PNNL-MLD) addresses this gap. This paper describes the PNNL-MLD, which provides a unified vocabulary and dataset of drug, disease, side effect, and vector transmission background information. Currently, the PNNL-MLD combines and curates data from the following research projects: DrugBank, DailyMed, Diseasome, DisGeNet, Wikipedia Infobox, Sider, and PharmGKB. The main outcomes of this effort are a dataset aligned to Schema.org, including a parsing framework, and extensible hooks ready for integration with selected medical ontologies. The PNNL-MLDmore » enables researchers more quickly and easily to query distinct datasets. Future extensions to the PNNL-MLD will include Traditional Chinese Medicine, broader interlinks across genetic structures, a larger thesaurus of synonyms and hypernyms, explicit coding of diseases and drugs across research systems, and incorporating vector-borne transmission vocabularies.« less
Adding part-of-speech information to the SUBTLEX-US word frequencies.

PubMed

Brysbaert, Marc; New, Boris; Keuleers, Emmanuel

2012-12-01

The SUBTLEX-US corpus has been parsed with the CLAWS tagger, so that researchers have information about the possible word classes (parts-of-speech, or PoSs) of the entries. Five new columns have been added to the SUBTLEX-US word frequency list: the dominant (most frequent) PoS for the entry, the frequency of the dominant PoS, the frequency of the dominant PoS relative to the entry's total frequency, all PoSs observed for the entry, and the respective frequencies of these PoSs. Because the current definition of lemma frequency does not seem to provide word recognition researchers with useful information (as illustrated by a comparison of the lemma frequencies and the word form frequencies from the Corpus of Contemporary American English), we have not provided a column with this variable. Instead, we hope that the full list of PoS frequencies will help researchers to collectively determine which combination of frequencies is the most informative.
A U-Shaped Relative Clause Attachment Preference in Japanese.

ERIC Educational Resources Information Center

Miyamoto, Edson T.; Gibson, Edward; Pearlmutter, Neal J.; Aikawa, Takako; Miyagawa, Shigeru

1999-01-01

Presents results from a self-paced reading experiment in Japanese investigating attachment preferences for relative clauses to three ensuing potential nominal heads. Results are discussed in light of two types of parsing models. (Author/VWL)
Peripheral Visual Cues Contribute to the Perception of Object Movement During Self-Movement

PubMed Central

Rogers, Cassandra; Warren, Paul A.

2017-01-01

Safe movement through the environment requires us to monitor our surroundings for moving objects or people. However, identification of moving objects in the scene is complicated by self-movement, which adds motion across the retina. To identify world-relative object movement, the brain thus has to ‘compensate for’ or ‘parse out’ the components of retinal motion that are due to self-movement. We have previously demonstrated that retinal cues arising from central vision contribute to solving this problem. Here, we investigate the contribution of peripheral vision, commonly thought to provide strong cues to self-movement. Stationary participants viewed a large field of view display, with radial flow patterns presented in the periphery, and judged the trajectory of a centrally presented probe. Across two experiments, we demonstrate and quantify the contribution of peripheral optic flow to flow parsing during forward and backward movement. PMID:29201335
RNA-Seq-Based Transcript Structure Analysis with TrBorderExt.

PubMed

Wang, Yejun; Sun, Ming-An; White, Aaron P

2018-01-01

RNA-Seq has become a routine strategy for genome-wide gene expression comparisons in bacteria. Despite lower resolution in transcript border parsing compared with dRNA-Seq, TSS-EMOTE, Cappable-seq, Term-seq, and others, directional RNA-Seq still illustrates its advantages: low cost, quantification and transcript border analysis with a medium resolution (±10-20 nt). To facilitate mining of directional RNA-Seq datasets especially with respect to transcript structure analysis, we developed a tool, TrBorderExt, which can parse transcript start sites and termination sites accurately in bacteria. A detailed protocol is described in this chapter for how to use the software package step by step to identify bacterial transcript borders from raw RNA-Seq data. The package was developed with Perl and R programming languages, and is accessible freely through the website: http://www.szu-bioinf.org/TrBorderExt .
Parsing with logical variables (logic-based programming systems)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Finin, T.W.; Stone Palmer, M.

1983-01-01

Logic based programming systems have enjoyed an increasing popularity in applied AI work in the last few years. One of the contributions to computational linguistics made by the logic programming paradigm has been the definite clause grammar. In comparing DCGS with previous parsing mechanisms such as ATNS, certain clear advantages are seen. The authors feel that the most important of these advantages are due to the use of logical variables with unification as the fundamental operation on them. To illustrate the power of the logical variable, they have implemented an experimental atn system which treats atn registers as logical variablesmore » and provides a unification operation over them. They aim to simultaneously encourage the use of the powerful mechanisms available in DCGS and demonstrate that some of these techniques can be captured without reference to a resolution theorem prover. 14 references.« less
The metamorphosis of the statistical segmentation output: lexicalization during artificial language learning.

PubMed

Fernandes, Tânia; Kolinsky, Régine; Ventura, Paulo

2009-09-01

This study combined artificial language learning (ALL) with conventional experimental techniques to test whether statistical speech segmentation outputs are integrated into adult listeners' mental lexicon. Lexicalization was assessed through inhibitory effects of novel neighbors (created by the parsing process) on auditory lexical decisions to real words. Both immediately after familiarization and post-one week, ALL outputs were lexicalized only when the cues available during familiarization (transitional probabilities and wordlikeness) suggested the same parsing (Experiments 1 and 3). No lexicalization effect occurred with incongruent cues (Experiments 2 and 4). Yet, ALL differed from chance, suggesting a dissociation between item knowledge and lexicalization. Similarly contrasted results were found when frequency of occurrence of the stimuli was equated during familiarization (Experiments 3 and 4). Our findings thus indicate that ALL outputs may be lexicalized as far as the segmentation cues are congruent, and that this process cannot be accounted for by raw frequency.
UniGene Tabulator: a full parser for the UniGene format.

PubMed

Lenzi, Luca; Frabetti, Flavia; Facchin, Federica; Casadei, Raffaella; Vitale, Lorenza; Canaider, Silvia; Carinci, Paolo; Zannotti, Maria; Strippoli, Pierluigi

2006-10-15

UniGene Tabulator 1.0 provides a solution for full parsing of UniGene flat file format; it implements a structured graphical representation of each data field present in UniGene following import into a common database managing system usable in a personal computer. This database includes related tables for sequence, protein similarity, sequence-tagged site (STS) and transcript map interval (TXMAP) data, plus a summary table where each record represents a UniGene cluster. UniGene Tabulator enables full local management of UniGene data, allowing parsing, querying, indexing, retrieving, exporting and analysis of UniGene data in a relational database form, usable on Macintosh (OS X 10.3.9 or later) and Windows (2000, with service pack 4, XP, with service pack 2 or later) operating systems-based computers. The current release, including both the FileMaker runtime applications, is freely available at http://apollo11.isto.unibo.it/software/
Punctuation and Implicit Prosody in Silent Reading: An ERP Study Investigating English Garden-Path Sentences

PubMed Central

Drury, John E.; Baum, Shari R.; Valeriote, Hope; Steinhauer, Karsten

2016-01-01

This study presents the first two ERP reading studies of comma-induced effects of covert (implicit) prosody on syntactic parsing decisions in English. The first experiment used a balanced 2 × 2 design in which the presence/absence of commas determined plausibility (e.g., John, said Mary, was the nicest boy at the party vs. John said Mary was the nicest boy at the party). The second reading experiment replicated a previous auditory study investigating the role of overt prosodic boundaries in closure ambiguities (Pauker et al., 2011). In both experiments, commas reliably elicited CPS components and generally played a dominant role in determining parsing decisions in the face of input ambiguity. The combined set of findings provides further evidence supporting the claim that mechanisms subserving speech processing play an active role during silent reading. PMID:27695428
Disparities in Diabetes Care Quality by English Language Preference in Community Health Centers.

PubMed

Leung, Lucinda B; Vargas-Bustamante, Arturo; Martinez, Ana E; Chen, Xiao; Rodriguez, Hector P

2018-02-01

To conduct a parallel analysis of disparities in diabetes care quality among Latino and Asian community health center (CHC) patients by English language preference. Clinical outcomes (2011) and patient survey data (2012) for Type 2 diabetes adults from 14 CHCs (n = 1,053). We estimated separate regression models for Latino and Asian patients by English language preference for Clinician & Group-Consumer Assessment of Healthcare Providers and System, Patient Assessment of Chronic Illness Care, hemoglobin A1c, and self-reported hypoglycemic events. We used the Blinder-Oaxaca decomposition method to parse out observed and unobserved differences in outcomes between English versus non-English language groups. After adjusting for socioeconomic and health characteristics, disparities in patient experiences by English language preference were found only among Asian patients. Unobserved factors largely accounted for linguistic disparities for most patient experience measures. There were no significant differences in glycemic control by language for either Latino or Asian patients. Given the importance of patient retention in CHCs, our findings indicate opportunities to improve CHC patients' experiences of care and to reduce disparities in patient experience by English preference for Asian diabetes patients. © Health Research and Educational Trust.
Principle-based concept analysis: intentionality in holistic nursing theories.

PubMed

Aghebati, Nahid; Mohammadi, Eesa; Ahmadi, Fazlollah; Noaparast, Khosrow Bagheri

2015-03-01

This is a report of a principle-based concept analysis of intentionality in holistic nursing theories. A principle-based concept analysis method was used to analyze seven holistic theories. The data included eight books and 31 articles (1998-2011), which were retrieved through MEDLINE and CINAHL. Erickson, Kriger, Parse, Watson, and Zahourek define intentionality as a capacity, a focused consciousness, and a pattern of human being. Rogers and Newman do not explicitly mention intentionality; however, they do explain pattern and consciousness (epistemology). Intentionality has been operationalized as a core concept of nurse-client relationships (pragmatic). The theories are consistent on intentionality as a noun and as an attribute of the person-intentionality is different from intent and intention (linguistic). There is ambiguity concerning the boundaries between intentionality and consciousness (logic). Theoretically, intentionality is an evolutionary capacity to integrate human awareness and experience. Because intentionality is an individualized concept, we introduced it as "a matrix of continuous known changes" that emerges in two forms: as a capacity of human being and as a capacity of transpersonal caring. This study has produced a theoretical definition of intentionality and provides a foundation for future research to further investigate intentionality to better delineate its boundaries. © The Author(s) 2014.
Equation-oriented specification of neural models for simulations

PubMed Central

Stimberg, Marcel; Goodman, Dan F. M.; Benichoux, Victor; Brette, Romain

2013-01-01

Simulating biological neuronal networks is a core method of research in computational neuroscience. A full specification of such a network model includes a description of the dynamics and state changes of neurons and synapses, as well as the synaptic connectivity patterns and the initial values of all parameters. A standard approach in neuronal modeling software is to build network models based on a library of pre-defined components and mechanisms; if a model component does not yet exist, it has to be defined in a special-purpose or general low-level language and potentially be compiled and linked with the simulator. Here we propose an alternative approach that allows flexible definition of models by writing textual descriptions based on mathematical notation. We demonstrate that this approach allows the definition of a wide range of models with minimal syntax. Furthermore, such explicit model descriptions allow the generation of executable code for various target languages and devices, since the description is not tied to an implementation. Finally, this approach also has advantages for readability and reproducibility, because the model description is fully explicit, and because it can be automatically parsed and transformed into formatted descriptions. The presented approach has been implemented in the Brian2 simulator. PMID:24550820
A pragmatic method for transforming clinical research data from the research electronic data capture "REDCap" to Clinical Data Interchange Standards Consortium (CDISC) Study Data Tabulation Model (SDTM): Development and evaluation of REDCap2SDTM.

PubMed

Yamamoto, Keiichi; Ota, Keiko; Akiya, Ippei; Shintani, Ayumi

2017-06-01

The Clinical Data Interchange Standards Consortium (CDISC) Study Data Tabulation Model (SDTM) can be used for new drug application studies as well as secondarily for creating a clinical research data warehouse to leverage clinical research study data across studies conducted within the same disease area. However, currently not all clinical research uses Clinical Data Acquisition Standards Harmonization (CDASH) beginning in the set-up phase of the study. Once already initiated, clinical studies that have not utilized CDASH are difficult to map in the SDTM format. In addition, most electronic data capture (EDC) systems are not equipped to export data in SDTM format; therefore, in many cases, statistical software is used to generate SDTM datasets from accumulated clinical data. In order to facilitate efficient secondary use of accumulated clinical research data using SDTM, it is necessary to develop a new tool to enable mapping of information for SDTM, even during or after the clinical research. REDCap is an EDC system developed by Vanderbilt University and is used globally by over 2100 institutions across 108 countries. In this study, we developed a simulated clinical trial to evaluate a tool called REDCap2SDTM that maps information in the Field Annotation of REDCap to SDTM and executes data conversion, including when data must be pivoted to accommodate the SDTM format, dynamically, by parsing the mapping information using R. We confirmed that generating SDTM data and the define.xml file from REDCap using REDCap2SDTM was possible. Conventionally, generation of SDTM data and the define.xml file from EDC systems requires the creation of individual programs for each clinical study. However, our proposed method can be used to generate this data and file dynamically without programming because it only involves entering the mapping information into the Field Annotation, and additional data into specific files. Our proposed method is adaptable not only to new drug application studies but also to all types of research, including observational and public health studies. Our method is also adaptable to clinical data collected with CDASH at the beginning of a study in non-standard format. We believe that this tool will reduce the workload of new drug application studies and will support data sharing and reuse of clinical research data in academia. Copyright © 2017 Elsevier Inc. All rights reserved.
The effect of recognizability on figure-ground processing: does it affect parsing or only figure selection?

PubMed

Navon, David

2011-03-01

Though figure-ground assignment has been shown to be probably affected by recognizability, it appears sensible that object recognition must follow at least the earlier process of figure-ground segregation. To examine whether or not rudimentary object recognition could, counterintuitively, start even before the completion of the stage of parsing in which figure-ground segregation is done, participants were asked to respond, in a go/no-go fashion, whenever any out of 16 alternative connected patterns (that constituted familiar stimuli in the upright orientation) appeared. The white figure of the to-be-attended stimulus-target or foil-could be segregated from the white ambient ground only by means of a frame surrounding it. Such a frame was absent until the onset of target display. Then, to manipulate organizational quality, the greyness of the frame was either gradually increased from zero (in Experiment 1) or changed abruptly to a stationary level whose greyness was varied between trials (in Experiments 2 and 3). Stimulus recognizability was manipulated by orientation angle. In all three experiments the effect of recognizability was found to be considerably larger when organizational quality was minimal due to an extremely faint frame. This result is argued to be incompatible with any version of a serial thesis suggesting that processing aimed at object recognition starts only with a good enough level of organizational quality. The experiments rather provide some support to the claim, termed here "early interaction hypothesis", positing interaction between early recognition processing and preassignment parsing processes.
Representing sentence information

NASA Astrophysics Data System (ADS)

Perkins, Walton A., III

1991-03-01

This paper describes a computer-oriented representation for sentence information. Whereas many Artificial Intelligence (AI) natural language systems start with a syntactic parse of a sentence into the linguist's components: noun, verb, adjective, preposition, etc., we argue that it is better to parse the input sentence into 'meaning' components: attribute, attribute value, object class, object instance, and relation. AI systems need a representation that will allow rapid storage and retrieval of information and convenient reasoning with that information. The attribute-of-object representation has proven useful for handling information in relational databases (which are well known for their efficiency in storage and retrieval) and for reasoning in knowledge- based systems. On the other hand, the linguist's syntactic representation of the works in sentences has not been shown to be useful for information handling and reasoning. We think it is an unnecessary and misleading intermediate form. Our sentence representation is semantic based in terms of attribute, attribute value, object class, object instance, and relation. Every sentence is segmented into one or more components with the form: 'attribute' of 'object' 'relation' 'attribute value'. Using only one format for all information gives the system simplicity and good performance as a RISC architecture does for hardware. The attribute-of-object representation is not new; it is used extensively in relational databases and knowledge-based systems. However, we will show that it can be used as a meaning representation for natural language sentences with minor extensions. In this paper we describe how a computer system can parse English sentences into this representation and generate English sentences from this representation. Much of this has been tested with computer implementation.
The EPRDATA Format: A Dialogue

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hughes, III, Henry Grady

2015-08-18

Recently the Los Alamos Nuclear Data Team has communicated certain issues of concern in relation to the new electron/photon/relaxation ACE data format as released in the eprdata12 library. In this document those issues are parsed, analyzed, and answered.
Deterministic Parsing and Linguistic Explanation. Revision,

DTIC Science & Technology

1985-06-01

near the town can have any of the following intepretations : 2"See Zubizarretta (1082) wrod StoweU (1081). 17...Department of Linguistics and Philosophy. 43 ... FILMED " ൒-85 D I’ DTIC " , S . I * -J’ . p -#-
Sleep Disrupts High-Level Speech Parsing Despite Significant Basic Auditory Processing.

PubMed

Makov, Shiri; Sharon, Omer; Ding, Nai; Ben-Shachar, Michal; Nir, Yuval; Zion Golumbic, Elana

2017-08-09

The extent to which the sleeping brain processes sensory information remains unclear. This is particularly true for continuous and complex stimuli such as speech, in which information is organized into hierarchically embedded structures. Recently, novel metrics for assessing the neural representation of continuous speech have been developed using noninvasive brain recordings that have thus far only been tested during wakefulness. Here we investigated, for the first time, the sleeping brain's capacity to process continuous speech at different hierarchical levels using a newly developed Concurrent Hierarchical Tracking (CHT) approach that allows monitoring the neural representation and processing-depth of continuous speech online. Speech sequences were compiled with syllables, words, phrases, and sentences occurring at fixed time intervals such that different linguistic levels correspond to distinct frequencies. This enabled us to distinguish their neural signatures in brain activity. We compared the neural tracking of intelligible versus unintelligible (scrambled and foreign) speech across states of wakefulness and sleep using high-density EEG in humans. We found that neural tracking of stimulus acoustics was comparable across wakefulness and sleep and similar across all conditions regardless of speech intelligibility. In contrast, neural tracking of higher-order linguistic constructs (words, phrases, and sentences) was only observed for intelligible speech during wakefulness and could not be detected at all during nonrapid eye movement or rapid eye movement sleep. These results suggest that, whereas low-level auditory processing is relatively preserved during sleep, higher-level hierarchical linguistic parsing is severely disrupted, thereby revealing the capacity and limits of language processing during sleep. SIGNIFICANCE STATEMENT Despite the persistence of some sensory processing during sleep, it is unclear whether high-level cognitive processes such as speech parsing are also preserved. We used a novel approach for studying the depth of speech processing across wakefulness and sleep while tracking neuronal activity with EEG. We found that responses to the auditory sound stream remained intact; however, the sleeping brain did not show signs of hierarchical parsing of the continuous stream of syllables into words, phrases, and sentences. The results suggest that sleep imposes a functional barrier between basic sensory processing and high-level cognitive processing. This paradigm also holds promise for studying residual cognitive abilities in a wide array of unresponsive states. Copyright © 2017 the authors 0270-6474/17/377772-10$15.00/0.

Multi-brain fusion and applications to intelligence analysis

NASA Astrophysics Data System (ADS)

Stoica, A.; Matran-Fernandez, A.; Andreou, D.; Poli, R.; Cinel, C.; Iwashita, Y.; Padgett, C.

2013-05-01

In a rapid serial visual presentation (RSVP) images are shown at an extremely rapid pace. Yet, the images can still be parsed by the visual system to some extent. In fact, the detection of specific targets in a stream of pictures triggers a characteristic electroencephalography (EEG) response that can be recognized by a brain-computer interface (BCI) and exploited for automatic target detection. Research funded by DARPA's Neurotechnology for Intelligence Analysts program has achieved speed-ups in sifting through satellite images when adopting this approach. This paper extends the use of BCI technology from individual analysts to collaborative BCIs. We show that the integration of information in EEGs collected from multiple operators results in performance improvements compared to the single-operator case.
ProForma: A Standard Proteoform Notation

DOE Office of Scientific and Technical Information (OSTI.GOV)

LeDuc, Richard D.; Schwämmle, Veit; Shortreed, Michael R.

The Consortium for Top-Down Proteomics (CTDP) proposes a standardized notation, ProForma, for writing the sequence of fully characterized proteoforms. ProForma provides a means to communicate any proteoform by writing the amino acid sequence using standard one-letter notation and specifying modifications or unidentified mass shifts within brackets following certain amino acids. The notation is unambiguous, human readable, and can easily be parsed and written by bioinformatic tools. This system uses seven rules and supports a wide range of possible use cases, ensuring compatibility and reproducibility of proteoform annotations. Standardizing proteoform sequences will simplify storage, comparison, and reanalysis of proteomic studies, andmore » the Consortium welcomes input and contributions from the research community on the continued design and maintenance of this standard.« less
What can mice tell us about how vision works?

PubMed Central

Huberman, Andrew D.; Niell, Cristopher M.

2012-01-01

Understanding the neural basis of visual perception is a longstanding fundamental goal of neuroscience. Historically, most vision studies were carried out on humans, macaque monkeys and cats. Over the last five years, however, a growing number of researchers have begun using mice to parse the mechanisms underlying visual processing- the rationale is that despite having relatively poor acuity, mice are unmatched in terms of the variety and sophistication of tools available to label, monitor and manipulate specific cell types and circuits. In this review, we discuss recent advances in understanding the mouse visual system at the anatomical, receptive field and perceptual level, focusing on the opportunities and constraints those features provide toward the goal of understanding how vision works. PMID:21840069
Quantifying care coordination using natural language processing and domain-specific ontology

PubMed Central

Popejoy, Lori L; Khalilia, Mohammed A; Popescu, Mihail; Galambos, Colleen; Lyons, Vanessa; Rantz, Marilyn; Hicks, Lanis; Stetzer, Frank

2015-01-01

Objective This research identifies specific care coordination activities used by Aging in Place (AIP) nurse care coordinators and home healthcare (HHC) nurses when coordinating care for older community-dwelling adults and suggests a method to quantify care coordination. Methods A care coordination ontology was built based on activities extracted from 11 038 notes labeled with the Omaha Case management category. From the parsed narrative notes of every patient, we mapped the extracted activities to the ontology, from which we computed problem profiles and quantified care coordination for all patients. Results We compared two groups of patients: AIP who received enhanced care coordination (n=217) and HHC who received traditional care (n=691) using 128 135 narratives notes. Patients were tracked from the time they were admitted to AIP or HHC until they were discharged. We found that patients in AIP received a higher dose of care coordination than HHC in most Omaha problems, with larger doses being given in AIP than in HHC in all four Omaha categories. Conclusions ‘Communicate’ and ‘manage’ activities are widely used in care coordination. This confirmed the expert hypothesis that nurse care coordinators spent most of their time communicating about their patients and managing problems. Overall, nurses performed care coordination in both AIP and HHC, but the aggregated dose across Omaha problems and categories is larger in AIP. PMID:25324557
A natural language query system for Hubble Space Telescope proposal selection

NASA Technical Reports Server (NTRS)

Hornick, Thomas; Cohen, William; Miller, Glenn

1987-01-01

The proposal selection process for the Hubble Space Telescope is assisted by a robust and easy to use query program (TACOS). The system parses an English subset language sentence regardless of the order of the keyword phases, allowing the user a greater flexibility than a standard command query language. Capabilities for macro and procedure definition are also integrated. The system was designed for flexibility in both use and maintenance. In addition, TACOS can be applied to any knowledge domain that can be expressed in terms of a single reaction. The system was implemented mostly in Common LISP. The TACOS design is described in detail, with particular attention given to the implementation methods of sentence processing.
A new method of cardiographic image segmentation based on grammar

NASA Astrophysics Data System (ADS)

Hamdi, Salah; Ben Abdallah, Asma; Bedoui, Mohamed H.; Alimi, Adel M.

2011-10-01

The measurement of the most common ultrasound parameters, such as aortic area, mitral area and left ventricle (LV) volume, requires the delineation of the organ in order to estimate the area. In terms of medical image processing this translates into the need to segment the image and define the contours as accurately as possible. The aim of this work is to segment an image and make an automated area estimation based on grammar. The entity "language" will be projected to the entity "image" to perform structural analysis and parsing of the image. We will show how the idea of segmentation and grammar-based area estimation is applied to real problems of cardio-graphic image processing.
Socializing the coast: Engaging the social science of tropical coastal research

NASA Astrophysics Data System (ADS)

Spalding, Ana K.; Biedenweg, Kelly

2017-03-01

The broad scale and rapid rate of change in the global environment is causing some of the world's most challenging problems, such as habitat degradation, loss of biodiversity, and food insecurity. These problems are especially pressing in coastal environments in the tropics, resulting in significant impacts on human wellbeing and ecological systems across the globe. The underlying causes of marine and coastal environmental change are both anthropogenic and natural; and, while it is difficult to parse out causal linkages as either exclusively human or naturally occurring, feedbacks between drivers only exacerbate the issues. Increasingly, scholars are turning to integrated research efforts, whereby multiple disciplines are used to answer pressing questions about and find solutions for the sustainability of human life and natural ecosystems across the coastal tropics. This article leverages the recent wave of interdisciplinary research to explore the various ways in which the social sciences have successfully contributed to a more complete understanding of coastal systems across the tropics. It also identifies opportunities for research that move beyond single disciplinary approaches to coastal science. The concluding discussion suggests social science knowledge areas that are underutilized in coastal research and provides suggestions for increasing the incorporation of social science in coastal research programs.
Earth Sciences data user community feedbacks to PARSE.Insight

NASA Astrophysics Data System (ADS)

Giaretta, David; Guidetti, Veronica

2010-05-01

The presentation in point reports on the topic of long term availability of environmental data as perceived by the Earth Science data user community. In the context of the European strategy for preserving Earth Observation (EO) data and as partner of the EU FP7 PARSE.Insight project (http://www.parse-insight.eu/), the European Space Agency (ESA) issued a public consultation on-line targeting its EO data user base. The timely and active participation confirmed the high interest in the addressed topic. Primary target of such an action is to provide ESA teams dedicated to environmental data access, archiving and re-processing with the first insight from the Earth Science community on the preservation of space data in the long-term. As a significant example, ESA's Climate Change Initiative requires activities like long-term preservation, recalibration and re-processing of data records. The time-span of EO data archives extends from a few years to decades and their value as scientific time-series increases considerably regarding the topic of global change. Future research in the field of Earth Sciences is of invaluable importance: to carry it on researchers worldwide must be enabled to find and access data of interest quickly. At present several thousands of scientists, principal investigators and operators, access EO missions' metadata, data and derived information daily. Main objectives may be to study the global climate change, to check the status of the instrument and the quality of EO data. There is a huge worldwide scientific community calling for the need to keep EO data accessible without time constrains, easily and quickly. The scientific community's standpoint is given over the stewardship of environmental data and the appropriateness of current EO data access systems as enabling digital preservation and offering HPC capabilities. This insight in the Earth Sciences community provides a comprehensive illustration of the users' responses over topics like use experiences with historical EO data, preferences in terms of historical data availability and proposals to better access and use them. The main achievement this initiative brought is certainly enforcing the link with the worldwide Earth Science community on the topic of environmental and space data preservation. Moreover it confirmed the following aspects: - The scientific community needs and wants to access historical environmental data and historical time series of Earth observations, for the most disparate applications across Earth Science; - The community wants to enhance its experiences on historical data exploitation aiming at a more active involvement in the process, e.g. by reporting examples and suggestions to foster data availability and accessibility; - Users are aware and informed about current infrastructures' limitations to enable data availability and accessibility, and ask for timely and effective solutions. This work presentation is a very good opportunity to receive further feedback over the preservation of EO data from the user community attending the conference.
Logs Perl Module

DOE Office of Scientific and Technical Information (OSTI.GOV)

Owen, R. K.

2007-04-04

A perl module designed to read and parse the voluminous set of event or accounting log files produced by a Portable Batch System (PBS) server. This module can filter on date-time and/or record type. The data can be returned in a variety of formats.
NEFTool: System Design

DTIC Science & Technology

2007-11-01

Architecture ( UIMA ) [3] based framework. All that would be required of such a NEFServer instance would be to send, receive and parse the received information...Management Architecture, http://incubator.apache.org/ uima /. [4] Grishman, R. & Sundheim, B, “Message Understanding Conference – 6: A Brief History
A linguistic rule-based approach to extract drug-drug interactions from pharmacological documents

PubMed Central

2011-01-01

Background A drug-drug interaction (DDI) occurs when one drug influences the level or activity of another drug. The increasing volume of the scientific literature overwhelms health care professionals trying to be kept up-to-date with all published studies on DDI. Methods This paper describes a hybrid linguistic approach to DDI extraction that combines shallow parsing and syntactic simplification with pattern matching. Appositions and coordinate structures are interpreted based on shallow syntactic parsing provided by the UMLS MetaMap tool (MMTx). Subsequently, complex and compound sentences are broken down into clauses from which simple sentences are generated by a set of simplification rules. A pharmacist defined a set of domain-specific lexical patterns to capture the most common expressions of DDI in texts. These lexical patterns are matched with the generated sentences in order to extract DDIs. Results We have performed different experiments to analyze the performance of the different processes. The lexical patterns achieve a reasonable precision (67.30%), but very low recall (14.07%). The inclusion of appositions and coordinate structures helps to improve the recall (25.70%), however, precision is lower (48.69%). The detection of clauses does not improve the performance. Conclusions Information Extraction (IE) techniques can provide an interesting way of reducing the time spent by health care professionals on reviewing the literature. Nevertheless, no approach has been carried out to extract DDI from texts. To the best of our knowledge, this work proposes the first integral solution for the automatic extraction of DDI from biomedical texts. PMID:21489220
A system for endobronchial video analysis

NASA Astrophysics Data System (ADS)

Byrnes, Patrick D.; Higgins, William E.

2017-03-01

Image-guided bronchoscopy is a critical component in the treatment of lung cancer and other pulmonary disorders. During bronchoscopy, a high-resolution endobronchial video stream facilitates guidance through the lungs and allows for visual inspection of a patient's airway mucosal surfaces. Despite the detailed information it contains, little effort has been made to incorporate recorded video into the clinical workflow. Follow-up procedures often required in cancer assessment or asthma treatment could significantly benefit from effectively parsed and summarized video. Tracking diagnostic regions of interest (ROIs) could potentially better equip physicians to detect early airway-wall cancer or improve asthma treatments, such as bronchial thermoplasty. To address this need, we have developed a system for the postoperative analysis of recorded endobronchial video. The system first parses an input video stream into endoscopic shots, derives motion information, and selects salient representative key frames. Next, a semi-automatic method for CT-video registration creates data linkages between a CT-derived airway-tree model and the input video. These data linkages then enable the construction of a CT-video chest model comprised of a bronchoscopy path history (BPH) - defining all airway locations visited during a procedure - and texture-mapping information for rendering registered video frames onto the airwaytree model. A suite of analysis tools is included to visualize and manipulate the extracted data. Video browsing and retrieval is facilitated through a video table of contents (TOC) and a search query interface. The system provides a variety of operational modes and additional functionality, including the ability to define regions of interest. We demonstrate the potential of our system using two human case study examples.
The "Globularization Hypothesis" of the Language-ready Brain as a Developmental Frame for Prosodic Bootstrapping Theories of Language Acquisition.

PubMed

Irurtzun, Aritz

2015-01-01

In recent research (Boeckx and Benítez-Burraco, 2014a,b) have advanced the hypothesis that our species-specific language-ready brain should be understood as the outcome of developmental changes that occurred in our species after the split from Neanderthals-Denisovans, which resulted in a more globular braincase configuration in comparison to our closest relatives, who had elongated endocasts. According to these authors, the development of a globular brain is an essential ingredient for the language faculty and in particular, it is the centrality occupied by the thalamus in a globular brain that allows its modulatory or regulatory role, essential for syntactico-semantic computations. Their hypothesis is that the syntactico-semantic capacities arise in humans as a consequence of a process of globularization, which significantly takes place postnatally (cf. Neubauer et al., 2010). In this paper, I show that Boeckx and Benítez-Burraco's hypothesis makes an interesting developmental prediction regarding the path of language acquisition: it teases apart the onset of phonological acquisition and the onset of syntactic acquisition (the latter starting significantly later, after globularization). I argue that this hypothesis provides a developmental rationale for the prosodic bootstrapping hypothesis of language acquisition (cf. i.a. Gleitman and Wanner, 1982; Mehler et al., 1988, et seq.; Gervain and Werker, 2013), which claim that prosodic cues are employed for syntactic parsing. The literature converges in the observation that a large amount of such prosodic cues (in particular, rhythmic cues) are already acquired before the completion of the globularization phase, which paves the way for the premises of the prosodic bootstrapping hypothesis, allowing babies to have a rich knowledge of the prosody of their target language before they can start parsing the primary linguistic data syntactically.
Neural Correlates of Three Promising Endophenotypes of Depression: Evidence from the EMBARC Study

PubMed Central

Webb, Christian A; Dillon, Daniel G; Pechtel, Pia; Goer, Franziska K; Murray, Laura; Huys, Quentin JM; Fava, Maurizio; McGrath, Patrick J; Weissman, Myrna; Parsey, Ramin; Kurian, Benji T; Adams, Phillip; Weyandt, Sarah; Trombello, Joseph M; Grannemann, Bruce; Cooper, Crystal M; Deldin, Patricia; Tenke, Craig; Trivedi, Madhukar; Bruder, Gerard; Pizzagalli, Diego A

2016-01-01

Major depressive disorder (MDD) is clinically, and likely pathophysiologically, heterogeneous. A potentially fruitful approach to parsing this heterogeneity is to focus on promising endophenotypes. Guided by the NIMH Research Domain Criteria initiative, we used source localization of scalp-recorded EEG resting data to examine the neural correlates of three emerging endophenotypes of depression: neuroticism, blunted reward learning, and cognitive control deficits. Data were drawn from the ongoing multi-site EMBARC study. We estimated intracranial current density for standard EEG frequency bands in 82 unmedicated adults with MDD, using Low-Resolution Brain Electromagnetic Tomography. Region-of-interest and whole-brain analyses tested associations between resting state EEG current density and endophenotypes of interest. Neuroticism was associated with increased resting gamma (36.5–44 Hz) current density in the ventral (subgenual) anterior cingulate cortex (ACC) and orbitofrontal cortex (OFC). In contrast, reduced cognitive control correlated with decreased gamma activity in the left dorsolateral prefrontal cortex (dlPFC), decreased theta (6.5–8 Hz) and alpha2 (10.5–12 Hz) activity in the dorsal ACC, and increased alpha2 activity in the right dlPFC. Finally, blunted reward learning correlated with lower OFC and left dlPFC gamma activity. Computational modeling of trial-by-trial reinforcement learning further indicated that lower OFC gamma activity was linked to reduced reward sensitivity. Three putative endophenotypes of depression were found to have partially dissociable resting intracranial EEG correlates, reflecting different underlying neural dysfunctions. Overall, these findings highlight the need to parse the heterogeneity of MDD by focusing on promising endophenotypes linked to specific pathophysiological abnormalities. PMID:26068725
Psychophysical and Neural Correlates of Auditory Attraction and Aversion

NASA Astrophysics Data System (ADS)

Patten, Kristopher Jakob

This study explores the psychophysical and neural processes associated with the perception of sounds as either pleasant or aversive. The underlying psychophysical theory is based on auditory scene analysis, the process through which listeners parse auditory signals into individual acoustic sources. The first experiment tests and confirms that a self-rated pleasantness continuum reliably exists for 20 various stimuli (r = .48). In addition, the pleasantness continuum correlated with the physical acoustic characteristics of consonance/dissonance (r = .78), which can facilitate auditory parsing processes. The second experiment uses an fMRI block design to test blood oxygen level dependent (BOLD) changes elicited by a subset of 5 exemplar stimuli chosen from Experiment 1 that are evenly distributed over the pleasantness continuum. Specifically, it tests and confirms that the pleasantness continuum produces systematic changes in brain activity for unpleasant acoustic stimuli beyond what occurs with pleasant auditory stimuli. Results revealed that the combination of two positively and two negatively valenced experimental sounds compared to one neutral baseline control elicited BOLD increases in the primary auditory cortex, specifically the bilateral superior temporal gyrus, and left dorsomedial prefrontal cortex; the latter being consistent with a frontal decision-making process common in identification tasks. The negatively-valenced stimuli yielded additional BOLD increases in the left insula, which typically indicates processing of visceral emotions. The positively-valenced stimuli did not yield any significant BOLD activation, consistent with consonant, harmonic stimuli being the prototypical acoustic pattern of auditory objects that is optimal for auditory scene analysis. Both the psychophysical findings of Experiment 1 and the neural processing findings of Experiment 2 support that consonance is an important dimension of sound that is processed in a manner that aids auditory parsing and functional representation of acoustic objects and was found to be a principal feature of pleasing auditory stimuli.
Machine learning to parse breast pathology reports in Chinese.

PubMed

Tang, Rong; Ouyang, Lizhi; Li, Clara; He, Yue; Griffin, Molly; Taghian, Alphonse; Smith, Barbara; Yala, Adam; Barzilay, Regina; Hughes, Kevin

2018-06-01

Large structured databases of pathology findings are valuable in deriving new clinical insights. However, they are labor intensive to create and generally require manual annotation. There has been some work in the bioinformatics community to support automating this work via machine learning in English. Our contribution is to provide an automated approach to construct such structured databases in Chinese, and to set the stage for extraction from other languages. We collected 2104 de-identified Chinese benign and malignant breast pathology reports from Hunan Cancer Hospital. Physicians with native Chinese proficiency reviewed the reports and annotated a variety of binary and numerical pathologic entities. After excluding 78 cases with a bilateral lesion in the same report, 1216 cases were used as a training set for the algorithm, which was then refined by 405 development cases. The Natural language processing algorithm was tested by using the remaining 405 cases to evaluate the machine learning outcome. The model was used to extract 13 binary entities and 8 numerical entities. When compared to physicians with native Chinese proficiency, the model showed a per-entity accuracy from 91 to 100% for all common diagnoses on the test set. The overall accuracy of binary entities was 98% and of numerical entities was 95%. In a per-report evaluation for binary entities with more than 100 training cases, 85% of all the testing reports were completely correct and 11% had an error in 1 out of 22 entities. We have demonstrated that Chinese breast pathology reports can be automatically parsed into structured data using standard machine learning approaches. The results of our study demonstrate that techniques effective in parsing English reports can be scaled to other languages.
Parsing Social Network Survey Data from Hidden Populations Using Stochastic Context-Free Grammars

PubMed Central

Poon, Art F. Y.; Brouwer, Kimberly C.; Strathdee, Steffanie A.; Firestone-Cruz, Michelle; Lozada, Remedios M.; Kosakovsky Pond, Sergei L.; Heckathorn, Douglas D.; Frost, Simon D. W.

2009-01-01

Background Human populations are structured by social networks, in which individuals tend to form relationships based on shared attributes. Certain attributes that are ambiguous, stigmatized or illegal can create a ÔhiddenÕ population, so-called because its members are difficult to identify. Many hidden populations are also at an elevated risk of exposure to infectious diseases. Consequently, public health agencies are presently adopting modern survey techniques that traverse social networks in hidden populations by soliciting individuals to recruit their peers, e.g., respondent-driven sampling (RDS). The concomitant accumulation of network-based epidemiological data, however, is rapidly outpacing the development of computational methods for analysis. Moreover, current analytical models rely on unrealistic assumptions, e.g., that the traversal of social networks can be modeled by a Markov chain rather than a branching process. Methodology/Principal Findings Here, we develop a new methodology based on stochastic context-free grammars (SCFGs), which are well-suited to modeling tree-like structure of the RDS recruitment process. We apply this methodology to an RDS case study of injection drug users (IDUs) in Tijuana, México, a hidden population at high risk of blood-borne and sexually-transmitted infections (i.e., HIV, hepatitis C virus, syphilis). Survey data were encoded as text strings that were parsed using our custom implementation of the inside-outside algorithm in a publicly-available software package (HyPhy), which uses either expectation maximization or direct optimization methods and permits constraints on model parameters for hypothesis testing. We identified significant latent variability in the recruitment process that violates assumptions of Markov chain-based methods for RDS analysis: firstly, IDUs tended to emulate the recruitment behavior of their own recruiter; and secondly, the recruitment of like peers (homophily) was dependent on the number of recruits. Conclusions SCFGs provide a rich probabilistic language that can articulate complex latent structure in survey data derived from the traversal of social networks. Such structure that has no representation in Markov chain-based models can interfere with the estimation of the composition of hidden populations if left unaccounted for, raising critical implications for the prevention and control of infectious disease epidemics. PMID:19738904
Blurring the Inputs: A Natural Language Approach to Sensitivity Analysis

NASA Technical Reports Server (NTRS)

Kleb, William L.; Thompson, Richard A.; Johnston, Christopher O.

2007-01-01

To document model parameter uncertainties and to automate sensitivity analyses for numerical simulation codes, a natural-language-based method to specify tolerances has been developed. With this new method, uncertainties are expressed in a natural manner, i.e., as one would on an engineering drawing, namely, 5.25 +/- 0.01. This approach is robust and readily adapted to various application domains because it does not rely on parsing the particular structure of input file formats. Instead, tolerances of a standard format are added to existing fields within an input file. As a demonstration of the power of this simple, natural language approach, a Monte Carlo sensitivity analysis is performed for three disparate simulation codes: fluid dynamics (LAURA), radiation (HARA), and ablation (FIAT). Effort required to harness each code for sensitivity analysis was recorded to demonstrate the generality and flexibility of this new approach.
A Seer of Trump's Coming Parses Repeal and Replace.

PubMed

Kirkner, Richard Mark

2017-03-01

Diana Furchtgott-Roth, a senior fellow at the Manhattan Institute, a freemarket think tank, confidently predicted back in October what few people saw coming-Donald Trump's electoral victory. Now she gives her take on the dismantling of the ACA and what might come after.
Design Report for the Synchronized Position, Velocity, and Time Code Generator

DTIC Science & Technology

2015-08-01

Stream Specification 4 2.3 Data Packet Format Specification 4 2.3.1 Individual Message Definition 5 3. MATLAB Parsing Software 6 4. Conclusions and...packet format structure ..................................................................4 Table 2 PPS time message definition ...5 Table 3 Position message definition ...................................................................5

A Python package for parsing, validating, mapping and formatting sequence variants using HGVS nomenclature.

PubMed

Hart, Reece K; Rico, Rudolph; Hare, Emily; Garcia, John; Westbrook, Jody; Fusaro, Vincent A

2015-01-15

Biological sequence variants are commonly represented in scientific literature, clinical reports and databases of variation using the mutation nomenclature guidelines endorsed by the Human Genome Variation Society (HGVS). Despite the widespread use of the standard, no freely available and comprehensive programming libraries are available. Here we report an open-source and easy-to-use Python library that facilitates the parsing, manipulation, formatting and validation of variants according to the HGVS specification. The current implementation focuses on the subset of the HGVS recommendations that precisely describe sequence-level variation relevant to the application of high-throughput sequencing to clinical diagnostics. The package is released under the Apache 2.0 open-source license. Source code, documentation and issue tracking are available at http://bitbucket.org/hgvs/hgvs/. Python packages are available at PyPI (https://pypi.python.org/pypi/hgvs). Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Parsing recursive sentences with a connectionist model including a neural stack and synaptic gating.

PubMed

Fedor, Anna; Ittzés, Péter; Szathmáry, Eörs

2011-02-21

It is supposed that humans are genetically predisposed to be able to recognize sequences of context-free grammars with centre-embedded recursion while other primates are restricted to the recognition of finite state grammars with tail-recursion. Our aim was to construct a minimalist neural network that is able to parse artificial sentences of both grammars in an efficient way without using the biologically unrealistic backpropagation algorithm. The core of this network is a neural stack-like memory where the push and pop operations are regulated by synaptic gating on the connections between the layers of the stack. The network correctly categorizes novel sentences of both grammars after training. We suggest that the introduction of the neural stack memory will turn out to be substantial for any biological 'hierarchical processor' and the minimalist design of the model suggests a quest for similar, realistic neural architectures. Copyright Â© 2010 Elsevier Ltd. All rights reserved.
Mathematical formula recognition using graph grammar

NASA Astrophysics Data System (ADS)

Lavirotte, Stephane; Pottier, Loic

1998-04-01

This paper describes current results of Ofr, a system for extracting and understanding mathematical expressions in documents. Such a tool could be really useful to be able to re-use knowledge in scientific books which are not available in electronic form. We currently also study use of this system for direct input of formulas with a graphical tablet for computer algebra system softwares. Existing solutions for mathematical recognition have problems to analyze 2D expressions like vectors and matrices. This is because they often try to use extended classical grammar to analyze formulas, relatively to baseline. But a lot of mathematical notations do not respect rules for such a parsing and that is the reason why they fail to extend text parsing technic. We investigate graph grammar and graph rewriting as a solution to recognize 2D mathematical notations. Graph grammar provide a powerful formalism to describe structural manipulations of multi-dimensional data. The main two problems to solve are ambiguities between rules of grammar and construction of graph.
Form and function: Optional complementizers reduce causal inferences

PubMed Central

Rohde, Hannah; Tyler, Joseph; Carlson, Katy

2017-01-01

Many factors are known to influence the inference of the discourse coherence relationship between two sentences. Here, we examine the relationship between two conjoined embedded clauses in sentences like The professor noted that the student teacher did not look confident and (that) the students were poorly behaved. In two studies, we find that the presence of that before the second embedded clause in such sentences reduces the possibility of a forward causal relationship between the clauses, i.e., the inference that the student teacher’s confidence was what affected student behavior. Three further studies tested the possibility of a backward causal relationship between clauses in the same structure, and found that the complementizer’s presence aids that relationship, especially in a forced-choice paradigm. The empirical finding that a complementizer, a linguistic element associated primarily with structure rather than event-level semantics, can affect discourse coherence is novel and illustrates an interdependence between syntactic parsing and discourse parsing. PMID:28804781
A Python package for parsing, validating, mapping and formatting sequence variants using HGVS nomenclature

PubMed Central

Hart, Reece K.; Rico, Rudolph; Hare, Emily; Garcia, John; Westbrook, Jody; Fusaro, Vincent A.

2015-01-01

Summary: Biological sequence variants are commonly represented in scientific literature, clinical reports and databases of variation using the mutation nomenclature guidelines endorsed by the Human Genome Variation Society (HGVS). Despite the widespread use of the standard, no freely available and comprehensive programming libraries are available. Here we report an open-source and easy-to-use Python library that facilitates the parsing, manipulation, formatting and validation of variants according to the HGVS specification. The current implementation focuses on the subset of the HGVS recommendations that precisely describe sequence-level variation relevant to the application of high-throughput sequencing to clinical diagnostics. Availability and implementation: The package is released under the Apache 2.0 open-source license. Source code, documentation and issue tracking are available at http://bitbucket.org/hgvs/hgvs/. Python packages are available at PyPI (https://pypi.python.org/pypi/hgvs). Contact: reecehart@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25273102
Parsing learning in networks using brain-machine interfaces.

PubMed

Orsborn, Amy L; Pesaran, Bijan

2017-10-01

Brain-machine interfaces (BMIs) define new ways to interact with our environment and hold great promise for clinical therapies. Motor BMIs, for instance, re-route neural activity to control movements of a new effector and could restore movement to people with paralysis. Increasing experience shows that interfacing with the brain inevitably changes the brain. BMIs engage and depend on a wide array of innate learning mechanisms to produce meaningful behavior. BMIs precisely define the information streams into and out of the brain, but engage wide-spread learning. We take a network perspective and review existing observations of learning in motor BMIs to show that BMIs engage multiple learning mechanisms distributed across neural networks. Recent studies demonstrate the advantages of BMI for parsing this learning and its underlying neural mechanisms. BMIs therefore provide a powerful tool for studying the neural mechanisms of learning that highlights the critical role of learning in engineered neural therapies. Copyright © 2017 Elsevier Ltd. All rights reserved.
Analyzing JAVAD TR-G2 GPS Receiver's Sensitivities to SLS Trajectory

NASA Technical Reports Server (NTRS)

Schuler, Tristan

2017-01-01

Automated guidance and navigation systems are an integral part to successful space missions. Previous researchers created Python tools to receive and parse data from a JAVAD TR-G2 space-capable GPS receiver. I improved the tool by customizing the output for plotting and comparing several simulations. I analyzed position errors, data loss, and signal loss by comparing simulated receiver data from an IFEN GPS simulator to ‘truth data’ from a proposed trajectory. By adjusting the trajectory simulation’s gain, attitude, and start time, NASA can assess the best time to launch the SLS, where to position the antennas on the Block 1-B, and which filter to use. Some additional testing has begun with the Novatel SpaceQuestGPS receiver as well as a GNSS SDR receiver.
Bridging the Gap: Towards a Cell-Type Specific Understanding of Neural Circuits Underlying Fear Behaviors

PubMed Central

McCullough, KM; Morrison, FG; Ressler, KJ

2016-01-01

Fear and anxiety-related disorders are remarkably common and debilitating, and are often characterized by dysregulated fear responses. Rodent models of fear learning and memory have taken great strides towards elucidating the specific neuronal circuitries underlying the learning of fear responses. The present review addresses recent research utilizing optogenetic approaches to parse circuitries underlying fear behaviors. It also highlights the powerful advances made when optogenetic techniques are utilized in a genetically defined, cell-type specific, manner. The application of next-generation genetic and sequencing approaches in a cell-type specific context will be essential for a mechanistic understanding of the neural circuitry underlying fear behavior and for the rational design of targeted, circuit specific, pharmacologic interventions for the treatment and prevention of fear-related disorders. PMID:27470092
CLOUDCLOUD : general-purpose instrument monitoring and data managing software

NASA Astrophysics Data System (ADS)

Dias, António; Amorim, António; Tomé, António

2016-04-01

An effective experiment is dependent on the ability to store and deliver data and information to all participant parties regardless of their degree of involvement in the specific parts that make the experiment a whole. Having fast, efficient and ubiquitous access to data will increase visibility and discussion, such that the outcome will have already been reviewed several times, strengthening the conclusions. The CLOUD project aims at providing users with a general purpose data acquisition, management and instrument monitoring platform that is fast, easy to use, lightweight and accessible to all participants of an experiment. This work is now implemented in the CLOUD experiment at CERN and will be fully integrated with the experiment as of 2016. Despite being used in an experiment of the scale of CLOUD, this software can also be used in any size of experiment or monitoring station, from single computers to large networks of computers to monitor any sort of instrument output without influencing the individual instrument's DAQ. Instrument data and meta data is stored and accessed via a specially designed database architecture and any type of instrument output is accepted using our continuously growing parsing application. Multiple databases can be used to separate different data taking periods or a single database can be used if for instance an experiment is continuous. A simple web-based application gives the user total control over the monitored instruments and their data, allowing data visualization and download, upload of processed data and the ability to edit existing instruments or add new instruments to the experiment. When in a network, new computers are immediately recognized and added to the system and are able to monitor instruments connected to them. Automatic computer integration is achieved by a locally running python-based parsing agent that communicates with a main server application guaranteeing that all instruments assigned to that computer are monitored with parsing intervals as fast as milliseconds. This software (server+agents+interface+database) comes in easy and ready-to-use packages that can be installed in any operating system, including Android and iOS systems. This software is ideal for use in modular experiments or monitoring stations with large variability in instruments and measuring methods or in large collaborations, where data requires homogenization in order to be effectively transmitted to all involved parties. This work presents the software and provides performance comparison with previously used monitoring systems in the CLOUD experiment at CERN.
Exploration of picture grammars, grammar learning, and inductive logic programming for image understanding

NASA Astrophysics Data System (ADS)

Ducksbury, P. G.; Kennedy, C.; Lock, Z.

2003-09-01

Grammars have been used for the formal specification of programming languages, and there are a number of commercial products which now use grammars. However, these have tended to be focused mainly on flow control type applications. In this paper, we consider the potential use of picture grammars and inductive logic programming in generic image understanding applications, such as object recognition. A number of issues are considered, such as what type of grammar needs to be used, how to construct the grammar with its associated attributes, difficulties encountered with parsing grammars followed by issues of automatically learning grammars using a genetic algorithm. The concept of inductive logic programming is then introduced as a method that can overcome some of the earlier difficulties.
Biological network extraction from scientific literature: state of the art and challenges.

PubMed

Li, Chen; Liakata, Maria; Rebholz-Schuhmann, Dietrich

2014-09-01

Networks of molecular interactions explain complex biological processes, and all known information on molecular events is contained in a number of public repositories including the scientific literature. Metabolic and signalling pathways are often viewed separately, even though both types are composed of interactions involving proteins and other chemical entities. It is necessary to be able to combine data from all available resources to judge the functionality, complexity and completeness of any given network overall, but especially the full integration of relevant information from the scientific literature is still an ongoing and complex task. Currently, the text-mining research community is steadily moving towards processing the full body of the scientific literature by making use of rich linguistic features such as full text parsing, to extract biological interactions. The next step will be to combine these with information from scientific databases to support hypothesis generation for the discovery of new knowledge and the extension of biological networks. The generation of comprehensive networks requires technologies such as entity grounding, coordination resolution and co-reference resolution, which are not fully solved and are required to further improve the quality of results. Here, we analyse the state of the art for the extraction of network information from the scientific literature and the evaluation of extraction methods against reference corpora, discuss challenges involved and identify directions for future research. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Many-core computing for space-based stereoscopic imaging

NASA Astrophysics Data System (ADS)

McCall, Paul; Torres, Gildo; LeGrand, Keith; Adjouadi, Malek; Liu, Chen; Darling, Jacob; Pernicka, Henry

The potential benefits of using parallel computing in real-time visual-based satellite proximity operations missions are investigated. Improvements in performance and relative navigation solutions over single thread systems can be achieved through multi- and many-core computing. Stochastic relative orbit determination methods benefit from the higher measurement frequencies, allowing them to more accurately determine the associated statistical properties of the relative orbital elements. More accurate orbit determination can lead to reduced fuel consumption and extended mission capabilities and duration. Inherent to the process of stereoscopic image processing is the difficulty of loading, managing, parsing, and evaluating large amounts of data efficiently, which may result in delays or highly time consuming processes for single (or few) processor systems or platforms. In this research we utilize the Single-Chip Cloud Computer (SCC), a fully programmable 48-core experimental processor, created by Intel Labs as a platform for many-core software research, provided with a high-speed on-chip network for sharing information along with advanced power management technologies and support for message-passing. The results from utilizing the SCC platform for the stereoscopic image processing application are presented in the form of Performance, Power, Energy, and Energy-Delay-Product (EDP) metrics. Also, a comparison between the SCC results and those obtained from executing the same application on a commercial PC are presented, showing the potential benefits of utilizing the SCC in particular, and any many-core platforms in general for real-time processing of visual-based satellite proximity operations missions.
Semantic Support and Parallel Parsing in Chinese

ERIC Educational Resources Information Center

Hsieh, Yufen; Boland, Julie E.

2015-01-01

Two eye-tracking experiments were conducted using written Chinese sentences that contained a multi-word ambiguous region. The goal was to determine whether readers maintained multiple interpretations throughout the ambiguous region or selected a single interpretation at the point of ambiguity. Within the ambiguous region, we manipulated the…
The Visual Syntax of Algebra.

ERIC Educational Resources Information Center

Kirshner, David

1989-01-01

A structured system of visual features is seen to parallel the propositional hierarchy of operations usually associated with the parsing of algebraic expressions. Women more than men were found to depend on these visual cues. Possible causes and consequences are discussed. Subjects were secondary and college students. (Author/DC)
Instructional Implications of Inquiry in Reading Comprehension.

ERIC Educational Resources Information Center

Snow, David

A contract deliverable on the NIE Communication Skills Project, this report consists of three separate documents describing the instructional implications of the analytic and empirical work carried out for the "Classroom Instruction in Reading Comprehension" part of the project: (1) Guidelines for Phrasal Segmentation; (2) Parsing Tasks…
Memory Retrieval in Parsing and Interpretation

ERIC Educational Resources Information Center

Schlueter, Ananda Lila Zoe

2017-01-01

This dissertation explores the relationship between the parser and the grammar in error-driven retrieval by examining the mechanism underlying the illusory licensing of subject-verb agreement violations ("agreement attraction"). Previous work motivates a two-stage model of agreement attraction in which the parser predicts the verb's…
Null Element Restoration

ERIC Educational Resources Information Center

Gabbard, Ryan

2010-01-01

Understanding the syntactic structure of a sentence is a necessary preliminary to understanding its semantics and therefore for many practical applications. The field of natural language processing has achieved a high degree of accuracy in parsing, at least in English. However, the syntactic structures produced by the most commonly used parsers…
Lessons Learned in Part-of-Speech Tagging of Conversational Speech

DTIC Science & Technology

2010-10-01

for conversational speech recognition. In Plenary Meeting and Symposium on Prosody and Speech Processing. Slav Petrov and Dan Klein. 2007. Improved...inference for unlexicalized parsing. In HLT-NAACL. Slav Petrov. 2010. Products of random latent variable grammars. In HLT-NAACL. Brian Roark, Yang Liu
Decoupling Object Detection and Categorization

ERIC Educational Resources Information Center

Mack, Michael L.; Palmeri, Thomas J.

2010-01-01

We investigated whether there exists a behavioral dependency between object detection and categorization. Previous work (Grill-Spector & Kanwisher, 2005) suggests that object detection and basic-level categorization may be the very same perceptual mechanism: As objects are parsed from the background they are categorized at the basic level. In…
Conversational Simulation in Computer-Assisted Language Learning: Potential and Reality.

ERIC Educational Resources Information Center

Coleman, D. Wells

1988-01-01

Addresses the potential of conversational simulations for computer-assisted language learning (CALL) and reasons why this potential is largely untapped. Topics discussed include artificial intelligence; microworlds; parsing; realism versus reality in computer software; intelligent tutoring systems; and criteria to clarify what kinds of CALL…

Using a Cultural and RDoC Framework to Conceptualize Anxiety in Asian Americans

PubMed Central

Liu, Huiting; Lieberman, Lynne; Stevens, Elizabeth; Auerbach, Randy P.; Shankman, Stewart A.

2016-01-01

Asian Americans are one of the fastest growing minority group in the United States; however, mental health within this population segment, particularly anxiety disorders, remains significantly understudied. Both the heterogeneity within the Asian American population, along with the multidimensional nature of anxiety, contribute to difficulties in understanding anxiety in this population. The present paper will review two sources of heterogeneity within anxiety in Asian Americans: (1) cultural variables and (2) mechanisms or components of anxiety. Specifically, we will examine four cultural variables most commonly found in research related to anxiety in Asian Americans: acculturation, affect valuation, loss of face, and individualism-collectivism. We will also discuss ways to parse anxiety through a Research Domain Criteria (RDoC) framework, specifically focusing on sensitivity to acute and potential threat, constructs within the Negative Valence System. We also present previously unpublished preliminary data to illustrate one way of examining ethnic differences in anxiety using an RDoC framework. Finally, this paper offers recommendations for future work in this area. PMID:27659553
Using a cultural and RDoC framework to conceptualize anxiety in Asian Americans.

PubMed

Liu, Huiting; Lieberman, Lynne; Stevens, Elizabeth S; Auerbach, Randy P; Shankman, Stewart A

2017-05-01

Asian Americans are one of the fastest growing minority groups in the United States; however, mental health within this population segment, particularly anxiety disorders, remains significantly understudied. Both the heterogeneity within the Asian American population and the multidimensional nature of anxiety contribute to difficulties in understanding anxiety in this population. The present paper reviewed two sources of heterogeneity within anxiety in Asian Americans: (1) cultural variables and (2) mechanisms or components of anxiety. Specifically, we examined four cultural variables most commonly found in research related to anxiety in Asian Americans: acculturation, loss of face, affect valuation, and individualism-collectivism. We also discussed ways to parse anxiety through a Research Domain Criteria (RDoC) framework, specifically focusing on sensitivity to acute and potential threat, constructs within the Negative Valence System. Previously unpublished preliminary data were presented to illustrate one way of examining ethnic differences in anxiety using an RDoC framework. Finally, this paper offered recommendations for future work in this area. Copyright © 2016 Elsevier Ltd. All rights reserved.
Domains of Social Support That Predict Bereavement Distress Following Homicide Loss.

PubMed

Bottomley, Jamison S; Burke, Laurie A; Neimeyer, Robert A

2017-05-01

Psychological adaptation following homicide loss can prove more challenging for grievers than other types of losses. Although social support can be beneficial in bereavement, research is mixed in terms of identifying whether it serves as a buffer to distress following traumatic loss. In particular, studies have not parsed out specific domains of social support that best predict positive bereavement outcomes. Recruiting a sample of 47 African Americans bereaved by homicide, we examined six types of social support along with the griever's perceived need for or satisfaction with each and analyzed them in relation to depression, anxiety, complicated grief, and posttraumatic stress disorder outcomes. Results of multivariate analyses revealed that the griever's level of satisfaction with physical assistance at the initial assessment best predicted lower levels of depression, anxiety, and posttraumatic stress disorder levels 6 months later, while less need for physical assistance predicted lower complicated grief at follow-up. Clinical implications and suggestions for future research are discussed.
Using domain knowledge and domain-inspired discourse model for coreference resolution for clinical narratives

PubMed Central

Roth, Dan

2013-01-01

Objective This paper presents a coreference resolution system for clinical narratives. Coreference resolution aims at clustering all mentions in a single document to coherent entities. Materials and methods A knowledge-intensive approach for coreference resolution is employed. The domain knowledge used includes several domain-specific lists, a knowledge intensive mention parsing, and task informed discourse model. Mention parsing allows us to abstract over the surface form of the mention and represent each mention using a higher-level representation, which we call the mention's semantic representation (SR). SR reduces the mention to a standard form and hence provides better support for comparing and matching. Existing coreference resolution systems tend to ignore discourse aspects and rely heavily on lexical and structural cues in the text. The authors break from this tradition and present a discourse model for “person” type mentions in clinical narratives, which greatly simplifies the coreference resolution. Results This system was evaluated on four different datasets which were made available in the 2011 i2b2/VA coreference challenge. The unweighted average of F1 scores (over B-cubed, MUC and CEAF) varied from 84.2% to 88.1%. These experiments show that domain knowledge is effective for different mention types for all the datasets. Discussion Error analysis shows that most of the recall errors made by the system can be handled by further addition of domain knowledge. The precision errors, on the other hand, are more subtle and indicate the need to understand the relations in which mentions participate for building a robust coreference system. Conclusion This paper presents an approach that makes an extensive use of domain knowledge to significantly improve coreference resolution. The authors state that their system and the knowledge sources developed will be made publicly available. PMID:22781192
Mind wandering at the fingertips: automatic parsing of subjective states based on response time variability

PubMed Central

Bastian, Mikaël; Sackur, Jérôme

2013-01-01

Research from the last decade has successfully used two kinds of thought reports in order to assess whether the mind is wandering: random thought-probes and spontaneous reports. However, none of these two methods allows any assessment of the subjective state of the participant between two reports. In this paper, we present a step by step elaboration and testing of a continuous index, based on response time variability within Sustained Attention to Response Tasks (N = 106, for a total of 10 conditions). We first show that increased response time variability predicts mind wandering. We then compute a continuous index of response time variability throughout full experiments and show that the temporal position of a probe relative to the nearest local peak of the continuous index is predictive of mind wandering. This suggests that our index carries information about the subjective state of the subject even when he or she is not probed, and opens the way for on-line tracking of mind wandering. Finally we proceed a step further and infer the internal attentional states on the basis of the variability of response times. To this end we use the Hidden Markov Model framework, which allows us to estimate the durations of on-task and off-task episodes. PMID:24046753
Effects of research complexity and competition on the incidence and growth of coauthorship in biomedicine

PubMed Central

Wang, Xiaoyan; Laubenbacher, Reinhard C.

2017-01-01

Background Investigations into the factors behind coauthorship growth in biomedical research have mostly focused on specific disciplines or journals, and have rarely controlled for factors in combination or considered changes in their effects over time. Observers often attribute the growth to the increasing complexity or competition (or both) of research practices, but few attempts have been made to parse the contributions of these two likely causes. Objectives We aimed to assess the effects of complexity and competition on the incidence and growth of coauthorship, using a sample of the biomedical literature spanning multiple journals and disciplines. Methods Article-level bibliographic data from PubMed were combined with publicly available bibliometric data from Web of Science and SCImago over the years 1999–2007. We selected four predictors of coauthorship were selected, two (study type, topical scope of the study) associated with complexity and two (financial support for the project, popularity of the publishing journal) associated with competition. A negative binomial regression model was used to estimate the effects of each predictor on coauthorship incidence and growth. A second, mixed-effect model included the journal as a random effect. Results Coauthorship increased at about one author per article per decade. Clinical trials, supported research, and research of broader scope produced articles with more authors, while review articles credited fewer; and more popular journals published higher-authorship articles. Incidence and growth rates varied widely across journals and were themselves uncorrelated. Most effects remained statistically discernible after controlling for the publishing journal. The effects of complexity-associated factors held constant or diminished over time, while competition-related effects strengthened. These trends were similar in size but not discernible from subject-specific subdata. Conclusions Coauthorship incidence rates are multifactorial and vary with factors associated with both complexity and competition. Coauthorship growth is likewise multifactorial and increasingly associated with research competition. PMID:28329003
Automated real-time search and analysis algorithms for a non-contact 3D profiling system

NASA Astrophysics Data System (ADS)

Haynes, Mark; Wu, Chih-Hang John; Beck, B. Terry; Peterman, Robert J.

2013-04-01

The purpose of this research is to develop a new means of identifying and extracting geometrical feature statistics from a non-contact precision-measurement 3D profilometer. Autonomous algorithms have been developed to search through large-scale Cartesian point clouds to identify and extract geometrical features. These algorithms are developed with the intent of providing real-time production quality control of cold-rolled steel wires. The steel wires in question are prestressing steel reinforcement wires for concrete members. The geometry of the wire is critical in the performance of the overall concrete structure. For this research a custom 3D non-contact profilometry system has been developed that utilizes laser displacement sensors for submicron resolution surface profiling. Optimizations in the control and sensory system allow for data points to be collected at up to an approximate 400,000 points per second. In order to achieve geometrical feature extraction and tolerancing with this large volume of data, the algorithms employed are optimized for parsing large data quantities. The methods used provide a unique means of maintaining high resolution data of the surface profiles while keeping algorithm running times within practical bounds for industrial application. By a combination of regional sampling, iterative search, spatial filtering, frequency filtering, spatial clustering, and template matching a robust feature identification method has been developed. These algorithms provide an autonomous means of verifying tolerances in geometrical features. The key method of identifying the features is through a combination of downhill simplex and geometrical feature templates. By performing downhill simplex through several procedural programming layers of different search and filtering techniques, very specific geometrical features can be identified within the point cloud and analyzed for proper tolerancing. Being able to perform this quality control in real time provides significant opportunities in cost savings in both equipment protection and waste minimization.
Haystack, a web-based tool for metabolomics research

PubMed Central

2014-01-01

Background Liquid chromatography coupled to mass spectrometry (LCMS) has become a widely used technique in metabolomics research for differential profiling, the broad screening of biomolecular constituents across multiple samples to diagnose phenotypic differences and elucidate relevant features. However, a significant limitation in LCMS-based metabolomics is the high-throughput data processing required for robust statistical analysis and data modeling for large numbers of samples with hundreds of unique chemical species. Results To address this problem, we developed Haystack, a web-based tool designed to visualize, parse, filter, and extract significant features from LCMS datasets rapidly and efficiently. Haystack runs in a browser environment with an intuitive graphical user interface that provides both display and data processing options. Total ion chromatograms (TICs) and base peak chromatograms (BPCs) are automatically displayed, along with time-resolved mass spectra and extracted ion chromatograms (EICs) over any mass range. Output files in the common .csv format can be saved for further statistical analysis or customized graphing. Haystack's core function is a flexible binning procedure that converts the mass dimension of the chromatogram into a set of interval variables that can uniquely identify a sample. Binned mass data can be analyzed by exploratory methods such as principal component analysis (PCA) to model class assignment and identify discriminatory features. The validity of this approach is demonstrated by comparison of a dataset from plants grown at two light conditions with manual and automated peak detection methods. Haystack successfully predicted class assignment based on PCA and cluster analysis, and identified discriminatory features based on analysis of EICs of significant bins. Conclusion Haystack, a new online tool for rapid processing and analysis of LCMS-based metabolomics data is described. It offers users a range of data visualization options and supports non-biased differential profiling studies through a unique and flexible binning function that provides an alternative to conventional peak deconvolution analysis methods. PMID:25350247
Haystack, a web-based tool for metabolomics research.

PubMed

Grace, Stephen C; Embry, Stephen; Luo, Heng

2014-01-01

Liquid chromatography coupled to mass spectrometry (LCMS) has become a widely used technique in metabolomics research for differential profiling, the broad screening of biomolecular constituents across multiple samples to diagnose phenotypic differences and elucidate relevant features. However, a significant limitation in LCMS-based metabolomics is the high-throughput data processing required for robust statistical analysis and data modeling for large numbers of samples with hundreds of unique chemical species. To address this problem, we developed Haystack, a web-based tool designed to visualize, parse, filter, and extract significant features from LCMS datasets rapidly and efficiently. Haystack runs in a browser environment with an intuitive graphical user interface that provides both display and data processing options. Total ion chromatograms (TICs) and base peak chromatograms (BPCs) are automatically displayed, along with time-resolved mass spectra and extracted ion chromatograms (EICs) over any mass range. Output files in the common .csv format can be saved for further statistical analysis or customized graphing. Haystack's core function is a flexible binning procedure that converts the mass dimension of the chromatogram into a set of interval variables that can uniquely identify a sample. Binned mass data can be analyzed by exploratory methods such as principal component analysis (PCA) to model class assignment and identify discriminatory features. The validity of this approach is demonstrated by comparison of a dataset from plants grown at two light conditions with manual and automated peak detection methods. Haystack successfully predicted class assignment based on PCA and cluster analysis, and identified discriminatory features based on analysis of EICs of significant bins. Haystack, a new online tool for rapid processing and analysis of LCMS-based metabolomics data is described. It offers users a range of data visualization options and supports non-biased differential profiling studies through a unique and flexible binning function that provides an alternative to conventional peak deconvolution analysis methods.
Computational Linguistics in the Netherlands 1996. Papers from the CLIN Meeting (7th, Eindhoven, Netherlands, November 15, 1996).

ERIC Educational Resources Information Center

Landsbergen, Jan, Ed.; Odijk, Jan, Ed.; van Deemter, Kees, Ed.; van Zanten, Gert Veldhuijzen, Ed.

Papers from the meeting on computational linguistics include: "Conversational Games, Belief Revision and Bayesian Networks" (Stephen G. Pulman); "Valence Alternation without Lexical Rules" (Gosse Bouma); "Filtering Left Dislocation Chains in Parsing Categorical Grammar" (Crit Cremers, Maarten Hijzelendoorn);…
Developing a Large Lexical Database for Information Retrieval, Parsing, and Text Generation Systems.

ERIC Educational Resources Information Center

Conlon, Sumali Pin-Ngern; And Others

1993-01-01

Important characteristics of lexical databases and their applications in information retrieval and natural language processing are explained. An ongoing project using various machine-readable sources to build a lexical database is described, and detailed designs of individual entries with examples are included. (Contains 66 references.) (EAM)
Interference Effects from Grammatically Unavailable Constituents during Sentence Processing

ERIC Educational Resources Information Center

Van Dyke, Julie A.

2007-01-01

Evidence from 3 experiments reveals interference effects from structural relationships that are inconsistent with any grammatical parse of the perceived input. Processing disruption was observed when items occurring between a head and a dependent overlapped with either (or both) syntactic or semantic features of the dependent. Effects of syntactic…
From Internationalisation to Education for Global Citizenship: A Multi-Layered History

ERIC Educational Resources Information Center

Haigh, Martin

2014-01-01

The evolving narrative on internationalisation in higher education is complex and multi-layered. This overview explores the evolution of thinking about internationalisation among different stakeholder groups in universities. It parses out eight coexisting layers that progress from concerns based largely upon institutional survival and competition…
The Temporal Organization of Syllabic Structure

ERIC Educational Resources Information Center

Shaw, Jason A.

2010-01-01

This dissertation develops analytical tools which enable rigorous evaluation of competing syllabic parses on the basis of temporal patterns in speech production data. The data come from the articulographic tracking of fleshpoints on target speech organs, e.g., tongue, lips, jaw, in experiments with native speakers of American English and Moroccan…
A Bootstrapped Approach to Multilingual Text Stream Parsing

ERIC Educational Resources Information Center

Londhe, Nikhil

2017-01-01

The ubiquitous hashtag has disruptively transformed how news stories are reported and shared across social media networks. Often, such text streams are massively multilingual with 50 different languages on an average and contain a combination of subjective user opinion, objective evolving information about the story and unrelated spam. This is in…
Acquisition by Processing Theory: A Theory of Everything?

ERIC Educational Resources Information Center

Carroll, Susanne E.

2004-01-01

Truscott and Sharwood Smith (henceforth T&SS) propose a novel theory of language acquisition, "Acquisition by Processing Theory" (APT), designed to account for both first and second language acquisition, monolingual and bilingual speech perception and parsing, and speech production. This is a tall order. Like any theoretically ambitious…
E-Learning for Depth in the Semantic Web

ERIC Educational Resources Information Center

Shafrir, Uri; Etkind, Masha

2006-01-01

In this paper, we describe concept parsing algorithms, a novel semantic analysis methodology at the core of a new pedagogy that focuses learners attention on deep comprehension of the conceptual content of learned material. Two new e-learning tools are described in some detail: interactive concept discovery learning and meaning equivalence…
Neural Encoding of Relative Position

ERIC Educational Resources Information Center

Hayworth, Kenneth J.; Lescroart, Mark D.; Biederman, Irving

2011-01-01

Late ventral visual areas generally consist of cells having a significant degree of translation invariance. Such a "bag of features" representation is useful for the recognition of individual objects; however, it seems unable to explain our ability to parse a scene into multiple objects and to understand their spatial relationships. We…
Income Sustainability through Educational Attainment

ERIC Educational Resources Information Center

Carlson, Ronald H.; McChesney, Christopher S.

2015-01-01

The authors examined the sustainability of income, as it relates to educational attainment, from the two recent decades, which includes three significant economic downturns. The data was analyzed to determine trends in the wealth gap, parsed by educational attainment and gender. Utilizing the data from 1991 through 2010, predictions in changes in…
TOC as a regional sediment conditionindicator: Parsing effects of grain size and organic content

EPA Science Inventory

TOC content of sediments is often used as an indicator of benthic condition. Percent TOC is generally positively correlated with sediment percent fines. While sediment grain size may have impacts on benthic organisms independent of organic content, it is often not explicitly co...

TOC as a regional sediment condition indicator: Parsing effects of grain size and organic content

EPA Science Inventory

TOC content of sediments is often used as an indicator of benthic condition. Percent TOC is generally positively correlated with sediment percent fines. While sediment grain size may have impacts on benthic organisms independent of organic content, it is often not explicitly co...
The Metamorphosis of the Statistical Segmentation Output: Lexicalization during Artificial Language Learning

ERIC Educational Resources Information Center

Fernandes, Tania; Kolinsky, Regine; Ventura, Paulo

2009-01-01

This study combined artificial language learning (ALL) with conventional experimental techniques to test whether statistical speech segmentation outputs are integrated into adult listeners' mental lexicon. Lexicalization was assessed through inhibitory effects of novel neighbors (created by the parsing process) on auditory lexical decisions to…
Computer-Assisted Analysis of Written Language: Assessing the Written Language of Deaf Children, II.

ERIC Educational Resources Information Center

Parkhurst, Barbara G.; MacEachron, Marion P.

1980-01-01

Two pilot studies investigated the accuracy of a computer parsing system for analyzing written language of deaf children. Results of the studies showed good agreement between human and machine raters. Journal availability: Elsevier North Holland, Inc., 52 Vanderbilt Avenue, New York, NY 10017. (Author)
Comorbid Social Anxiety Disorder in Adults with Autism Spectrum Disorder

ERIC Educational Resources Information Center

Maddox, Brenna B.; White, Susan W.

2015-01-01

Social anxiety symptoms are common among cognitively unimpaired youth with autism spectrum disorder (ASD). Few studies have investigated the co-occurrence of social anxiety disorder (SAD) in adults with ASD, although identification may aid access to effective treatments and inform our scientific efforts to parse heterogeneity. In this preliminary…
Parsing the Relations of Race and Socioeconomic Status in Special Education Disproportionality

ERIC Educational Resources Information Center

Kincaid, Aleksis P.; Sullivan, Amanda L.

2017-01-01

This study investigated how student and school-level socioeconomic status (SES) measures predict students' odds of being identified for special education, particularly high-incidence disabilities. Using the Early Childhood Longitudinal Study--Kindergarten cohort, hierarchical models were used to determine the relations of student and school SES to…
Netmeld v. 1.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

BERG, MICHAEL; RILEY, MARSHALL

System assessments typically yield large quantities of data from disparate sources for an analyst to scrutinize for issues. Netmeld is used to parse input from different file formats, store the data in a common format, allow users to easily query it, and enable analysts to tie different analysis tools together using a common back-end.
A Multiple-Channel Model of Task-Dependent Ambiguity Resolution in Sentence Comprehension

ERIC Educational Resources Information Center

Logacev, Pavel; Vasishth, Shravan

2016-01-01

Traxler, Pickering, and Clifton (1998) found that ambiguous sentences are read faster than their unambiguous counterparts. This so-called "ambiguity advantage" has presented a major challenge to classical theories of human sentence comprehension (parsing) because its most prominent explanation, in the form of the unrestricted race model…
Conversational Coherency. Technical Report No. 95.

ERIC Educational Resources Information Center

Reichman, Rachel

To analyze the process involved in maintaining conversational coherency, the study described in this paper used a construct called a "context space" that grouped utterances referring to a single issue or episode. The paper defines the types of context spaces, parses individual conversations to identify the underlying model or structure,…
A Flexible, Extensible Online Testing System for Mathematics

ERIC Educational Resources Information Center

Passmore, Tim; Brookshaw, Leigh; Butler, Harry

2011-01-01

An online testing system developed for entry-skills testing of first-year university students in algebra and calculus is described. The system combines the open-source computer algebra system "Maxima" with computer scripts to parse student answers, which are entered using standard mathematical notation and conventions. The answers can…
Modelling Parsing Constraints with High-Dimensional Context Space.

ERIC Educational Resources Information Center

Burgess, Curt; Lund, Kevin

1997-01-01

Presents a model of high-dimensional context space, the Hyperspace Analogue to Language (HAL), with a series of simulations modelling human empirical results. Proposes that HAL's context space can be used to provide a basic categorization of semantic and grammatical concepts; model certain aspects of morphological ambiguity in verbs; and provide…
Event Segmentation Improves Event Memory up to One Month Later

ERIC Educational Resources Information Center

Flores, Shaney; Bailey, Heather R.; Eisenberg, Michelle L.; Zacks, Jeffrey M.

2017-01-01

When people observe everyday activity, they spontaneously parse it into discrete meaningful events. Individuals who segment activity in a more normative fashion show better subsequent memory for the events. If segmenting events effectively leads to better memory, does asking people to attend to segmentation improve subsequent memory? To answer…
The Neural Basis of Speech Parsing in Children and Adults

ERIC Educational Resources Information Center

McNealy, Kristin; Mazziotta, John C.; Dapretto, Mirella

2010-01-01

Word segmentation, detecting word boundaries in continuous speech, is a fundamental aspect of language learning that can occur solely by the computation of statistical and speech cues. Fifty-four children underwent functional magnetic resonance imaging (fMRI) while listening to three streams of concatenated syllables that contained either high…
E-Learning Systems Requirements Elicitation: Perspectives and Considerations

ERIC Educational Resources Information Center

AlKhuder, Shaikha B.; AlAli, Fatma H.

2017-01-01

Training and education have evolved far beyond black boards and chalk boxes. The environment of knowledge exchange requires more than simple materials and assessments. This article is an attempt of parsing through the different aspects of e-learning, understanding the real needs, and conducting the right requirements to build the appropriate…
Perceiving Goals and Actions in Individuals with Autism Spectrum Disorders

ERIC Educational Resources Information Center

Zalla, Tiziana; Labruyère, Nelly; Georgieff, Nicolas

2013-01-01

In the present study, we investigated the ability to parse familiar sequences of action into meaningful events in young individuals with autism spectrum disorders (ASDs), as compared to young individuals with typical development (TD) and young individuals with moderate mental retardation or learning disabilities (MLDs). While viewing two…
SUBTLE: Situation Understanding Bot through Language and Environment

DTIC Science & Technology

2016-01-06

a 4 day “hackathon” by Stuart Young’s small robots group which successfully ported the SUBTLE MURI NLP robot interface to the Packbot platform they...null element restoration, a step typically ig- nored in NLP systems, allows for correct parsing of im- peratives and questions, critical structures
Video segmentation using keywords

NASA Astrophysics Data System (ADS)

Ton-That, Vinh; Vong, Chi-Tai; Nguyen-Dao, Xuan-Truong; Tran, Minh-Triet

2018-04-01

At DAVIS-2016 Challenge, many state-of-art video segmentation methods achieve potential results, but they still much depend on annotated frames to distinguish between background and foreground. It takes a lot of time and efforts to create these frames exactly. In this paper, we introduce a method to segment objects from video based on keywords given by user. First, we use a real-time object detection system - YOLOv2 to identify regions containing objects that have labels match with the given keywords in the first frame. Then, for each region identified from the previous step, we use Pyramid Scene Parsing Network to assign each pixel as foreground or background. These frames can be used as input frames for Object Flow algorithm to perform segmentation on entire video. We conduct experiments on a subset of DAVIS-2016 dataset in half the size of its original size, which shows that our method can handle many popular classes in PASCAL VOC 2012 dataset with acceptable accuracy, about 75.03%. We suggest widely testing by combining other methods to improve this result in the future.
A user-oriented web crawler for selectively acquiring online content in e-health research.

PubMed

Xu, Songhua; Yoon, Hong-Jun; Tourassi, Georgia

2014-01-01

Life stories of diseased and healthy individuals are abundantly available on the Internet. Collecting and mining such online content can offer many valuable insights into patients' physical and emotional states throughout the pre-diagnosis, diagnosis, treatment and post-treatment stages of the disease compared with those of healthy subjects. However, such content is widely dispersed across the web. Using traditional query-based search engines to manually collect relevant materials is rather labor intensive and often incomplete due to resource constraints in terms of human query composition and result parsing efforts. The alternative option, blindly crawling the whole web, has proven inefficient and unaffordable for e-health researchers. We propose a user-oriented web crawler that adaptively acquires user-desired content on the Internet to meet the specific online data source acquisition needs of e-health researchers. Experimental results on two cancer-related case studies show that the new crawler can substantially accelerate the acquisition of highly relevant online content compared with the existing state-of-the-art adaptive web crawling technology. For the breast cancer case study using the full training set, the new method achieves a cumulative precision between 74.7 and 79.4% after 5 h of execution till the end of the 20-h long crawling session as compared with the cumulative precision between 32.8 and 37.0% using the peer method for the same time period. For the lung cancer case study using the full training set, the new method achieves a cumulative precision between 56.7 and 61.2% after 5 h of execution till the end of the 20-h long crawling session as compared with the cumulative precision between 29.3 and 32.4% using the peer method. Using the reduced training set in the breast cancer case study, the cumulative precision of our method is between 44.6 and 54.9%, whereas the cumulative precision of the peer method is between 24.3 and 26.3%; for the lung cancer case study using the reduced training set, the cumulative precisions of our method and the peer method are, respectively, between 35.7 and 46.7% versus between 24.1 and 29.6%. These numbers clearly show a consistently superior accuracy of our method in discovering and acquiring user-desired online content for e-health research. The implementation of our user-oriented web crawler is freely available to non-commercial users via the following Web site: http://bsec.ornl.gov/AdaptiveCrawler.shtml. The Web site provides a step-by-step guide on how to execute the web crawler implementation. In addition, the Web site provides the two study datasets including manually labeled ground truth, initial seeds and the crawling results reported in this article.
A fast and efficient python library for interfacing with the Biological Magnetic Resonance Data Bank.

PubMed

Smelter, Andrey; Astra, Morgan; Moseley, Hunter N B

2017-03-17

The Biological Magnetic Resonance Data Bank (BMRB) is a public repository of Nuclear Magnetic Resonance (NMR) spectroscopic data of biological macromolecules. It is an important resource for many researchers using NMR to study structural, biophysical, and biochemical properties of biological macromolecules. It is primarily maintained and accessed in a flat file ASCII format known as NMR-STAR. While the format is human readable, the size of most BMRB entries makes computer readability and explicit representation a practical requirement for almost any rigorous systematic analysis. To aid in the use of this public resource, we have developed a package called nmrstarlib in the popular open-source programming language Python. The nmrstarlib's implementation is very efficient, both in design and execution. The library has facilities for reading and writing both NMR-STAR version 2.1 and 3.1 formatted files, parsing them into usable Python dictionary- and list-based data structures, making access and manipulation of the experimental data very natural within Python programs (i.e. "saveframe" and "loop" records represented as individual Python dictionary data structures). Another major advantage of this design is that data stored in original NMR-STAR can be easily converted into its equivalent JavaScript Object Notation (JSON) format, a lightweight data interchange format, facilitating data access and manipulation using Python and any other programming language that implements a JSON parser/generator (i.e., all popular programming languages). We have also developed tools to visualize assigned chemical shift values and to convert between NMR-STAR and JSONized NMR-STAR formatted files. Full API Reference Documentation, User Guide and Tutorial with code examples are also available. We have tested this new library on all current BMRB entries: 100% of all entries are parsed without any errors for both NMR-STAR version 2.1 and version 3.1 formatted files. We also compared our software to three currently available Python libraries for parsing NMR-STAR formatted files: PyStarLib, NMRPyStar, and PyNMRSTAR. The nmrstarlib package is a simple, fast, and efficient library for accessing data from the BMRB. The library provides an intuitive dictionary-based interface with which Python programs can read, edit, and write NMR-STAR formatted files and their equivalent JSONized NMR-STAR files. The nmrstarlib package can be used as a library for accessing and manipulating data stored in NMR-STAR files and as a command-line tool to convert from NMR-STAR file format into its equivalent JSON file format and vice versa, and to visualize chemical shift values. Furthermore, the nmrstarlib implementation provides a guide for effectively JSONizing other older scientific formats, improving the FAIRness of data in these formats.
Processing voiceless vowels in Japanese: Effects of language-specific phonological knowledge

NASA Astrophysics Data System (ADS)

Ogasawara, Naomi

2005-04-01

There has been little research on processing allophonic variation in the field of psycholinguistics. This study focuses on processing the voiced/voiceless allophonic alternation of high vowels in Japanese. Three perception experiments were conducted to explore how listeners parse out vowels with the voicing alternation from other segments in the speech stream and how the different voicing statuses of the vowel affect listeners' word recognition process. The results from the three experiments show that listeners use phonological knowledge of their native language for phoneme processing and for word recognition. However, interactions of the phonological and acoustic effects are observed to be different in each process. The facilitatory phonological effect and the inhibitory acoustic effect cancel out one another in phoneme processing; while in word recognition, the facilitatory phonological effect overrides the inhibitory acoustic effect.
An Experiment in Scientific Program Understanding

NASA Technical Reports Server (NTRS)

Stewart, Mark E. M.; Owen, Karl (Technical Monitor)

2000-01-01

This paper concerns a procedure that analyzes aspects of the meaning or semantics of scientific and engineering code. This procedure involves taking a user's existing code, adding semantic declarations for some primitive variables, and parsing this annotated code using multiple, independent expert parsers. These semantic parsers encode domain knowledge and recognize formulae in different disciplines including physics, numerical methods, mathematics, and geometry. The parsers will automatically recognize and document some static, semantic concepts and help locate some program semantic errors. Results are shown for three intensively studied codes and seven blind test cases; all test cases are state of the art scientific codes. These techniques may apply to a wider range of scientific codes. If so, the techniques could reduce the time, risk, and effort required to develop and modify scientific codes.

The effect of sample size and disease prevalence on supervised machine learning of narrative data.

PubMed Central

McKnight, Lawrence K.; Wilcox, Adam; Hripcsak, George

2002-01-01

This paper examines the independent effects of outcome prevalence and training sample sizes on inductive learning performance. We trained 3 inductive learning algorithms (MC4, IB, and Naïve-Bayes) on 60 simulated datasets of parsed radiology text reports labeled with 6 disease states. Data sets were constructed to define positive outcome states at 4 prevalence rates (1, 5, 10, 25, and 50%) in training set sizes of 200 and 2,000 cases. We found that the effect of outcome prevalence is significant when outcome classes drop below 10% of cases. The effect appeared independent of sample size, induction algorithm used, or class label. Work is needed to identify methods of improving classifier performance when output classes are rare. PMID:12463878
cyvcf2: fast, flexible variant analysis with Python.

PubMed

Pedersen, Brent S; Quinlan, Aaron R

2017-06-15

Variant call format (VCF) files document the genetic variation observed after DNA sequencing, alignment and variant calling of a sample cohort. Given the complexity of the VCF format as well as the diverse variant annotations and genotype metadata, there is a need for fast, flexible methods enabling intuitive analysis of the variant data within VCF and BCF files. We introduce cyvcf2 , a Python library and software package for fast parsing and querying of VCF and BCF files and illustrate its speed, simplicity and utility. bpederse@gmail.com or aaronquinlan@gmail.com. cyvcf2 is available from https://github.com/brentp/cyvcf2 under the MIT license and from common python package managers. Detailed documentation is available at http://brentp.github.io/cyvcf2/. © The Author 2017. Published by Oxford University Press.
The pediatric sepsis biomarker risk model: potential implications for sepsis therapy and biology.

PubMed

Alder, Matthew N; Lindsell, Christopher J; Wong, Hector R

2014-07-01

Sepsis remains a major cause of morbidity and mortality in adult and pediatric intensive care units. Heterogeneity of demographics, comorbidities, biological mechanisms, and severity of illness leads to difficulty in determining which patients are at highest risk of mortality. Determining mortality risk is important for weighing the potential benefits of more aggressive interventions and for deciding whom to enroll in clinical trials. Biomarkers can be used to parse patients into different risk categories and can outperform current methods of patient risk stratification based on physiologic parameters. Here we review the Pediatric Sepsis Biomarker Risk Model that has also been modified and applied to estimate mortality risk in adult patients. We compare the two models and speculate on the biological implications of the biomarkers in patients with sepsis.
An Activation-Based Model of Sentence Processing as Skilled Memory Retrieval

ERIC Educational Resources Information Center

Lewis, Richard L.; Vasishth, Shravan

2005-01-01

We present a detailed process theory of the moment-by-moment working-memory retrievals and associated control structure that subserve sentence comprehension. The theory is derived from the application of independently motivated principles of memory and cognitive skill to the specialized task of sentence parsing. The resulting theory construes…
Is Desegregation Dead? Parsing the Relationship between Achievement and Demographics

ERIC Educational Resources Information Center

Eaton, Susan; Rivkin, Steven

2010-01-01

The Supreme Court declared in 1954 that "separate educational facilities are inherently unequal." Into the 1970s, urban education reform focused predominantly on making sure that African American students had the opportunity to attend school with their white peers. Now, however, most reformers take as a given that the typical low-income minority…
Reanalysis of Clause Boundaries in Japanese as a Constraint-Driven Process.

ERIC Educational Resources Information Center

Miyamoto, Edson T.

2003-01-01

Reports on two experiments that focus on clause boundaries in Japanese that suggest that minimal change restriction is unnecessary to characterize reanalysis. Proposes that the data and previous observations are more naturally explained by a constraint-driven model in which revisions are performed only when required by parsing constraints.…
Deeper than Shallow: Evidence for Structure-Based Parsing Biases in Second-Language Sentence Processing

ERIC Educational Resources Information Center

Witzel, Jeffrey; Witzel, Naoko; Nicol, Janet

2012-01-01

This study examines the reading patterns of native speakers (NSs) and high-level (Chinese) nonnative speakers (NNSs) on three English sentence types involving temporarily ambiguous structural configurations. The reading patterns on each sentence type indicate that both NSs and NNSs were biased toward specific structural interpretations. These…
Large Constituent Families Help Children Parse Compounds

ERIC Educational Resources Information Center

Krott, Andrea; Nicoladis, Elena

2005-01-01

The family size of the constituents of compound words, or the number of compounds sharing the constituents, has been shown to affect adults' access to compound words in the mental lexicon. The present study was designed to see if family size would affect children's segmentation of compounds. Twenty-five English-speaking children between 3;7 and…
Neural Responses to the Production and Comprehension of Syntax in Identical Utterances

ERIC Educational Resources Information Center

Indefrey, Peter; Hellwig, Frauke; Herzog, Hans; Seitz, Rudiger J.; Hagoort, Peter

2004-01-01

Following up on an earlier positron emission tomography (PET) experiment (Indefrey et al., 2001), we used a scene description paradigm to investigate whether a posterior inferior frontal region subserving syntactic encoding for speaking is also involved in syntactic parsing during listening. In the language production part of the experiment,…
Adaptations for English Language Learners: Differentiating between Linguistic and Instructional Accommodations

ERIC Educational Resources Information Center

Pappamihiel, N. Eleni; Lynn, C. Allen

2016-01-01

While many teachers and teacher educators in the United States K-12 system acknowledge that the English language learners (ELLs) in our schools need modifications and accommodations to help them succeed in school, few attempt to parse out how different types of accommodations may affect learning in the mainstream classroom, specifically linguistic…
Maturation of Rapid Auditory Temporal Processing and Subsequent Nonword Repetition Performance in Children

ERIC Educational Resources Information Center

Fox, Allison M.; Reid, Corinne L.; Anderson, Mike; Richardson, Cassandra; Bishop, Dorothy V. M.

2012-01-01

According to the rapid auditory processing theory, the ability to parse incoming auditory information underpins learning of oral and written language. There is wide variation in this low-level perceptual ability, which appears to follow a protracted developmental course. We studied the development of rapid auditory processing using event-related…
Graphemic Cohesion Effect in Reading and Writing Complex Graphemes

ERIC Educational Resources Information Center

Spinelli, Elsa; Kandel, Sonia; Guerassimovitch, Helena; Ferrand, Ludovic

2012-01-01

"AU" /o/ and "AN" /a/ in French are both complex graphemes, but they vary in their strength of association to their respective sounds. The letter sequence "AU" is systematically associated to the phoneme /o/, and as such is always parsed as a complex grapheme. However, "AN" can be associated with either one…
Disfluencies along the Garden Path: Brain Electrophysiological Evidence of Disrupted Sentence Processing

ERIC Educational Resources Information Center

Maxfield, Nathan D.; Lyon, Justine M.; Silliman, Elaine R.

2009-01-01

Bailey and Ferreira (2003) hypothesized and reported behavioral evidence that disfluencies (filled and silent pauses) undesirably affect sentence processing when they appear before disambiguating verbs in Garden Path (GP) sentences. Disfluencies here cause the parser to "linger" on, and apparently accept as correct, an erroneous parse. Critically,…
HyperText MARCup: A Conceptualization for Encoding, De-Constructing, Searching, Retrieving, and Using Traditional Knowledge Tools.

ERIC Educational Resources Information Center

Wall, C. Edward; And Others

1995-01-01

Discusses the integration of Standard General Markup Language, Hypertext Markup Language, and MARC format to parse classified analytical bibliographies. Use of the resulting electronic knowledge constructs in local library systems as maps of a specified subset of resources is discussed, and an example is included. (LRW)
Allocation of Limited Cognitive Resources during Text Comprehension in a Second Language

ERIC Educational Resources Information Center

Morishima, Yasunori

2013-01-01

For native (L1) comprehenders, lower-level language processes such as lexical access and parsing are considered to consume few cognitive resources. In contrast, these processes pose considerable demands for second-language (L2) comprehenders. Two reading-time experiments employing inconsistency detection found that English learners did not detect…
Using Artificial Intelligence To Teach English to Deaf People. Final Report.

ERIC Educational Resources Information Center

Loritz, Donald; Zambrano, Robert

This report describes a project to develop an English grammar-checking word processor intended for use by college students with hearing impairments. The project succeeded in its first objective, achievement of 92 percent parsing accuracy across the freely written compositions of college-bound deaf students. The second objective, ability to use the…
On the Early Left-Anterior Negativity (ELAN) in Syntax Studies

ERIC Educational Resources Information Center

Steinhauer, Karsten; Drury, John E.

2012-01-01

Within the framework of Friederici's (2002) neurocognitive model of sentence processing, the early left anterior negativity (ELAN) in event-related potentials (ERPs) has been claimed to be a brain marker of syntactic first-pass parsing. As ELAN components seem to be exclusively elicited by word category violations (phrase structure violations),…
Change of Academic Major: The Influence of Broad and Narrow Personality Traits

ERIC Educational Resources Information Center

Foster, N. A.

2017-01-01

The relationship between academic major change and ten personality traits (the Big Five and five narrow traits), was investigated in a sample of 437 college undergraduates. Contrary to expectations, Career Decidedness and Optimism were positively related to academic major change, regardless of class ranking. When parsing data by college year,…
Immediate use of prosody and context in predicting a syntactic structure.

PubMed

Nakamura, Chie; Arai, Manabu; Mazuka, Reiko

2012-11-01

Numerous studies have reported an effect of prosodic information on parsing but whether prosody can impact even the initial parsing decision is still not evident. In a visual world eye-tracking experiment, we investigated the influence of contrastive intonation and visual context on processing temporarily ambiguous relative clause sentences in Japanese. Our results showed that listeners used the prosodic cue to make a structural prediction before hearing disambiguating information. Importantly, the effect was limited to cases where the visual scene provided an appropriate context for the prosodic cue, thus eliminating the explanation that listeners have simply associated marked prosodic information with a less frequent structure. Furthermore, the influence of the prosodic information was also evident following disambiguating information, in a way that reflected the initial analysis. The current study demonstrates that prosody, when provided with an appropriate context, influences the initial syntactic analysis and also the subsequent cost at disambiguating information. The results also provide first evidence for pre-head structural prediction driven by prosodic and contextual information with a head-final construction. Copyright © 2012 Elsevier B.V. All rights reserved.
Conceptual plural information is used to guide early parsing decisions: Evidence from garden-path sentences with reciprocal verbs.

PubMed

Patson, Nikole D; Ferreira, Fernanda

2009-05-01

In three eyetracking studies, we investigated the role of conceptual plurality in initial parsing decisions in temporarily ambiguous sentences with reciprocal verbs (e.g., While the lovers kissed the baby played alone). We varied the subject of the first clause using three types of plural noun phrases: conjoined noun phrases (the bride and the groom), plural definite descriptions (the lovers), and numerically quantified noun phrases (the two lovers). We found no evidence for garden-path effects when the subject was conjoined (Ferreira & McClure, 1997), but traditional garden-path effects were found with the other plural noun phrases. In addition, we tested plural anaphors that had a plural antecedent present in the discourse. We found that when the antecedent was conjoined, garden-path effects were absent compared to cases in which the antecedent was a plural definite description. Our results indicate that the parser is sensitive to the conceptual representation of a plural constituent. In particular, it appears that a Complex Reference Object (Moxey et al., 2004) automatically activates a reciprocal reading of a reciprocal verb.

Language experience changes subsequent learning

PubMed Central

Onnis, Luca; Thiessen, Erik

2013-01-01

What are the effects of experience on subsequent learning? We explored the effects of language-specific word order knowledge on the acquisition of sequential conditional information. Korean and English adults were engaged in a sequence learning task involving three different sets of stimuli: auditory linguistic (nonsense syllables), visual non-linguistic (nonsense shapes), and auditory non-linguistic (pure tones). The forward and backward probabilities between adjacent elements generated two equally probable and orthogonal perceptual parses of the elements, such that any significant preference at test must be due to either general cognitive biases, or prior language-induced biases. We found that language modulated parsing preferences with the linguistic stimuli only. Intriguingly, these preferences are congruent with the dominant word order patterns of each language, as corroborated by corpus analyses, and are driven by probabilistic preferences. Furthermore, although the Korean individuals had received extensive formal explicit training in English and lived in an English-speaking environment, they exhibited statistical learning biases congruent with their native language. Our findings suggest that mechanisms of statistical sequential learning are implicated in language across the lifespan, and experience with language may affect cognitive processes and later learning. PMID:23200510
Two models of minimalist, incremental syntactic analysis.

PubMed

Stabler, Edward P

2013-07-01

Minimalist grammars (MGs) and multiple context-free grammars (MCFGs) are weakly equivalent in the sense that they define the same languages, a large mildly context-sensitive class that properly includes context-free languages. But in addition, for each MG, there is an MCFG which is strongly equivalent in the sense that it defines the same language with isomorphic derivations. However, the structure-building rules of MGs but not MCFGs are defined in a way that generalizes across categories. Consequently, MGs can be exponentially more succinct than their MCFG equivalents, and this difference shows in parsing models too. An incremental, top-down beam parser for MGs is defined here, sound and complete for all MGs, and hence also capable of parsing all MCFG languages. But since the parser represents its grammar transparently, the relative succinctness of MGs is again evident. Although the determinants of MG structure are narrowly and discretely defined, probabilistic influences from a much broader domain can influence even the earliest analytic steps, allowing frequency and context effects to come early and from almost anywhere, as expected in incremental models. Copyright © 2013 Cognitive Science Society, Inc.
Text data extraction for a prospective, research-focused data mart: implementation and validation

PubMed Central

2012-01-01

Background Translational research typically requires data abstracted from medical records as well as data collected specifically for research. Unfortunately, many data within electronic health records are represented as text that is not amenable to aggregation for analyses. We present a scalable open source SQL Server Integration Services package, called Regextractor, for including regular expression parsers into a classic extract, transform, and load workflow. We have used Regextractor to abstract discrete data from textual reports from a number of ‘machine generated’ sources. To validate this package, we created a pulmonary function test data mart and analyzed the quality of the data mart versus manual chart review. Methods Eleven variables from pulmonary function tests performed closest to the initial clinical evaluation date were studied for 100 randomly selected subjects with scleroderma. One research assistant manually reviewed, abstracted, and entered relevant data into a database. Correlation with data obtained from the automated pulmonary function test data mart within the Northwestern Medical Enterprise Data Warehouse was determined. Results There was a near perfect (99.5%) agreement between results generated from the Regextractor package and those obtained via manual chart abstraction. The pulmonary function test data mart has been used subsequently to monitor disease progression of patients in the Northwestern Scleroderma Registry. In addition to the pulmonary function test example presented in this manuscript, the Regextractor package has been used to create cardiac catheterization and echocardiography data marts. The Regextractor package was released as open source software in October 2009 and has been downloaded 552 times as of 6/1/2012. Conclusions Collaboration between clinical researchers and biomedical informatics experts enabled the development and validation of a tool (Regextractor) to parse, abstract and assemble structured data from text data contained in the electronic health record. Regextractor has been successfully used to create additional data marts in other medical domains and is available to the public. PMID:22970696
A resource-saving collective approach to biomedical semantic role labeling

PubMed Central

2014-01-01

Background Biomedical semantic role labeling (BioSRL) is a natural language processing technique that identifies the semantic roles of the words or phrases in sentences describing biological processes and expresses them as predicate-argument structures (PAS’s). Currently, a major problem of BioSRL is that most systems label every node in a full parse tree independently; however, some nodes always exhibit dependency. In general SRL, collective approaches based on the Markov logic network (MLN) model have been successful in dealing with this problem. However, in BioSRL such an approach has not been attempted because it would require more training data to recognize the more specialized and diverse terms found in biomedical literature, increasing training time and computational complexity. Results We first constructed a collective BioSRL system based on MLN. This system, called collective BIOSMILE (CBIOSMILE), is trained on the BioProp corpus. To reduce the resources used in BioSRL training, we employ a tree-pruning filter to remove unlikely nodes from the parse tree and four argument candidate identifiers to retain candidate nodes in the tree. Nodes not recognized by any candidate identifier are discarded. The pruned annotated parse trees are used to train a resource-saving MLN-based system, which is referred to as resource-saving collective BIOSMILE (RCBIOSMILE). Our experimental results show that our proposed CBIOSMILE system outperforms BIOSMILE, which is the top BioSRL system. Furthermore, our proposed RCBIOSMILE maintains the same level of accuracy as CBIOSMILE using 92% less memory and 57% less training time. Conclusions This greatly improved efficiency makes RCBIOSMILE potentially suitable for training on much larger BioSRL corpora over more biomedical domains. Compared to real-world biomedical corpora, BioProp is relatively small, containing only 445 MEDLINE abstracts and 30 event triggers. It is not large enough for practical applications, such as pathway construction. We consider it of primary importance to pursue SRL training on large corpora in the future. PMID:24884358
Towards a unified theory of neocortex: laminar cortical circuits for vision and cognition.

PubMed

Grossberg, Stephen

2007-01-01

A key goal of computational neuroscience is to link brain mechanisms to behavioral functions. The present article describes recent progress towards explaining how laminar neocortical circuits give rise to biological intelligence. These circuits embody two new and revolutionary computational paradigms: Complementary Computing and Laminar Computing. Circuit properties include a novel synthesis of feedforward and feedback processing, of digital and analog processing, and of preattentive and attentive processing. This synthesis clarifies the appeal of Bayesian approaches but has a far greater predictive range that naturally extends to self-organizing processes. Examples from vision and cognition are summarized. A LAMINART architecture unifies properties of visual development, learning, perceptual grouping, attention, and 3D vision. A key modeling theme is that the mechanisms which enable development and learning to occur in a stable way imply properties of adult behavior. It is noted how higher-order attentional constraints can influence multiple cortical regions, and how spatial and object attention work together to learn view-invariant object categories. In particular, a form-fitting spatial attentional shroud can allow an emerging view-invariant object category to remain active while multiple view categories are associated with it during sequences of saccadic eye movements. Finally, the chapter summarizes recent work on the LIST PARSE model of cognitive information processing by the laminar circuits of prefrontal cortex. LIST PARSE models the short-term storage of event sequences in working memory, their unitization through learning into sequence, or list, chunks, and their read-out in planned sequential performance that is under volitional control. LIST PARSE provides a laminar embodiment of Item and Order working memories, also called Competitive Queuing models, that have been supported by both psychophysical and neurobiological data. These examples show how variations of a common laminar cortical design can embody properties of visual and cognitive intelligence that seem, at least on the surface, to be mechanistically unrelated.
“One code to find them all”: a perl tool to conveniently parse RepeatMasker output files

PubMed Central

2014-01-01

Background Of the different bioinformatic methods used to recover transposable elements (TEs) in genome sequences, one of the most commonly used procedures is the homology-based method proposed by the RepeatMasker program. RepeatMasker generates several output files, including the .out file, which provides annotations for all detected repeats in a query sequence. However, a remaining challenge consists of identifying the different copies of TEs that correspond to the identified hits. This step is essential for any evolutionary/comparative analysis of the different copies within a family. Different possibilities can lead to multiple hits corresponding to a unique copy of an element, such as the presence of large deletions/insertions or undetermined bases, and distinct consensus corresponding to a single full-length sequence (like for long terminal repeat (LTR)-retrotransposons). These possibilities must be taken into account to determine the exact number of TE copies. Results We have developed a perl tool that parses the RepeatMasker .out file to better determine the number and positions of TE copies in the query sequence, in addition to computing quantitative information for the different families. To determine the accuracy of the program, we tested it on several RepeatMasker .out files corresponding to two organisms (Drosophila melanogaster and Homo sapiens) for which the TE content has already been largely described and which present great differences in genome size, TE content, and TE families. Conclusions Our tool provides access to detailed information concerning the TE content in a genome at the family level from the .out file of RepeatMasker. This information includes the exact position and orientation of each copy, its proportion in the query sequence, and its quality compared to the reference element. In addition, our tool allows a user to directly retrieve the sequence of each copy and obtain the same detailed information at the family level when a local library with incomplete TE class/subclass information was used with RepeatMasker. We hope that this tool will be helpful for people working on the distribution and evolution of TEs within genomes.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Jurrus, Elizabeth R.; Hodas, Nathan O.; Baker, Nathan A.

Forensic analysis of nanoparticles is often conducted through the collection and identifi- cation of electron microscopy images to determine the origin of suspected nuclear material. Each image is carefully studied by experts for classification of materials based on texture, shape, and size. Manually inspecting large image datasets takes enormous amounts of time. However, automatic classification of large image datasets is a challenging problem due to the complexity involved in choosing image features, the lack of training data available for effective machine learning methods, and the availability of user interfaces to parse through images. Therefore, a significant need exists for automatedmore » and semi-automated methods to help analysts perform accurate image classification in large image datasets. We present INStINCt, our Intelligent Signature Canvas, as a framework for quickly organizing image data in a web based canvas framework. Images are partitioned using small sets of example images, chosen by users, and presented in an optimal layout based on features derived from convolutional neural networks.« less
The Lucy Calkins Project: Parsing a Self-Proclaimed Literacy Guru

ERIC Educational Resources Information Center

Feinberg, Barbara

2007-01-01

This article discusses the work of Lucy McCormick Calkins, an educator and the visionary founding director of Teachers College Reading and Writing Project. Begun in 1981, the think tank and teacher training institute has since trained hundreds of thousands of educators across the country. Calkins is one of the original architects of the…
Working Memory Effects in the L2 Processing of Ambiguous Relative Clauses

ERIC Educational Resources Information Center

Hopp, Holger

2014-01-01

This article investigates whether and how L2 sentence processing is affected by memory constraints that force serial parsing. Monitoring eye movements, we test effects of working memory on L2 relative-clause attachment preferences in a sample of 75 late-adult German learners of English and 25 native English controls. Mixed linear regression…
Efficient Inference for Trees and Alignments: Modeling Monolingual and Bilingual Syntax with Hard and Soft Constraints and Latent Variables

ERIC Educational Resources Information Center

Smith, David Arthur

2010-01-01

Much recent work in natural language processing treats linguistic analysis as an inference problem over graphs. This development opens up useful connections between machine learning, graph theory, and linguistics. The first part of this dissertation formulates syntactic dependency parsing as a dynamic Markov random field with the novel…
The Effects of Syntactically Parsed Text Formats on Intensive Reading in EFL

ERIC Educational Resources Information Center

Herbert, John C.

2014-01-01

Separating text into meaningful language chunks, as with visual-syntactic text formatting, helps readers to process text more easily and language learners to recognize grammar and syntax patterns more quickly. Evidence of this exists in studies on native and non-native English speakers. However, recent studies question the roll of VSTF in certain…
Parsing the Passive: Comparing Children with Specific Language Impairment to Sequential Bilingual Children

ERIC Educational Resources Information Center

Marinis, Theodoros; Saddy, Douglas

2013-01-01

Twenty-five monolingual (L1) children with specific language impairment (SLI), 32 sequential bilingual (L2) children, and 29 L1 controls completed the Test of Active & Passive Sentences-Revised (van der Lely 1996) and the Self-Paced Listening Task with Picture Verification for actives and passives (Marinis 2007). These revealed important…
A New Framework for Textual Information Mining over Parse Trees. CRESST Report 805

ERIC Educational Resources Information Center

Mousavi, Hamid; Kerr, Deirdre; Iseli, Markus R.

2011-01-01

Textual information mining is a challenging problem that has resulted in the creation of many different rule-based linguistic query languages. However, these languages generally are not optimized for the purpose of text mining. In other words, they usually consider queries as individuals and only return raw results for each query. Moreover they…
Simplex and Multiplex Stratification in ASD and ADHD Families: A Promising Approach for Identifying Overlapping and Unique Underpinnings of ASD and ADHD?

ERIC Educational Resources Information Center

Oerlemans, Anoek M.; Hartman, Catharina A.; De Bruijn, Yvette G. E.; Van Steijn, Daphne J.; Franke, Barbara; Buitelaar, Jan K.; Rommelse, Nanda N. J.

2015-01-01

Autism spectrum disorders (ASD) and attention-deficit/hyperactivity disorder (ADHD) are highly heterogeneous neuropsychiatric disorders, that frequently co-occur. This study examined whether stratification into single-incidence (SPX) and multi-incidence (MPX) is helpful in (a) parsing heterogeneity and (b) detecting overlapping and unique…
Morphological Decomposition in the Recognition of Prefixed and Suffixed Words: Evidence from Korean

ERIC Educational Resources Information Center

Kim, Say Young; Wang, Min; Taft, Marcus

2015-01-01

Korean has visually salient syllable units that are often mapped onto either prefixes or suffixes in derived words. In addition, prefixed and suffixed words may be processed differently given a left-to-right parsing procedure and the need to resolve morphemic ambiguity in prefixes in Korean. To test this hypothesis, four experiments using the…
Phrase Length Matters: The Interplay between Implicit Prosody and Syntax in Korean "Garden Path" Sentences

ERIC Educational Resources Information Center

Hwang, Hyekyung; Steinhauer, Karsten

2011-01-01

In spoken language comprehension, syntactic parsing decisions interact with prosodic phrasing, which is directly affected by phrase length. Here we used ERPs to examine whether a similar effect holds for the on-line processing of written sentences during silent reading, as suggested by theories of "implicit prosody." Ambiguous Korean sentence…
Periscopic Spine Surgery

DTIC Science & Technology

2007-01-01

radiation therapy In radiation therapy , the overarching goal is to deliver a lethal dose to the cancerous tissue while minimizing collateral damage to the ...Computer is shown in Figure 6. The exercise protocol is first parsed into a control mode based on the desired activation of configuration space variables...ABSTRACT In this paper, we present the design and implementation of a
Computational Linguistic Assessment of Genre Differences Focusing on Text Cohesive Devices of Student Writing: Implications for Library Instruction

ERIC Educational Resources Information Center

Wang, Xin; Cho, Kwangsu

2010-01-01

This study examined two major academic genres of writing: argumentative and technical writing. Three hundred eighty-four undergraduate student-produced texts were parsed and analyzed through a computational tool called Coh-Metrix. The results inform the instructional librarians that students used genre-dependent cohesive devices in a limited way…
Myelin Biogenesis And Oligodendrocyte Development: Parsing Out The Roles Of Glycosphingolipids

PubMed Central

Jackman, Nicole; Ishii, Akihiro; Bansal, Rashmi

2010-01-01

The myelin sheath is an extension of the oligoddendrocyte (OL) plasma membrane enriched in lipids which ensheaths the axons of the central and peripheral nervous system. Here we review the involvement of glycosphingolipid in myelin/OL functions; including the regulation of OL differentiation, lipid raft-mediated trafficking and signaling, and neuron-glia interactions. PMID:19815855
FASTQ quality control dashboard

DOE Office of Scientific and Technical Information (OSTI.GOV)

2016-07-25

FQCDB builds up existing open source software, FastQC, implementing a modern web interface for across parsed output of FastQC. In addition, FQCDB is extensible as a web service to include additional plots of type line, boxplot, or heatmap, across data formatted according to guidelines. The interface is also configurable via more readable JSON format, enabling customization by non-web programmers.

Temporal Clustering and Sequencing in Short-Term Memory and Episodic Memory

ERIC Educational Resources Information Center

Farrell, Simon

2012-01-01

A model of short-term memory and episodic memory is presented, with the core assumptions that (a) people parse their continuous experience into episodic clusters and (b) items are clustered together in memory as episodes by binding information within an episode to a common temporal context. Along with the additional assumption that information…
Morphological Parsing and the Use of Segmentation Cues in Reading Finnish Compounds

ERIC Educational Resources Information Center

Bertram, Raymond; Pollatsek, Alexander; Hyona, Jukka

2004-01-01

This eye movement study investigated the use of two types of segmentation cues in processing long Finnish compounds. The cues were related to the vowel quality properties of the constituents and properties of the consonant starting the second constituent. In Finnish, front vowels never appear with back vowels in a lexeme, but different quality…
Living Human Dignity: A Nightingale Legacy.

PubMed

Hegge, Margaret; Bunkers, Sandra Schmidt

2017-10-01

The authors in this article present the humanbecoming ethical tenets of human dignity: reverence, awe, betrayal, and shame. These four ethical tenets of human dignity are examined from a historical perspective, exploring how Rosemarie Rizzo Parse has conceptualized these ethical tenets with added descriptions from other scholars, and how Florence Nightingale lived human dignity as the founder of modern nursing.
User Evaluation of Automatically Generated Semantic Hypertext Links in a Heavily Used Procedural Manual.

ERIC Educational Resources Information Center

Tebbutt, John

1999-01-01

Discusses efforts at National Institute of Standards and Technology (NIST) to construct an information discovery tool through the fusion of hypertext and information retrieval that works by parsing a contiguous document base into smaller documents and inserting semantic links between them. Also presents a case study that evaluated user reactions.…
The Sentence Fairy: A Natural-Language Generation System to Support Children's Essay Writing

ERIC Educational Resources Information Center

Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine

2008-01-01

We built an NLP system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary texts produced by pupils…
Text Cohesion and Comprehension: A Comparison of Prose Analysis Systems.

ERIC Educational Resources Information Center

Varnhagen, Connie K.; Goldman, Susan R.

To test three specific hypotheses about recall as a function of four categories of logical relations, a study was done to determine whether logical relations systems of prose analysis can be used to predict recall. Two descriptive passages of naturally occurring expository prose were used. Each text was parsed into 45 statements, consisting of…
Direct Object Predictability: Effects on Young Children's Imitation of Sentences

ERIC Educational Resources Information Center

Valian, Virginia; Prasada, Sandeep; Scarpa, Jodi

2006-01-01

We hypothesize that the conceptual relation between a verb and its direct object can make a sentence easier ("the cat is eating some food") or harder ("the cat is eating a sock") to parse and understand. If children's limited performance systems contribute to the ungrammatical brevity of their speech, they should perform better on sentences that…
Interconnection of electronic medical record with clinical data management system by CDISC ODM.

PubMed

Matsumura, Yasushi; Hattori, Atsushi; Manabe, Shiro; Takeda, Toshihiro; Takahashi, Daiyo; Yamamoto, Yuichiro; Murata, Taizo; Mihara, Naoki

2014-01-01

EDC system has been used in the field of clinical research. The current EDC system does not connect with electronic medical record system (EMR), thus a medical staff has to transcribe the data in EMR to EDC system manually. This redundant process causes not only inefficiency but also human error. We developed an EDC system cooperating with EMR, in which the data required for a clinical research form (CRF) is transcribed automatically from EMR to electronic CRF (eCRF) and is sent via network. We call this system as "eCRF reporter". The interface module of eCRF reporter can retrieves the data in EMR database including patient biography data, laboratory test data, prescription data and data entered by template in progress notes. The eCRF reporter also enables users to enter data directly to eCRF. The eCRF reporter generates CDISC ODM file and PDF which is a translated form of Clinical data in ODM. After storing eCRF in EMR, it is transferred via VPN to a clinical data management system (CDMS) which can receive the eCRF files and parse ODM. We started some clinical research by using this system. This system is expected to promote clinical research efficiency and strictness.
[Nursing knowledge: the evolution of scientific philosophies and paradigm trends].

PubMed

Hung, Hsuan-Man; Wang, Hui-Ling; Chang, Yun-Hsuan; Chen, Chung-Hey

2010-02-01

Different aspects of philosophy are derived from different paradigms that contain various main points, some of which are repeated or overlap. Belief and practice are two components of a paradigm that provide perspective and framework and lead to nursing research. Changes in healthcare have popularized empirical and evidence-based research in the field of nursing research. However, the evidence-base study approach has given rise to a certain level of debate. Until now, no standard paradigm has been established for the nursing field, as different professionals use different paradigms in their studies. Such provides certain limitations as well as advantages. The quantitative aspects of a nursing paradigm were developed by Peplau and Henderson (1950) and Orem (1980). Such remained the standard until 1990, when Guba and Parse proposed qualitative viewpoints in contextual features. Therefore, the nursing paradigm has made great contributions to the development of knowledge in nursing care, although debate continues due to incomplete knowledge attributable to the presentation of knowledge and insight within individually developed paradigms. It is better to apply multiple paradigms to different research questions. It is suggested that better communication amongst experts regarding their individual points of view would help nursing members to integrate findings within the global pool of knowledge and allow replication over multiple studies.
Processing sequence annotation data using the Lua programming language.

PubMed

Ueno, Yutaka; Arita, Masanori; Kumagai, Toshitaka; Asai, Kiyoshi

2003-01-01

The data processing language in a graphical software tool that manages sequence annotation data from genome databases should provide flexible functions for the tasks in molecular biology research. Among currently available languages we adopted the Lua programming language. It fulfills our requirements to perform computational tasks for sequence map layouts, i.e. the handling of data containers, symbolic reference to data, and a simple programming syntax. Upon importing a foreign file, the original data are first decomposed in the Lua language while maintaining the original data schema. The converted data are parsed by the Lua interpreter and the contents are stored in our data warehouse. Then, portions of annotations are selected and arranged into our catalog format to be depicted on the sequence map. Our sequence visualization program was successfully implemented, embedding the Lua language for processing of annotation data and layout script. The program is available at http://staff.aist.go.jp/yutaka.ueno/guppy/.
T.I.M.S: TaqMan Information Management System, tools to organize data flow in a genotyping laboratory

PubMed Central

Monnier, Stéphanie; Cox, David G; Albion, Tim; Canzian, Federico

2005-01-01

Background Single Nucleotide Polymorphism (SNP) genotyping is a major activity in biomedical research. The Taqman technology is one of the most commonly used approaches. It produces large amounts of data that are difficult to process by hand. Laboratories not equipped with a Laboratory Information Management System (LIMS) need tools to organize the data flow. Results We propose a package of Visual Basic programs focused on sample management and on the parsing of input and output TaqMan files. The code is written in Visual Basic, embedded in the Microsoft Office package, and it allows anyone to have access to those tools, without any programming skills and with basic computer requirements. Conclusion We have created useful tools focused on management of TaqMan genotyping data, a critical issue in genotyping laboratories whithout a more sophisticated and expensive system, such as a LIMS. PMID:16221298
Characterizing tradeoffs between water and food under different climate regimes across the United States

NASA Astrophysics Data System (ADS)

Troy, T.; Zhu, X.; Kipgen, C.; Li, X.; Pal, I.

2015-12-01

As water demand approaches or exceeds the available water supply in many regions of the globe, water stress will become increasingly prevalent with potentially necessary tradeoffs required between water prioritization amongst sectors. Agriculture is the largest consumptive water user in the US, and irrigation plays a vital role in ensuring a stable food supply by buffering against climate extremes. However, it also plays a negative role in inducing water stress in many regions. Much research has focused on reducing agricultural water use, but this needs to be complemented by better quantifying the benefit of irrigation on crop yields under a range of climate conditions. Regions are identified with significant irrigation benefits with and without water stress to parse apart the role of climate, crop choice, and water usage to then evaluate tradeoffs with food production in a climate-water-food nexus.
Design of real-time voice over internet protocol system under bandwidth network

NASA Astrophysics Data System (ADS)

Zhang, Li; Gong, Lina

2017-04-01

With the increasing bandwidth of the network and network convergence accelerating, VoIP means of communication across the network is becoming increasingly popular phenomenon. The real-time identification and analysis for VOIP flow over backbone network become the urgent needs and research hotspot of network operations management. Based on this, the paper proposes a VoIP business management system over backbone network. The system first filters VoIP data stream over backbone network and further resolves the call signaling information and media voice. The system can also be able to design appropriate rules to complete real-time reduction and presentation of specific categories of calls. Experimental results show that the system can parse and process real-time backbone of the VoIP call, and the results are presented accurately in the management interface, VoIP-based network traffic management and maintenance provide the necessary technical support.
The role of suppression in figurative language comprehension✩

PubMed Central

Gemsbacher, Morton Ann; Robertson, Rachel R.W.

2014-01-01

In this paper, we describe the crucial role that suppression plays in many aspects of language comprehension. We define suppression as a general, cognitive mechanism, the purpose of which is to attenuate the interference caused by the activation of extraneous, unnecessary, or inappropriate information. We illustrate the crucial role that suppression plays in general comprehension by reviewing numerous experiments. These experiments demonstrate that suppression attenuates interference during lexical access (how word meanings are ‘accessed’), anaphoric reference (how referents for anaphors, like pronouns, are computed), cataphoric reference (how concepts that are marked by devices, such as spoken stress, gain a privileged status), syntactic parsing (how grammatical forms of sentences are decoded), and individual differences in (adult) language comprehension skill. We also review research that suggests that suppression plays a crucial role in the understanding of figurative language, in particular, metaphors, idioms, and proverbs. PMID:25520540
The influence of the microbiota on the immune response to transplantation

PubMed Central

Bartman, Caroline; Chong, Anita S.; Alegre, Maria-Luisa

2015-01-01

Purpose of review In the past decade, appreciation of the important effects of commensal microbes on immunity has grown exponentially. The effect of the microbiota on transplantation has only recently begun to be explored; however, our understanding of the mechanistic details of host-microbe interactions is still lacking. Recent findings It has become clear that transplantation is associated with changes in the microbiota in many different settings although what clinical events and therapeutic interventions contribute to these changes remains to be parsed out. Research groups have begun to identify associations between specific communities of organisms and transplant outcomes but it remains to be established whether microbial changes precede or follow transplant rejection episodes. Finally, results from continuing exploration of basic mechanisms by which microbial communities affect innate and adaptive immunity in various animal models of disease continues to inform research on the microbiota’s effects on immune responses against transplanted organs. Summary Commensal microbes may alter immune responses to organ transplantation, but direct experiments are only beginning in the field to identify species and immune pathways responsible for these putative effects. PMID:25563985
Automated realtime data import for the i2b2 clinical data warehouse: introducing the HL7 ETL cell.

PubMed

Majeed, Raphael W; Röhrig, Rainer

2012-01-01

Clinical data warehouses are used to consolidate all available clinical data from one or multiple organizations. They represent an important source for clinical research, quality management and controlling. Since its introduction, the data warehouse i2b2 gathered a large user base in the research community. Yet, little work has been done on the process of importing clinical data into data warehouses using existing standards. In this article, we present a novel approach of utilizing the clinical integration server as data source, commonly available in most hospitals. As information is transmitted through the integration server, the standardized HL7 message is immediately parsed and inserted into the data warehouse. Evaluation of import speeds suggest feasibility of the provided solution for real-time processing of HL7 messages. By using the presented approach of standardized data import, i2b2 can be used as a plug and play data warehouse, without the hurdle of customized import for every clinical information system or electronic medical record. The provided solution is available for download at http://sourceforge.net/projects/histream/.
Addressing socioeconomic and political challenges posed by climate change

NASA Astrophysics Data System (ADS)

Fernando, Harindra Joseph; Klaic, Zvjezdana Bencetic

2011-08-01

NATO Advanced Research Workshop: Climate Change, Human Health and National Security; Dubrovnik, Croatia, 28-30 April 2011; Climate change has been identified as one of the most serious threats to humanity. It not only causes sea level rise, drought, crop failure, vector-borne diseases, extreme events, degradation of water and air quality, heat waves, and other phenomena, but it is also a threat multiplier wherein concatenation of multiple events may lead to frequent human catastrophes and intranational and international conflicts. In particular, urban areas may bear the brunt of climate change because of the amplification of climate effects that cascade down from global to urban scales, but current modeling and downscaling capabilities are unable to predict these effects with confidence. These were the main conclusions of a NATO Advanced Research Workshop (ARW) sponsored by the NATO Science for Peace and Security program. Thirty-two invitees from 17 counties, including leading modelers; natural, political, and social scientists; engineers; politicians; military experts; urban planners; industry analysts; epidemiologists; and health care professionals, parsed the topic on a common platform.
Practical Application of Research in Science Education (PARSE) -- A New Collaboration for K-12 Science Teacher Professional Development

NASA Astrophysics Data System (ADS)

Zwicker, Andrew; Lopez, Jose; Clayton, James

2008-11-01

A new collaboration between PPPL, St. Peter's College, the Liberty Science Center, and the Jersey City Public School District was formed in order to create a unique K-12 teacher professional development program. St. Peter's College, located in Jersey City, NJ, is a liberal arts college in an urban setting. The Liberty Science Center (LSC) is the largest education resource in the New Jersey-New York City region. The Jersey City School District has 28,000 students of which approximately 90% are from populations traditionally under-represented in science. The new program is centered upon topics surrounding energy and the environment. In the first year, beginning in 2009, 15-20 teachers will participate in a pilot course that includes hands-on research at PPPL and St. Peter's, the creation of new curricular materials, and pedagogical techniques. Scientists, master teachers, and education professors will teach the course. In subsequent years, the number of participants will be significantly expanded and the curricular material disseminated to other school districts. In addition, an outside evaluator will measure the educational outcome throughout the project.
Automatic Extraction of Drug Adverse Effects from Product Characteristics (SPCs): A Text Versus Table Comparison.

PubMed

Lamy, Jean-Baptiste; Ugon, Adrien; Berthelot, Hélène

2016-01-01

Potential adverse effects (AEs) of drugs are described in their summary of product characteristics (SPCs), a textual document. Automatic extraction of AEs from SPCs is useful for detecting AEs and for building drug databases. However, this task is difficult because each AE is associated with a frequency that must be extracted and the presentation of AEs in SPCs is heterogeneous, consisting of plain text and tables in many different formats. We propose a taxonomy for the presentation of AEs in SPCs. We set up natural language processing (NLP) and table parsing methods for extracting AEs from texts and tables of any format, and evaluate them on 10 SPCs. Automatic extraction performed better on tables than on texts. Tables should be recommended for the presentation of the AEs section of the SPCs.
Gro2mat: a package to efficiently read gromacs output in MATLAB.

PubMed

Dien, Hung; Deane, Charlotte M; Knapp, Bernhard

2014-07-30

Molecular dynamics (MD) simulations are a state-of-the-art computational method used to investigate molecular interactions at atomic scale. Interaction processes out of experimental reach can be monitored using MD software, such as Gromacs. Here, we present the gro2mat package that allows fast and easy access to Gromacs output files from Matlab. Gro2mat enables direct parsing of the most common Gromacs output formats including the binary xtc-format. No openly available Matlab parser currently exists for this format. The xtc reader is orders of magnitudes faster than other available pdb/ascii workarounds. Gro2mat is especially useful for scientists with an interest in quick prototyping of new mathematical and statistical approaches for Gromacs trajectory analyses. © 2014 Wiley Periodicals, Inc. Copyright © 2014 Wiley Periodicals, Inc.

X-tile: a new bio-informatics tool for biomarker assessment and outcome-based cut-point optimization.

PubMed

Camp, Robert L; Dolled-Filhart, Marisa; Rimm, David L

2004-11-01

The ability to parse tumors into subsets based on biomarker expression has many clinical applications; however, there is no global way to visualize the best cut-points for creating such divisions. We have developed a graphical method, the X-tile plot that illustrates the presence of substantial tumor subpopulations and shows the robustness of the relationship between a biomarker and outcome by construction of a two dimensional projection of every possible subpopulation. We validate X-tile plots by examining the expression of several established prognostic markers (human epidermal growth factor receptor-2, estrogen receptor, p53 expression, patient age, tumor size, and node number) in cohorts of breast cancer patients and show how X-tile plots of each marker predict population subsets rooted in the known biology of their expression.
Multiscale characterization and analysis of shapes

DOEpatents

Prasad, Lakshman; Rao, Ramana

2002-01-01

An adaptive multiscale method approximates shapes with continuous or uniformly and densely sampled contours, with the purpose of sparsely and nonuniformly discretizing the boundaries of shapes at any prescribed resolution, while at the same time retaining the salient shape features at that resolution. In another aspect, a fundamental geometric filtering scheme using the Constrained Delaunay Triangulation (CDT) of polygonized shapes creates an efficient parsing of shapes into components that have semantic significance dependent only on the shapes' structure and not on their representations per se. A shape skeletonization process generalizes to sparsely discretized shapes, with the additional benefit of prunability to filter out irrelevant and morphologically insignificant features. The skeletal representation of characters of varying thickness and the elimination of insignificant and noisy spurs and branches from the skeleton greatly increases the robustness, reliability and recognition rates of character recognition algorithms.
Instrument Remote Control Application Framework

NASA Technical Reports Server (NTRS)

Ames, Troy; Hostetter, Carl F.

2006-01-01

The Instrument Remote Control (IRC) architecture is a flexible, platform-independent application framework that is well suited for the control and monitoring of remote devices and sensors. IRC enables significant savings in development costs by utilizing extensible Markup Language (XML) descriptions to configure the framework for a specific application. The Instrument Markup Language (IML) is used to describe the commands used by an instrument, the data streams produced, the rules for formatting commands and parsing the data, and the method of communication. Often no custom code is needed to communicate with a new instrument or device. An IRC instance can advertise and publish a description about a device or subscribe to another device's description on a network. This simple capability of dynamically publishing and subscribing to interfaces enables a very flexible, self-adapting architecture for monitoring and control of complex instruments in diverse environments.
MPEG-4 AVC saliency map computation

NASA Astrophysics Data System (ADS)

Ammar, M.; Mitrea, M.; Hasnaoui, M.

2014-02-01

A saliency map provides information about the regions inside some visual content (image, video, ...) at which a human observer will spontaneously look at. For saliency maps computation, current research studies consider the uncompressed (pixel) representation of the visual content and extract various types of information (intensity, color, orientation, motion energy) which are then fusioned. This paper goes one step further and computes the saliency map directly from the MPEG-4 AVC stream syntax elements with minimal decoding operations. In this respect, an a-priori in-depth study on the MPEG-4 AVC syntax elements is first carried out so as to identify the entities appealing the visual attention. Secondly, the MPEG-4 AVC reference software is completed with software tools allowing the parsing of these elements and their subsequent usage in objective benchmarking experiments. This way, it is demonstrated that an MPEG-4 saliency map can be given by a combination of static saliency and motion maps. This saliency map is experimentally validated under a robust watermarking framework. When included in an m-QIM (multiple symbols Quantization Index Modulation) insertion method, PSNR average gains of 2.43 dB, 2.15dB, and 2.37 dB are obtained for data payload of 10, 20 and 30 watermarked blocks per I frame, i.e. about 30, 60, and 90 bits/second, respectively. These quantitative results are obtained out of processing 2 hours of heterogeneous video content.
One algorithm to rule them all? An evaluation and discussion of ten eye movement event-detection algorithms.

PubMed

Andersson, Richard; Larsson, Linnea; Holmqvist, Kenneth; Stridh, Martin; Nyström, Marcus

2017-04-01

Almost all eye-movement researchers use algorithms to parse raw data and detect distinct types of eye movement events, such as fixations, saccades, and pursuit, and then base their results on these. Surprisingly, these algorithms are rarely evaluated. We evaluated the classifications of ten eye-movement event detection algorithms, on data from an SMI HiSpeed 1250 system, and compared them to manual ratings of two human experts. The evaluation focused on fixations, saccades, and post-saccadic oscillations. The evaluation used both event duration parameters, and sample-by-sample comparisons to rank the algorithms. The resulting event durations varied substantially as a function of what algorithm was used. This evaluation differed from previous evaluations by considering a relatively large set of algorithms, multiple events, and data from both static and dynamic stimuli. The main conclusion is that current detectors of only fixations and saccades work reasonably well for static stimuli, but barely better than chance for dynamic stimuli. Differing results across evaluation methods make it difficult to select one winner for fixation detection. For saccade detection, however, the algorithm by Larsson, Nyström and Stridh (IEEE Transaction on Biomedical Engineering, 60(9):2484-2493,2013) outperforms all algorithms in data from both static and dynamic stimuli. The data also show how improperly selected algorithms applied to dynamic data misestimate fixation and saccade properties.
Timescale bias in measuring river migration rate

NASA Astrophysics Data System (ADS)

Donovan, M.; Belmont, P.; Notebaert, B.

2016-12-01

River channel migration plays an important role in sediment routing, water quality, riverine ecology, and infrastructure risk assessment. Migration rates may change in time and space due to systematic changes in hydrology, sediment supply, vegetation, and/or human land and water management actions. The ability to make detailed measurements of lateral migration over a wide range of temporal and spatial scales has been enhanced from increased availability of historical landscape-scale aerial photography and high-resolution topography (HRT). Despite a surge in the use of historical and contemporary aerial photograph sequences in conjunction with evolving methods to analyze such data for channel change, we found no research considering the biases that may be introduced as a function of the temporal scales of measurement. Unsteady processes (e.g.; sedimentation, channel migration, width changes) exhibit extreme discontinuities over time and space, resulting in distortion when measurements are averaged over longer temporal scales, referred to as `Sadler effects' (Sadler, 1981; Gardner et al., 1987). Using 12 sets of aerial photographs for the Root River (Minnesota), we measure lateral migration over space (110 km) and time (1937-2013) assess whether bias arises from different measurement scales and whether rates shift systematically with increased discharge over time. Results indicate that measurement-scale biases indeed arise from the time elapsed between measurements. We parsed the study reach into three distinct reaches and examine if/how recent increases in river discharge translate into changes in migration rate.
A user-oriented web crawler for selectively acquiring online content in e-health research

PubMed Central

Xu, Songhua; Yoon, Hong-Jun; Tourassi, Georgia

2014-01-01

Motivation: Life stories of diseased and healthy individuals are abundantly available on the Internet. Collecting and mining such online content can offer many valuable insights into patients’ physical and emotional states throughout the pre-diagnosis, diagnosis, treatment and post-treatment stages of the disease compared with those of healthy subjects. However, such content is widely dispersed across the web. Using traditional query-based search engines to manually collect relevant materials is rather labor intensive and often incomplete due to resource constraints in terms of human query composition and result parsing efforts. The alternative option, blindly crawling the whole web, has proven inefficient and unaffordable for e-health researchers. Results: We propose a user-oriented web crawler that adaptively acquires user-desired content on the Internet to meet the specific online data source acquisition needs of e-health researchers. Experimental results on two cancer-related case studies show that the new crawler can substantially accelerate the acquisition of highly relevant online content compared with the existing state-of-the-art adaptive web crawling technology. For the breast cancer case study using the full training set, the new method achieves a cumulative precision between 74.7 and 79.4% after 5 h of execution till the end of the 20-h long crawling session as compared with the cumulative precision between 32.8 and 37.0% using the peer method for the same time period. For the lung cancer case study using the full training set, the new method achieves a cumulative precision between 56.7 and 61.2% after 5 h of execution till the end of the 20-h long crawling session as compared with the cumulative precision between 29.3 and 32.4% using the peer method. Using the reduced training set in the breast cancer case study, the cumulative precision of our method is between 44.6 and 54.9%, whereas the cumulative precision of the peer method is between 24.3 and 26.3%; for the lung cancer case study using the reduced training set, the cumulative precisions of our method and the peer method are, respectively, between 35.7 and 46.7% versus between 24.1 and 29.6%. These numbers clearly show a consistently superior accuracy of our method in discovering and acquiring user-desired online content for e-health research. Availability and implementation: The implementation of our user-oriented web crawler is freely available to non-commercial users via the following Web site: http://bsec.ornl.gov/AdaptiveCrawler.shtml. The Web site provides a step-by-step guide on how to execute the web crawler implementation. In addition, the Web site provides the two study datasets including manually labeled ground truth, initial seeds and the crawling results reported in this article. Contact: xus1@ornl.gov Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24078710
FQC Dashboard: integrates FastQC results into a web-based, interactive, and extensible FASTQ quality control tool

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, Joseph; Pirrung, Meg; McCue, Lee Ann

FQC is software that facilitates large-scale quality control of FASTQ files by carrying out a QC protocol, parsing results, and aggregating quality metrics within and across experiments into an interactive dashboard. The dashboard utilizes human-readable configuration files to manipulate the pages and tabs, and is extensible with CSV data.
Neoliberalism in Historical Light: How Business Models Displaced Science Education Goals in Two Eras

ERIC Educational Resources Information Center

Hayes, Kathryn N.

2016-01-01

Although a growing body of work addresses the current role of neoliberalism in displacing democratic equality as a goal of public education, attempts to parse such impacts rarely draw from historical accounts. At least one tenet of neoliberalism--the application of business models to public institutions--was also pervasive at the turn of the 20th…
Moving Target Techniques: Leveraging Uncertainty for Cyber Defense

DTIC Science & Technology

2015-08-24

vulnerability (a flaw or bug that an attacker can exploit to penetrate or disrupt a system) to successfully compromise systems. Defenders, however...device drivers, numerous software applications, and hardware components. Within the cyberspace, this imbalance between a simple, one- bug attack...parsing code itself could have security-relevant software bugs . Dynamic Network Techniques in the dynamic network domain change the properties
Does the Advanced Proficiency Evaluated in Oral-Like Written Text Support Syntactic Parsing in a Written Academic Text among L2 Japanese Learners?

ERIC Educational Resources Information Center

Kitajima, Ryu

2016-01-01

Corpus linguistics identifies the qualitative difference in the characteristics of spoken discourse vs. written academic discourse. Whereas spoken discourse makes greater use of finite dependent clauses functioning as constituents in other clauses, written academic discourse incorporates noun phrase constituents and complex phrases. This claim can…
Integrating Syntax, Semantics, and Discourse DARPA Natural Language Understanding Program. Volume 1. Technical Report

DTIC Science & Technology

1989-09-30

26 9.1 Overview of SPQR ............................................................................. 26 9.2...domain. The ISR is the input to the selection component SPQR , whose function is to block semantically anomalous parses before they are sent to the...frequently occurring pairs of words, which is useful for identifying fixed multi-word expressions. 9. SELECTION The SPQR module (Selectional Pattern
Multiple Motives, Conflicting Conceptions: Parsing the Contexts of Differentiated Access to Scientific Information in the Federal Government

ERIC Educational Resources Information Center

Oltmann, Shannon M.

2012-01-01

Scientific information, used by the U.S. federal government to formulate public policy in many arenas, is frequently contested and sometimes altered, blocked from publication, deleted from reports, or restricted in some way. This dissertation examines how and why restricted access to science policy (RASP) occurs through a comparative case study.…
Sorry Dave, I’m Afraid I Can’t Do That: Explaining Unachievable Robot Tasks using Natural Language

DTIC Science & Technology

2013-06-24

processing components used by Brooks et al. [6]: the Bikel parser [3] combined with the null element (understood subject) restoration of Gabbard et al...Intelligent Robots and Systems (IROS), pages 1988 – 1993, 2010. [12] Ryan Gabbard , Mitch Marcus, and Seth Kulick. Fully parsing the Penn Treebank. In Human
Syllabic Parsing in Children: A Developmental Study Using Visual Word-Spotting in Spanish

ERIC Educational Resources Information Center

Álvarez, Carlos J.; Garcia-Saavedra, Guacimara; Luque, Juan L.; Taft, Marcus

2017-01-01

Some inconsistency is observed in the results from studies of reading development regarding the role of the syllable in visual word recognition, perhaps due to a disparity between the tasks used. We adopted a word-spotting paradigm, with Spanish children of second grade (mean age: 7 years) and sixth grade (mean age: 11 years). The children were…
Infants' Attention to Patterned Stimuli: Developmental Change from 3 to 12 Months of Age

ERIC Educational Resources Information Center

Courage, Mary L.; Reynolds, Greg D.; Richards, John E.

2006-01-01

To examine the development of look duration as a function of age and stimulus type, 14- to 52-week-old infants were shown static and dynamic versions of faces, Sesame Street material, and achromatic patterns for 20 s of accumulated looking. Heart rate was recorded during looking and parsed into stimulus orienting, sustained attention, and…
Effects of Cognitive Load on Trust

DTIC Science & Technology

2013-10-01

that may be affected by load  Build a parsing tool to extract relevant features  Statistical analysis of results (by load components) Achieved...for a business application. Participants assessed potential job candidates and reviewed the applicants’ virtual resume which included standard...substantially different from each other that would make any confounding problems or other issues. Some statistics of the Australian data collection are
Processing of Tense Morphology and Filler-Gap Dependencies by Chinese Second Language Speakers of English

ERIC Educational Resources Information Center

Dong, Zhiyin Renee

2014-01-01

There is an ongoing debate in the field of Second Language Acquisition concerning whether a fundamental difference exists between the native language (L1) and adult second language (L2) online processing of syntax and morpho-syntax. The Shallow Structure Hypothesis (SSH) (Clahsen and Felser, 2006a, b) states that L2 online parsing is qualitatively…
Parsing Heuristic and Forward Search in First-Graders' Game-Play Behavior

ERIC Educational Resources Information Center

Paz, Luciano; Goldin, Andrea P.; Diuk, Carlos; Sigman, Mariano

2015-01-01

Seventy-three children between 6 and 7 years of age were presented with a problem having ambiguous subgoal ordering. Performance in this task showed reliable fingerprints: (a) a non-monotonic dependence of performance as a function of the distance between the beginning and the end-states of the problem, (b) very high levels of performance when the…
Preschool Children's Exposure to Story Grammar Elements during Parent-Child Book Reading

ERIC Educational Resources Information Center

Breit-Smith, Allison; van Kleeck, Anne; Prendeville, Jo-Anne; Pan, Wei

2017-01-01

Twenty-three preschool-age children, 3;6 (years; months) to 4;1, were videotaped separately with their mothers and fathers while each mother and father read a different unfamiliar storybook to them. The text from the unfamiliar storybooks was parsed and coded into story grammar elements and all parental extratextual utterances were transcribed and…

The Universal Parser and Interlanguage: Domain-Specific Mental Organization in the Comprehension of "Combien" Interrogatives in English-French Interlanguage.

ERIC Educational Resources Information Center

Dekydtspotter, Laurent

2001-01-01

From the perspective of Fodor's (1983) theory of mental organization and Chomsky's (1995) Minimalist theory of grammar, considers constraints on the interpretation of French-type and English-type cardinality interrogatives in the task of sentence comprehension, as a function of a universal parsing algorithm and hypotheses embodied in a French-type…
Laminar Cortical Dynamics of Cognitive and Motor Working Memory, Sequence Learning and Performance: Toward a Unified Theory of How the Cerebral Cortex Works

ERIC Educational Resources Information Center

Grossberg, Stephen; Pearson, Lance R.

2008-01-01

How does the brain carry out working memory storage, categorization, and voluntary performance of event sequences? The LIST PARSE neural model proposes an answer that unifies the explanation of cognitive, neurophysiological, and anatomical data. It quantitatively simulates human cognitive data about immediate serial recall and free recall, and…
Deciphering the Combinatorial Roles of Geometric, Mechanical, and Adhesion Cues in Regulation of Cell Spreading

PubMed Central

Harris, Greg M.; Shazly, Tarek; Jabbarzadeh, Ehsan

2013-01-01

Significant effort has gone towards parsing out the effects of surrounding microenvironment on macroscopic behavior of stem cells. Many of the microenvironmental cues, however, are intertwined, and thus, further studies are warranted to identify the intricate interplay among the conflicting downstream signaling pathways that ultimately guide a cell response. In this contribution, by patterning adhesive PEG (polyethylene glycol) hydrogels using Dip Pen Nanolithography (DPN), we demonstrate that substrate elasticity, subcellular elasticity, ligand density, and topography ultimately define mesenchymal stem cells (MSCs) spreading and shape. Physical characteristics are parsed individually with 7 kilopascal (kPa) hydrogel islands leading to smaller, spindle shaped cells and 105 kPa hydrogel islands leading to larger, polygonal cell shapes. In a parallel effort, a finite element model was constructed to characterize and confirm experimental findings and aid as a predictive tool in modeling cell microenvironments. Signaling pathway inhibition studies suggested that RhoA is a key regulator of cell response to the cooperative effect of the tunable substrate variables. These results are significant for the engineering of cell-extra cellular matrix interfaces and ultimately decoupling matrix bound cues presented to cells in a tissue microenvironment for regenerative medicine. PMID:24282570
Impaired P600 in neuroleptic naive patients with first-episode schizophrenia.

PubMed

Papageorgiou, C; Kontaxakis, V P; Havaki-Kontaxaki, B J; Stamouli, S; Vasios, C; Asvestas, P; Matsopoulos, G K; Kontopantelis, E; Rabavilas, A; Uzunoglu, N; Christodoulou, G N

2001-09-17

Deficits of working memory (WM) are recognized as an important pathological feature in schizophrenia. Since the P600 component of event related potentials has been hypothesized that represents aspects of second-pass parsing processes of information processing, and is related to WM, the present study focuses on P600 elicited during a WM test in drug-naive first-episode schizophrenics (FES) compared to healthy controls. We examined 16 drug-naive first-episode schizophrenic patients and 23 healthy controls matched for age and sex. Compared with controls schizophrenic patients showed reduced P600 amplitude on left temporoparietal region and increased P600 amplitude on left occipital region. With regard to the latency, the patients exhibited significantly prolongation on right temporoparietal region. The obtained pattern of differences classified correctly 89.20% of patients. Memory performance of patients was also significantly impaired relative to controls. Our results suggest that second-pass parsing process of information processing, as indexed by P600, elicited during a WM test, is impaired in FES. Moreover, these findings lend support to the view that the auditory WM in schizophrenia involves or affects a circuitry including temporoparietal and occipital brain areas.
Do 11-month-old French infants process articles?

PubMed

Hallé, Pierre A; Durand, Catherine; de Boysson-Bardies, Bénédicte

2008-01-01

The first part of this study examined (Parisian) French-learning 11-month-old infants' recognition of the six definite and indefinite French articles: le, la, les, un, une, and des. The six articles were compared with pseudoarticles in the context of disyllabic or monosyllabic nouns, using the Head-turn Preference Procedure. The pseudo articles were similar to real articles in terms of phonetic composition and phonotactic probability, and real and pseudo noun phrases were alike in terms of overall prosodic contour. In three experiments, 11-month-old infants showed preference for real over pseudo articles, suggesting they have the articles' word-forms stored in long-term memory. The second part of the study evaluates several hypotheses about the role of articles in 11-month-olds infants' word recognition. Evidence from three experiments supports the view that articles help infants to recognize the following words. We propose that 11-month-olds have the capacity to parse noun phrases into their constituents, which is consistent with the more general view that function words define a syntactic skeleton that serves as a basis for parsing spoken utterances. This proposition is compared to a competing account, which argues that 11-month-olds recognize noun-phrases as whole-words.
Language experience changes subsequent learning.

PubMed

Onnis, Luca; Thiessen, Erik

2013-02-01

What are the effects of experience on subsequent learning? We explored the effects of language-specific word order knowledge on the acquisition of sequential conditional information. Korean and English adults were engaged in a sequence learning task involving three different sets of stimuli: auditory linguistic (nonsense syllables), visual non-linguistic (nonsense shapes), and auditory non-linguistic (pure tones). The forward and backward probabilities between adjacent elements generated two equally probable and orthogonal perceptual parses of the elements, such that any significant preference at test must be due to either general cognitive biases, or prior language-induced biases. We found that language modulated parsing preferences with the linguistic stimuli only. Intriguingly, these preferences are congruent with the dominant word order patterns of each language, as corroborated by corpus analyses, and are driven by probabilistic preferences. Furthermore, although the Korean individuals had received extensive formal explicit training in English and lived in an English-speaking environment, they exhibited statistical learning biases congruent with their native language. Our findings suggest that mechanisms of statistical sequential learning are implicated in language across the lifespan, and experience with language may affect cognitive processes and later learning. Copyright © 2012 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Janik, Gregory

Renders, saves, and analyzes pressure from several sensors in a prosthesis socket. The program receives pressure data from 64 manometers and parses the pressure for each individual sensor. The program can then display those pressures as number in a table. The program also interpolates pressures between manometers to create a larger set of data. This larger set of data is displayed as a simple contour plot. That same contour plot can also be placed on a three-dimensional surface in the shape of a prosthesis.This program allows for easy identification of high pressure areas in a prosthesis to reduce the usersmore » discomfort. The program parses the sensor pressures into a human-readable numeric format. The data may also be used to actively adjust bladders within the prosthesis to spread out pressure in real time, according to changing demands placed on the prosthesis. Interpolation of the pressures to create a larger data set makes it even easier for a human to identify particular areas of the prosthesis that are under high pressure. After identifying pressure points, a prosthetician can then redesign the prosthesis and/or command the bladders in the prosthesis to attempt to maintain constant pressures.« less
A controlled trial of automated classification of negation from clinical notes

PubMed Central

Elkin, Peter L; Brown, Steven H; Bauer, Brent A; Husser, Casey S; Carruth, William; Bergstrom, Larry R; Wahner-Roedler, Dietlind L

2005-01-01

Background Identification of negation in electronic health records is essential if we are to understand the computable meaning of the records: Our objective is to compare the accuracy of an automated mechanism for assignment of Negation to clinical concepts within a compositional expression with Human Assigned Negation. Also to perform a failure analysis to identify the causes of poorly identified negation (i.e. Missed Conceptual Representation, Inaccurate Conceptual Representation, Missed Negation, Inaccurate identification of Negation). Methods 41 Clinical Documents (Medical Evaluations; sometimes outside of Mayo these are referred to as History and Physical Examinations) were parsed using the Mayo Vocabulary Server Parsing Engine. SNOMED-CT™ was used to provide concept coverage for the clinical concepts in the record. These records resulted in identification of Concepts and textual clues to Negation. These records were reviewed by an independent medical terminologist, and the results were tallied in a spreadsheet. Where questions on the review arose Internal Medicine Faculty were employed to make a final determination. Results SNOMED-CT was used to provide concept coverage of the 14,792 Concepts in 41 Health Records from John's Hopkins University. Of these, 1,823 Concepts were identified as negative by Human review. The sensitivity (Recall) of the assignment of negation was 97.2% (p < 0.001, Pearson Chi-Square test; when compared to a coin flip). The specificity of assignment of negation was 98.8%. The positive likelihood ratio of the negation was 81. The positive predictive value (Precision) was 91.2% Conclusion Automated assignment of negation to concepts identified in health records based on review of the text is feasible and practical. Lexical assignment of negation is a good test of true Negativity as judged by the high sensitivity, specificity and positive likelihood ratio of the test. SNOMED-CT had overall coverage of 88.7% of the concepts being negated. PMID:15876352
Incremental Parsing with Reference Interaction

DTIC Science & Technology

2004-07-01

ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) Department of...Computer Science,University of Rochester,Rochester,NY,14627 8. PERFORMING ORGANIZATION REPORT NUMBER 9. SPONSORING/MONITORING AGENCY NAME(S) AND...Evidence from eye movements in spoken language comprehen- sion. Conference Abstract. Architechtures and Mechanisms for Language Processing. R. M
Document Image Parsing and Understanding using Neuromorphic Architecture

DTIC Science & Technology

2015-03-01

processing speed at different layers. In the pattern matching layer, the computing power of multicore processors is explored to reduce the processing...developed to reduce the processing speed at different layers. In the pattern matching layer, the computing power of multicore processors is explored... cortex where the complex data is reduced to abstract representations. The abstract representation is compared to stored patterns in massively parallel
Neural Detection of Malicious Network Activities Using a New Direct Parsing and Feature Extraction Technique

DTIC Science & Technology

2015-09-01

intrusion detection systems , neural networks 15. NUMBER OF PAGES 75 16. PRICE CODE 17. SECURITY CLASSIFICATION OF... detection system (IDS) software, which learns to detect and classify network attacks and intrusions through prior training data. With the added criteria of...BACKGROUND The growing threat of malicious network activities and intrusion attempts makes intrusion detection systems (IDS) a
Learning with leaders.

PubMed

Bunkers, Sandra S

2009-01-01

This column focuses on ideas concerning leaders and leadership. The author proposes that leadership is about showing up and participating with others in doing something. "Mandela: His 8 Lessons of Leadership" by Richard Stengel is explored in light of selected philosophical writings, literature on nursing leadership, and nurse theorist Rosemarie Rizzo Parse's humanbecoming leading-following model. Teaching-learning questions are then posed to stimulate further reflection on the lessons of leadership.
AGILE: Autonomous Global Integrated Language Exploitation

DTIC Science & Technology

2009-12-01

combination, including METEOR-based alignment (with stemming and WordNet synonym matching) and GIZA ++ based alignment. So far, we have not seen any...parse trees and a detailed analysis of how function words operate in translation. This program lets us fix alignment errors that systems like GIZA ...correlates better with Pyramid than with Responsiveness scoring (i.e., it is a more precise, careful, measure) • BE generally outperforms ROUGE
Moving Target Techniques: Leveraging Uncertainty for CyberDefense

DTIC Science & Technology

2015-12-15

cyberattacks is a continual struggle for system managers. Attackers often need only find one vulnerability (a flaw or bug that an attacker can exploit...additional parsing code itself could have security-relevant software bugs . Dynamic Network Techniques in the dynamic network domain change the...evaluation of MT techniques can benefit from a variety of evaluation approaches, including abstract analysis, modeling and simulation, test bed
Analysis of Cloud-Based Database Systems

DTIC Science & Technology

2015-06-01

EU) citizens under the Patriot Act [3]. Unforeseen virtualization bugs have caused wide-reaching outages [4], leaving customers helpless to assist...collected from SQL Server Profiler traces. We analyze the trace results captured from our test bed both before and after increasing system resources...cloud test- bed . A. DATA COLLECTION, PARSING, AND ORGANIZATION Once we finished collecting the trace data, we knew we needed to have as close a
Neurobiological Bases of Reading Comprehension: Insights from Neuroimaging Studies of Word-Level and Text-Level Processing in Skilled and Impaired Readers

ERIC Educational Resources Information Center

Landi, Nicole; Frost, Stephen J.; Mencl, W. Einar; Sandak, Rebecca; Pugh, Kenneth R.

2013-01-01

For accurate reading comprehension, readers must first learn to map letters to their corresponding speech sounds and meaning, and then they must string the meanings of many words together to form a representation of the text. Furthermore, readers must master the complexities involved in parsing the relevant syntactic and pragmatic information…
Rochester Connectionist Papers. 1979-1985

DTIC Science & Technology

1985-12-01

updated and improved version of the thesis account of recent neurolinguistic data. Fanty, M., "Context-free parsing in connectionist networks." TR 174...April 1982. Our first large program in the connectionist paradigm. It simulates a multi- layer network for recognizing line drawings of Origami figures...The program successfully deals with noise and simple occlusion and the thesis incorporates many key ideas on designing and running large models. Small
PCF File Format.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Thoreson, Gregory G

PCF files are binary files designed to contain gamma spectra and neutron count rates from radiation sensors. It is the native format for the GAmma Detector Response and Analysis Software (GADRAS) package [1]. It can contain multiple spectra and information about each spectrum such as energy calibration. This document outlines the format of the file that would allow one to write a computer program to parse and write such files.
Syllabic Strategy as Opposed to Coda Optimization in the Segmentation of Spanish Letter-Strings Using Word Spotting

ERIC Educational Resources Information Center

Álvarez, Carlos J.; Taft, Marcus; Hernández-Cabrera, Juan A.

2017-01-01

A word-spotting task is used in Spanish to test the way in which polysyllabic letter-strings are parsed in this language. Monosyllabic words (e.g., "bar") embedded at the beginning of a pseudoword were immediately followed by either a coda-forming consonant (e.g., "barto") or a vowel (e.g., "baros"). In the former…
Intelligent Semantic Query of Notices to Airmen (NOTAMs)

DTIC Science & Technology

2006-07-01

definition of the airspace is constantly changing, new vocabulary is added and old words retired on a monthly basis, and the information specifying this is...NOTAMs are notices containing information on the conditions, or changes to, aeronautical facilities, services, procedures, or hazards, which are...develop a new parsing system, employing and extending ideas developed by the information-extraction community, rather than on classical computational

Language and the Law: A Case for Linguistic Pragmatics. Sociolinguistic Working Paper Number 94.

ERIC Educational Resources Information Center

Prince, Ellen F.

The emergence of a subfield of linguistics, linguistic pragmatics, whose goal is to discover the principles by which a hearer or reader understands a text or can construct a model based on the text, given the sentence-level competence to parse the text's sentences and assign logical forms to them, is discussed in the context of a court case in…
Does the Shallow Structures Proposal Account for Qualitative Differences in First and Second Language Processing?

ERIC Educational Resources Information Center

Sabourin, Laura

2006-01-01

In their Keynote Article, Clahsen and Felser (CF) provide a detailed summary and comparison of grammatical processing in adult first language (L1) speakers, child L1 speakers, and second language (L2) speakers. CF conclude that child and adult L1 processing makes use of a continuous parsing mechanism, and that any differences found in processing…
Age-Related Differences in Speech Rate Perception Do Not Necessarily Entail Age-Related Differences in Speech Rate Use

ERIC Educational Resources Information Center

Heffner, Christopher C.; Newman, Rochelle S.; Dilley, Laura C.; Idsardi, William J.

2015-01-01

Purpose: A new literature has suggested that speech rate can influence the parsing of words quite strongly in speech. The purpose of this study was to investigate differences between younger adults and older adults in the use of context speech rate in word segmentation, given that older adults perceive timing information differently from younger…
Children Do Not Overcome Lexical Biases Where Adults Do: The Role of the Referential Scene in Garden-Path Recovery

ERIC Educational Resources Information Center

Kidd, Evan; Stewart, Andrew J.; Serratrice, Ludovica

2011-01-01

In this paper we report on a visual world eye-tracking experiment that investigated the differing abilities of adults and children to use referential scene information during reanalysis to overcome lexical biases during sentence processing. The results showed that adults incorporated aspects of the referential scene into their parse as soon as it…
Computing Accurate Grammatical Feedback in a Virtual Writing Conference for German-Speaking Elementary-School Children: An Approach Based on Natural Language Generation

ERIC Educational Resources Information Center

Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine

2009-01-01

We built a natural language processing (NLP) system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary…
Critiquing Systems for Decision Support

DTIC Science & Technology

2006-02-01

errors and deficiencies. An example of a comparative critic is the ATTENDING system ( anaesthesiology ), which first parses the user’s solution into a...design tools at the times when those tools are useful. 9. Experiential critics provide reminders of past experiences with similar designs or design...technique for hypertension rather than the broader field of anaesthesiology ; and (2) critiquing systems are most appropriate for tasks that require
A cognitive model for multidigit number reading: Inferences from individuals with selective impairments.

PubMed

Dotan, Dror; Friedmann, Naama

2018-04-01

We propose a detailed cognitive model of multi-digit number reading. The model postulates separate processes for visual analysis of the digit string and for oral production of the verbal number. Within visual analysis, separate sub-processes encode the digit identities and the digit order, and additional sub-processes encode the number's decimal structure: its length, the positions of 0, and the way it is parsed into triplets (e.g., 314987 → 314,987). Verbal production consists of a process that generates the verbal structure of the number, and another process that retrieves the phonological forms of each number word. The verbal number structure is first encoded in a tree-like structure, similarly to syntactic trees of sentences, and then linearized to a sequence of number-word specifiers. This model is based on an investigation of the number processing abilities of seven individuals with different selective deficits in number reading. We report participants with impairment in specific sub-processes of the visual analysis of digit strings - in encoding the digit order, in encoding the number length, or in parsing the digit string to triplets. Other participants were impaired in verbal production, making errors in the number structure (shifts of digits to another decimal position, e.g., 3,040 → 30,004). Their selective deficits yielded several dissociations: first, we found a double dissociation between visual analysis deficits and verbal production deficits. Second, several dissociations were found within visual analysis: a double dissociation between errors in digit order and errors in the number length; a dissociation between order/length errors and errors in parsing the digit string into triplets; and a dissociation between the processing of different digits - impaired order encoding of the digits 2-9, without errors in the 0 position. Third, within verbal production, a dissociation was found between digit shifts and substitutions of number words. A selective deficit in any of the processes described by the model would cause difficulties in number reading, which we propose to term "dysnumeria". Copyright © 2017 Elsevier Ltd. All rights reserved.
Telemetry and Science Data Software System

NASA Technical Reports Server (NTRS)

Bates, Lakesha; Hong, Liang

2011-01-01

The Telemetry and Science Data Software System (TSDSS) was designed to validate the operational health of a spacecraft, ease test verification, assist in debugging system anomalies, and provide trending data and advanced science analysis. In doing so, the system parses, processes, and organizes raw data from the Aquarius instrument both on the ground and while in space. In addition, it provides a user-friendly telemetry viewer, and an instant pushbutton test report generator. Existing ground data systems can parse and provide simple data processing, but have limitations in advanced science analysis and instant report generation. The TSDSS functions as an offline data analysis system during I&T (integration and test) and mission operations phases. After raw data are downloaded from an instrument, TSDSS ingests the data files, parses, converts telemetry to engineering units, and applies advanced algorithms to produce science level 0, 1, and 2 data products. Meanwhile, it automatically schedules upload of the raw data to a remote server and archives all intermediate and final values in a MySQL database in time order. All data saved in the system can be straightforwardly retrieved, exported, and migrated. Using TSDSS s interactive data visualization tool, a user can conveniently choose any combination and mathematical computation of interesting telemetry points from a large range of time periods (life cycle of mission ground data and mission operations testing), and display a graphical and statistical view of the data. With this graphical user interface (GUI), the data queried graphs can be exported and saved in multiple formats. This GUI is especially useful in trending data analysis, debugging anomalies, and advanced data analysis. At the request of the user, mission-specific instrument performance assessment reports can be generated with a simple click of a button on the GUI. From instrument level to observatory level, the TSDSS has been operating supporting functional and performance tests and refining system calibration algorithms and coefficients, in sync with the Aquarius/SAC-D spacecraft. At the time of this reporting, it was prepared and set up to perform anomaly investigation for mission operations preceding the Aquarius/SAC-D spacecraft launch on June 10, 2011.
User-defined functions in the Arden Syntax: An extension proposal.

PubMed

Karadimas, Harry; Ebrahiminia, Vahid; Lepage, Eric

2015-12-11

The Arden Syntax is a knowledge-encoding standard, started in 1989, and now in its 10th revision, maintained by the health level seven (HL7) organization. It has constructs borrowed from several language concepts that were available at that time (mainly the HELP hospital information system and the Regenstrief medical record system (RMRS), but also the Pascal language, functional languages and the data structure of frames, used in artificial intelligence). The syntax has a rationale for its constructs, and has restrictions that follow this rationale. The main goal of the Standard is to promote knowledge sharing, by avoiding the complexity of traditional programs, so that a medical logic module (MLM) written in the Arden Syntax can remain shareable and understandable across institutions. One of the restrictions of the syntax is that you cannot define your own functions and subroutines inside an MLM. An MLM can, however, call another MLM, where this MLM will serve as a function. This will add an additional dependency between MLMs, a known criticism of the Arden Syntax knowledge model. This article explains why we believe the Arden Syntax would benefit from a construct for user-defined functions, discusses the need, the benefits and the limitations of such a construct. We used the recent grammar of the Arden Syntax v.2.10, and both the Arden Syntax standard document and the Arden Syntax Rationale article as guidelines. We gradually introduced production rules to the grammar. We used the CUP parsing tool to verify that no ambiguities were detected. A new grammar was produced, that supports user-defined functions. 22 production rules were added to the grammar. A parser was built using the CUP parsing tool. A few examples are given to illustrate the concepts. All examples were parsed correctly. It is possible to add user-defined functions to the Arden Syntax in a way that remains coherent with the standard. We believe that this enhances the readability and the robustness of MLMs. A detailed proposal will be submitted by the end of the year to the HL7 workgroup on Arden Syntax. Copyright © 2015 Elsevier B.V. All rights reserved.
Detection and categorization of bacteria habitats using shallow linguistic analysis

PubMed Central

2015-01-01

Background Information regarding bacteria biotopes is important for several research areas including health sciences, microbiology, and food processing and preservation. One of the challenges for scientists in these domains is the huge amount of information buried in the text of electronic resources. Developing methods to automatically extract bacteria habitat relations from the text of these electronic resources is crucial for facilitating research in these areas. Methods We introduce a linguistically motivated rule-based approach for recognizing and normalizing names of bacteria habitats in biomedical text by using an ontology. Our approach is based on the shallow syntactic analysis of the text that include sentence segmentation, part-of-speech (POS) tagging, partial parsing, and lemmatization. In addition, we propose two methods for identifying bacteria habitat localization relations. The underlying assumption for the first method is that discourse changes with a new paragraph. Therefore, it operates on a paragraph-basis. The second method performs a more fine-grained analysis of the text and operates on a sentence-basis. We also develop a novel anaphora resolution method for bacteria coreferences and incorporate it with the sentence-based relation extraction approach. Results We participated in the Bacteria Biotope (BB) Task of the BioNLP Shared Task 2013. Our system (Boun) achieved the second best performance with 68% Slot Error Rate (SER) in Sub-task 1 (Entity Detection and Categorization), and ranked third with an F-score of 27% in Sub-task 2 (Localization Event Extraction). This paper reports the system that is implemented for the shared task, including the novel methods developed and the improvements obtained after the official evaluation. The extensions include the expansion of the OntoBiotope ontology using the training set for Sub-task 1, and the novel sentence-based relation extraction method incorporated with anaphora resolution for Sub-task 2. These extensions resulted in promising results for Sub-task 1 with a SER of 68%, and state-of-the-art performance for Sub-task 2 with an F-score of 53%. Conclusions Our results show that a linguistically-oriented approach based on the shallow syntactic analysis of the text is as effective as machine learning approaches for the detection and ontology-based normalization of habitat entities. Furthermore, the newly developed sentence-based relation extraction system with the anaphora resolution module significantly outperforms the paragraph-based one, as well as the other systems that participated in the BB Shared Task 2013. PMID:26201262
Pybel: a Python wrapper for the OpenBabel cheminformatics toolkit

PubMed Central

O'Boyle, Noel M; Morley, Chris; Hutchison, Geoffrey R

2008-01-01

Background Scripting languages such as Python are ideally suited to common programming tasks in cheminformatics such as data analysis and parsing information from files. However, for reasons of efficiency, cheminformatics toolkits such as the OpenBabel toolkit are often implemented in compiled languages such as C++. We describe Pybel, a Python module that provides access to the OpenBabel toolkit. Results Pybel wraps the direct toolkit bindings to simplify common tasks such as reading and writing molecular files and calculating fingerprints. Extensive use is made of Python iterators to simplify loops such as that over all the molecules in a file. A Pybel Molecule can be easily interconverted to an OpenBabel OBMol to access those methods or attributes not wrapped by Pybel. Conclusion Pybel allows cheminformaticians to rapidly develop Python scripts that manipulate chemical information. It is open source, available cross-platform, and offers the power of the OpenBabel toolkit to Python programmers. PMID:18328109
Adaptive real time selection for quantum key distribution in lossy and turbulent free-space channels

NASA Astrophysics Data System (ADS)

Vallone, Giuseppe; Marangon, Davide G.; Canale, Matteo; Savorgnan, Ilaria; Bacco, Davide; Barbieri, Mauro; Calimani, Simon; Barbieri, Cesare; Laurenti, Nicola; Villoresi, Paolo

2015-04-01

The unconditional security in the creation of cryptographic keys obtained by quantum key distribution (QKD) protocols will induce a quantum leap in free-space communication privacy in the same way that we are beginning to realize secure optical fiber connections. However, free-space channels, in particular those with long links and the presence of atmospheric turbulence, are affected by losses, fluctuating transmissivity, and background light that impair the conditions for secure QKD. Here we introduce a method to contrast the atmospheric turbulence in QKD experiments. Our adaptive real time selection (ARTS) technique at the receiver is based on the selection of the intervals with higher channel transmissivity. We demonstrate, using data from the Canary Island 143-km free-space link, that conditions with unacceptable average quantum bit error rate which would prevent the generation of a secure key can be used once parsed according to the instantaneous scintillation using the ARTS technique.
Pybel: a Python wrapper for the OpenBabel cheminformatics toolkit.

PubMed

O'Boyle, Noel M; Morley, Chris; Hutchison, Geoffrey R

2008-03-09

Scripting languages such as Python are ideally suited to common programming tasks in cheminformatics such as data analysis and parsing information from files. However, for reasons of efficiency, cheminformatics toolkits such as the OpenBabel toolkit are often implemented in compiled languages such as C++. We describe Pybel, a Python module that provides access to the OpenBabel toolkit. Pybel wraps the direct toolkit bindings to simplify common tasks such as reading and writing molecular files and calculating fingerprints. Extensive use is made of Python iterators to simplify loops such as that over all the molecules in a file. A Pybel Molecule can be easily interconverted to an OpenBabel OBMol to access those methods or attributes not wrapped by Pybel. Pybel allows cheminformaticians to rapidly develop Python scripts that manipulate chemical information. It is open source, available cross-platform, and offers the power of the OpenBabel toolkit to Python programmers.
Realization of Real-Time Clinical Data Integration Using Advanced Database Technology

PubMed Central

Yoo, Sooyoung; Kim, Boyoung; Park, Heekyong; Choi, Jinwook; Chun, Jonghoon

2003-01-01

As information & communication technologies have advanced, interest in mobile health care systems has grown. In order to obtain information seamlessly from distributed and fragmented clinical data from heterogeneous institutions, we need solutions that integrate data. In this article, we introduce a method for information integration based on real-time message communication using trigger and advanced database technologies. Messages were devised to conform to HL7, a standard for electronic data exchange in healthcare environments. The HL7 based system provides us with an integrated environment in which we are able to manage the complexities of medical data. We developed this message communication interface to generate and parse HL7 messages automatically from the database point of view. We discuss how easily real time data exchange is performed in the clinical information system, given the requirement for minimum loading of the database system. PMID:14728271
Search for Minimal and Semi-Minimal Rule Sets in Incremental Learning of Context-Free and Definite Clause Grammars

NASA Astrophysics Data System (ADS)

Imada, Keita; Nakamura, Katsuhiko

This paper describes recent improvements to Synapse system for incremental learning of general context-free grammars (CFGs) and definite clause grammars (DCGs) from positive and negative sample strings. An important feature of our approach is incremental learning, which is realized by a rule generation mechanism called “bridging” based on bottom-up parsing for positive samples and the search for rule sets. The sizes of rule sets and the computation time depend on the search strategies. In addition to the global search for synthesizing minimal rule sets and serial search, another method for synthesizing semi-optimum rule sets, we incorporate beam search to the system for synthesizing semi-minimal rule sets. The paper shows several experimental results on learning CFGs and DCGs, and we analyze the sizes of rule sets and the computation time.
Efficient Exact Inference With Loss Augmented Objective in Structured Learning.

PubMed

Bauer, Alexander; Nakajima, Shinichi; Muller, Klaus-Robert

2016-08-19

Structural support vector machine (SVM) is an elegant approach for building complex and accurate models with structured outputs. However, its applicability relies on the availability of efficient inference algorithms--the state-of-the-art training algorithms repeatedly perform inference to compute a subgradient or to find the most violating configuration. In this paper, we propose an exact inference algorithm for maximizing nondecomposable objectives due to special type of a high-order potential having a decomposable internal structure. As an important application, our method covers the loss augmented inference, which enables the slack and margin scaling formulations of structural SVM with a variety of dissimilarity measures, e.g., Hamming loss, precision and recall, Fβ-loss, intersection over union, and many other functions that can be efficiently computed from the contingency table. We demonstrate the advantages of our approach in natural language parsing and sequence segmentation applications.
Social motivation in schizophrenia: How research on basic reward processes informs and limits our understanding.

PubMed

Fulford, Daniel; Campellone, Tim; Gard, David E

2018-05-28

Limited quantity and quality of interpersonal exchanges and relationships predict worse symptomatic and hospitalization outcomes and limit functional recovery in people with schizophrenia. While deficits in social skills and social cognition contribute to much of the impairment in social functioning in schizophrenia, our focus on the current review is social motivation-the drive to connect with others and form meaningful, lasting relationships. We pay particular attention to how recent research on reward informs, and limits, our understanding of the construct. Recent findings that parse out key components of human motivation, especially the temporal nature of reward and effort, are informative for understanding some aspects of social motivation. This approach, however, fails to fully integrate the critical influence of uncertainty and punishment (e.g., avoidance, threat) in social motivation. In the current review, we argue for the importance of experimental paradigms and real-time measurement to capture the interaction between social approach and avoidance in characterizing social affiliation in schizophrenia. We end with suggestions for how researchers might move the field forward by emphasizing the ecological validity of social motivation paradigms, including dynamic, momentary assessment of social reward and punishment using mobile technology and other innovative tools. Copyright © 2018. Published by Elsevier Ltd.
Machine aided indexing from natural language text

NASA Technical Reports Server (NTRS)

Silvester, June P.; Genuardi, Michael T.; Klingbiel, Paul H.

1993-01-01

The NASA Lexical Dictionary (NLD) Machine Aided Indexing (MAI) system was designed to (1) reuse the indexing of the Defense Technical Information Center (DTIC); (2) reuse the indexing of the Department of Energy (DOE); and (3) reduce the time required for original indexing. This was done by automatically generating appropriate NASA thesaurus terms from either the other agency's index terms, or, for original indexing, from document titles and abstracts. The NASA STI Program staff devised two different ways to generate thesaurus terms from text. The first group of programs identified noun phrases by a parsing method that allowed for conjunctions and certain prepositions, on the assumption that indexable concepts are found in such phrases. Results were not always satisfactory, and it was noted that indexable concepts often occurred outside of noun phrases. The first method also proved to be too slow for the ultimate goal of interactive (online) MAI. The second group of programs used the knowledge base (KB), word proximity, and frequency of word and phrase occurrence to identify indexable concepts. Both methods are described and illustrated. Online MAI has been achieved, as well as several spinoff benefits, which are also described.
Preserving Data for Renewable Energy

NASA Astrophysics Data System (ADS)

Macduff, M.; Sivaraman, C.

2017-12-01

The EERE Atmosphere to Electrons (A2e) program established the Data Archive and Portal (DAP) to ensure the long-term preservation and access to A2e research data. The DAP has been operated by PNNL for 2 years with data from more than a dozen projects and 1PB of data and hundreds of datasets expected to be stored this year. The data are a diverse mix of model runs, observational data, and dervived products. While most of the data is public, the DAP has securely stored many proprietary data sets provided by energy producers that are critical to the research goals of the A2e program. The DAP uses Amazon Web Services (AWS) and PNNL resources to provide long-term archival and access to the data with appropriate access controls. As a key element of the DAP, metadata are collected for each dataset to assist with data discovery and usefulness of the data. Further, the DAP has begun a process of standardizing observation data into NetCDF, which allows users to focus on the data instead of parsing the many formats. Creating a central repository that is in tune with the unique needs of the A2e research community is helping active tasks today as well as making many future research efforts possible. In this presentation, we provide an overview the DAP capabilities and benefits to the renewable energy community.
A method for high-throughput production of sequence-verified DNA libraries and strain collections.

PubMed

Smith, Justin D; Schlecht, Ulrich; Xu, Weihong; Suresh, Sundari; Horecka, Joe; Proctor, Michael J; Aiyar, Raeka S; Bennett, Richard A O; Chu, Angela; Li, Yong Fuga; Roy, Kevin; Davis, Ronald W; Steinmetz, Lars M; Hyman, Richard W; Levy, Sasha F; St Onge, Robert P

2017-02-13

The low costs of array-synthesized oligonucleotide libraries are empowering rapid advances in quantitative and synthetic biology. However, high synthesis error rates, uneven representation, and lack of access to individual oligonucleotides limit the true potential of these libraries. We have developed a cost-effective method called Recombinase Directed Indexing (REDI), which involves integration of a complex library into yeast, site-specific recombination to index library DNA, and next-generation sequencing to identify desired clones. We used REDI to generate a library of ~3,300 DNA probes that exhibited > 96% purity and remarkable uniformity (> 95% of probes within twofold of the median abundance). Additionally, we created a collection of ~9,000 individually accessible CRISPR interference yeast strains for > 99% of genes required for either fermentative or respiratory growth, demonstrating the utility of REDI for rapid and cost-effective creation of strain collections from oligonucleotide pools. Our approach is adaptable to any complex DNA library, and fundamentally changes how these libraries can be parsed, maintained, propagated, and characterized. © 2017 The Authors. Published under the terms of the CC BY 4.0 license.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.