Um, Ki Sung; Kwak, Yun Sik; Cho, Hune; Kim, Il Kon
2005-11-01
A basic assumption of Health Level Seven (HL7) protocol is 'No limitation of message length'. However, most existing commercial HL7 interface engines do limit message length because they use the string array method, which is run in the main memory for the HL7 message parsing process. Specifically, messages with image and multi-media data create a long string array and thus cause the computer system to raise critical and fatal problem. Consequently, HL7 messages cannot handle the image and multi-media data necessary in modern medical records. This study aims to solve this problem with the 'streaming algorithm' method. This new method for HL7 message parsing applies the character-stream object which process character by character between the main memory and hard disk device with the consequence that the processing load on main memory could be alleviated. The main functions of this new engine are generating, parsing, validating, browsing, sending, and receiving HL7 messages. Also, the engine can parse and generate XML-formatted HL7 messages. This new HL7 engine successfully exchanged HL7 messages with 10 megabyte size images and discharge summary information between two university hospitals.
Chen, Mingyang; Stott, Amanda C; Li, Shenggang; Dixon, David A
2012-04-01
A robust metadata database called the Collaborative Chemistry Database Tool (CCDBT) for massive amounts of computational chemistry raw data has been designed and implemented. It performs data synchronization and simultaneously extracts the metadata. Computational chemistry data in various formats from different computing sources, software packages, and users can be parsed into uniform metadata for storage in a MySQL database. Parsing is performed by a parsing pyramid, including parsers written for different levels of data types and sets created by the parser loader after loading parser engines and configurations. Copyright © 2011 Elsevier Inc. All rights reserved.
A Semantic Analysis Method for Scientific and Engineering Code
NASA Technical Reports Server (NTRS)
Stewart, Mark E. M.
1998-01-01
This paper develops a procedure to statically analyze aspects of the meaning or semantics of scientific and engineering code. The analysis involves adding semantic declarations to a user's code and parsing this semantic knowledge with the original code using multiple expert parsers. These semantic parsers are designed to recognize formulae in different disciplines including physical and mathematical formulae and geometrical position in a numerical scheme. In practice, a user would submit code with semantic declarations of primitive variables to the analysis procedure, and its semantic parsers would automatically recognize and document some static, semantic concepts and locate some program semantic errors. A prototype implementation of this analysis procedure is demonstrated. Further, the relationship between the fundamental algebraic manipulations of equations and the parsing of expressions is explained. This ability to locate some semantic errors and document semantic concepts in scientific and engineering code should reduce the time, risk, and effort of developing and using these codes.
A Template Engine for Parsing Objects from Textual Representations
NASA Astrophysics Data System (ADS)
Rajković, Milan; Stanković, Milena; Marković, Ivica
2011-09-01
Template engines are widely used for separation of business and presentation logic. They are commonly used in web applications for clean rendering of HTML pages. Another area of usage is message formatting in distributed applications where they transform objects to appropriate representations. This paper explores the possibility of using templates for a reverse process—for creating objects starting from their representations. We present the prototype of engine that we have developed, and describe benefits and drawbacks of this approach.
NASA Technical Reports Server (NTRS)
2002-01-01
A system that retrieves problem reports from a NASA database is described. The database is queried with natural language questions. Part-of-speech tags are first assigned to each word in the question using a rule based tagger. A partial parse of the question is then produced with independent sets of deterministic finite state a utomata. Using partial parse information, a look up strategy searches the database for problem reports relevant to the question. A bigram stemmer and irregular verb conjugates have been incorporated into the system to improve accuracy. The system is evaluated by a set of fifty five questions posed by NASA engineers. A discussion of future research is also presented.
Accuracy and Tuning of Flow Parsing for Visual Perception of Object Motion During Self-Motion
Niehorster, Diederick C.
2017-01-01
How do we perceive object motion during self-motion using visual information alone? Previous studies have reported that the visual system can use optic flow to identify and globally subtract the retinal motion component resulting from self-motion to recover scene-relative object motion, a process called flow parsing. In this article, we developed a retinal motion nulling method to directly measure and quantify the magnitude of flow parsing (i.e., flow parsing gain) in various scenarios to examine the accuracy and tuning of flow parsing for the visual perception of object motion during self-motion. We found that flow parsing gains were below unity for all displays in all experiments; and that increasing self-motion and object motion speed did not alter flow parsing gain. We conclude that visual information alone is not sufficient for the accurate perception of scene-relative motion during self-motion. Although flow parsing performs global subtraction, its accuracy also depends on local motion information in the retinal vicinity of the moving object. Furthermore, the flow parsing gain was constant across common self-motion or object motion speeds. These results can be used to inform and validate computational models of flow parsing. PMID:28567272
voevent-parse: Parse, manipulate, and generate VOEvent XML packets
NASA Astrophysics Data System (ADS)
Staley, Tim D.
2014-11-01
voevent-parse, written in Python, parses, manipulates, and generates VOEvent XML packets; it is built atop lxml.objectify. Details of transients detected by many projects, including Fermi, Swift, and the Catalina Sky Survey, are currently made available as VOEvents, which is also the standard alert format by future facilities such as LSST and SKA. However, working with XML and adhering to the sometimes lengthy VOEvent schema can be a tricky process. voevent-parse provides convenience routines for common tasks, while allowing the user to utilise the full power of the lxml library when required. An earlier version of voevent-parse was part of the pysovo (ascl:1411.002) library.
A reusability and efficiency oriented software design method for mobile land inspection
NASA Astrophysics Data System (ADS)
Cai, Wenwen; He, Jun; Wang, Qing
2008-10-01
Aiming at the requirement from the real-time land inspection domain, a land inspection handset system was presented in this paper. In order to increase the reusability of the system, a design pattern based framework was presented. Encapsulation for command like actions by applying COMMAND pattern was proposed for the problem of complex UI interactions. Integrating several GPS-log parsing engines into a general parsing framework was archived by introducing STRATEGY pattern. A network transmission module based network middleware was constructed. For mitigating the high coupling of complex network communication programs, FACTORY pattern was applied to facilitate the decoupling. Moreover, in order to efficiently manipulate huge GIS datasets, a VISITOR pattern and Quad-tree based multi-scale representation method was presented. It had been proved practically that these design patterns reduced the coupling between the subsystems, and improved the expansibility.
1986-06-30
Machine Studies .. 14. Minton, S. N., Hayes, P. J., and Fain, J. E. Controlling Search in Flexible Parsing. Proc. Ninth Int. Jt. Conf. on Artificial...interaction through the COUSIN command interface", International Journal of Man- Machine Studies , Vol. 19, No. 3, September 1983, pp. 285-305. 8...in a gracefully interacting user interface," "Dynamic strategy selection in flexible parsing," and "Parsing spoken language: a semantic case frame
eMontage: An Architecture for Rapid Integration of Situational Awareness Data at the Edge
2013-05-01
Request Response Interaction Android Client Dispatcher Servlet Spring MVC Controller Camel Producer Template Camel Route Remote Data Service - REST...8SATURN 2013© 2013 Carnegie Mellon University Publish Subscribe Interaction Android Client Dispatcher Servlet Spring MVC Controller Remote Data...set ..., "’"C oUkRelw•b J’- - ’ ~ ’~------’ ~-------- ’-------~ , Parse XML into a single Java XML Document object. -=--:=’ Software Engineering
Parsing learning in networks using brain-machine interfaces.
Orsborn, Amy L; Pesaran, Bijan
2017-10-01
Brain-machine interfaces (BMIs) define new ways to interact with our environment and hold great promise for clinical therapies. Motor BMIs, for instance, re-route neural activity to control movements of a new effector and could restore movement to people with paralysis. Increasing experience shows that interfacing with the brain inevitably changes the brain. BMIs engage and depend on a wide array of innate learning mechanisms to produce meaningful behavior. BMIs precisely define the information streams into and out of the brain, but engage wide-spread learning. We take a network perspective and review existing observations of learning in motor BMIs to show that BMIs engage multiple learning mechanisms distributed across neural networks. Recent studies demonstrate the advantages of BMI for parsing this learning and its underlying neural mechanisms. BMIs therefore provide a powerful tool for studying the neural mechanisms of learning that highlights the critical role of learning in engineered neural therapies. Copyright © 2017 Elsevier Ltd. All rights reserved.
Lassahn, Gordon D.; Lancaster, Gregory D.; Apel, William A.; Thompson, Vicki S.
2013-01-08
Image portion identification methods, image parsing methods, image parsing systems, and articles of manufacture are described. According to one embodiment, an image portion identification method includes accessing data regarding an image depicting a plurality of biological substrates corresponding to at least one biological sample and indicating presence of at least one biological indicator within the biological sample and, using processing circuitry, automatically identifying a portion of the image depicting one of the biological substrates but not others of the biological substrates.
MEG Evidence for Incremental Sentence Composition in the Anterior Temporal Lobe
ERIC Educational Resources Information Center
Brennan, Jonathan R.; Pylkkänen, Liina
2017-01-01
Research investigating the brain basis of language comprehension has associated the left anterior temporal lobe (ATL) with sentence-level combinatorics. Using magnetoencephalography (MEG), we test the parsing strategy implemented in this brain region. The number of incremental parse steps from a predictive left-corner parsing strategy that is…
A Noisy-Channel Approach to Question Answering
2003-01-01
question “When did Elvis Presley die?” To do this, we build a noisy channel model that makes explicit how answer sentence parse trees are mapped into...in Figure 1, the algorithm above generates the following training example: Q: When did Elvis Presley die ? SA: Presley died PP PP in A_DATE, and...engine as a potential candidate for finding the answer to the question “When did Elvis Presley die?” In this case, we don’t know what the answer is
Research on polarization imaging information parsing method
NASA Astrophysics Data System (ADS)
Yuan, Hongwu; Zhou, Pucheng; Wang, Xiaolong
2016-11-01
Polarization information parsing plays an important role in polarization imaging detection. This paper focus on the polarization information parsing method: Firstly, the general process of polarization information parsing is given, mainly including polarization image preprocessing, multiple polarization parameters calculation, polarization image fusion and polarization image tracking, etc.; And then the research achievements of the polarization information parsing method are presented, in terms of polarization image preprocessing, the polarization image registration method based on the maximum mutual information is designed. The experiment shows that this method can improve the precision of registration and be satisfied the need of polarization information parsing; In terms of multiple polarization parameters calculation, based on the omnidirectional polarization inversion model is built, a variety of polarization parameter images are obtained and the precision of inversion is to be improve obviously; In terms of polarization image fusion , using fuzzy integral and sparse representation, the multiple polarization parameters adaptive optimal fusion method is given, and the targets detection in complex scene is completed by using the clustering image segmentation algorithm based on fractal characters; In polarization image tracking, the average displacement polarization image characteristics of auxiliary particle filtering fusion tracking algorithm is put forward to achieve the smooth tracking of moving targets. Finally, the polarization information parsing method is applied to the polarization imaging detection of typical targets such as the camouflage target, the fog and latent fingerprints.
Automated Program Recognition by Graph Parsing
1992-07-01
structures (cliches) in a program can help an experienced programmer understand the program. Based on the known relationships between the clichis, a...Graph Parsing Linda Mary Wills Abstract The recognition of standard computational structures (cliches) in a program can help an experienced programmer...3.4.1 Structure -Sharing ....... ............................ 76 3.4.2 Aggregation ....................................... 80 2 3.5 Chart Parsing Flow
Tags Extarction from Spatial Documents in Search Engines
NASA Astrophysics Data System (ADS)
Borhaninejad, S.; Hakimpour, F.; Hamzei, E.
2015-12-01
Nowadays the selective access to information on the Web is provided by search engines, but in the cases which the data includes spatial information the search task becomes more complex and search engines require special capabilities. The purpose of this study is to extract the information which lies in spatial documents. To that end, we implement and evaluate information extraction from GML documents and a retrieval method in an integrated approach. Our proposed system consists of three components: crawler, database and user interface. In crawler component, GML documents are discovered and their text is parsed for information extraction; storage. The database component is responsible for indexing of information which is collected by crawlers. Finally the user interface component provides the interaction between system and user. We have implemented this system as a pilot system on an Application Server as a simulation of Web. Our system as a spatial search engine provided searching capability throughout the GML documents and thus an important step to improve the efficiency of search engines has been taken.
Parsing English. Course Notes for a Tutorial on Computational Semantics, March 17-22, 1975.
ERIC Educational Resources Information Center
Wilks, Yorick
The course in parsing English is essentially a survey and comparison of several of the principal systems used for understanding natural language. The basic procedure of parsing is described. The discussion of the principal systems is based on the idea that "meaning is procedures," that is, that the procedures of application give a parsed…
Integrated Japanese Dependency Analysis Using a Dialog Context
NASA Astrophysics Data System (ADS)
Ikegaya, Yuki; Noguchi, Yasuhiro; Kogure, Satoru; Itoh, Toshihiko; Konishi, Tatsuhiro; Kondo, Makoto; Asoh, Hideki; Takagi, Akira; Itoh, Yukihiro
This paper describes how to perform syntactic parsing and semantic analysis in a dialog system. The paper especially deals with how to disambiguate potentially ambiguous sentences using the contextual information. Although syntactic parsing and semantic analysis are often studied independently of each other, correct parsing of a sentence often requires the semantic information on the input and/or the contextual information prior to the input. Accordingly, we merge syntactic parsing with semantic analysis, which enables syntactic parsing taking advantage of the semantic content of an input and its context. One of the biggest problems of semantic analysis is how to interpret dependency structures. We employ a framework for semantic representations that circumvents the problem. Within the framework, the meaning of any predicate is converted into a semantic representation which only permits a single type of predicate: an identifying predicate "aru". The semantic representations are expressed as sets of "attribute-value" pairs, and those semantic representations are stored in the context information. Our system disambiguates syntactic/semantic ambiguities of inputs referring to the attribute-value pairs in the context information. We have experimentally confirmed the effectiveness of our approach; specifically, the experiment confirmed high accuracy of parsing and correctness of generated semantic representations.
High-frequency neural activity predicts word parsing in ambiguous speech streams.
Kösem, Anne; Basirat, Anahita; Azizi, Leila; van Wassenhove, Virginie
2016-12-01
During speech listening, the brain parses a continuous acoustic stream of information into computational units (e.g., syllables or words) necessary for speech comprehension. Recent neuroscientific hypotheses have proposed that neural oscillations contribute to speech parsing, but whether they do so on the basis of acoustic cues (bottom-up acoustic parsing) or as a function of available linguistic representations (top-down linguistic parsing) is unknown. In this magnetoencephalography study, we contrasted acoustic and linguistic parsing using bistable speech sequences. While listening to the speech sequences, participants were asked to maintain one of the two possible speech percepts through volitional control. We predicted that the tracking of speech dynamics by neural oscillations would not only follow the acoustic properties but also shift in time according to the participant's conscious speech percept. Our results show that the latency of high-frequency activity (specifically, beta and gamma bands) varied as a function of the perceptual report. In contrast, the phase of low-frequency oscillations was not strongly affected by top-down control. Whereas changes in low-frequency neural oscillations were compatible with the encoding of prelexical segmentation cues, high-frequency activity specifically informed on an individual's conscious speech percept. Copyright © 2016 the American Physiological Society.
Xu, Hua; AbdelRahman, Samir; Lu, Yanxin; Denny, Joshua C.; Doan, Son
2011-01-01
Semantic-based sublanguage grammars have been shown to be an efficient method for medical language processing. However, given the complexity of the medical domain, parsers using such grammars inevitably encounter ambiguous sentences, which could be interpreted by different groups of production rules and consequently result in two or more parse trees. One possible solution, which has not been extensively explored previously, is to augment productions in medical sublanguage grammars with probabilities to resolve the ambiguity. In this study, we associated probabilities with production rules in a semantic-based grammar for medication findings and evaluated its performance on reducing parsing ambiguity. Using the existing data set from 2009 i2b2 NLP (Natural Language Processing) challenge for medication extraction, we developed a semantic-based CFG (Context Free Grammar) for parsing medication sentences and manually created a Treebank of 4,564 medication sentences from discharge summaries. Using the Treebank, we derived a semantic-based PCFG (probabilistic Context Free Grammar) for parsing medication sentences. Our evaluation using a 10-fold cross validation showed that the PCFG parser dramatically improved parsing performance when compared to the CFG parser. PMID:21856440
High-frequency neural activity predicts word parsing in ambiguous speech streams
Basirat, Anahita; Azizi, Leila; van Wassenhove, Virginie
2016-01-01
During speech listening, the brain parses a continuous acoustic stream of information into computational units (e.g., syllables or words) necessary for speech comprehension. Recent neuroscientific hypotheses have proposed that neural oscillations contribute to speech parsing, but whether they do so on the basis of acoustic cues (bottom-up acoustic parsing) or as a function of available linguistic representations (top-down linguistic parsing) is unknown. In this magnetoencephalography study, we contrasted acoustic and linguistic parsing using bistable speech sequences. While listening to the speech sequences, participants were asked to maintain one of the two possible speech percepts through volitional control. We predicted that the tracking of speech dynamics by neural oscillations would not only follow the acoustic properties but also shift in time according to the participant's conscious speech percept. Our results show that the latency of high-frequency activity (specifically, beta and gamma bands) varied as a function of the perceptual report. In contrast, the phase of low-frequency oscillations was not strongly affected by top-down control. Whereas changes in low-frequency neural oscillations were compatible with the encoding of prelexical segmentation cues, high-frequency activity specifically informed on an individual's conscious speech percept. PMID:27605528
An All-Fragments Grammar for Simple and Accurate Parsing
2012-03-21
Tsujii. Probabilistic CFG with latent annotations. In Proceedings of ACL, 2005. Slav Petrov and Dan Klein. Improved Inference for Unlexicalized Parsing. In...Proceedings of NAACL-HLT, 2007. Slav Petrov and Dan Klein. Sparse Multi-Scale Grammars for Discriminative Latent Variable Parsing. In Proceedings of...EMNLP, 2008. Slav Petrov, Leon Barrett, Romain Thibaux, and Dan Klein. Learning Accurate, Compact, and Interpretable Tree Annotation. In Proceedings
Solving LR Conflicts Through Context Aware Scanning
NASA Astrophysics Data System (ADS)
Leon, C. Rodriguez; Forte, L. Garcia
2011-09-01
This paper presents a new algorithm to compute the exact list of tokens expected by any LR syntax analyzer at any point of the scanning process. The lexer can, at any time, compute the exact list of valid tokens to return only tokens in this set. In the case than more than one matching token is in the valid set, the lexer can resort to a nested LR parser to disambiguate. Allowing nested LR parsing requires some slight modifications when building the LR parsing tables. We also show how LR parsers can parse conflictive and inherently ambiguous languages using a combination of nested parsing and context aware scanning. These expanded lexical analyzers can be generated from high level specifications.
An Experiment in Scientific Code Semantic Analysis
NASA Technical Reports Server (NTRS)
Stewart, Mark E. M.
1998-01-01
This paper concerns a procedure that analyzes aspects of the meaning or semantics of scientific and engineering code. This procedure involves taking a user's existing code, adding semantic declarations for some primitive variables, and parsing this annotated code using multiple, distributed expert parsers. These semantic parser are designed to recognize formulae in different disciplines including physical and mathematical formulae and geometrical position in a numerical scheme. The parsers will automatically recognize and document some static, semantic concepts and locate some program semantic errors. Results are shown for a subroutine test case and a collection of combustion code routines. This ability to locate some semantic errors and document semantic concepts in scientific and engineering code should reduce the time, risk, and effort of developing and using these codes.
Chen, Hung-Ming; Liou, Yong-Zan
2014-10-01
In a mobile health management system, mobile devices act as the application hosting devices for personal health records (PHRs) and the healthcare servers construct to exchange and analyze PHRs. One of the most popular PHR standards is continuity of care record (CCR). The CCR is expressed in XML formats. However, parsing is an expensive operation that can degrade XML processing performance. Hence, the objective of this study was to identify different operational and performance characteristics for those CCR parsing models including the XML DOM parser, the SAX parser, the PULL parser, and the JSON parser with regard to JSON data converted from XML-based CCR. Thus, developers can make sensible choices for their target PHR applications to parse CCRs when using mobile devices or servers with different system resources. Furthermore, the simulation experiments of four case studies are conducted to compare the parsing performance on Android mobile devices and the server with large quantities of CCR data.
Parsing Flowcharts and Series-Parallel Graphs
1978-11-01
descriptions of the graph. This possible multiplicity is undesirable in most practical applications, a fact that makes parti%:ularly useful reduction...to parse TT networks, some of the features that make this parsing method useful in other cases are more natually introduced in the context of this...as Figure 4.5 shows. This multiplicity is due to the associativity of consecutive Two Terminal Series and Two Terminal Parallel compositions. In spite
Context-free parsing with connectionist networks
NASA Astrophysics Data System (ADS)
Fanty, M. A.
1986-08-01
This paper presents a simple algorithm which converts any context-free grammar into a connectionist network which parses strings (of arbitrary but fixed maximum length) in the language defined by that grammar. The network is fast, O(n), and deterministicd. It consists of binary units which compute a simple function of their input. When the grammar is put in Chomsky normal form, O(n3) units needed to parse inputs of length up to n.
Using Parse's humanbecoming theory in Japan.
Tanaka, Junko; Katsuno, Towako; Takahashi, Teruko
2012-01-01
In this paper the authors discuss the use of Parse's humanbecoming theory in Japan. Elements of the theory are used in the nursing approach to an 88 year-old Japanese man who had complications following surgery. Process recordings of the dialogues between the patient, the patient's wife, and the nurse were made and considered in light of the three methodologies of Parse's theory; illuminating meaning, synchronizing rhythms, and mobilizing transcendence. The theory is seen as useful in Japan.
Modeling Syntax for Parsing and Translation
2003-12-15
20 CHAPTER 2. MONOLINGUAL PROBABILISTIC PARSING a the D cat snake D S O chased S O ran SS Mary O Figure 2.1: Part of a dictionary . the cat S chased S O...along with their training algorithms: a monolingual gen- erative model of sentence structure, and a model of the relationship between the structure of a...tasks of monolingual parsing and word-level bilingual corpus alignment, they are demonstrated in two additional applications. First, a new statistical
Tapp, Diane; Lavoie, Mireille
2017-04-01
Discussions about real knowledge contained in grand theories and models seem to remain an active quest in the academic sphere. The most fervent of these defendants is Rosemarie Parse with her Humanbecoming School of Thought (1981, 1998). This article first highlights the similarities between Parse's theory and Blumer's symbolic interactionism (1969). This comparison will act as a counterargument to Parse's assertions that her theory is original 'nursing' material. Standing on the contemporary philosophy of science, the very possibility for discovering specific nursing knowledge will be questioned. Second, Parse's scientific assumptions will be thoroughly addressed and contrasted with Blumer's more moderate view of knowledge. It will lead to recognize that the valorization of the social nature of existence and reality does not necessarily induce requirements and methods such as those proposed by Parse. According to Blumer's point of view, her perspective may not even be desirable. Recommendations will be raised about the necessity for a distanced relationship to knowledge, being the key to the pursuit of its improvement, not its circular contemplation. © 2016 John Wiley & Sons Ltd.
Harris, Greg M.; Shazly, Tarek; Jabbarzadeh, Ehsan
2013-01-01
Significant effort has gone towards parsing out the effects of surrounding microenvironment on macroscopic behavior of stem cells. Many of the microenvironmental cues, however, are intertwined, and thus, further studies are warranted to identify the intricate interplay among the conflicting downstream signaling pathways that ultimately guide a cell response. In this contribution, by patterning adhesive PEG (polyethylene glycol) hydrogels using Dip Pen Nanolithography (DPN), we demonstrate that substrate elasticity, subcellular elasticity, ligand density, and topography ultimately define mesenchymal stem cells (MSCs) spreading and shape. Physical characteristics are parsed individually with 7 kilopascal (kPa) hydrogel islands leading to smaller, spindle shaped cells and 105 kPa hydrogel islands leading to larger, polygonal cell shapes. In a parallel effort, a finite element model was constructed to characterize and confirm experimental findings and aid as a predictive tool in modeling cell microenvironments. Signaling pathway inhibition studies suggested that RhoA is a key regulator of cell response to the cooperative effect of the tunable substrate variables. These results are significant for the engineering of cell-extra cellular matrix interfaces and ultimately decoupling matrix bound cues presented to cells in a tissue microenvironment for regenerative medicine. PMID:24282570
RadSearch: a RIS/PACS integrated query tool
NASA Astrophysics Data System (ADS)
Tsao, Sinchai; Documet, Jorge; Moin, Paymann; Wang, Kevin; Liu, Brent J.
2008-03-01
Radiology Information Systems (RIS) contain a wealth of information that can be used for research, education, and practice management. However, the sheer amount of information available makes querying specific data difficult and time consuming. Previous work has shown that a clinical RIS database and its RIS text reports can be extracted, duplicated and indexed for searches while complying with HIPAA and IRB requirements. This project's intent is to provide a software tool, the RadSearch Toolkit, to allow intelligent indexing and parsing of RIS reports for easy yet powerful searches. In addition, the project aims to seamlessly query and retrieve associated images from the Picture Archiving and Communication System (PACS) in situations where an integrated RIS/PACS is in place - even subselecting individual series, such as in an MRI study. RadSearch's application of simple text parsing techniques to index text-based radiology reports will allow the search engine to quickly return relevant results. This powerful combination will be useful in both private practice and academic settings; administrators can easily obtain complex practice management information such as referral patterns; researchers can conduct retrospective studies with specific, multiple criteria; teaching institutions can quickly and effectively create thorough teaching files.
Integrating high dimensional bi-directional parsing models for gene mention tagging.
Hsu, Chun-Nan; Chang, Yu-Ming; Kuo, Cheng-Ju; Lin, Yu-Shi; Huang, Han-Shen; Chung, I-Fang
2008-07-01
Tagging gene and gene product mentions in scientific text is an important initial step of literature mining. In this article, we describe in detail our gene mention tagger participated in BioCreative 2 challenge and analyze what contributes to its good performance. Our tagger is based on the conditional random fields model (CRF), the most prevailing method for the gene mention tagging task in BioCreative 2. Our tagger is interesting because it accomplished the highest F-scores among CRF-based methods and second over all. Moreover, we obtained our results by mostly applying open source packages, making it easy to duplicate our results. We first describe in detail how we developed our CRF-based tagger. We designed a very high dimensional feature set that includes most of information that may be relevant. We trained bi-directional CRF models with the same set of features, one applies forward parsing and the other backward, and integrated two models based on the output scores and dictionary filtering. One of the most prominent factors that contributes to the good performance of our tagger is the integration of an additional backward parsing model. However, from the definition of CRF, it appears that a CRF model is symmetric and bi-directional parsing models will produce the same results. We show that due to different feature settings, a CRF model can be asymmetric and the feature setting for our tagger in BioCreative 2 not only produces different results but also gives backward parsing models slight but constant advantage over forward parsing model. To fully explore the potential of integrating bi-directional parsing models, we applied different asymmetric feature settings to generate many bi-directional parsing models and integrate them based on the output scores. Experimental results show that this integrated model can achieve even higher F-score solely based on the training corpus for gene mention tagging. Data sets, programs and an on-line service of our gene mention tagger can be accessed at http://aiia.iis.sinica.edu.tw/biocreative2.htm.
Object-oriented parsing of biological databases with Python.
Ramu, C; Gemünd, C; Gibson, T J
2000-07-01
While database activities in the biological area are increasing rapidly, rather little is done in the area of parsing them in a simple and object-oriented way. We present here an elegant, simple yet powerful way of parsing biological flat-file databases. We have taken EMBL, SWISSPROT and GENBANK as examples. EMBL and SWISS-PROT do not differ much in the format structure. GENBANK has a very different format structure than EMBL and SWISS-PROT. Extracting the desired fields in an entry (for example a sub-sequence with an associated feature) for later analysis is a constant need in the biological sequence-analysis community: this is illustrated with tools to make new splice-site databases. The interface to the parser is abstract in the sense that the access to all the databases is independent from their different formats, since parsing instructions are hidden.
Comparison of Classical and Lazy Approach in SCG Compiler
NASA Astrophysics Data System (ADS)
Jirák, Ota; Kolář, Dušan
2011-09-01
The existing parsing methods of scattered context grammar usually expand nonterminals deeply in the pushdown. This expansion is implemented by using either a linked list, or some kind of an auxiliary pushdown. This paper describes the parsing algorithm of an LL(1) scattered context grammar. The given algorithm merges two principles together. The first approach is a table-driven parsing method commonly used for parsing of the context-free grammars. The second is a delayed execution used in functional programming. The main part of this paper is a proof of equivalence between the common principle (the whole rule is applied at once) and our approach (execution of the rules is delayed). Therefore, this approach works with the pushdown top only. In the most cases, the second approach is faster than the first one. Finally, the future work is discussed.
Automatic Parsing of Parental Verbal Input
Sagae, Kenji; MacWhinney, Brian; Lavie, Alon
2006-01-01
To evaluate theoretical proposals regarding the course of child language acquisition, researchers often need to rely on the processing of large numbers of syntactically parsed utterances, both from children and their parents. Because it is so difficult to do this by hand, there are currently no parsed corpora of child language input data. To automate this process, we developed a system that combined the MOR tagger, a rule-based parser, and statistical disambiguation techniques. The resultant system obtained nearly 80% correct parses for the sentences spoken to children. To achieve this level, we had to construct a particular processing sequence that minimizes problems caused by the coverage/ambiguity trade-off in parser design. These procedures are particularly appropriate for use with the CHILDES database, an international corpus of transcripts. The data and programs are now freely available over the Internet. PMID:15190707
GBParsy: a GenBank flatfile parser library with high speed.
Lee, Tae-Ho; Kim, Yeon-Ki; Nahm, Baek Hie
2008-07-25
GenBank flatfile (GBF) format is one of the most popular sequence file formats because of its detailed sequence features and ease of readability. To use the data in the file by a computer, a parsing process is required and is performed according to a given grammar for the sequence and the description in a GBF. Currently, several parser libraries for the GBF have been developed. However, with the accumulation of DNA sequence information from eukaryotic chromosomes, parsing a eukaryotic genome sequence with these libraries inevitably takes a long time, due to the large GBF file and its correspondingly large genomic nucleotide sequence and related feature information. Thus, there is significant need to develop a parsing program with high speed and efficient use of system memory. We developed a library, GBParsy, which was C language-based and parses GBF files. The parsing speed was maximized by using content-specified functions in place of regular expressions that are flexible but slow. In addition, we optimized an algorithm related to memory usage so that it also increased parsing performance and efficiency of memory usage. GBParsy is at least 5-100x faster than current parsers in benchmark tests. GBParsy is estimated to extract annotated information from almost 100 Mb of a GenBank flatfile for chromosomal sequence information within a second. Thus, it should be used for a variety of applications such as on-time visualization of a genome at a web site.
FRED: a program development tool
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shilling, J.
1985-09-01
The structured, screen-based editor FRED is introduced. FRED provides incremental parsing and semantic analysis. The parsing is based on an LL(1) top-down algorithm which has been modified to provide follow-the-cursor parsing and soft templates. The languages accepted by the editor are LL(1) languages with the addition of the Unknown and preferred production non-terminal classes. The semantic analysis is based on the incremental update of attribute grammar equations. We briefly describe the interface between FRED and an automated reference librarian system that is under development.
Recognition of Equations Using a Two-Dimensional Stochastic Context-Free Grammar
NASA Astrophysics Data System (ADS)
Chou, Philip A.
1989-11-01
We propose using two-dimensional stochastic context-free grammars for image recognition, in a manner analogous to using hidden Markov models for speech recognition. The value of the approach is demonstrated in a system that recognizes printed, noisy equations. The system uses a two-dimensional probabilistic version of the Cocke-Younger-Kasami parsing algorithm to find the most likely parse of the observed image, and then traverses the corresponding parse tree in accordance with translation formats associated with each production rule, to produce eqn I troff commands for the imaged equation. In addition, it uses two-dimensional versions of the Inside/Outside and Baum re-estimation algorithms for learning the parameters of the grammar from a training set of examples. Parsing the image of a simple noisy equation currently takes about one second of cpu time on an Alliant FX/80.
Prosody and parsing in coordination structures.
Schepman, A; Rodway, P
2000-05-01
The effect of prosodic boundary cues on the off-line disambiguation and on-line parsing of coordination structures was examined. It was found that relative clauses were attached to coordinated object noun phrases in preference to second conjuncts in sentences like: The lawyer greeted the powerful barrister and the wise judge who was/were walking to the courtroom. Naive speakers signalled the syntactic contrast between the two structures by a prosodic break between the conjuncts when the relative clause was attached to the second conjunct. Listeners were able to use this prosodic information in both off-line syntactic disambiguation and on-line syntactic parsing. The findings are compatible with a model in which prosody has a strong immediate effect on parsing. It is argued that the current experimental design has avoided confounds present in earlier studies on the on-line integration of prosodic and syntactic information.
HDM/PASCAL Verification System User's Manual
NASA Technical Reports Server (NTRS)
Hare, D.
1983-01-01
The HDM/Pascal verification system is a tool for proving the correctness of programs written in PASCAL and specified in the Hierarchical Development Methodology (HDM). This document assumes an understanding of PASCAL, HDM, program verification, and the STP system. The steps toward verification which this tool provides are parsing programs and specifications, checking the static semantics, and generating verification conditions. Some support functions are provided such as maintaining a data base, status management, and editing. The system runs under the TOPS-20 and TENEX operating systems and is written in INTERLISP. However, no knowledge is assumed of these operating systems or of INTERLISP. The system requires three executable files, HDMVCG, PARSE, and STP. Optionally, the editor EMACS should be on the system in order for the editor to work. The file HDMVCG is invoked to run the system. The files PARSE and STP are used as lower forks to perform the functions of parsing and proving.
A structural SVM approach for reference parsing.
Zhang, Xiaoli; Zou, Jie; Le, Daniel X; Thoma, George R
2011-06-09
Automated extraction of bibliographic data, such as article titles, author names, abstracts, and references is essential to the affordable creation of large citation databases. References, typically appearing at the end of journal articles, can also provide valuable information for extracting other bibliographic data. Therefore, parsing individual reference to extract author, title, journal, year, etc. is sometimes a necessary preprocessing step in building citation-indexing systems. The regular structure in references enables us to consider reference parsing a sequence learning problem and to study structural Support Vector Machine (structural SVM), a newly developed structured learning algorithm on parsing references. In this study, we implemented structural SVM and used two types of contextual features to compare structural SVM with conventional SVM. Both methods achieve above 98% token classification accuracy and above 95% overall chunk-level accuracy for reference parsing. We also compared SVM and structural SVM to Conditional Random Field (CRF). The experimental results show that structural SVM and CRF achieve similar accuracies at token- and chunk-levels. When only basic observation features are used for each token, structural SVM achieves higher performance compared to SVM since it utilizes the contextual label features. However, when the contextual observation features from neighboring tokens are combined, SVM performance improves greatly, and is close to that of structural SVM after adding the second order contextual observation features. The comparison of these two methods with CRF using the same set of binary features show that both structural SVM and CRF perform better than SVM, indicating their stronger sequence learning ability in reference parsing.
Warren, Paul A; Rushton, Simon K
2009-05-01
We have recently suggested that the brain uses its sensitivity to optic flow in order to parse retinal motion into components arising due to self and object movement (e.g. Rushton, S. K., & Warren, P. A. (2005). Moving observers, 3D relative motion and the detection of object movement. Current Biology, 15, R542-R543). Here, we explore whether stereo disparity is necessary for flow parsing or whether other sources of depth information, which could theoretically constrain flow-field interpretation, are sufficient. Stationary observers viewed large field of view stimuli containing textured cubes, moving in a manner that was consistent with a complex observer movement through a stationary scene. Observers made speeded responses to report the perceived direction of movement of a probe object presented at different depths in the scene. Across conditions we varied the presence or absence of different binocular and monocular cues to depth order. In line with previous studies, results consistent with flow parsing (in terms of both perceived direction and response time) were found in the condition in which motion parallax and stereoscopic disparity were present. Observers were poorer at judging object movement when depth order was specified by parallax alone. However, as more monocular depth cues were added to the stimulus the results approached those found when the scene contained stereoscopic cues. We conclude that both monocular and binocular static depth information contribute to flow parsing. These findings are discussed in the context of potential architectures for a model of the flow parsing mechanism.
A Semantic Constraint on Syntactic Parsing.
ERIC Educational Resources Information Center
Crain, Stephen; Coker, Pamela L.
This research examines how semantic information influences syntactic parsing decisions during sentence processing. In the first experiment, subjects were presented lexical strings having syntactically identical surface structures but with two possible underlying structures: "The children taught by the Berlitz method," and "The…
Hardware independence checkout software
NASA Technical Reports Server (NTRS)
Cameron, Barry W.; Helbig, H. R.
1990-01-01
ACSI has developed a program utilizing CLIPS to assess compliance with various programming standards. Essentially the program parses C code to extract the names of all function calls. These are asserted as CLIPS facts which also include information about line numbers, source file names, and called functions. Rules have been devised to establish functions called that have not been defined in any of the source parsed. These are compared against lists of standards (represented as facts) using rules that check intersections and/or unions of these. By piping the output into other processes the source is appropriately commented by generating and executing parsed scripts.
ParseCNV integrative copy number variation association software with quality tracking
Glessner, Joseph T.; Li, Jin; Hakonarson, Hakon
2013-01-01
A number of copy number variation (CNV) calling algorithms exist; however, comprehensive software tools for CNV association studies are lacking. We describe ParseCNV, unique software that takes CNV calls and creates probe-based statistics for CNV occurrence in both case–control design and in family based studies addressing both de novo and inheritance events, which are then summarized based on CNV regions (CNVRs). CNVRs are defined in a dynamic manner to allow for a complex CNV overlap while maintaining precise association region. Using this approach, we avoid failure to converge and non-monotonic curve fitting weaknesses of programs, such as CNVtools and CNVassoc, and although Plink is easy to use, it only provides combined CNV state probe-based statistics, not state-specific CNVRs. Existing CNV association methods do not provide any quality tracking information to filter confident associations, a key issue which is fully addressed by ParseCNV. In addition, uncertainty in CNV calls underlying CNV associations is evaluated to verify significant results, including CNV overlap profiles, genomic context, number of probes supporting the CNV and single-probe intensities. When optimal quality control parameters are followed using ParseCNV, 90% of CNVs validate by polymerase chain reaction, an often problematic stage because of inadequate significant association review. ParseCNV is freely available at http://parsecnv.sourceforge.net. PMID:23293001
ParseCNV integrative copy number variation association software with quality tracking.
Glessner, Joseph T; Li, Jin; Hakonarson, Hakon
2013-03-01
A number of copy number variation (CNV) calling algorithms exist; however, comprehensive software tools for CNV association studies are lacking. We describe ParseCNV, unique software that takes CNV calls and creates probe-based statistics for CNV occurrence in both case-control design and in family based studies addressing both de novo and inheritance events, which are then summarized based on CNV regions (CNVRs). CNVRs are defined in a dynamic manner to allow for a complex CNV overlap while maintaining precise association region. Using this approach, we avoid failure to converge and non-monotonic curve fitting weaknesses of programs, such as CNVtools and CNVassoc, and although Plink is easy to use, it only provides combined CNV state probe-based statistics, not state-specific CNVRs. Existing CNV association methods do not provide any quality tracking information to filter confident associations, a key issue which is fully addressed by ParseCNV. In addition, uncertainty in CNV calls underlying CNV associations is evaluated to verify significant results, including CNV overlap profiles, genomic context, number of probes supporting the CNV and single-probe intensities. When optimal quality control parameters are followed using ParseCNV, 90% of CNVs validate by polymerase chain reaction, an often problematic stage because of inadequate significant association review. ParseCNV is freely available at http://parsecnv.sourceforge.net.
Fan, Jung-Wei; Friedman, Carol
2011-01-01
Biomedical natural language processing (BioNLP) is a useful technique that unlocks valuable information stored in textual data for practice and/or research. Syntactic parsing is a critical component of BioNLP applications that rely on correctly determining the sentence and phrase structure of free text. In addition to dealing with the vast amount of domain-specific terms, a robust biomedical parser needs to model the semantic grammar to obtain viable syntactic structures. With either a rule-based or corpus-based approach, the grammar engineering process requires substantial time and knowledge from experts, and does not always yield a semantically transferable grammar. To reduce the human effort and to promote semantic transferability, we propose an automated method for deriving a probabilistic grammar based on a training corpus consisting of concept strings and semantic classes from the Unified Medical Language System (UMLS), a comprehensive terminology resource widely used by the community. The grammar is designed to specify noun phrases only due to the nominal nature of the majority of biomedical terminological concepts. Evaluated on manually parsed clinical notes, the derived grammar achieved a recall of 0.644, precision of 0.737, and average cross-bracketing of 0.61, which demonstrated better performance than a control grammar with the semantic information removed. Error analysis revealed shortcomings that could be addressed to improve performance. The results indicated the feasibility of an approach which automatically incorporates terminology semantics in the building of an operational grammar. Although the current performance of the unsupervised solution does not adequately replace manual engineering, we believe once the performance issues are addressed, it could serve as an aide in a semi-supervised solution. PMID:21549857
Yes! An object-oriented compiler compiler (YOOCC)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Avotins, J.; Mingins, C.; Schmidt, H.
1995-12-31
Grammar-based processor generation is one of the most widely studied areas in language processor construction. However, there have been very few approaches to date that reconcile object-oriented principles, processor generation, and an object-oriented language. Pertinent here also. is that currently to develop a processor using the Eiffel Parse libraries requires far too much time to be expended on tasks that can be automated. For these reasons, we have developed YOOCC (Yes! an Object-Oriented Compiler Compiler), which produces a processor framework from a grammar using an enhanced version of the Eiffel Parse libraries, incorporating the ideas hypothesized by Meyer, and Grapemore » and Walden, as well as many others. Various essential changes have been made to the Eiffel Parse libraries. Examples are presented to illustrate the development of a processor using YOOCC, and it is concluded that the Eiffel Parse libraries are now not only an intelligent, but also a productive option for processor construction.« less
Marginal Space Deep Learning: Efficient Architecture for Volumetric Image Parsing.
Ghesu, Florin C; Krubasik, Edward; Georgescu, Bogdan; Singh, Vivek; Yefeng Zheng; Hornegger, Joachim; Comaniciu, Dorin
2016-05-01
Robust and fast solutions for anatomical object detection and segmentation support the entire clinical workflow from diagnosis, patient stratification, therapy planning, intervention and follow-up. Current state-of-the-art techniques for parsing volumetric medical image data are typically based on machine learning methods that exploit large annotated image databases. Two main challenges need to be addressed, these are the efficiency in scanning high-dimensional parametric spaces and the need for representative image features which require significant efforts of manual engineering. We propose a pipeline for object detection and segmentation in the context of volumetric image parsing, solving a two-step learning problem: anatomical pose estimation and boundary delineation. For this task we introduce Marginal Space Deep Learning (MSDL), a novel framework exploiting both the strengths of efficient object parametrization in hierarchical marginal spaces and the automated feature design of Deep Learning (DL) network architectures. In the 3D context, the application of deep learning systems is limited by the very high complexity of the parametrization. More specifically 9 parameters are necessary to describe a restricted affine transformation in 3D, resulting in a prohibitive amount of billions of scanning hypotheses. The mechanism of marginal space learning provides excellent run-time performance by learning classifiers in clustered, high-probability regions in spaces of gradually increasing dimensionality. To further increase computational efficiency and robustness, in our system we learn sparse adaptive data sampling patterns that automatically capture the structure of the input. Given the object localization, we propose a DL-based active shape model to estimate the non-rigid object boundary. Experimental results are presented on the aortic valve in ultrasound using an extensive dataset of 2891 volumes from 869 patients, showing significant improvements of up to 45.2% over the state-of-the-art. To our knowledge, this is the first successful demonstration of the DL potential to detection and segmentation in full 3D data with parametrized representations.
Pen-based Interfaces for Engineering and Education
NASA Astrophysics Data System (ADS)
Stahovich, Thomas F.
Sketches are an important problem-solving tool in many fields. This is particularly true of engineering design, where sketches facilitate creativity by providing an efficient medium for expressing ideas. However, despite the importance of sketches in engineering practice, current engineering software still relies on traditional mouse and keyboard interfaces, with little or no capabilities to handle free-form sketch input. With recent advances in machine-interpretation techniques, it is now becoming possible to create practical interpretation-based interfaces for such software. In this chapter, we report on our efforts to create interpretation techniques to enable pen-based engineering applications. We describe work on two fundamental sketch understanding problems. The first is sketch parsing, the task of clustering pen strokes or geometric primitives into individual symbols. The second is symbol recognition, the task of classifying symbols once they have been located by a parser. We have used the techniques that we have developed to construct several pen-based engineering analysis tools. These are used here as examples to illustrate our methods. We have also begun to use our techniques to create pen-based tutoring systems that scaffold students in solving problems in the same way they would ordinarily solve them with paper and pencil. The chapter concludes with a brief discussion of these systems.
Speed up of XML parsers with PHP language implementation
NASA Astrophysics Data System (ADS)
Georgiev, Bozhidar; Georgieva, Adriana
2012-11-01
In this paper, authors introduce PHP5's XML implementation and show how to read, parse, and write a short and uncomplicated XML file using Simple XML in a PHP environment. The possibilities for mutual work of PHP5 language and XML standard are described. The details of parsing process with Simple XML are also cleared. A practical project PHP-XML-MySQL presents the advantages of XML implementation in PHP modules. This approach allows comparatively simple search of XML hierarchical data by means of PHP software tools. The proposed project includes database, which can be extended with new data and new XML parsing functions.
Mention Detection: Heuristics for the OntoNotes Annotations
2011-01-01
Mention Detection: Heuristics for the OntoNotes annotations Jonathan K. Kummerfeld, Mohit Bansal, David Burkett and Dan Klein Computer Science...considered the provided parses and parses produced by the Berke - ley parser (Petrov et al., 2006) trained on the pro- vided training data. We added a
Simultaneous Translation: Idiom Interpretation and Parsing Heuristics.
ERIC Educational Resources Information Center
McDonald, Janet L.; Carpenter, Patricia A.
1981-01-01
Presents a model of interpretation, parsing and error recovery in simultaneous translation using two experts and two amateur German-English bilingual translators orally translating from English to German. Argues that the translator first comprehends the text in English and divides it into meaningful units before translating. Study also…
Progress in The Semantic Analysis of Scientific Code
NASA Technical Reports Server (NTRS)
Stewart, Mark
2000-01-01
This paper concerns a procedure that analyzes aspects of the meaning or semantics of scientific and engineering code. This procedure involves taking a user's existing code, adding semantic declarations for some primitive variables, and parsing this annotated code using multiple, independent expert parsers. These semantic parsers encode domain knowledge and recognize formulae in different disciplines including physics, numerical methods, mathematics, and geometry. The parsers will automatically recognize and document some static, semantic concepts and help locate some program semantic errors. These techniques may apply to a wider range of scientific codes. If so, the techniques could reduce the time, risk, and effort required to develop and modify scientific codes.
Memory mechanisms supporting syntactic comprehension.
Caplan, David; Waters, Gloria
2013-04-01
Efforts to characterize the memory system that supports sentence comprehension have historically drawn extensively on short-term memory as a source of mechanisms that might apply to sentences. The focus of these efforts has changed significantly in the past decade. As a result of changes in models of short-term working memory (ST-WM) and developments in models of sentence comprehension, the effort to relate entire components of an ST-WM system, such as those in the model developed by Baddeley (Nature Reviews Neuroscience 4: 829-839, 2003) to sentence comprehension has largely been replaced by an effort to relate more specific mechanisms found in modern models of ST-WM to memory processes that support one aspect of sentence comprehension--the assignment of syntactic structure (parsing) and its use in determining sentence meaning (interpretation) during sentence comprehension. In this article, we present the historical background to recent studies of the memory mechanisms that support parsing and interpretation and review recent research into this relation. We argue that the results of this research do not converge on a set of mechanisms derived from ST-WM that apply to parsing and interpretation. We argue that the memory mechanisms supporting parsing and interpretation have features that characterize another memory system that has been postulated to account for skilled performance-long-term working memory. We propose a model of the relation of different aspects of parsing and interpretation to ST-WM and long-term working memory.
Parsing GML data based on integrative GML syntactic and semantic schemas database
NASA Astrophysics Data System (ADS)
Miao, Lizhi; Zhang, Shuliang; Lu, Guonian; Gao, Xiaoli; Jiao, Donglai; Gan, Jiayan
2007-06-01
This paper proposes a new method to parse various application schemas of Geography Markup Language (GML) for understanding syntax and semantic of their element and type in order to implement uniform interpretation of the same GML instance data among diverse users. The proposed method generates an Integrative GML Syntactic and Semantic Schemas Database (IGSSSDB) from GML3.1 core schemas and corresponding application schema. This paper parses GML data based on IGSSSDB, which is composed of syntactic and semantic information, nesting information and mapping rules of GML core schemas and application schemas. Three kinds of relational tables are designed for storing information from schemas when constructing IGSSSDB. Those are info tables for schemas included and namespace imported in application schemas, tables for information related to schemas and catalog tables of core schemas. In relational tables, we propose to use homologous regular expression to describe model of elements and complex types in schemas, which can ensure model complete and readable. Based on IGSSSDB, we design and develop many APIs to implement GML data parsing, and can process syntactic and semantic information of GML data from diverse fields and users. At the latter part of this paper, test study is implemented to show that the proposed method is feasible and appropriate for parsing GML data. Also, it founds a good basis for future GML data studies such as storage, index and query etc.
Effects of Tasks on BOLD Signal Responses to Sentence Contrasts: Review and Commentary
ERIC Educational Resources Information Center
Caplan, David; Gow, David
2012-01-01
Functional neuroimaging studies of syntactic processing have been interpreted as identifying the neural locations of parsing and interpretive operations. However, current behavioral studies of sentence processing indicate that many operations occur simultaneously with parsing and interpretation. In this review, we point to issues that arise in…
Applications of Parsing Theory to Computer-Assisted Instruction.
ERIC Educational Resources Information Center
Markosian, Lawrence Z.; Ager, Tryg A.
1983-01-01
Applications of an LR-1 parsing algorithm to intelligent programs for computer assisted instruction in symbolic logic and foreign languages are discussed. The system has been adequately used for diverse instructional applications, including analysis of student input, generation of pattern drills, and modeling the student's understanding of the…
Parsing the Practice of Teaching
ERIC Educational Resources Information Center
Kennedy, Mary
2016-01-01
Teacher education programs typically teach novices about one part of teaching at a time. We might offer courses on different topics--cultural foundations, learning theory, or classroom management--or we may parse teaching practice itself into a set of discrete techniques, such as core teaching practices, that can be taught individually. Missing…
Learning for Semantic Parsing Using Statistical Syntactic Parsing Techniques
2010-05-01
Workshop on Supervisory Con- trol of Learning and Adaptive Systems. San Jose, CA. Roland Kuhn and Renato De Mori (1995). The application of semantic...Processing (EMNLP-09), pp. 1–10. Suntec,Singapore. Ana-Maria Popescu, Alex Armanasu, Oren Etzioni, David Ko and Alexander Yates (2004). Modern natural
Recursive Optimization of Digital Circuits
1990-12-14
Obverse- Specification . . . A-23 A.14 Non-MDS Optimization of SAMPLE .. .. .. .. .. .. ..... A-24 Appendix B . BORIS Recursive Optimization System...Software ...... B -i B .1 DESIGN.S File . .... .. .. .. .. .. .. .. .. .. ... ... B -2 B .2 PARSE.S File. .. .. .. .. .. .. .. .. ... .. ... .... B -1i B .3...TABULAR.S File. .. .. .. .. .. .. ... .. ... .. ... B -22 B .4 MDS.S File. .. .. .. .. .. .. .. ... .. ... .. ...... B -28 B .5 COST.S File
Time-Driven Effects on Parsing during Reading
ERIC Educational Resources Information Center
Roll, Mikael; Lindgren, Magnus; Alter, Kai; Horne, Merle
2012-01-01
The phonological trace of perceived words starts fading away in short-term memory after a few seconds. Spoken utterances are usually 2-3 s long, possibly to allow the listener to parse the words into coherent prosodic phrases while they still have a clear representation. Results from this brain potential study suggest that even during silent…
The Effect of Exposure on Syntactic Parsing in Spanish-English Bilinguals
ERIC Educational Resources Information Center
Dussias, Paola E.; Sagarra, Nuria
2007-01-01
An eye tracking experiment examined how exposure to a second language (L2) influences sentence parsing in the first language. Forty-four monolingual Spanish speakers, 24 proficient Spanish-English bilinguals with limited immersion experience in the L2 environment and 20 proficient Spanish-English bilinguals with extensive L2 immersion experience…
A Semantic Parsing Method for Mapping Clinical Questions to Logical Forms
Roberts, Kirk; Patra, Braja Gopal
2017-01-01
This paper presents a method for converting natural language questions about structured data in the electronic health record (EHR) into logical forms. The logical forms can then subsequently be converted to EHR-dependent structured queries. The natural language processing task, known as semantic parsing, has the potential to convert questions to logical forms with extremely high precision, resulting in a system that is usable and trusted by clinicians for real-time use in clinical settings. We propose a hybrid semantic parsing method, combining rule-based methods with a machine learning-based classifier. The overall semantic parsing precision on a set of 212 questions is 95.6%. The parser’s rules furthermore allow it to “know what it does not know”, enabling the system to indicate when unknown terms prevent it from understanding the question’s full logical structure. When combined with a module for converting a logical form into an EHR-dependent query, this high-precision approach allows for a question answering system to provide a user with a single, verifiably correct answer. PMID:29854217
A python tool for the implementation of domain-specific languages
NASA Astrophysics Data System (ADS)
Dejanović, Igor; Vaderna, Renata; Milosavljević, Gordana; Simić, Miloš; Vuković, Željko
2017-07-01
In this paper we describe textX, a meta-language and a tool for building Domain-Specific Languages. It is implemented in Python using Arpeggio PEG (Parsing Expression Grammar) parser library. From a single language description (grammar) textX will build a parser and a meta-model (a.k.a. abstract syntax) of the language. The parser is used to parse textual representations of models conforming to the meta-model. As a result of parsing, a Python object graph will be automatically created. The structure of the object graph will conform to the meta-model defined by the grammar. This approach frees a developer from the need to manually analyse a parse tree and transform it to other suitable representation. The textX library is independent of any integrated development environment and can be easily integrated in any Python project. The textX tool works as a grammar interpreter. The parser is configured at run-time using the grammar. The textX tool is a free and open-source project available at GitHub.
Experimental Evaluation of Processing Time for the Synchronization of XML-Based Business Objects
NASA Astrophysics Data System (ADS)
Ameling, Michael; Wolf, Bernhard; Springer, Thomas; Schill, Alexander
Business objects (BOs) are data containers for complex data structures used in business applications such as Supply Chain Management and Customer Relationship Management. Due to the replication of application logic, multiple copies of BOs are created which have to be synchronized and updated. This is a complex and time consuming task because BOs rigorously vary in their structure according to the distribution, number and size of elements. Since BOs are internally represented as XML documents, the parsing of XML is one major cost factor which has to be considered for minimizing the processing time during synchronization. The prediction of the parsing time for BOs is an significant property for the selection of an efficient synchronization mechanism. In this paper, we present a method to evaluate the influence of the structure of BOs on their parsing time. The results of our experimental evaluation incorporating four different XML parsers examine the dependencies between the distribution of elements and the parsing time. Finally, a general cost model will be validated and simplified according to the results of the experimental setup.
Perception of object trajectory: parsing retinal motion into self and object movement components.
Warren, Paul A; Rushton, Simon K
2007-08-16
A moving observer needs to be able to estimate the trajectory of other objects moving in the scene. Without the ability to do so, it would be difficult to avoid obstacles or catch a ball. We hypothesized that neural mechanisms sensitive to the patterns of motion generated on the retina during self-movement (optic flow) play a key role in this process, "parsing" motion due to self-movement from that due to object movement. We investigated this "flow parsing" hypothesis by measuring the perceived trajectory of a moving probe placed within a flow field that was consistent with movement of the observer. In the first experiment, the flow field was consistent with an eye rotation; in the second experiment, it was consistent with a lateral translation of the eyes. We manipulated the distance of the probe in both experiments and assessed the consequences. As predicted by the flow parsing hypothesis, manipulating the distance of the probe had differing effects on the perceived trajectory of the probe in the two experiments. The results were consistent with the scene geometry and the type of simulated self-movement. In a third experiment, we explored the contribution of local and global motion processing to the results of the first two experiments. The data suggest that the parsing process involves global motion processing, not just local motion contrast. The findings of this study support a role for optic flow processing in the perception of object movement during self-movement.
Memory mechanisms supporting syntactic comprehension
Waters, Gloria
2013-01-01
Efforts to characterize the memory system that supports sentence comprehension have historically drawn extensively on short-term memory as a source of mechanisms that might apply to sentences. The focus of these efforts has changed significantly in the past decade. As a result of changes in models of short-term working memory (ST-WM) and developments in models of sentence comprehension, the effort to relate entire components of an ST-WM system, such as those in the model developed by Baddeley (Nature Reviews Neuroscience 4: 829–839, 2003) to sentence comprehension has largely been replaced by an effort to relate more specific mechanisms found in modern models of ST-WM to memory processes that support one aspect of sentence comprehension—the assignment of syntactic structure (parsing) and its use in determining sentence meaning (interpretation) during sentence comprehension. In this article, we present the historical background to recent studies of the memory mechanisms that support parsing and interpretation and review recent research into this relation. We argue that the results of this research do not converge on a set of mechanisms derived from ST-WM that apply to parsing and interpretation. We argue that the memory mechanisms supporting parsing and interpretation have features that characterize another memory system that has been postulated to account for skilled performance—long-term working memory. We propose a model of the relation of different aspects of parsing and interpretation to ST-WM and long-term working memory. PMID:23319178
compomics-utilities: an open-source Java library for computational proteomics.
Barsnes, Harald; Vaudel, Marc; Colaert, Niklaas; Helsens, Kenny; Sickmann, Albert; Berven, Frode S; Martens, Lennart
2011-03-08
The growing interest in the field of proteomics has increased the demand for software tools and applications that process and analyze the resulting data. And even though the purpose of these tools can vary significantly, they usually share a basic set of features, including the handling of protein and peptide sequences, the visualization of (and interaction with) spectra and chromatograms, and the parsing of results from various proteomics search engines. Developers typically spend considerable time and effort implementing these support structures, which detracts from working on the novel aspects of their tool. In order to simplify the development of proteomics tools, we have implemented an open-source support library for computational proteomics, called compomics-utilities. The library contains a broad set of features required for reading, parsing, and analyzing proteomics data. compomics-utilities is already used by a long list of existing software, ensuring library stability and continued support and development. As a user-friendly, well-documented and open-source library, compomics-utilities greatly simplifies the implementation of the basic features needed in most proteomics tools. Implemented in 100% Java, compomics-utilities is fully portable across platforms and architectures. Our library thus allows the developers to focus on the novel aspects of their tools, rather than on the basic functions, which can contribute substantially to faster development, and better tools for proteomics.
Error-Correcting Parsing for Syntactic Pattern Recognition
1977-08-01
1971. 55. Slromoney, G., Slromoney, R., and K. Krlthlvasan, "Abstract Families of Matrices and Picture Langauges," Computer Graphic and Image...T112 111X1 121 Tine USLO FOR LINXirjG A THtt .186 SEC "INPUT CHARACTER IS A DISTANCE PORN N0*flAL A IS_ 3 TINE USED FOX PARSING S.l&l SEC
Perceiving Event Dynamics and Parsing Hollywood Films
ERIC Educational Resources Information Center
Cutting, James E.; Brunick, Kaitlin L.; Candan, Ayse
2012-01-01
We selected 24 Hollywood movies released from 1940 through 2010 to serve as a film corpus. Eight viewers, three per film, parsed them into events, which are best termed subscenes. While watching a film a second time, viewers scrolled through frames and recorded the frame number where each event began. Viewers agreed about 90% of the time. We then…
The Storage and Processing of Morphologically Complex Words in L2 Spanish
ERIC Educational Resources Information Center
Foote, Rebecca
2017-01-01
Research with native speakers indicates that, during word recognition, regularly inflected words undergo parsing that segments them into stems and affixes. In contrast, studies with learners suggest that this parsing may not take place in L2. This study's research questions are: Do L2 Spanish learners store and process regularly inflected,…
Parsing Protocols Using Problem Solving Grammars. AI Memo 385.
ERIC Educational Resources Information Center
Miller, Mark L.; Goldstein, Ira P.
A theory of the planning and debugging of computer programs is formalized as a context free grammar, which is used to reveal the constituent structure of problem solving episodes by parsing protocols in which programs are written, tested, and debugged. This is illustrated by the detailed analysis of an actual session with a beginning student…
ERIC Educational Resources Information Center
Tabor, Whitney; And Others
1997-01-01
Proposes a dynamical systems approach to parsing in which syntactic hypotheses are associated with attractors in a metric space. The experiments discussed documented various contingent frequency effects that cut across traditional linguistic grains, each of which was predicted by the dynamical systems model. (47 references) (Author/CK)
Effects of Prosodic and Lexical Constraints on Parsing in Young Children (and Adults)
ERIC Educational Resources Information Center
Snedeker, Jesse; Yuan, Sylvia
2008-01-01
Prior studies of ambiguity resolution in young children have found that children rely heavily on lexical information but persistently fail to use referential constraints in online parsing [Trueswell, J.C., Sekerina, I., Hill, N.M., & Logrip, M.L, (1999). The kindergarten-path effect: Studying on-line sentence processing in young children.…
Semantics Boosts Syntax in Artificial Grammar Learning Tasks with Recursion
ERIC Educational Resources Information Center
Fedor, Anna; Varga, Mate; Szathmary, Eors
2012-01-01
Center-embedded recursion (CER) in natural language is exemplified by sentences such as "The malt that the rat ate lay in the house." Parsing center-embedded structures is in the focus of attention because this could be one of the cognitive capacities that make humans distinct from all other animals. The ability to parse CER is usually…
Writing filter processes for the SAGA editor, appendix G
NASA Technical Reports Server (NTRS)
Kirslis, Peter A.
1985-01-01
The SAGA editor provides a mechanism by which separate processes can be invoked during an editing session to traverse portions of the parse tree being edited. These processes, termed filter processes, read, analyze, and possibly transform the parse tree, returning the result to the editor. By defining new commands with the editor's user defined command facility, which invoke filter processes, authors of filter can provide complex operations as simple commands. A tree plotter, pretty printer, and Pascal tree transformation program were already written using this facility. The filter processes are introduced, parse tree structure is described and the library interface made available to the programmer. Also discussed is how to compile and run filter processes. Examples are presented to illustrate aspect of each of these areas.
Is human sentence parsing serial or parallel? Evidence from event-related brain potentials.
Hopf, Jens-Max; Bader, Markus; Meng, Michael; Bayer, Josef
2003-01-01
In this ERP study we investigate the processes that occur in syntactically ambiguous German sentences at the point of disambiguation. Whereas most psycholinguistic theories agree on the view that processing difficulties arise when parsing preferences are disconfirmed (so-called garden-path effects), important differences exist with respect to theoretical assumptions about the parser's recovery from a misparse. A key distinction can be made between parsers that compute all alternative syntactic structures in parallel (parallel parsers) and parsers that compute only a single preferred analysis (serial parsers). To distinguish empirically between parallel and serial parsing models, we compare ERP responses to garden-path sentences with ERP responses to truly ungrammatical sentences. Garden-path sentences contain a temporary and ultimately curable ungrammaticality, whereas truly ungrammatical sentences remain so permanently--a difference which gives rise to different predictions in the two classes of parsing architectures. At the disambiguating word, ERPs in both sentence types show negative shifts of similar onset latency, amplitude, and scalp distribution in an initial time window between 300 and 500 ms. In a following time window (500-700 ms), the negative shift to garden-path sentences disappears at right central parietal sites, while it continues in permanently ungrammatical sentences. These data are taken as evidence for a strictly serial parser. The absence of a difference in the early time window indicates that temporary and permanent ungrammaticalities trigger the same kind of parsing responses. Later differences can be related to successful reanalysis in garden-path but not in ungrammatical sentences. Copyright 2003 Elsevier Science B.V.
Pippi — Painless parsing, post-processing and plotting of posterior and likelihood samples
NASA Astrophysics Data System (ADS)
Scott, Pat
2012-11-01
Interpreting samples from likelihood or posterior probability density functions is rarely as straightforward as it seems it should be. Producing publication-quality graphics of these distributions is often similarly painful. In this short note I describe pippi, a simple, publicly available package for parsing and post-processing such samples, as well as generating high-quality PDF graphics of the results. Pippi is easily and extensively configurable and customisable, both in its options for parsing and post-processing samples, and in the visual aspects of the figures it produces. I illustrate some of these using an existing supersymmetric global fit, performed in the context of a gamma-ray search for dark matter. Pippi can be downloaded and followed at http://github.com/patscott/pippi.
Cameron, Sharon; Chong-White, Nicky; Mealings, Kiri; Beechey, Tim; Dillon, Harvey; Young, Taegan
2018-02-01
Intensity peaks and valleys in the acoustic signal are salient cues to syllable structure, which is accepted to be a crucial early step in phonological processing. As such, the ability to detect low-rate (envelope) modulations in signal amplitude is essential to parse an incoming speech signal into smaller phonological units. The Parsing Syllable Envelopes (ParSE) test was developed to quantify the ability of children to recognize syllable boundaries using an amplitude modulation detection paradigm. The envelope of a 750-msec steady-state /a/ vowel is modulated into two or three pseudo-syllables using notches with modulation depths varying between 0% and 100% along an 11-step continuum. In an adaptive three-alternative forced-choice procedure, the participant identified whether one, two, or three pseudo-syllables were heard. Development of the ParSE stimuli and test protocols, and collection of normative and test-retest reliability data. Eleven adults (aged 23 yr 10 mo to 50 yr 9 mo, mean 32 yr 10 mo) and 134 typically developing, primary-school children (aged 6 yr 0 mo to 12 yr 4 mo, mean 9 yr 3 mo). There were 73 males and 72 females. Data were collected using a touchscreen computer. Psychometric functions (PFs) were automatically fit to individual data by the ParSE software. Performance was related to the modulation depth at which syllables can be detected with 88% accuracy (referred to as the upper boundary of the uncertainty region [UBUR]). A shallower PF slope reflected a greater level of uncertainty. Age effects were determined based on raw scores. z Scores were calculated to account for the effect of age on performance. Outliers, and individual data for which the confidence interval of the UBUR exceeded a maximum allowable value, were removed. Nonparametric tests were used as the data were skewed toward negative performance. Across participants, the performance criterion (UBUR) was met with a median modulation depth of 42%. The effect of age on the UBUR was significant (p < 0.00001). The UBUR ranged from 50% modulation depth for 6-yr-olds to 25% for adults. Children aged 6-10 had significantly higher uncertainty region boundaries than adults. A skewed distribution toward negative performance occurred (p = 0.00007). There was no significant difference in performance on the ParSE between males and females (p = 0.60). Test-retest z scores were strongly correlated (r = 0.68, p < 0.0000001). The ParSE normative data show that the ability to identify syllable boundaries based on changes in amplitude modulation improves with age, and that some children in the general population have performance much worse than their age peers. The test is suitable for use in planned studies in a clinical population. American Academy of Audiology
TopFed: TCGA tailored federated query processing and linking to LOD.
Saleem, Muhammad; Padmanabhuni, Shanmukha S; Ngomo, Axel-Cyrille Ngonga; Iqbal, Aftab; Almeida, Jonas S; Decker, Stefan; Deus, Helena F
2014-01-01
The Cancer Genome Atlas (TCGA) is a multidisciplinary, multi-institutional effort to catalogue genetic mutations responsible for cancer using genome analysis techniques. One of the aims of this project is to create a comprehensive and open repository of cancer related molecular analysis, to be exploited by bioinformaticians towards advancing cancer knowledge. However, devising bioinformatics applications to analyse such large dataset is still challenging, as it often requires downloading large archives and parsing the relevant text files. Therefore, it is making it difficult to enable virtual data integration in order to collect the critical co-variates necessary for analysis. We address these issues by transforming the TCGA data into the Semantic Web standard Resource Description Format (RDF), link it to relevant datasets in the Linked Open Data (LOD) cloud and further propose an efficient data distribution strategy to host the resulting 20.4 billion triples data via several SPARQL endpoints. Having the TCGA data distributed across multiple SPARQL endpoints, we enable biomedical scientists to query and retrieve information from these SPARQL endpoints by proposing a TCGA tailored federated SPARQL query processing engine named TopFed. We compare TopFed with a well established federation engine FedX in terms of source selection and query execution time by using 10 different federated SPARQL queries with varying requirements. Our evaluation results show that TopFed selects on average less than half of the sources (with 100% recall) with query execution time equal to one third to that of FedX. With TopFed, we aim to offer biomedical scientists a single-point-of-access through which distributed TCGA data can be accessed in unison. We believe the proposed system can greatly help researchers in the biomedical domain to carry out their research effectively with TCGA as the amount and diversity of data exceeds the ability of local resources to handle its retrieval and parsing.
ERIC Educational Resources Information Center
Mayer, John; Kieras, David E.
Using a system based on standard augmented transition network (ATN) parsing approach, this report describes a technique for the rapid development of natural language parsing, called High-Level Grammar Specification Language (HGSL). The first part of the report describes the syntax and semantics of HGSL and the network implementation of each of its…
ERIC Educational Resources Information Center
Patson, Nikole D.; Ferreira, Fernanda
2009-01-01
In three eyetracking studies, we investigated the role of conceptual plurality in initial parsing decisions in temporarily ambiguous sentences with reciprocal verbs (e.g., "While the lovers kissed the baby played alone"). We varied the subject of the first clause using three types of plural noun phrases: conjoined noun phrases ("the bride and the…
"gnparser": a powerful parser for scientific names based on Parsing Expression Grammar.
Mozzherin, Dmitry Y; Myltsev, Alexander A; Patterson, David J
2017-05-26
Scientific names in biology act as universal links. They allow us to cross-reference information about organisms globally. However variations in spelling of scientific names greatly diminish their ability to interconnect data. Such variations may include abbreviations, annotations, misspellings, etc. Authorship is a part of a scientific name and may also differ significantly. To match all possible variations of a name we need to divide them into their elements and classify each element according to its role. We refer to this as 'parsing' the name. Parsing categorizes name's elements into those that are stable and those that are prone to change. Names are matched first by combining them according to their stable elements. Matches are then refined by examining their varying elements. This two stage process dramatically improves the number and quality of matches. It is especially useful for the automatic data exchange within the context of "Big Data" in biology. We introduce Global Names Parser (gnparser). It is a Java tool written in Scala language (a language for Java Virtual Machine) to parse scientific names. It is based on a Parsing Expression Grammar. The parser can be applied to scientific names of any complexity. It assigns a semantic meaning (such as genus name, species epithet, rank, year of publication, authorship, annotations, etc.) to all elements of a name. It is able to work with nested structures as in the names of hybrids. gnparser performs with ≈99% accuracy and processes 30 million name-strings/hour per CPU thread. The gnparser library is compatible with Scala, Java, R, Jython, and JRuby. The parser can be used as a command line application, as a socket server, a web-app or as a RESTful HTTP-service. It is released under an Open source MIT license. Global Names Parser (gnparser) is a fast, high precision tool for biodiversity informaticians and biologists working with large numbers of scientific names. It can replace expensive and error-prone manual parsing and standardization of scientific names in many situations, and can quickly enhance the interoperability of distributed biological information.
Telemetry and Science Data Software System
NASA Technical Reports Server (NTRS)
Bates, Lakesha; Hong, Liang
2011-01-01
The Telemetry and Science Data Software System (TSDSS) was designed to validate the operational health of a spacecraft, ease test verification, assist in debugging system anomalies, and provide trending data and advanced science analysis. In doing so, the system parses, processes, and organizes raw data from the Aquarius instrument both on the ground and while in space. In addition, it provides a user-friendly telemetry viewer, and an instant pushbutton test report generator. Existing ground data systems can parse and provide simple data processing, but have limitations in advanced science analysis and instant report generation. The TSDSS functions as an offline data analysis system during I&T (integration and test) and mission operations phases. After raw data are downloaded from an instrument, TSDSS ingests the data files, parses, converts telemetry to engineering units, and applies advanced algorithms to produce science level 0, 1, and 2 data products. Meanwhile, it automatically schedules upload of the raw data to a remote server and archives all intermediate and final values in a MySQL database in time order. All data saved in the system can be straightforwardly retrieved, exported, and migrated. Using TSDSS s interactive data visualization tool, a user can conveniently choose any combination and mathematical computation of interesting telemetry points from a large range of time periods (life cycle of mission ground data and mission operations testing), and display a graphical and statistical view of the data. With this graphical user interface (GUI), the data queried graphs can be exported and saved in multiple formats. This GUI is especially useful in trending data analysis, debugging anomalies, and advanced data analysis. At the request of the user, mission-specific instrument performance assessment reports can be generated with a simple click of a button on the GUI. From instrument level to observatory level, the TSDSS has been operating supporting functional and performance tests and refining system calibration algorithms and coefficients, in sync with the Aquarius/SAC-D spacecraft. At the time of this reporting, it was prepared and set up to perform anomaly investigation for mission operations preceding the Aquarius/SAC-D spacecraft launch on June 10, 2011.
Texture Mixing via Universal Simulation
2005-08-01
classes and universal simulation. Based on the well-known Lempel and Ziv (LZ) universal compression scheme, the universal type class of a one...length that produce the same tree (dictionary) under the Lempel - Ziv (LZ) incre- mental parsing defined in the well-known LZ78 universal compression ...the well known Lempel - Ziv parsing algorithm . The goal is not just to synthesize mixed textures, but to understand what texture is. We are currently
ERIC Educational Resources Information Center
Qiao, Hong Liang; Sussex, Roland
1996-01-01
Presents methods for using the Longman Mini-Concordancer on tagged and parsed corpora rather than plain text corpora. The article discusses several aspects with models to be applied in the classroom as an aid to grammar learning. This paper suggests exercises suitable for teaching English to both native and nonnative speakers. (13 references)…
Parsing and Tagging of Bilingual Dictionary
2003-09-01
LAMP-TR-106 CAR-TR-991 CS-TR-4529 UMIACS-TR-2003-97 PARSING ANS TAGGING OF BILINGUAL DICTIONARY Huanfeng Ma1,2, Burcu Karagol-Ayan1,2, David... dictionaries hold great potential as a source of lexical resources for training and testing automated systems for optical character recognition, machine...translation, and cross-language information retrieval. In this paper, we describe a system for extracting term lexicons from printed bilingual dictionaries
Annotating Socio-Cultural Structures in Text
2012-10-31
parts of speech (POS) within text, using the Stanford Part of Speech Tagger (Stanford Log-Linear, 2011). The ERDC-CERL taxonomy is then used to...annotated NP/VP Pane: Shows the sentence parsed using the Parts of Speech tagger Document View Pane: Specifies the document (being annotated) in three...first parsed using the Stanford Parts of Speech tagger and converted to an XML document both components which are done through the Import function
Text-to-phonemic transcription and parsing into mono-syllables of English text
NASA Astrophysics Data System (ADS)
Jusgir Mullick, Yugal; Agrawal, S. S.; Tayal, Smita; Goswami, Manisha
2004-05-01
The present paper describes a program that converts the English text (entered through the normal computer keyboard) into its phonemic representation and then parses it into mono-syllables. For every letter a set of context based rules is defined in lexical order. A default rule is also defined separately for each letter. Beginning from the first letter of the word the rules are checked and the most appropriate rule is applied on the letter to find its actual orthographic representation. If no matching rule is found, then the default rule is applied. Current rule sets the next position to be analyzed. Proceeding in the same manner orthographic representation for each word can be found. For example, ``reading'' is represented as ``rEdiNX'' by applying the following rules: r-->r move 1 position ahead ead-->Ed move 3 position ahead i-->i move 1 position ahead ng-->NX move 2 position ahead, i.e., end of word. The phonemic representations obtained from the above procedure are parsed to get mono-syllabic representation for various combinations such as CVC, CVCC, CV, CVCVC, etc. For example, the above phonemic representation will be parsed as rEdiNX---> /rE/ /diNX/. This study is a part of developing TTS for Indian English.
Goloborodko, Anton A; Levitsky, Lev I; Ivanov, Mark V; Gorshkov, Mikhail V
2013-02-01
Pyteomics is a cross-platform, open-source Python library providing a rich set of tools for MS-based proteomics. It provides modules for reading LC-MS/MS data, search engine output, protein sequence databases, theoretical prediction of retention times, electrochemical properties of polypeptides, mass and m/z calculations, and sequence parsing. Pyteomics is available under Apache license; release versions are available at the Python Package Index http://pypi.python.org/pyteomics, the source code repository at http://hg.theorchromo.ru/pyteomics, documentation at http://packages.python.org/pyteomics. Pyteomics.biolccc documentation is available at http://packages.python.org/pyteomics.biolccc/. Questions on installation and usage can be addressed to pyteomics mailing list: pyteomics@googlegroups.com.
Cognitive science as an interface between rational and mechanistic explanation.
Chater, Nick
2014-04-01
Cognitive science views thought as computation; and computation, by its very nature, can be understood in both rational and mechanistic terms. In rational terms, a computation solves some information processing problem (e.g., mapping sensory information into a description of the external world; parsing a sentence; selecting among a set of possible actions). In mechanistic terms, a computation corresponds to causal chain of events in a physical device (in engineering context, a silicon chip; in biological context, the nervous system). The discipline is thus at the interface between two very different styles of explanation--as the papers in the current special issue well illustrate, it explores the interplay of rational and mechanistic forces. Copyright © 2014 Cognitive Science Society, Inc.
Environmental and Molecular Science Laboratory Arrow
DOE Office of Scientific and Technical Information (OSTI.GOV)
2016-06-24
Arrows is a software package that combines NWChem, SQL and NOSQL databases, email, and social networks (e.g. Twitter, Tumblr) that simplifies molecular and materials modeling and makes these modeling capabilities accessible to all scientists and engineers. EMSL Arrows is very simple to use. The user just emails chemical reactions to arrows@emsl.pnnl.gov and then an email is sent back with thermodynamic, reaction pathway (kinetic), spectroscopy, and other results. EMSL Arrows parses the email and then searches the database for the compounds in the reactions. If a compound isn't there, an NWChem calculation is setup and submitted to calculate it. Once themore » calculation is finished the results are entered into the database and then results are emailed back.« less
Ground Operations Aerospace Language (GOAL). Volume 2: Compiler
NASA Technical Reports Server (NTRS)
1973-01-01
The principal elements and functions of the Ground Operations Aerospace Language (GOAL) compiler are presented. The technique used to transcribe the syntax diagrams into machine processable format for use by the parsing routines is described. An explanation of the parsing technique used to process GOAL source statements is included. The compiler diagnostics and the output reports generated during a GOAL compilation are explained. A description of the GOAL program package is provided.
Learning for Semantic Parsing with Kernels under Various Forms of Supervision
2007-08-01
natural language sentences to their formal executable meaning representations. This is a challenging problem and is critical for developing computing...sentences are semantically tractable. This indi- cates that Geoquery is more challenging domain for semantic parsing than ATIS. In the past, there have been a...Combining parsers. In Proceedings of the Conference on Em- pirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/ VLC -99), pp. 187–194
Imitation as behaviour parsing.
Byrne, R W
2003-01-01
Non-human great apes appear to be able to acquire elaborate skills partly by imitation, raising the possibility of the transfer of skill by imitation in animals that have only rudimentary mentalizing capacities: in contrast to the frequent assumption that imitation depends on prior understanding of others' intentions. Attempts to understand the apes' behaviour have led to the development of a purely mechanistic model of imitation, the 'behaviour parsing' model, in which the statistical regularities that are inevitable in planned behaviour are used to decipher the organization of another agent's behaviour, and thence to imitate parts of it. Behaviour can thereby be understood statistically in terms of its correlations (circumstances of use, effects on the environment) without understanding of intentions or the everyday physics of cause-and-effect. Thus, imitation of complex, novel behaviour may not require mentalizing, but conversely behaviour parsing may be a necessary preliminary to attributing intention and cause. PMID:12689378
Perceived visual speed constrained by image segmentation
NASA Technical Reports Server (NTRS)
Verghese, P.; Stone, L. S.
1996-01-01
Little is known about how or where the visual system parses the visual scene into objects or surfaces. However, it is generally assumed that the segmentation and grouping of pieces of the image into discrete entities is due to 'later' processing stages, after the 'early' processing of the visual image by local mechanisms selective for attributes such as colour, orientation, depth, and motion. Speed perception is also thought to be mediated by early mechanisms tuned for speed. Here we show that manipulating the way in which an image is parsed changes the way in which local speed information is processed. Manipulations that cause multiple stimuli to appear as parts of a single patch degrade speed discrimination, whereas manipulations that perceptually divide a single large stimulus into parts improve discrimination. These results indicate that processes as early as speed perception may be constrained by the parsing of the visual image into discrete entities.
Revise and resubmit: How real-time parsing limitations influence grammar acquisition
Pozzan, Lucia; Trueswell, John C.
2015-01-01
We present the results from a three-day artificial language learning study on adults. The study examined whether sentence-parsing limitations, in particular, difficulties revising initial syntactic/semantic commitments during comprehension, shape learners’ ability to acquire a language. Findings show that both comprehension and production of morphology pertaining to sentence argument structure are delayed when this morphology consistently appears at the end, rather than at the beginning, of sentences in otherwise identical grammatical systems. This suggests that real-time processing constraints impact acquisition; morphological cues that tend to guide linguistic analyses are easier to learn than cues that revise these analyses. Parallel performance in production and comprehension indicates that parsing constraints affect grammatical acquisition, not just real-time commitments. Properties of the linguistic system (e.g., ordering of cues within a sentence) interact with the properties of the cognitive system (cognitive control and conflict-resolution abilities) and together affect language acquisition. PMID:26026607
Effects of Tasks on BOLD Signal Responses to Sentence Contrasts: Review and Commentary
Caplan, David; Gow, David
2010-01-01
Functional neuroimaging studies of syntactic processing have been interpreted as identifying the neural locations of parsing and interpretive operations. However, current behavioral studies of sentence processing indicate that many operations occur simultaneously with parsing and interpretation. In this review, we point to issues that arise in discriminating the effects of these concurrent processes from those of the parser/interpreter in neural measures and to approaches that may help resolve them. PMID:20932562
Generic Detection of Register Realignment
NASA Astrophysics Data System (ADS)
Ďurfina, Lukáš; Kolář, Dušan
2011-09-01
The register realignment is a method of binary obfuscation and it is used by malware writers. The paper introduces the method how register realignment can be recognized by analysis based on the scattered context grammars. Such an analysis includes exploration of bytes affected by realignment, finding new valid values for them, building the scattered context grammar and parse an obfuscated code by this grammar. The created grammar has LL property--an ability for parsing by this type of grammar.
Generic Detection of Register Realignment
NASA Astrophysics Data System (ADS)
Durfina, Lukáš; Kolář, Dušan
2011-09-01
The register realignment is a method of binary obfuscation and it is used by malware writers. The paper introduces the method how register realignment can be recognized by analysis based on the scattered context grammars. Such an analysis includes exploration of bytes affected by realignment, finding new valid values for them, building the scattered context grammar and parse an obfuscated code by this grammar. The created grammar has LL property—an ability for parsing by this type of grammar.
2007-08-01
In this domain, queries typically show a deeply nested structure, which makes the semantic parsing task rather challenging , e.g.: What states border...only 80% of the GEOQUERY queries are semantically tractable, which shows that GEOQUERY is indeed a more challenging domain than ATIS. Note that none...a particularly challenging task, because of the inherent ambiguity of natural languages on both sides. It has inspired a large body of research. In
2006-09-01
is that it is universally applicable. That is, it can be used to parse an instance of any Chomsky Normal Form context-free grammar . This relative... Chomsky -Normal-Form grammar corresponding to the vehicle-specific data format, use of the Cocke-Younger- Kasami algorithm to generate a parse tree...05). The productions of a Chomsky Normal Form context-free grammar have three significant characteristics: • There are no useless symbols (i.e
A Database Design for the Brazilian Air Force Flying Unit Operational Control System.
1984-12-14
Company, 1980. 23. Pereira Filho, Jorge da Cunha " Banco de Dados Hoje" Dados e Ideias - Brazilian Magazine , 99 : 55-63 (February 1979). 24. Rich...34QUAIS SAO AS CONDICOES DE EMPREGO OPERACIONAL DO MIRAGE 2120 ?" LIKE "WHAT IS THE FORCESTATUS OF MIRAGE 2120?" % Parsed ! % [ Production added to system...4 - QUAIS SAO AS CONDICOES DE EMPREGO OPERACIONAL DO MIRAGE 2120 ? % Parsed ! % (S DEPLOC) = sbbr % (S-TIME) = 1000 % Endurance = 0200 % (SITCODE
Incremental Learning of Context Free Grammars by Parsing-Based Rule Generation and Rule Set Search
NASA Astrophysics Data System (ADS)
Nakamura, Katsuhiko; Hoshina, Akemi
This paper discusses recent improvements and extensions in Synapse system for inductive inference of context free grammars (CFGs) from sample strings. Synapse uses incremental learning, rule generation based on bottom-up parsing, and the search for rule sets. The form of production rules in the previous system is extended from Revised Chomsky Normal Form A→βγ to Extended Chomsky Normal Form, which also includes A→B, where each of β and γ is either a terminal or nonterminal symbol. From the result of bottom-up parsing, a rule generation mechanism synthesizes minimum production rules required for parsing positive samples. Instead of inductive CYK algorithm in the previous version of Synapse, the improved version uses a novel rule generation method, called ``bridging,'' which bridges the lacked part of the derivation tree for the positive string. The improved version also employs a novel search strategy, called serial search in addition to minimum rule set search. The synthesis of grammars by the serial search is faster than the minimum set search in most cases. On the other hand, the size of the generated CFGs is generally larger than that by the minimum set search, and the system can find no appropriate grammar for some CFL by the serial search. The paper shows experimental results of incremental learning of several fundamental CFGs and compares the methods of rule generation and search strategies.
'Visual’ parsing can be taught quickly without visual experience during critical periods
Reich, Lior; Amedi, Amir
2015-01-01
Cases of invasive sight-restoration in congenital blind adults demonstrated that acquiring visual abilities is extremely challenging, presumably because visual-experience during critical-periods is crucial for learning visual-unique concepts (e.g. size constancy). Visual rehabilitation can also be achieved using sensory-substitution-devices (SSDs) which convey visual information non-invasively through sounds. We tested whether one critical concept – visual parsing, which is highly-impaired in sight-restored patients – can be learned using SSD. To this end, congenitally blind adults participated in a unique, relatively short (~70 hours), SSD-‘vision’ training. Following this, participants successfully parsed 2D and 3D visual objects. Control individuals naïve to SSDs demonstrated that while some aspects of parsing with SSD are intuitive, the blind’s success could not be attributed to auditory processing alone. Furthermore, we had a unique opportunity to compare the SSD-users’ abilities to those reported for sight-restored patients who performed similar tasks visually, and who had months of eyesight. Intriguingly, the SSD-users outperformed the patients on most criteria tested. These suggest that with adequate training and technologies, key high-order visual features can be quickly acquired in adulthood, and lack of visual-experience during critical-periods can be somewhat compensated for. Practically, these highlight the potential of SSDs as standalone-aids or combined with invasive restoration approaches. PMID:26482105
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vega, Sebastián L.; Liu, Er; Arvind, Varun
Stem and progenitor cells that exhibit significant regenerative potential and critical roles in cancer initiation and progression remain difficult to characterize. Cell fates are determined by reciprocal signaling between the cell microenvironment and the nucleus; hence parameters derived from nuclear remodeling are ideal candidates for stem/progenitor cell characterization. Here we applied high-content, single cell analysis of nuclear shape and organization to examine stem and progenitor cells destined to distinct differentiation endpoints, yet undistinguishable by conventional methods. Nuclear descriptors defined through image informatics classified mesenchymal stem cells poised to either adipogenic or osteogenic differentiation, and oligodendrocyte precursors isolated from different regionsmore » of the brain and destined to distinct astrocyte subtypes. Nuclear descriptors also revealed early changes in stem cells after chemical oncogenesis, allowing the identification of a class of cancer-mitigating biomaterials. To capture the metrology of nuclear changes, we developed a simple and quantitative “imaging-derived” parsing index, which reflects the dynamic evolution of the high-dimensional space of nuclear organizational features. A comparative analysis of parsing outcomes via either nuclear shape or textural metrics of the nuclear structural protein NuMA indicates the nuclear shape alone is a weak phenotypic predictor. In contrast, variations in the NuMA organization parsed emergent cell phenotypes and discerned emergent stages of stem cell transformation, supporting a prognosticating role for this protein in the outcomes of nuclear functions. - Highlights: • High-content analysis of nuclear shape and organization classify stem and progenitor cells poised for distinct lineages. • Early oncogenic changes in mesenchymal stem cells (MSCs) are also detected with nuclear descriptors. • A new class of cancer-mitigating biomaterials was identified based on image informatics. • Textural metrics of the nuclear structural protein NuMA are sufficient to parse emergent cell phenotypes.« less
An Experiment in Scientific Program Understanding
NASA Technical Reports Server (NTRS)
Stewart, Mark E. M.; Owen, Karl (Technical Monitor)
2000-01-01
This paper concerns a procedure that analyzes aspects of the meaning or semantics of scientific and engineering code. This procedure involves taking a user's existing code, adding semantic declarations for some primitive variables, and parsing this annotated code using multiple, independent expert parsers. These semantic parsers encode domain knowledge and recognize formulae in different disciplines including physics, numerical methods, mathematics, and geometry. The parsers will automatically recognize and document some static, semantic concepts and help locate some program semantic errors. Results are shown for three intensively studied codes and seven blind test cases; all test cases are state of the art scientific codes. These techniques may apply to a wider range of scientific codes. If so, the techniques could reduce the time, risk, and effort required to develop and modify scientific codes.
1989-09-30
parses, in a second experiment. This procedure used PUNDIT’s Selection Pattern Query and Response ( SPQR ) component JLang19881. We first used SPQR in...messages pattern. SPQR continues the analysis of the ISR. from each domain, and the resulting output is and the parsing of the sentence is allowed to...UNISYS P. 0. Box 517, Paoli, PA 19301 ABSTRACT knowledge. This paper presents SPQR (Selectional Pat- One obvious benefit of acquiring domain- tern Queries
Exploiting graph kernels for high performance biomedical relation extraction.
Panyam, Nagesh C; Verspoor, Karin; Cohn, Trevor; Ramamohanarao, Kotagiri
2018-01-30
Relation extraction from biomedical publications is an important task in the area of semantic mining of text. Kernel methods for supervised relation extraction are often preferred over manual feature engineering methods, when classifying highly ordered structures such as trees and graphs obtained from syntactic parsing of a sentence. Tree kernels such as the Subset Tree Kernel and Partial Tree Kernel have been shown to be effective for classifying constituency parse trees and basic dependency parse graphs of a sentence. Graph kernels such as the All Path Graph kernel (APG) and Approximate Subgraph Matching (ASM) kernel have been shown to be suitable for classifying general graphs with cycles, such as the enhanced dependency parse graph of a sentence. In this work, we present a high performance Chemical-Induced Disease (CID) relation extraction system. We present a comparative study of kernel methods for the CID task and also extend our study to the Protein-Protein Interaction (PPI) extraction task, an important biomedical relation extraction task. We discuss novel modifications to the ASM kernel to boost its performance and a method to apply graph kernels for extracting relations expressed in multiple sentences. Our system for CID relation extraction attains an F-score of 60%, without using external knowledge sources or task specific heuristic or rules. In comparison, the state of the art Chemical-Disease Relation Extraction system achieves an F-score of 56% using an ensemble of multiple machine learning methods, which is then boosted to 61% with a rule based system employing task specific post processing rules. For the CID task, graph kernels outperform tree kernels substantially, and the best performance is obtained with APG kernel that attains an F-score of 60%, followed by the ASM kernel at 57%. The performance difference between the ASM and APG kernels for CID sentence level relation extraction is not significant. In our evaluation of ASM for the PPI task, ASM performed better than APG kernel for the BioInfer dataset, in the Area Under Curve (AUC) measure (74% vs 69%). However, for all the other PPI datasets, namely AIMed, HPRD50, IEPA and LLL, ASM is substantially outperformed by the APG kernel in F-score and AUC measures. We demonstrate a high performance Chemical Induced Disease relation extraction, without employing external knowledge sources or task specific heuristics. Our work shows that graph kernels are effective in extracting relations that are expressed in multiple sentences. We also show that the graph kernels, namely the ASM and APG kernels, substantially outperform the tree kernels. Among the graph kernels, we showed the ASM kernel as effective for biomedical relation extraction, with comparable performance to the APG kernel for datasets such as the CID-sentence level relation extraction and BioInfer in PPI. Overall, the APG kernel is shown to be significantly more accurate than the ASM kernel, achieving better performance on most datasets.
owlcpp: a C++ library for working with OWL ontologies.
Levin, Mikhail K; Cowell, Lindsay G
2015-01-01
The increasing use of ontologies highlights the need for a library for working with ontologies that is efficient, accessible from various programming languages, and compatible with common computational platforms. We developed owlcpp, a library for storing and searching RDF triples, parsing RDF/XML documents, converting triples into OWL axioms, and reasoning. The library is written in ISO-compliant C++ to facilitate efficiency, portability, and accessibility from other programming languages. Internally, owlcpp uses the Raptor RDF Syntax library for parsing RDF/XML and the FaCT++ library for reasoning. The current version of owlcpp is supported under Linux, OSX, and Windows platforms and provides an API for Python. The results of our evaluation show that, compared to other commonly used libraries, owlcpp is significantly more efficient in terms of memory usage and searching RDF triple stores. owlcpp performs strict parsing and detects errors ignored by other libraries, thus reducing the possibility of incorrect semantic interpretation of ontologies. owlcpp is available at http://owl-cpp.sf.net/ under the Boost Software License, Version 1.0.
FastaValidator: an open-source Java library to parse and validate FASTA formatted sequences.
Waldmann, Jost; Gerken, Jan; Hankeln, Wolfgang; Schweer, Timmy; Glöckner, Frank Oliver
2014-06-14
Advances in sequencing technologies challenge the efficient importing and validation of FASTA formatted sequence data which is still a prerequisite for most bioinformatic tools and pipelines. Comparative analysis of commonly used Bio*-frameworks (BioPerl, BioJava and Biopython) shows that their scalability and accuracy is hampered. FastaValidator represents a platform-independent, standardized, light-weight software library written in the Java programming language. It targets computer scientists and bioinformaticians writing software which needs to parse quickly and accurately large amounts of sequence data. For end-users FastaValidator includes an interactive out-of-the-box validation of FASTA formatted files, as well as a non-interactive mode designed for high-throughput validation in software pipelines. The accuracy and performance of the FastaValidator library qualifies it for large data sets such as those commonly produced by massive parallel (NGS) technologies. It offers scientists a fast, accurate and standardized method for parsing and validating FASTA formatted sequence data.
Pragmatic precision oncology: the secondary uses of clinical tumor molecular profiling
Thota, Ramya; Staggs, David B; Johnson, Douglas B; Warner, Jeremy L
2016-01-01
Background Precision oncology increasingly utilizes molecular profiling of tumors to determine treatment decisions with targeted therapeutics. The molecular profiling data is valuable in the treatment of individual patients as well as for multiple secondary uses. Objective To automatically parse, categorize, and aggregate clinical molecular profile data generated during cancer care as well as use this data to address multiple secondary use cases. Methods A system to parse, categorize and aggregate molecular profile data was created. A naÿve Bayesian classifier categorized results according to clinical groups. The accuracy of these systems were validated against a published expertly-curated subset of molecular profiling data. Results Following one year of operation, 819 samples have been accurately parsed and categorized to generate a data repository of 10,620 genetic variants. The database has been used for operational, clinical trial, and discovery science research. Conclusions A real-time database of molecular profiling data is a pragmatic solution to several knowledge management problems in the practice and science of precision oncology. PMID:27026612
Vega, Sebastián L; Liu, Er; Arvind, Varun; Bushman, Jared; Sung, Hak-Joon; Becker, Matthew L; Lelièvre, Sophie; Kohn, Joachim; Vidi, Pierre-Alexandre; Moghe, Prabhas V
2017-02-01
Stem and progenitor cells that exhibit significant regenerative potential and critical roles in cancer initiation and progression remain difficult to characterize. Cell fates are determined by reciprocal signaling between the cell microenvironment and the nucleus; hence parameters derived from nuclear remodeling are ideal candidates for stem/progenitor cell characterization. Here we applied high-content, single cell analysis of nuclear shape and organization to examine stem and progenitor cells destined to distinct differentiation endpoints, yet undistinguishable by conventional methods. Nuclear descriptors defined through image informatics classified mesenchymal stem cells poised to either adipogenic or osteogenic differentiation, and oligodendrocyte precursors isolated from different regions of the brain and destined to distinct astrocyte subtypes. Nuclear descriptors also revealed early changes in stem cells after chemical oncogenesis, allowing the identification of a class of cancer-mitigating biomaterials. To capture the metrology of nuclear changes, we developed a simple and quantitative "imaging-derived" parsing index, which reflects the dynamic evolution of the high-dimensional space of nuclear organizational features. A comparative analysis of parsing outcomes via either nuclear shape or textural metrics of the nuclear structural protein NuMA indicates the nuclear shape alone is a weak phenotypic predictor. In contrast, variations in the NuMA organization parsed emergent cell phenotypes and discerned emergent stages of stem cell transformation, supporting a prognosticating role for this protein in the outcomes of nuclear functions. Copyright © 2017 Elsevier Inc. All rights reserved.
Automated extraction of Biomarker information from pathology reports.
Lee, Jeongeun; Song, Hyun-Je; Yoon, Eunsil; Park, Seong-Bae; Park, Sung-Hye; Seo, Jeong-Wook; Park, Peom; Choi, Jinwook
2018-05-21
Pathology reports are written in free-text form, which precludes efficient data gathering. We aimed to overcome this limitation and design an automated system for extracting biomarker profiles from accumulated pathology reports. We designed a new data model for representing biomarker knowledge. The automated system parses immunohistochemistry reports based on a "slide paragraph" unit defined as a set of immunohistochemistry findings obtained for the same tissue slide. Pathology reports are parsed using context-free grammar for immunohistochemistry, and using a tree-like structure for surgical pathology. The performance of the approach was validated on manually annotated pathology reports of 100 randomly selected patients managed at Seoul National University Hospital. High F-scores were obtained for parsing biomarker name and corresponding test results (0.999 and 0.998, respectively) from the immunohistochemistry reports, compared to relatively poor performance for parsing surgical pathology findings. However, applying the proposed approach to our single-center dataset revealed information on 221 unique biomarkers, which represents a richer result than biomarker profiles obtained based on the published literature. Owing to the data representation model, the proposed approach can associate biomarker profiles extracted from an immunohistochemistry report with corresponding pathology findings listed in one or more surgical pathology reports. Term variations are resolved by normalization to corresponding preferred terms determined by expanded dictionary look-up and text similarity-based search. Our proposed approach for biomarker data extraction addresses key limitations regarding data representation and can handle reports prepared in the clinical setting, which often contain incomplete sentences, typographical errors, and inconsistent formatting.
Natural-Language Parser for PBEM
NASA Technical Reports Server (NTRS)
James, Mark
2010-01-01
A computer program called "Hunter" accepts, as input, a colloquial-English description of a set of policy-based-management rules, and parses that description into a form useable by policy-based enterprise management (PBEM) software. PBEM is a rules-based approach suitable for automating some management tasks. PBEM simplifies the management of a given enterprise through establishment of policies addressing situations that are likely to occur. Hunter was developed to have a unique capability to extract the intended meaning instead of focusing on parsing the exact ways in which individual words are used.
Translation lexicon acquisition from bilingual dictionaries
NASA Astrophysics Data System (ADS)
Doermann, David S.; Ma, Huanfeng; Karagol-Ayan, Burcu; Oard, Douglas W.
2001-12-01
Bilingual dictionaries hold great potential as a source of lexical resources for training automated systems for optical character recognition, machine translation and cross-language information retrieval. In this work we describe a system for extracting term lexicons from printed copies of bilingual dictionaries. We describe our approach to page and definition segmentation and entry parsing. We have used the approach to parse a number of dictionaries and demonstrate the results for retrieval using a French-English Dictionary to generate a translation lexicon and a corpus of English queries applied to French documents to evaluation cross-language IR.
PyParse: a semiautomated system for scoring spoken recall data.
Solway, Alec; Geller, Aaron S; Sederberg, Per B; Kahana, Michael J
2010-02-01
Studies of human memory often generate data on the sequence and timing of recalled items, but scoring such data using conventional methods is difficult or impossible. We describe a Python-based semiautomated system that greatly simplifies this task. This software, called PyParse, can easily be used in conjunction with many common experiment authoring systems. Scored data is output in a simple ASCII format and can be accessed with the programming language of choice, allowing for the identification of features such as correct responses, prior-list intrusions, extra-list intrusions, and repetitions.
Sawja: Static Analysis Workshop for Java
NASA Astrophysics Data System (ADS)
Hubert, Laurent; Barré, Nicolas; Besson, Frédéric; Demange, Delphine; Jensen, Thomas; Monfort, Vincent; Pichardie, David; Turpin, Tiphaine
Static analysis is a powerful technique for automatic verification of programs but raises major engineering challenges when developing a full-fledged analyzer for a realistic language such as Java. Efficiency and precision of such a tool rely partly on low level components which only depend on the syntactic structure of the language and therefore should not be redesigned for each implementation of a new static analysis. This paper describes the Sawja library: a static analysis workshop fully compliant with Java 6 which provides OCaml modules for efficiently manipulating Java bytecode programs. We present the main features of the library, including i) efficient functional data-structures for representing a program with implicit sharing and lazy parsing, ii) an intermediate stack-less representation, and iii) fast computation and manipulation of complete programs. We provide experimental evaluations of the different features with respect to time, memory and precision.
Parsing and Quantification of Raw Orbitrap Mass Spectrometer Data Using RawQuant.
Kovalchik, Kevin A; Moggridge, Sophie; Chen, David D Y; Morin, Gregg B; Hughes, Christopher S
2018-06-01
Effective analysis of protein samples by mass spectrometry (MS) requires careful selection and optimization of a range of experimental parameters. As the output from the primary detection device, the "raw" MS data file can be used to gauge the success of a given sample analysis. However, the closed-source nature of the standard raw MS file can complicate effective parsing of the data contained within. To ease and increase the range of analyses possible, the RawQuant tool was developed to enable parsing of raw MS files derived from Thermo Orbitrap instruments to yield meta and scan data in an openly readable text format. RawQuant can be commanded to export user-friendly files containing MS 1 , MS 2 , and MS 3 metadata as well as matrices of quantification values based on isobaric tagging approaches. In this study, the utility of RawQuant is demonstrated in several scenarios: (1) reanalysis of shotgun proteomics data for the identification of the human proteome, (2) reanalysis of experiments utilizing isobaric tagging for whole-proteome quantification, and (3) analysis of a novel bacterial proteome and synthetic peptide mixture for assessing quantification accuracy when using isobaric tags. Together, these analyses successfully demonstrate RawQuant for the efficient parsing and quantification of data from raw Thermo Orbitrap MS files acquired in a range of common proteomics experiments. In addition, the individual analyses using RawQuant highlights parametric considerations in the different experimental sets and suggests targetable areas to improve depth of coverage in identification-focused studies and quantification accuracy when using isobaric tags.
Parser Combinators: a Practical Application for Generating Parsers for NMR Data
Fenwick, Matthew; Weatherby, Gerard; Ellis, Heidi JC; Gryk, Michael R.
2013-01-01
Nuclear Magnetic Resonance (NMR) spectroscopy is a technique for acquiring protein data at atomic resolution and determining the three-dimensional structure of large protein molecules. A typical structure determination process results in the deposition of a large data sets to the BMRB (Bio-Magnetic Resonance Data Bank). This data is stored and shared in a file format called NMR-Star. This format is syntactically and semantically complex making it challenging to parse. Nevertheless, parsing these files is crucial to applying the vast amounts of biological information stored in NMR-Star files, allowing researchers to harness the results of previous studies to direct and validate future work. One powerful approach for parsing files is to apply a Backus-Naur Form (BNF) grammar, which is a high-level model of a file format. Translation of the grammatical model to an executable parser may be automatically accomplished. This paper will show how we applied a model BNF grammar of the NMR-Star format to create a free, open-source parser, using a method that originated in the functional programming world known as “parser combinators”. This paper demonstrates the effectiveness of a principled approach to file specification and parsing. This paper also builds upon our previous work [1], in that 1) it applies concepts from Functional Programming (which is relevant even though the implementation language, Java, is more mainstream than Functional Programming), and 2) all work and accomplishments from this project will be made available under standard open source licenses to provide the community with the opportunity to learn from our techniques and methods. PMID:24352525
Vosse, Theo; Kempen, Gerard
2009-12-01
We introduce a novel computer implementation of the Unification-Space parser (Vosse and Kempen in Cognition 75:105-143, 2000) in the form of a localist neural network whose dynamics is based on interactive activation and inhibition. The wiring of the network is determined by Performance Grammar (Kempen and Harbusch in Verb constructions in German and Dutch. Benjamins, Amsterdam, 2003), a lexicalist formalism with feature unification as binding operation. While the network is processing input word strings incrementally, the evolving shape of parse trees is represented in the form of changing patterns of activation in nodes that code for syntactic properties of words and phrases, and for the grammatical functions they fulfill. The system is capable, at least qualitatively and rudimentarily, of simulating several important dynamic aspects of human syntactic parsing, including garden-path phenomena and reanalysis, effects of complexity (various types of clause embeddings), fault-tolerance in case of unification failures and unknown words, and predictive parsing (expectation-based analysis, surprisal effects). English is the target language of the parser described.
Representations of the language recognition problem for a theorem prover
NASA Technical Reports Server (NTRS)
Minker, J.; Vanderbrug, G. J.
1972-01-01
Two representations of the language recognition problem for a theorem prover in first order logic are presented and contrasted. One of the representations is based on the familiar method of generating sentential forms of the language, and the other is based on the Cocke parsing algorithm. An augmented theorem prover is described which permits recognition of recursive languages. The state-transformation method developed by Cordell Green to construct problem solutions in resolution-based systems can be used to obtain the parse tree. In particular, the end-order traversal of the parse tree is derived in one of the representations. An inference system, termed the cycle inference system, is defined which makes it possible for the theorem prover to model the method on which the representation is based. The general applicability of the cycle inference system to state space problems is discussed. Given an unsatisfiable set S, where each clause has at most one positive literal, it is shown that there exists an input proof. The clauses for the two representations satisfy these conditions, as do many state space problems.
NASA Astrophysics Data System (ADS)
Lee, Kyu J.; Kunii, T. L.; Noma, T.
1993-01-01
In this paper, we propose a syntactic pattern recognition method for non-schematic drawings, based on a new attributed graph grammar with flexible embedding. In our graph grammar, the embedding rule permits the nodes of a guest graph to be arbitrarily connected with the nodes of a host graph. The ambiguity caused by this flexible embedding is controlled with the evaluation of synthesized attributes and the check of context sensitivity. To integrate parsing with the synthesized attribute evaluation and the context sensitivity check, we also develop a bottom up parsing algorithm.
KEGGParser: parsing and editing KEGG pathway maps in Matlab.
Arakelyan, Arsen; Nersisyan, Lilit
2013-02-15
KEGG pathway database is a collection of manually drawn pathway maps accompanied with KGML format files intended for use in automatic analysis. KGML files, however, do not contain the required information for complete reproduction of all the events indicated in the static image of a pathway map. Several parsers and editors of KEGG pathways exist for processing KGML files. We introduce KEGGParser-a MATLAB based tool for KEGG pathway parsing, semiautomatic fixing, editing, visualization and analysis in MATLAB environment. It also works with Scilab. The source code is available at http://www.mathworks.com/matlabcentral/fileexchange/37561.
A novel argument for the Universality of Parsing principles.
Grillo, Nino; Costa, João
2014-10-01
Previous work on Relative Clause attachment has overlooked a crucial grammatical distinction across both the languages and structures tested: the selective availability of Pseudo Relatives. We reconsider the literature in light of this observation and argue that, all else being equal, local attachment is found with genuine Relative Clauses and that non-local attachment emerges when their surface identical imposters, Pseudo Relatives, are available. Hence, apparent cross-linguistic variation in parsing preferences is reducible to grammatical factors. The results from two novel experiments in Italian are presented in support of these conclusions. Copyright © 2014 Elsevier B.V. All rights reserved.
Locating and parsing bibliographic references in HTML medical articles
Zou, Jie; Le, Daniel; Thoma, George R.
2010-01-01
The set of references that typically appear toward the end of journal articles is sometimes, though not always, a field in bibliographic (citation) databases. But even if references do not constitute such a field, they can be useful as a preprocessing step in the automated extraction of other bibliographic data from articles, as well as in computer-assisted indexing of articles. Automation in data extraction and indexing to minimize human labor is key to the affordable creation and maintenance of large bibliographic databases. Extracting the components of references, such as author names, article title, journal name, publication date and other entities, is therefore a valuable and sometimes necessary task. This paper describes a two-step process using statistical machine learning algorithms, to first locate the references in HTML medical articles and then to parse them. Reference locating identifies the reference section in an article and then decomposes it into individual references. We formulate this step as a two-class classification problem based on text and geometric features. An evaluation conducted on 500 articles drawn from 100 medical journals achieves near-perfect precision and recall rates for locating references. Reference parsing identifies the components of each reference. For this second step, we implement and compare two algorithms. One relies on sequence statistics and trains a Conditional Random Field. The other focuses on local feature statistics and trains a Support Vector Machine to classify each individual word, followed by a search algorithm that systematically corrects low confidence labels if the label sequence violates a set of predefined rules. The overall performance of these two reference-parsing algorithms is about the same: above 99% accuracy at the word level, and over 97% accuracy at the chunk level. PMID:20640222
Locating and parsing bibliographic references in HTML medical articles.
Zou, Jie; Le, Daniel; Thoma, George R
2010-06-01
The set of references that typically appear toward the end of journal articles is sometimes, though not always, a field in bibliographic (citation) databases. But even if references do not constitute such a field, they can be useful as a preprocessing step in the automated extraction of other bibliographic data from articles, as well as in computer-assisted indexing of articles. Automation in data extraction and indexing to minimize human labor is key to the affordable creation and maintenance of large bibliographic databases. Extracting the components of references, such as author names, article title, journal name, publication date and other entities, is therefore a valuable and sometimes necessary task. This paper describes a two-step process using statistical machine learning algorithms, to first locate the references in HTML medical articles and then to parse them. Reference locating identifies the reference section in an article and then decomposes it into individual references. We formulate this step as a two-class classification problem based on text and geometric features. An evaluation conducted on 500 articles drawn from 100 medical journals achieves near-perfect precision and recall rates for locating references. Reference parsing identifies the components of each reference. For this second step, we implement and compare two algorithms. One relies on sequence statistics and trains a Conditional Random Field. The other focuses on local feature statistics and trains a Support Vector Machine to classify each individual word, followed by a search algorithm that systematically corrects low confidence labels if the label sequence violates a set of predefined rules. The overall performance of these two reference-parsing algorithms is about the same: above 99% accuracy at the word level, and over 97% accuracy at the chunk level.
A hierarchical methodology for urban facade parsing from TLS point clouds
NASA Astrophysics Data System (ADS)
Li, Zhuqiang; Zhang, Liqiang; Mathiopoulos, P. Takis; Liu, Fangyu; Zhang, Liang; Li, Shuaipeng; Liu, Hao
2017-01-01
The effective and automated parsing of building facades from terrestrial laser scanning (TLS) point clouds of urban environments is an important research topic in the GIS and remote sensing fields. It is also challenging because of the complexity and great variety of the available 3D building facade layouts as well as the noise and data missing of the input TLS point clouds. In this paper, we introduce a novel methodology for the accurate and computationally efficient parsing of urban building facades from TLS point clouds. The main novelty of the proposed methodology is that it is a systematic and hierarchical approach that considers, in an adaptive way, the semantic and underlying structures of the urban facades for segmentation and subsequent accurate modeling. Firstly, the available input point cloud is decomposed into depth planes based on a data-driven method; such layer decomposition enables similarity detection in each depth plane layer. Secondly, the labeling of the facade elements is performed using the SVM classifier in combination with our proposed BieS-ScSPM algorithm. The labeling outcome is then augmented with weak architectural knowledge. Thirdly, least-squares fitted normalized gray accumulative curves are applied to detect regular structures, and a binarization dilation extraction algorithm is used to partition facade elements. A dynamic line-by-line division is further applied to extract the boundaries of the elements. The 3D geometrical façade models are then reconstructed by optimizing facade elements across depth plane layers. We have evaluated the performance of the proposed method using several TLS facade datasets. Qualitative and quantitative performance comparisons with several other state-of-the-art methods dealing with the same facade parsing problem have demonstrated its superiority in performance and its effectiveness in improving segmentation accuracy.
Visual Turing test for computer vision systems
Geman, Donald; Geman, Stuart; Hallonquist, Neil; Younes, Laurent
2015-01-01
Today, computer vision systems are tested by their accuracy in detecting and localizing instances of objects. As an alternative, and motivated by the ability of humans to provide far richer descriptions and even tell a story about an image, we construct a “visual Turing test”: an operator-assisted device that produces a stochastic sequence of binary questions from a given test image. The query engine proposes a question; the operator either provides the correct answer or rejects the question as ambiguous; the engine proposes the next question (“just-in-time truthing”). The test is then administered to the computer-vision system, one question at a time. After the system’s answer is recorded, the system is provided the correct answer and the next question. Parsing is trivial and deterministic; the system being tested requires no natural language processing. The query engine employs statistical constraints, learned from a training set, to produce questions with essentially unpredictable answers—the answer to a question, given the history of questions and their correct answers, is nearly equally likely to be positive or negative. In this sense, the test is only about vision. The system is designed to produce streams of questions that follow natural story lines, from the instantiation of a unique object, through an exploration of its properties, and on to its relationships with other uniquely instantiated objects. PMID:25755262
Extracting Loop Bounds for WCET Analysis Using the Instrumentation Point Graph
NASA Astrophysics Data System (ADS)
Betts, A.; Bernat, G.
2009-05-01
Every calculation engine proposed in the literature of Worst-Case Execution Time (WCET) analysis requires upper bounds on loop iterations. Existing mechanisms to procure this information are either error prone, because they are gathered from the end-user, or limited in scope, because automatic analyses target very specific loop structures. In this paper, we present a technique that obtains bounds completely automatically for arbitrary loop structures. In particular, we show how to employ the Instrumentation Point Graph (IPG) to parse traces of execution (generated by an instrumented program) in order to extract bounds relative to any loop-nesting level. With this technique, therefore, non-rectangular dependencies between loops can be captured, allowing more accurate WCET estimates to be calculated. We demonstrate the improvement in accuracy by comparing WCET estimates computed through our HMB framework against those computed with state-of-the-art techniques.
Blurring the Inputs: A Natural Language Approach to Sensitivity Analysis
NASA Technical Reports Server (NTRS)
Kleb, William L.; Thompson, Richard A.; Johnston, Christopher O.
2007-01-01
To document model parameter uncertainties and to automate sensitivity analyses for numerical simulation codes, a natural-language-based method to specify tolerances has been developed. With this new method, uncertainties are expressed in a natural manner, i.e., as one would on an engineering drawing, namely, 5.25 +/- 0.01. This approach is robust and readily adapted to various application domains because it does not rely on parsing the particular structure of input file formats. Instead, tolerances of a standard format are added to existing fields within an input file. As a demonstration of the power of this simple, natural language approach, a Monte Carlo sensitivity analysis is performed for three disparate simulation codes: fluid dynamics (LAURA), radiation (HARA), and ablation (FIAT). Effort required to harness each code for sensitivity analysis was recorded to demonstrate the generality and flexibility of this new approach.
GEOmetadb: powerful alternative search engine for the Gene Expression Omnibus
Zhu, Yuelin; Davis, Sean; Stephens, Robert; Meltzer, Paul S.; Chen, Yidong
2008-01-01
The NCBI Gene Expression Omnibus (GEO) represents the largest public repository of microarray data. However, finding data in GEO can be challenging. We have developed GEOmetadb in an attempt to make querying the GEO metadata both easier and more powerful. All GEO metadata records as well as the relationships between them are parsed and stored in a local MySQL database. A powerful, flexible web search interface with several convenient utilities provides query capabilities not available via NCBI tools. In addition, a Bioconductor package, GEOmetadb that utilizes a SQLite export of the entire GEOmetadb database is also available, rendering the entire GEO database accessible with full power of SQL-based queries from within R. Availability: The web interface and SQLite databases available at http://gbnci.abcc.ncifcrf.gov/geo/. The Bioconductor package is available via the Bioconductor project. The corresponding MATLAB implementation is also available at the same website. Contact: yidong@mail.nih.gov PMID:18842599
Processing of ICARTT Data Files Using Fuzzy Matching and Parser Combinators
NASA Technical Reports Server (NTRS)
Rutherford, Matthew T.; Typanski, Nathan D.; Wang, Dali; Chen, Gao
2014-01-01
In this paper, the task of parsing and matching inconsistent, poorly formed text data through the use of parser combinators and fuzzy matching is discussed. An object-oriented implementation of the parser combinator technique is used to allow for a relatively simple interface for adapting base parsers. For matching tasks, a fuzzy matching algorithm with Levenshtein distance calculations is implemented to match string pair, which are otherwise difficult to match due to the aforementioned irregularities and errors in one or both pair members. Used in concert, the two techniques allow parsing and matching operations to be performed which had previously only been done manually.
Colaert, Niklaas; Barsnes, Harald; Vaudel, Marc; Helsens, Kenny; Timmerman, Evy; Sickmann, Albert; Gevaert, Kris; Martens, Lennart
2011-08-05
The Thermo Proteome Discoverer program integrates both peptide identification and quantification into a single workflow for peptide-centric proteomics. Furthermore, its close integration with Thermo mass spectrometers has made it increasingly popular in the field. Here, we present a Java library to parse the msf files that constitute the output of Proteome Discoverer. The parser is also implemented as a graphical user interface allowing convenient access to the information found in the msf files, and in Rover, a program to analyze and validate quantitative proteomics information. All code, binaries, and documentation is freely available at http://thermo-msf-parser.googlecode.com.
ANTLR Tree Grammar Generator and Extensions
NASA Technical Reports Server (NTRS)
Craymer, Loring
2005-01-01
A computer program implements two extensions of ANTLR (Another Tool for Language Recognition), which is a set of software tools for translating source codes between different computing languages. ANTLR supports predicated- LL(k) lexer and parser grammars, a notation for annotating parser grammars to direct tree construction, and predicated tree grammars. [ LL(k) signifies left-right, leftmost derivation with k tokens of look-ahead, referring to certain characteristics of a grammar.] One of the extensions is a syntax for tree transformations. The other extension is the generation of tree grammars from annotated parser or input tree grammars. These extensions can simplify the process of generating source-to-source language translators and they make possible an approach, called "polyphase parsing," to translation between computing languages. The typical approach to translator development is to identify high-level semantic constructs such as "expressions," "declarations," and "definitions" as fundamental building blocks in the grammar specification used for language recognition. The polyphase approach is to lump ambiguous syntactic constructs during parsing and then disambiguate the alternatives in subsequent tree transformation passes. Polyphase parsing is believed to be useful for generating efficient recognizers for C++ and other languages that, like C++, have significant ambiguities.
Pragmatic precision oncology: the secondary uses of clinical tumor molecular profiling.
Rioth, Matthew J; Thota, Ramya; Staggs, David B; Johnson, Douglas B; Warner, Jeremy L
2016-07-01
Precision oncology increasingly utilizes molecular profiling of tumors to determine treatment decisions with targeted therapeutics. The molecular profiling data is valuable in the treatment of individual patients as well as for multiple secondary uses. To automatically parse, categorize, and aggregate clinical molecular profile data generated during cancer care as well as use this data to address multiple secondary use cases. A system to parse, categorize and aggregate molecular profile data was created. A naÿve Bayesian classifier categorized results according to clinical groups. The accuracy of these systems were validated against a published expertly-curated subset of molecular profiling data. Following one year of operation, 819 samples have been accurately parsed and categorized to generate a data repository of 10,620 genetic variants. The database has been used for operational, clinical trial, and discovery science research. A real-time database of molecular profiling data is a pragmatic solution to several knowledge management problems in the practice and science of precision oncology. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Acoustic facilitation of object movement detection during self-motion
Calabro, F. J.; Soto-Faraco, S.; Vaina, L. M.
2011-01-01
In humans, as well as most animal species, perception of object motion is critical to successful interaction with the surrounding environment. Yet, as the observer also moves, the retinal projections of the various motion components add to each other and extracting accurate object motion becomes computationally challenging. Recent psychophysical studies have demonstrated that observers use a flow-parsing mechanism to estimate and subtract self-motion from the optic flow field. We investigated whether concurrent acoustic cues for motion can facilitate visual flow parsing, thereby enhancing the detection of moving objects during simulated self-motion. Participants identified an object (the target) that moved either forward or backward within a visual scene containing nine identical textured objects simulating forward observer translation. We found that spatially co-localized, directionally congruent, moving auditory stimuli enhanced object motion detection. Interestingly, subjects who performed poorly on the visual-only task benefited more from the addition of moving auditory stimuli. When auditory stimuli were not co-localized to the visual target, improvements in detection rates were weak. Taken together, these results suggest that parsing object motion from self-motion-induced optic flow can operate on multisensory object representations. PMID:21307050
Attribute And-Or Grammar for Joint Parsing of Human Pose, Parts and Attributes.
Park, Seyoung; Nie, Xiaohan; Zhu, Song-Chun
2017-07-25
This paper presents an attribute and-or grammar (A-AOG) model for jointly inferring human body pose and human attributes in a parse graph with attributes augmented to nodes in the hierarchical representation. In contrast to other popular methods in the current literature that train separate classifiers for poses and individual attributes, our method explicitly represents the decomposition and articulation of body parts, and account for the correlations between poses and attributes. The A-AOG model is an amalgamation of three traditional grammar formulations: (i)Phrase structure grammar representing the hierarchical decomposition of the human body from whole to parts; (ii)Dependency grammar modeling the geometric articulation by a kinematic graph of the body pose; and (iii)Attribute grammar accounting for the compatibility relations between different parts in the hierarchy so that their appearances follow a consistent style. The parse graph outputs human detection, pose estimation, and attribute prediction simultaneously, which are intuitive and interpretable. We conduct experiments on two tasks on two datasets, and experimental results demonstrate the advantage of joint modeling in comparison with computing poses and attributes independently. Furthermore, our model obtains better performance over existing methods for both pose estimation and attribute prediction tasks.
Hierarchical parsing and semantic navigation of full body CT data
NASA Astrophysics Data System (ADS)
Seifert, Sascha; Barbu, Adrian; Zhou, S. Kevin; Liu, David; Feulner, Johannes; Huber, Martin; Suehling, Michael; Cavallaro, Alexander; Comaniciu, Dorin
2009-02-01
Whole body CT scanning is a common diagnosis technique for discovering early signs of metastasis or for differential diagnosis. Automatic parsing and segmentation of multiple organs and semantic navigation inside the body can help the clinician in efficiently obtaining accurate diagnosis. However, dealing with the large amount of data of a full body scan is challenging and techniques are needed for the fast detection and segmentation of organs, e.g., heart, liver, kidneys, bladder, prostate, and spleen, and body landmarks, e.g., bronchial bifurcation, coccyx tip, sternum, lung tips. Solving the problem becomes even more challenging if partial body scans are used, where not all organs are present. We propose a new approach to this problem, in which a network of 1D and 3D landmarks is trained to quickly parse the 3D CT data and estimate which organs and landmarks are present as well as their most probable locations and boundaries. Using this approach, the segmentation of seven organs and detection of 19 body landmarks can be obtained in about 20 seconds with state-of-the-art accuracy and has been validated on 80 CT full or partial body scans.
Critical evaluation of reverse engineering tool Imagix 4D!
Yadav, Rashmi; Patel, Ravindra; Kothari, Abhay
2016-01-01
The comprehension of legacy codes is difficult to understand. Various commercial reengineering tools are available that have unique working styles, and are equipped with their inherent capabilities and shortcomings. The focus of the available tools is in visualizing static behavior not the dynamic one. Therefore, it is difficult for people who work in software product maintenance, code understanding reengineering/reverse engineering. Consequently, the need for a comprehensive reengineering/reverse engineering tool arises. We found the usage of Imagix 4D to be good as it generates the maximum pictorial representations in the form of flow charts, flow graphs, class diagrams, metrics and, to a partial extent, dynamic visualizations. We evaluated Imagix 4D with the help of a case study involving a few samples of source code. The behavior of the tool was analyzed on multiple small codes and a large code gcc C parser. Large code evaluation was performed to uncover dead code, unstructured code, and the effect of not including required files at preprocessing level. The utility of Imagix 4D to prepare decision density and complexity metrics for a large code was found to be useful in getting to know how much reengineering is required. At the outset, Imagix 4D offered limitations in dynamic visualizations, flow chart separation (large code) and parsing loops. The outcome of evaluation will eventually help in upgrading Imagix 4D and posed a need of full featured tools in the area of software reengineering/reverse engineering. It will also help the research community, especially those who are interested in the realm of software reengineering tool building.
The lived experience of doing the right thing: a parse method study.
Smith, Sandra Maxwell
2012-01-01
The purposes of this research were to discover the structure of the experience of doing the right thing and to contribute to nursing knowledge. The Parse research method was used in this study to answer the research question: What is the structure of the lived experience of doing the right thing? Participants were 10 individuals living in the community. The central finding of this study was the following structure: The lived experience of doing the right thing is steadfast uprightness amid adversity, as honorableness with significant affiliations emerges with contentment. New knowledge extended the theory of humanbecoming and enhanced understanding of the experience of doing the right thing.
The value of parsing as feature generation for gene mention recognition
Smith, Larry H; Wilbur, W John
2009-01-01
We measured the extent to which information surrounding a base noun phrase reflects the presence of a gene name, and evaluated seven different parsers in their ability to provide information for that purpose. Using the GENETAG corpus as a gold standard, we performed machine learning to recognize from its context when a base noun phrase contained a gene name. Starting with the best lexical features, we assessed the gain of adding dependency or dependency-like relations from a full sentence parse. Features derived from parsers improved performance in this partial gene mention recognition task by a small but statistically significant amount. There were virtually no differences between parsers in these experiments. PMID:19345281
Parsing Citations in Biomedical Articles Using Conditional Random Fields
Zhang, Qing; Cao, Yong-Gang; Yu, Hong
2011-01-01
Citations are used ubiquitously in biomedical full-text articles and play an important role for representing both the rhetorical structure and the semantic content of the articles. As a result, text mining systems will significantly benefit from a tool that automatically extracts the content of a citation. In this study, we applied the supervised machine-learning algorithms Conditional Random Fields (CRFs) to automatically parse a citation into its fields (e.g., Author, Title, Journal, and Year). With a subset of html format open-access PubMed Central articles, we report an overall 97.95% F1-score. The citation parser can be accessed at: http://www.cs.uwm.edu/~qing/projects/cithit/index.html. PMID:21419403
The lived experience of serenity: using Parse's research method.
Kruse, B G
1999-04-01
Parse's research method was used to investigate the meaning of serenity for survivors of a life-threatening illness or traumatic event. Ten survivors of cancer told their stories of the meaning of serenity as they had lived it in their lives. Descriptions were aided by photographs chosen by each participant to represent the meaning of serenity for them. The structure of serenity was generated through the extraction-synthesis process. Four main concepts--steering-yielding with the flow, savoring remembered visions of engaging surroundings, abiding with aloneness-togetherness, and attesting to a loving presence--emerged and led to a theoretical structure of serenity from the human becoming perspective. Findings confirm serenity as a multidimensional process.
Development of 3D browsing and interactive web system
NASA Astrophysics Data System (ADS)
Shi, Xiaonan; Fu, Jian; Jin, Chaolin
2017-09-01
In the current market, users need to download specific software or plug-ins to browse the 3D model, and browsing the system may be unstable, and it cannot be 3D model interaction issues In order to solve this problem, this paper presents a solution to the interactive browsing of the model in the server-side parsing model, and when the system is applied, the user only needs to input the system URL and upload the 3D model file to operate the browsing The server real-time parsing 3D model, the interactive response speed, these completely follows the user to walk the minimalist idea, and solves the current market block 3D content development question.
Bradley, Anthony R; Rose, Alexander S; Pavelka, Antonín; Valasatava, Yana; Duarte, Jose M; Prlić, Andreas; Rose, Peter W
2017-06-01
Recent advances in experimental techniques have led to a rapid growth in complexity, size, and number of macromolecular structures that are made available through the Protein Data Bank. This creates a challenge for macromolecular visualization and analysis. Macromolecular structure files, such as PDB or PDBx/mmCIF files can be slow to transfer, parse, and hard to incorporate into third-party software tools. Here, we present a new binary and compressed data representation, the MacroMolecular Transmission Format, MMTF, as well as software implementations in several languages that have been developed around it, which address these issues. We describe the new format and its APIs and demonstrate that it is several times faster to parse, and about a quarter of the file size of the current standard format, PDBx/mmCIF. As a consequence of the new data representation, it is now possible to visualize structures with millions of atoms in a web browser, keep the whole PDB archive in memory or parse it within few minutes on average computers, which opens up a new way of thinking how to design and implement efficient algorithms in structural bioinformatics. The PDB archive is available in MMTF file format through web services and data that are updated on a weekly basis.
Deriving pathway maps from automated text analysis using a grammar-based approach.
Olsson, Björn; Gawronska, Barbara; Erlendsson, Björn
2006-04-01
We demonstrate how automated text analysis can be used to support the large-scale analysis of metabolic and regulatory pathways by deriving pathway maps from textual descriptions found in the scientific literature. The main assumption is that correct syntactic analysis combined with domain-specific heuristics provides a good basis for relation extraction. Our method uses an algorithm that searches through the syntactic trees produced by a parser based on a Referent Grammar formalism, identifies relations mentioned in the sentence, and classifies them with respect to their semantic class and epistemic status (facts, counterfactuals, hypotheses). The semantic categories used in the classification are based on the relation set used in KEGG (Kyoto Encyclopedia of Genes and Genomes), so that pathway maps using KEGG notation can be automatically generated. We present the current version of the relation extraction algorithm and an evaluation based on a corpus of abstracts obtained from PubMed. The results indicate that the method is able to combine a reasonable coverage with high accuracy. We found that 61% of all sentences were parsed, and 97% of the parse trees were judged to be correct. The extraction algorithm was tested on a sample of 300 parse trees and was found to produce correct extractions in 90.5% of the cases.
Automated vocabulary discovery for geo-parsing online epidemic intelligence.
Keller, Mikaela; Freifeld, Clark C; Brownstein, John S
2009-11-24
Automated surveillance of the Internet provides a timely and sensitive method for alerting on global emerging infectious disease threats. HealthMap is part of a new generation of online systems designed to monitor and visualize, on a real-time basis, disease outbreak alerts as reported by online news media and public health sources. HealthMap is of specific interest for national and international public health organizations and international travelers. A particular task that makes such a surveillance useful is the automated discovery of the geographic references contained in the retrieved outbreak alerts. This task is sometimes referred to as "geo-parsing". A typical approach to geo-parsing would demand an expensive training corpus of alerts manually tagged by a human. Given that human readers perform this kind of task by using both their lexical and contextual knowledge, we developed an approach which relies on a relatively small expert-built gazetteer, thus limiting the need of human input, but focuses on learning the context in which geographic references appear. We show in a set of experiments, that this approach exhibits a substantial capacity to discover geographic locations outside of its initial lexicon. The results of this analysis provide a framework for future automated global surveillance efforts that reduce manual input and improve timeliness of reporting.
Tsuji, Shintarou; Nishimoto, Naoki; Ogasawara, Katsuhiko
2008-07-20
Although large medical texts are stored in electronic format, they are seldom reused because of the difficulty of processing narrative texts by computer. Morphological analysis is a key technology for extracting medical terms correctly and automatically. This process parses a sentence into its smallest unit, the morpheme. Phrases consisting of two or more technical terms, however, cause morphological analysis software to fail in parsing the sentence and output unprocessed terms as "unknown words." The purpose of this study was to reduce the number of unknown words in medical narrative text processing. The results of parsing the text with additional dictionaries were compared with the analysis of the number of unknown words in the national examination for radiologists. The ratio of unknown words was reduced 1.0% to 0.36% by adding terminologies of radiological technology, MeSH, and ICD-10 labels. The terminology of radiological technology was the most effective resource, being reduced by 0.62%. This result clearly showed the necessity of additional dictionary selection and trends in unknown words. The potential for this investigation is to make available a large body of clinical information that would otherwise be inaccessible for applications other than manual health care review by personnel.
Pavelka, Antonín; Valasatava, Yana; Prlić, Andreas
2017-01-01
Recent advances in experimental techniques have led to a rapid growth in complexity, size, and number of macromolecular structures that are made available through the Protein Data Bank. This creates a challenge for macromolecular visualization and analysis. Macromolecular structure files, such as PDB or PDBx/mmCIF files can be slow to transfer, parse, and hard to incorporate into third-party software tools. Here, we present a new binary and compressed data representation, the MacroMolecular Transmission Format, MMTF, as well as software implementations in several languages that have been developed around it, which address these issues. We describe the new format and its APIs and demonstrate that it is several times faster to parse, and about a quarter of the file size of the current standard format, PDBx/mmCIF. As a consequence of the new data representation, it is now possible to visualize structures with millions of atoms in a web browser, keep the whole PDB archive in memory or parse it within few minutes on average computers, which opens up a new way of thinking how to design and implement efficient algorithms in structural bioinformatics. The PDB archive is available in MMTF file format through web services and data that are updated on a weekly basis. PMID:28574982
SU-E-T-473: A Patient-Specific QC Paradigm Based On Trajectory Log Files and DICOM Plan Files
DOE Office of Scientific and Technical Information (OSTI.GOV)
DeMarco, J; McCloskey, S; Low, D
Purpose: To evaluate a remote QC tool for monitoring treatment machine parameters and treatment workflow. Methods: The Varian TrueBeamTM linear accelerator is a digital machine that records machine axis parameters and MLC leaf positions as a function of delivered monitor unit or control point. This information is saved to a binary trajectory log file for every treatment or imaging field in the patient treatment session. A MATLAB analysis routine was developed to parse the trajectory log files for a given patient, compare the expected versus actual machine and MLC positions as well as perform a cross-comparison with the DICOM-RT planmore » file exported from the treatment planning system. The parsing routine sorts the trajectory log files based on the time and date stamp and generates a sequential report file listing treatment parameters and provides a match relative to the DICOM-RT plan file. Results: The trajectory log parsing-routine was compared against a standard record and verify listing for patients undergoing initial IMRT dosimetry verification and weekly and final chart QC. The complete treatment course was independently verified for 10 patients of varying treatment site and a total of 1267 treatment fields were evaluated including pre-treatment imaging fields where applicable. In the context of IMRT plan verification, eight prostate SBRT plans with 4-arcs per plan were evaluated based on expected versus actual machine axis parameters. The average value for the maximum RMS MLC error was 0.067±0.001mm and 0.066±0.002mm for leaf bank A and B respectively. Conclusion: A real-time QC analysis program was tested using trajectory log files and DICOM-RT plan files. The parsing routine is efficient and able to evaluate all relevant machine axis parameters during a patient treatment course including MLC leaf positions and table positions at time of image acquisition and during treatment.« less
Domain Adaption of Parsing for Operative Notes
Wang, Yan; Pakhomov, Serguei; Ryan, James O.; Melton, Genevieve B.
2016-01-01
Background Full syntactic parsing of clinical text as a part of clinical natural language processing (NLP) is critical for a wide range of applications, such as identification of adverse drug reactions, patient cohort identification, and gene interaction extraction. Several robust syntactic parsers are publicly available to produce linguistic representations for sentences. However, these existing parsers are mostly trained on general English text and often require adaptation for optimal performance on clinical text. Our objective was to adapt an existing general English parser for the clinical text of operative reports via lexicon augmentation, statistics adjusting, and grammar rules modification based on a set of biomedical text. Method The Stanford unlexicalized probabilistic context-free grammar (PCFG) parser lexicon was expanded with SPECIALIST lexicon along with statistics collected from a limited set of operative notes tagged with a two of POS taggers (GENIA tagger and MedPost). The most frequently occurring verb entries of the SPECIALIST lexicon were adjusted based on manual review of verb usage in operative notes. Stanford parser grammar production rules were also modified based on linguistic features of operative reports. An analogous approach was then applied to the GENIA corpus to test the generalizability of this approach to biomedical text. Results The new unlexicalized PCFG parser extended with the extra lexicon from SPECIALIST along with accurate statistics collected from an operative note corpus tagged with GENIA POS tagger improved the parser performance by 2.26% from 87.64% to 89.90%. There was a progressive improvement with the addition of multiple approaches. Most of the improvement occurred with lexicon augmentation combined with statistics from the operative notes corpus. Application of this approach on the GENIA corpus showed that parsing performance was boosted by 3.81% with a simple new grammar and the addition of the GENIA corpus lexicon. Conclusion Using statistics collected from clinical text tagged with POS taggers along with proper modification of grammars and lexicons of an unlexicalized PCFG parser can improve parsing performance. PMID:25661593
Condon, Barbara Backer
2013-01-01
Trying something new is a universal living experience of health. Although trying something new frequently occurs in healthcare, its meaning has never explicitly been studied. Parse's humanbecoming school of thought is the theoretical perspective for this study. The research question for this study is: What is the structure of the living experience of trying something new? The purpose of this study was to advance nursing science. Parse's qualitative phenomenological-hermeneutic research method was used to guide this study. Participants were 8 men and 2 women, ages 29 to 65 who utilize an outpatient mental health facility in the Midwest. Data were collected with dialogical engagement. The major finding of the study is the structure: Trying something new is engaging in capricious exploitations with vacillating sentiments, as wistful contemplation surfaces with disparate affiliations.
Motion based parsing for video from observational psychology
NASA Astrophysics Data System (ADS)
Kokaram, Anil; Doyle, Erika; Lennon, Daire; Joyeux, Laurent; Fuller, Ray
2006-01-01
In Psychology it is common to conduct studies involving the observation of humans undertaking some task. The sessions are typically recorded on video and used for subjective visual analysis. The subjective analysis is tedious and time consuming, not only because much useless video material is recorded but also because subjective measures of human behaviour are not necessarily repeatable. This paper presents tools using content based video analysis that allow automated parsing of video from one such study involving Dyslexia. The tools rely on implicit measures of human motion that can be generalised to other applications in the domain of human observation. Results comparing quantitative assessment of human motion with subjective assessment are also presented, illustrating that the system is a useful scientific tool.
Introduction of statistical information in a syntactic analyzer for document image recognition
NASA Astrophysics Data System (ADS)
Maroneze, André O.; Coüasnon, Bertrand; Lemaitre, Aurélie
2011-01-01
This paper presents an improvement to document layout analysis systems, offering a possible solution to Sayre's paradox (which states that an element "must be recognized before it can be segmented; and it must be segmented before it can be recognized"). This improvement, based on stochastic parsing, allows integration of statistical information, obtained from recognizers, during syntactic layout analysis. We present how this fusion of numeric and symbolic information in a feedback loop can be applied to syntactic methods to improve document description expressiveness. To limit combinatorial explosion during exploration of solutions, we devised an operator that allows optional activation of the stochastic parsing mechanism. Our evaluation on 1250 handwritten business letters shows this method allows the improvement of global recognition scores.
GFFview: A Web Server for Parsing and Visualizing Annotation Information of Eukaryotic Genome.
Deng, Feilong; Chen, Shi-Yi; Wu, Zhou-Lin; Hu, Yongsong; Jia, Xianbo; Lai, Song-Jia
2017-10-01
Owing to wide application of RNA sequencing (RNA-seq) technology, more and more eukaryotic genomes have been extensively annotated, such as the gene structure, alternative splicing, and noncoding loci. Annotation information of genome is prevalently stored as plain text in General Feature Format (GFF), which could be hundreds or thousands Mb in size. Therefore, it is a challenge for manipulating GFF file for biologists who have no bioinformatic skill. In this study, we provide a web server (GFFview) for parsing the annotation information of eukaryotic genome and then generating statistical description of six indices for visualization. GFFview is very useful for investigating quality and difference of the de novo assembled transcriptome in RNA-seq studies.
A controlled trial of automated classification of negation from clinical notes
Elkin, Peter L; Brown, Steven H; Bauer, Brent A; Husser, Casey S; Carruth, William; Bergstrom, Larry R; Wahner-Roedler, Dietlind L
2005-01-01
Background Identification of negation in electronic health records is essential if we are to understand the computable meaning of the records: Our objective is to compare the accuracy of an automated mechanism for assignment of Negation to clinical concepts within a compositional expression with Human Assigned Negation. Also to perform a failure analysis to identify the causes of poorly identified negation (i.e. Missed Conceptual Representation, Inaccurate Conceptual Representation, Missed Negation, Inaccurate identification of Negation). Methods 41 Clinical Documents (Medical Evaluations; sometimes outside of Mayo these are referred to as History and Physical Examinations) were parsed using the Mayo Vocabulary Server Parsing Engine. SNOMED-CT™ was used to provide concept coverage for the clinical concepts in the record. These records resulted in identification of Concepts and textual clues to Negation. These records were reviewed by an independent medical terminologist, and the results were tallied in a spreadsheet. Where questions on the review arose Internal Medicine Faculty were employed to make a final determination. Results SNOMED-CT was used to provide concept coverage of the 14,792 Concepts in 41 Health Records from John's Hopkins University. Of these, 1,823 Concepts were identified as negative by Human review. The sensitivity (Recall) of the assignment of negation was 97.2% (p < 0.001, Pearson Chi-Square test; when compared to a coin flip). The specificity of assignment of negation was 98.8%. The positive likelihood ratio of the negation was 81. The positive predictive value (Precision) was 91.2% Conclusion Automated assignment of negation to concepts identified in health records based on review of the text is feasible and practical. Lexical assignment of negation is a good test of true Negativity as judged by the high sensitivity, specificity and positive likelihood ratio of the test. SNOMED-CT had overall coverage of 88.7% of the concepts being negated. PMID:15876352
An Object-Relational Ifc Storage Model Based on Oracle Database
NASA Astrophysics Data System (ADS)
Li, Hang; Liu, Hua; Liu, Yong; Wang, Yuan
2016-06-01
With the building models are getting increasingly complicated, the levels of collaboration across professionals attract more attention in the architecture, engineering and construction (AEC) industry. In order to adapt the change, buildingSMART developed Industry Foundation Classes (IFC) to facilitate the interoperability between software platforms. However, IFC data are currently shared in the form of text file, which is defective. In this paper, considering the object-based inheritance hierarchy of IFC and the storage features of different database management systems (DBMS), we propose a novel object-relational storage model that uses Oracle database to store IFC data. Firstly, establish the mapping rules between data types in IFC specification and Oracle database. Secondly, design the IFC database according to the relationships among IFC entities. Thirdly, parse the IFC file and extract IFC data. And lastly, store IFC data into corresponding tables in IFC database. In experiment, three different building models are selected to demonstrate the effectiveness of our storage model. The comparison of experimental statistics proves that IFC data are lossless during data exchange.
Metacoder: An R package for visualization and manipulation of community taxonomic diversity data.
Foster, Zachary S L; Sharpton, Thomas J; Grünwald, Niklaus J
2017-02-01
Community-level data, the type generated by an increasing number of metabarcoding studies, is often graphed as stacked bar charts or pie graphs that use color to represent taxa. These graph types do not convey the hierarchical structure of taxonomic classifications and are limited by the use of color for categories. As an alternative, we developed metacoder, an R package for easily parsing, manipulating, and graphing publication-ready plots of hierarchical data. Metacoder includes a dynamic and flexible function that can parse most text-based formats that contain taxonomic classifications, taxon names, taxon identifiers, or sequence identifiers. Metacoder can then subset, sample, and order this parsed data using a set of intuitive functions that take into account the hierarchical nature of the data. Finally, an extremely flexible plotting function enables quantitative representation of up to 4 arbitrary statistics simultaneously in a tree format by mapping statistics to the color and size of tree nodes and edges. Metacoder also allows exploration of barcode primer bias by integrating functions to run digital PCR. Although it has been designed for data from metabarcoding research, metacoder can easily be applied to any data that has a hierarchical component such as gene ontology or geographic location data. Our package complements currently available tools for community analysis and is provided open source with an extensive online user manual.
Parsing partial molar volumes of small molecules: a molecular dynamics study.
Patel, Nisha; Dubins, David N; Pomès, Régis; Chalikian, Tigran V
2011-04-28
We used molecular dynamics (MD) simulations in conjunction with the Kirkwood-Buff theory to compute the partial molar volumes for a number of small solutes of various chemical natures. We repeated our computations using modified pair potentials, first, in the absence of the Coulombic term and, second, in the absence of the Coulombic and the attractive Lennard-Jones terms. Comparison of our results with experimental data and the volumetric results of Monte Carlo simulation with hard sphere potentials and scaled particle theory-based computations led us to conclude that, for small solutes, the partial molar volume computed with the Lennard-Jones potential in the absence of the Coulombic term nearly coincides with the cavity volume. On the other hand, MD simulations carried out with the pair interaction potentials containing only the repulsive Lennard-Jones term produce unrealistically large partial molar volumes of solutes that are close to their excluded volumes. Our simulation results are in good agreement with the reported schemes for parsing partial molar volume data on small solutes. In particular, our determined interaction volumes() and the thickness of the thermal volume for individual compounds are in good agreement with empirical estimates. This work is the first computational study that supports and lends credence to the practical algorithms of parsing partial molar volume data that are currently in use for molecular interpretations of volumetric data.
Metacoder: An R package for visualization and manipulation of community taxonomic diversity data
Foster, Zachary S. L.; Sharpton, Thomas J.
2017-01-01
Community-level data, the type generated by an increasing number of metabarcoding studies, is often graphed as stacked bar charts or pie graphs that use color to represent taxa. These graph types do not convey the hierarchical structure of taxonomic classifications and are limited by the use of color for categories. As an alternative, we developed metacoder, an R package for easily parsing, manipulating, and graphing publication-ready plots of hierarchical data. Metacoder includes a dynamic and flexible function that can parse most text-based formats that contain taxonomic classifications, taxon names, taxon identifiers, or sequence identifiers. Metacoder can then subset, sample, and order this parsed data using a set of intuitive functions that take into account the hierarchical nature of the data. Finally, an extremely flexible plotting function enables quantitative representation of up to 4 arbitrary statistics simultaneously in a tree format by mapping statistics to the color and size of tree nodes and edges. Metacoder also allows exploration of barcode primer bias by integrating functions to run digital PCR. Although it has been designed for data from metabarcoding research, metacoder can easily be applied to any data that has a hierarchical component such as gene ontology or geographic location data. Our package complements currently available tools for community analysis and is provided open source with an extensive online user manual. PMID:28222096
Doelling, Keith; Arnal, Luc; Ghitza, Oded; Poeppel, David
2013-01-01
A growing body of research suggests that intrinsic neuronal slow (< 10 Hz) oscillations in auditory cortex appear to track incoming speech and other spectro-temporally complex auditory signals. Within this framework, several recent studies have identified critical-band temporal envelopes as the specific acoustic feature being reflected by the phase of these oscillations. However, how this alignment between speech acoustics and neural oscillations might underpin intelligibility is unclear. Here we test the hypothesis that the ‘sharpness’ of temporal fluctuations in the critical band envelope acts as a temporal cue to speech syllabic rate, driving delta-theta rhythms to track the stimulus and facilitate intelligibility. We interpret our findings as evidence that sharp events in the stimulus cause cortical rhythms to re-align and parse the stimulus into syllable-sized chunks for further decoding. Using magnetoencephalographic recordings, we show that by removing temporal fluctuations that occur at the syllabic rate, envelope-tracking activity is reduced. By artificially reinstating these temporal fluctuations, envelope-tracking activity is regained. These changes in tracking correlate with intelligibility of the stimulus. Together, the results suggest that the sharpness of fluctuations in the stimulus, as reflected in the cochlear output, drive oscillatory activity to track and entrain to the stimulus, at its syllabic rate. This process likely facilitates parsing of the stimulus into meaningful chunks appropriate for subsequent decoding, enhancing perception and intelligibility. PMID:23791839
DOE Office of Scientific and Technical Information (OSTI.GOV)
Goodall, John; Iannacone, Mike; Athalye, Anish
2013-08-01
Morph is a framework and domain-specific language (DSL) that helps parse and transform structured documents. It currently supports several file formats including XML, JSON, and CSV, and custom formats are usable as well.
Toward a theory of distributed word expert natural language parsing
NASA Technical Reports Server (NTRS)
Rieger, C.; Small, S.
1981-01-01
An approach to natural language meaning-based parsing in which the unit of linguistic knowledge is the word rather than the rewrite rule is described. In the word expert parser, knowledge about language is distributed across a population of procedural experts, each representing a word of the language, and each an expert at diagnosing that word's intended usage in context. The parser is structured around a coroutine control environment in which the generator-like word experts ask questions and exchange information in coming to collective agreement on sentence meaning. The word expert theory is advanced as a better cognitive model of human language expertise than the traditional rule-based approach. The technical discussion is organized around examples taken from the prototype LISP system which implements parts of the theory.
Parsing clinical text: how good are the state-of-the-art parsers?
Jiang, Min; Huang, Yang; Fan, Jung-wei; Tang, Buzhou; Denny, Josh; Xu, Hua
2015-01-01
Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain. In this study, we investigated the performance of three state-of-the-art parsers: the Stanford parser, the Bikel parser, and the Charniak parser, using following two datasets: (1) A Treebank containing 1,100 sentences that were randomly selected from progress notes used in the 2010 i2b2 NLP challenge and manually annotated according to a Penn Treebank based guideline; and (2) the MiPACQ Treebank, which is developed based on pathology notes and clinical notes, containing 13,091 sentences. We conducted three experiments on both datasets. First, we measured the performance of the three state-of-the-art parsers on the clinical Treebanks with their default settings. Then we re-trained the parsers using the clinical Treebanks and evaluated their performance using the 10-fold cross validation method. Finally we re-trained the parsers by combining the clinical Treebanks with the Penn Treebank. Our results showed that the original parsers achieved lower performance in clinical text (Bracketing F-measure in the range of 66.6%-70.3%) compared to general English text. After retraining on the clinical Treebank, all parsers achieved better performance, with the best performance from the Stanford parser that reached the highest Bracketing F-measure of 73.68% on progress notes and 83.72% on the MiPACQ corpus using 10-fold cross validation. When the combined clinical Treebanks and Penn Treebank was used, of the three parsers, the Charniak parser achieved the highest Bracketing F-measure of 73.53% on progress notes and the Stanford parser reached the highest F-measure of 84.15% on the MiPACQ corpus. Our study demonstrates that re-training using clinical Treebanks is critical for improving general English parsers' performance on clinical text, and combining clinical and open domain corpora might achieve optimal performance for parsing clinical text.
NASA Astrophysics Data System (ADS)
Sardina, V.
2012-12-01
The US Tsunami Warning Centers (TWCs) have traditionally generated their tsunami message products primarily as blocks of text then tagged with headers that identify them on each particular communications' (comms) circuit. Each warning center has a primary area of responsibility (AOR) within which it has an authoritative role regarding parameters such as earthquake location and magnitude. This means that when a major tsunamigenic event occurs the other warning centers need to quickly access the earthquake parameters issued by the authoritative warning center before issuing their message products intended for customers in their own AOR. Thus, within the operational context of the TWCs the scientists on duty have an operational need to access the information contained in the message products issued by other warning centers as quickly as possible. As a solution to this operational problem we designed and implemented a C++ software package that allows scanning for and parsing the entire suite of tsunami message products issued by the Pacific Tsunami Warning Center (PTWC), the West Coast and Alaska Tsunami Warning Center (WCATWC), and the Japan Meteorological Agency (JMA). The scanning and parsing classes composing the resulting C++ software package allow parsing both non-official message products(observatory messages) routinely issued by the TWCs, and all official tsunami message products such as tsunami advisories, watches, and warnings. This software package currently allows scientists on duty at the PTWC to automatically retrieve the parameters contained in tsunami messages issued by WCATWC, JMA, or PTWC itself. Extension of the capabilities of the classes composing the software package would make it possible to generate XML and CAP compliant versions of the TWCs' message products until new messaging software natively adds this capabilities. Customers who receive the TWCs' tsunami message products could also use the package to automatically retrieve information from messages sent via any text-based communications' circuit currently used by the TWCs to disseminate their tsunami message products.
Seabed mapping and characterization of sediment variability using the usSEABED data base
Goff, J.A.; Jenkins, C.J.; Jeffress, Williams S.
2008-01-01
We present a methodology for statistical analysis of randomly located marine sediment point data, and apply it to the US continental shelf portions of usSEABED mean grain size records. The usSEABED database, like many modern, large environmental datasets, is heterogeneous and interdisciplinary. We statistically test the database as a source of mean grain size data, and from it provide a first examination of regional seafloor sediment variability across the entire US continental shelf. Data derived from laboratory analyses ("extracted") and from word-based descriptions ("parsed") are treated separately, and they are compared statistically and deterministically. Data records are selected for spatial analysis by their location within sample regions: polygonal areas defined in ArcGIS chosen by geography, water depth, and data sufficiency. We derive isotropic, binned semivariograms from the data, and invert these for estimates of noise variance, field variance, and decorrelation distance. The highly erratic nature of the semivariograms is a result both of the random locations of the data and of the high level of data uncertainty (noise). This decorrelates the data covariance matrix for the inversion, and largely prevents robust estimation of the fractal dimension. Our comparison of the extracted and parsed mean grain size data demonstrates important differences between the two. In particular, extracted measurements generally produce finer mean grain sizes, lower noise variance, and lower field variance than parsed values. Such relationships can be used to derive a regionally dependent conversion factor between the two. Our analysis of sample regions on the US continental shelf revealed considerable geographic variability in the estimated statistical parameters of field variance and decorrelation distance. Some regional relationships are evident, and overall there is a tendency for field variance to be higher where the average mean grain size is finer grained. Surprisingly, parsed and extracted noise magnitudes correlate with each other, which may indicate that some portion of the data variability that we identify as "noise" is caused by real grain size variability at very short scales. Our analyses demonstrate that by applying a bias-correction proxy, usSEABED data can be used to generate reliable interpolated maps of regional mean grain size and sediment character.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Englehardt, Robert; Steele, Andrew
Arcus, developed by Sandia National Laboratories, is a library for calculating, parsing, formatting, converting and comparing both IPv4 and IPv6 addresses and subnets. It accounts for 128-bit numbers on 32-bit platforms.
Integration of Dakota into the NEAMS Workbench
DOE Office of Scientific and Technical Information (OSTI.GOV)
Swiler, Laura Painton; Lefebvre, Robert A.; Langley, Brandon R.
2017-07-01
This report summarizes a NEAMS (Nuclear Energy Advanced Modeling and Simulation) project focused on integrating Dakota into the NEAMS Workbench. The NEAMS Workbench, developed at Oak Ridge National Laboratory, is a new software framework that provides a graphical user interface, input file creation, parsing, validation, job execution, workflow management, and output processing for a variety of nuclear codes. Dakota is a tool developed at Sandia National Laboratories that provides a suite of uncertainty quantification and optimization algorithms. Providing Dakota within the NEAMS Workbench allows users of nuclear simulation codes to perform uncertainty and optimization studies on their nuclear codes frommore » within a common, integrated environment. Details of the integration and parsing are provided, along with an example of Dakota running a sampling study on the fuels performance code, BISON, from within the NEAMS Workbench.« less
Drury, John E; Baum, Shari R; Valeriote, Hope; Steinhauer, Karsten
2016-01-01
This study presents the first two ERP reading studies of comma-induced effects of covert (implicit) prosody on syntactic parsing decisions in English. The first experiment used a balanced 2 × 2 design in which the presence/absence of commas determined plausibility (e.g., John, said Mary, was the nicest boy at the party vs. John said Mary was the nicest boy at the party ). The second reading experiment replicated a previous auditory study investigating the role of overt prosodic boundaries in closure ambiguities (Pauker et al., 2011). In both experiments, commas reliably elicited CPS components and generally played a dominant role in determining parsing decisions in the face of input ambiguity. The combined set of findings provides further evidence supporting the claim that mechanisms subserving speech processing play an active role during silent reading.
Modeling the Arden Syntax for medical decisions in XML.
Kim, Sukil; Haug, Peter J; Rocha, Roberto A; Choi, Inyoung
2008-10-01
A new model expressing Arden Syntax with the eXtensible Markup Language (XML) was developed to increase its portability. Every example was manually parsed and reviewed until the schema and the style sheet were considered to be optimized. When the first schema was finished, several MLMs in Arden Syntax Markup Language (ArdenML) were validated against the schema. They were then transformed to HTML formats with the style sheet, during which they were compared to the original text version of their own MLM. When faults were found in the transformed MLM, the schema and/or style sheet was fixed. This cycle continued until all the examples were encoded into XML documents. The original MLMs were encoded in XML according to the proposed XML schema and reverse-parsed MLMs in ArdenML were checked using a public domain Arden Syntax checker. Two hundred seventy seven examples of MLMs were successfully transformed into XML documents using the model, and the reverse-parse yielded the original text version of MLMs. Two hundred sixty five of the 277 MLMs showed the same error patterns before and after transformation, and all 11 errors related to statement structure were resolved in XML version. The model uses two syntax checking mechanisms, first an XML validation process, and second, a syntax check using an XSL style sheet. Now that we have a schema for ArdenML, we can also begin the development of style sheets for transformation ArdenML into other languages.
Incremental Refinement of FAÇADE Models with Attribute Grammar from 3d Point Clouds
NASA Astrophysics Data System (ADS)
Dehbi, Y.; Staat, C.; Mandtler, L.; Pl¨umer, L.
2016-06-01
Data acquisition using unmanned aerial vehicles (UAVs) has gotten more and more attention over the last years. Especially in the field of building reconstruction the incremental interpretation of such data is a demanding task. In this context formal grammars play an important role for the top-down identification and reconstruction of building objects. Up to now, the available approaches expect offline data in order to parse an a-priori known grammar. For mapping on demand an on the fly reconstruction based on UAV data is required. An incremental interpretation of the data stream is inevitable. This paper presents an incremental parser of grammar rules for an automatic 3D building reconstruction. The parser enables a model refinement based on new observations with respect to a weighted attribute context-free grammar (WACFG). The falsification or rejection of hypotheses is supported as well. The parser can deal with and adapt available parse trees acquired from previous interpretations or predictions. Parse trees derived so far are updated in an iterative way using transformation rules. A diagnostic step searches for mismatches between current and new nodes. Prior knowledge on façades is incorporated. It is given by probability densities as well as architectural patterns. Since we cannot always assume normal distributions, the derivation of location and shape parameters of building objects is based on a kernel density estimation (KDE). While the level of detail is continuously improved, the geometrical, semantic and topological consistency is ensured.
Performance of Lempel-Ziv compressors with deferred innovation
NASA Technical Reports Server (NTRS)
Cohn, Martin
1989-01-01
The noiseless data-compression algorithms introduced by Lempel and Ziv (LZ) parse an input data string into successive substrings each consisting of two parts: The citation, which is the longest prefix that has appeared earlier in the input, and the innovation, which is the symbol immediately following the citation. In extremal versions of the LZ algorithm the citation may have begun anywhere in the input; in incremental versions it must have begun at a previous parse position. Originally the citation and the innovation were encoded, either individually or jointly, into an output word to be transmitted or stored. Subsequently, it was speculated that the cost of this encoding may be excessively high because the innovation contributes roughly 1g(A) bits, where A is the size of the input alphabet, regardless of the compressibility of the source. To remedy this excess, it was suggested to store the parsed substring as usual, but encoding for output only the citation, leaving the innovation to be encoded as the first symbol of the next substring. Being thus included in the next substring, the innovation can participate in whatever compression that substring enjoys. This strategy is called deferred innovation. It is exemplified in the algorithm described by Welch and implemented in the C program compress that has widely displaced adaptive Huffman coding (compact) as a UNIX system utility. The excessive expansion is explained, an implicit warning is given against using the deferred innovation compressors on nearly incompressible data.
Statistical learning of movement.
Ongchoco, Joan Danielle Khonghun; Uddenberg, Stefan; Chun, Marvin M
2016-12-01
The environment is dynamic, but objects move in predictable and characteristic ways, whether they are a dancer in motion, or a bee buzzing around in flight. Sequences of movement are comprised of simpler motion trajectory elements chained together. But how do we know where one trajectory element ends and another begins, much like we parse words from continuous streams of speech? As a novel test of statistical learning, we explored the ability to parse continuous movement sequences into simpler element trajectories. Across four experiments, we showed that people can robustly parse such sequences from a continuous stream of trajectories under increasingly stringent tests of segmentation ability and statistical learning. Observers viewed a single dot as it moved along simple sequences of paths, and were later able to discriminate these sequences from novel and partial ones shown at test. Observers demonstrated this ability when there were potentially helpful trajectory-segmentation cues such as a common origin for all movements (Experiment 1); when the dot's motions were entirely continuous and unconstrained (Experiment 2); when sequences were tested against partial sequences as a more stringent test of statistical learning (Experiment 3); and finally, even when the element trajectories were in fact pairs of trajectories, so that abrupt directional changes in the dot's motion could no longer signal inter-trajectory boundaries (Experiment 4). These results suggest that observers can automatically extract regularities in movement - an ability that may underpin our capacity to learn more complex biological motions, as in sport or dance.
Parsing interindividual drug variability: an emerging role for systems pharmacology
Turner, Richard M; Park, B Kevin; Pirmohamed, Munir
2015-01-01
There is notable interindividual heterogeneity in drug response, affecting both drug efficacy and toxicity, resulting in patient harm and the inefficient utilization of limited healthcare resources. Pharmacogenomics is at the forefront of research to understand interindividual drug response variability, but although many genotype-drug response associations have been identified, translation of pharmacogenomic associations into clinical practice has been hampered by inconsistent findings and inadequate predictive values. These limitations are in part due to the complex interplay between drug-specific, human body and environmental factors influencing drug response and therefore pharmacogenomics, whilst intrinsically necessary, is by itself unlikely to adequately parse drug variability. The emergent, interdisciplinary and rapidly developing field of systems pharmacology, which incorporates but goes beyond pharmacogenomics, holds significant potential to further parse interindividual drug variability. Systems pharmacology broadly encompasses two distinct research efforts, pharmacologically-orientated systems biology and pharmacometrics. Pharmacologically-orientated systems biology utilizes high throughput omics technologies, including next-generation sequencing, transcriptomics and proteomics, to identify factors associated with differential drug response within the different levels of biological organization in the hierarchical human body. Increasingly complex pharmacometric models are being developed that quantitatively integrate factors associated with drug response. Although distinct, these research areas complement one another and continual development can be facilitated by iterating between dynamic experimental and computational findings. Ultimately, quantitative data-derived models of sufficient detail will be required to help realize the goal of precision medicine. WIREs Syst Biol Med 2015, 7:221–241. doi: 10.1002/wsbm.1302 PMID:25950758
Raja, Kalpana; Natarajan, Jeyakumar
2018-07-01
Extraction of protein phosphorylation information from biomedical literature has gained much attention because of the importance in numerous biological processes. In this study, we propose a text mining methodology which consists of two phases, NLP parsing and SVM classification to extract phosphorylation information from literature. First, using NLP parsing we divide the data into three base-forms depending on the biomedical entities related to phosphorylation and further classify into ten sub-forms based on their distribution with phosphorylation keyword. Next, we extract the phosphorylation entity singles/pairs/triplets and apply SVM to classify the extracted singles/pairs/triplets using a set of features applicable to each sub-form. The performance of our methodology was evaluated on three corpora namely PLC, iProLink and hPP corpus. We obtained promising results of >85% F-score on ten sub-forms of training datasets on cross validation test. Our system achieved overall F-score of 93.0% on iProLink and 96.3% on hPP corpus test datasets. Furthermore, our proposed system achieved best performance on cross corpus evaluation and outperformed the existing system with recall of 90.1%. The performance analysis of our unique system on three corpora reveals that it extracts protein phosphorylation information efficiently in both non-organism specific general datasets such as PLC and iProLink, and human specific dataset such as hPP corpus. Copyright © 2018 Elsevier B.V. All rights reserved.
A U-Shaped Relative Clause Attachment Preference in Japanese.
ERIC Educational Resources Information Center
Miyamoto, Edson T.; Gibson, Edward; Pearlmutter, Neal J.; Aikawa, Takako; Miyagawa, Shigeru
1999-01-01
Presents results from a self-paced reading experiment in Japanese investigating attachment preferences for relative clauses to three ensuing potential nominal heads. Results are discussed in light of two types of parsing models. (Author/VWL)
Peripheral Visual Cues Contribute to the Perception of Object Movement During Self-Movement
Rogers, Cassandra; Warren, Paul A.
2017-01-01
Safe movement through the environment requires us to monitor our surroundings for moving objects or people. However, identification of moving objects in the scene is complicated by self-movement, which adds motion across the retina. To identify world-relative object movement, the brain thus has to ‘compensate for’ or ‘parse out’ the components of retinal motion that are due to self-movement. We have previously demonstrated that retinal cues arising from central vision contribute to solving this problem. Here, we investigate the contribution of peripheral vision, commonly thought to provide strong cues to self-movement. Stationary participants viewed a large field of view display, with radial flow patterns presented in the periphery, and judged the trajectory of a centrally presented probe. Across two experiments, we demonstrate and quantify the contribution of peripheral optic flow to flow parsing during forward and backward movement. PMID:29201335
RNA-Seq-Based Transcript Structure Analysis with TrBorderExt.
Wang, Yejun; Sun, Ming-An; White, Aaron P
2018-01-01
RNA-Seq has become a routine strategy for genome-wide gene expression comparisons in bacteria. Despite lower resolution in transcript border parsing compared with dRNA-Seq, TSS-EMOTE, Cappable-seq, Term-seq, and others, directional RNA-Seq still illustrates its advantages: low cost, quantification and transcript border analysis with a medium resolution (±10-20 nt). To facilitate mining of directional RNA-Seq datasets especially with respect to transcript structure analysis, we developed a tool, TrBorderExt, which can parse transcript start sites and termination sites accurately in bacteria. A detailed protocol is described in this chapter for how to use the software package step by step to identify bacterial transcript borders from raw RNA-Seq data. The package was developed with Perl and R programming languages, and is accessible freely through the website: http://www.szu-bioinf.org/TrBorderExt .
NASA Technical Reports Server (NTRS)
Gardner, Adrian
2010-01-01
National Aeronautical and Space Administration (NASA) weather and atmospheric environmental organizations are insatiable consumers of geophysical, hydrometeorological and solar weather statistics. The expanding array of internet-worked sensors producing targeted physical measurements has generated an almost factorial explosion of near real-time inputs to topical statistical datasets. Normalizing and value-based parsing of such statistical datasets in support of time-constrained weather and environmental alerts and warnings is essential, even with dedicated high-performance computational capabilities. What are the optimal indicators for advanced decision making? How do we recognize the line between sufficient statistical sampling and excessive, mission destructive sampling ? How do we assure that the normalization and parsing process, when interpolated through numerical models, yields accurate and actionable alerts and warnings? This presentation will address the integrated means and methods to achieve desired outputs for NASA and consumers of its data.
Parsing with logical variables (logic-based programming systems)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Finin, T.W.; Stone Palmer, M.
1983-01-01
Logic based programming systems have enjoyed an increasing popularity in applied AI work in the last few years. One of the contributions to computational linguistics made by the logic programming paradigm has been the definite clause grammar. In comparing DCGS with previous parsing mechanisms such as ATNS, certain clear advantages are seen. The authors feel that the most important of these advantages are due to the use of logical variables with unification as the fundamental operation on them. To illustrate the power of the logical variable, they have implemented an experimental atn system which treats atn registers as logical variablesmore » and provides a unification operation over them. They aim to simultaneously encourage the use of the powerful mechanisms available in DCGS and demonstrate that some of these techniques can be captured without reference to a resolution theorem prover. 14 references.« less
Fernandes, Tânia; Kolinsky, Régine; Ventura, Paulo
2009-09-01
This study combined artificial language learning (ALL) with conventional experimental techniques to test whether statistical speech segmentation outputs are integrated into adult listeners' mental lexicon. Lexicalization was assessed through inhibitory effects of novel neighbors (created by the parsing process) on auditory lexical decisions to real words. Both immediately after familiarization and post-one week, ALL outputs were lexicalized only when the cues available during familiarization (transitional probabilities and wordlikeness) suggested the same parsing (Experiments 1 and 3). No lexicalization effect occurred with incongruent cues (Experiments 2 and 4). Yet, ALL differed from chance, suggesting a dissociation between item knowledge and lexicalization. Similarly contrasted results were found when frequency of occurrence of the stimuli was equated during familiarization (Experiments 3 and 4). Our findings thus indicate that ALL outputs may be lexicalized as far as the segmentation cues are congruent, and that this process cannot be accounted for by raw frequency.
UniGene Tabulator: a full parser for the UniGene format.
Lenzi, Luca; Frabetti, Flavia; Facchin, Federica; Casadei, Raffaella; Vitale, Lorenza; Canaider, Silvia; Carinci, Paolo; Zannotti, Maria; Strippoli, Pierluigi
2006-10-15
UniGene Tabulator 1.0 provides a solution for full parsing of UniGene flat file format; it implements a structured graphical representation of each data field present in UniGene following import into a common database managing system usable in a personal computer. This database includes related tables for sequence, protein similarity, sequence-tagged site (STS) and transcript map interval (TXMAP) data, plus a summary table where each record represents a UniGene cluster. UniGene Tabulator enables full local management of UniGene data, allowing parsing, querying, indexing, retrieving, exporting and analysis of UniGene data in a relational database form, usable on Macintosh (OS X 10.3.9 or later) and Windows (2000, with service pack 4, XP, with service pack 2 or later) operating systems-based computers. The current release, including both the FileMaker runtime applications, is freely available at http://apollo11.isto.unibo.it/software/
Drury, John E.; Baum, Shari R.; Valeriote, Hope; Steinhauer, Karsten
2016-01-01
This study presents the first two ERP reading studies of comma-induced effects of covert (implicit) prosody on syntactic parsing decisions in English. The first experiment used a balanced 2 × 2 design in which the presence/absence of commas determined plausibility (e.g., John, said Mary, was the nicest boy at the party vs. John said Mary was the nicest boy at the party). The second reading experiment replicated a previous auditory study investigating the role of overt prosodic boundaries in closure ambiguities (Pauker et al., 2011). In both experiments, commas reliably elicited CPS components and generally played a dominant role in determining parsing decisions in the face of input ambiguity. The combined set of findings provides further evidence supporting the claim that mechanisms subserving speech processing play an active role during silent reading. PMID:27695428
Discovery of novel biomarkers and phenotypes by semantic technologies
2013-01-01
Background Biomarkers and target-specific phenotypes are important to targeted drug design and individualized medicine, thus constituting an important aspect of modern pharmaceutical research and development. More and more, the discovery of relevant biomarkers is aided by in silico techniques based on applying data mining and computational chemistry on large molecular databases. However, there is an even larger source of valuable information available that can potentially be tapped for such discoveries: repositories constituted by research documents. Results This paper reports on a pilot experiment to discover potential novel biomarkers and phenotypes for diabetes and obesity by self-organized text mining of about 120,000 PubMed abstracts, public clinical trial summaries, and internal Merck research documents. These documents were directly analyzed by the InfoCodex semantic engine, without prior human manipulations such as parsing. Recall and precision against established, but different benchmarks lie in ranges up to 30% and 50% respectively. Retrieval of known entities missed by other traditional approaches could be demonstrated. Finally, the InfoCodex semantic engine was shown to discover new diabetes and obesity biomarkers and phenotypes. Amongst these were many interesting candidates with a high potential, although noticeable noise (uninteresting or obvious terms) was generated. Conclusions The reported approach of employing autonomous self-organising semantic engines to aid biomarker discovery, supplemented by appropriate manual curation processes, shows promise and has potential to impact, conservatively, a faster alternative to vocabulary processes dependent on humans having to read and analyze all the texts. More optimistically, it could impact pharmaceutical research, for example to shorten time-to-market of novel drugs, or speed up early recognition of dead ends and adverse reactions. PMID:23402646
Navon, David
2011-03-01
Though figure-ground assignment has been shown to be probably affected by recognizability, it appears sensible that object recognition must follow at least the earlier process of figure-ground segregation. To examine whether or not rudimentary object recognition could, counterintuitively, start even before the completion of the stage of parsing in which figure-ground segregation is done, participants were asked to respond, in a go/no-go fashion, whenever any out of 16 alternative connected patterns (that constituted familiar stimuli in the upright orientation) appeared. The white figure of the to-be-attended stimulus-target or foil-could be segregated from the white ambient ground only by means of a frame surrounding it. Such a frame was absent until the onset of target display. Then, to manipulate organizational quality, the greyness of the frame was either gradually increased from zero (in Experiment 1) or changed abruptly to a stationary level whose greyness was varied between trials (in Experiments 2 and 3). Stimulus recognizability was manipulated by orientation angle. In all three experiments the effect of recognizability was found to be considerably larger when organizational quality was minimal due to an extremely faint frame. This result is argued to be incompatible with any version of a serial thesis suggesting that processing aimed at object recognition starts only with a good enough level of organizational quality. The experiments rather provide some support to the claim, termed here "early interaction hypothesis", positing interaction between early recognition processing and preassignment parsing processes.
Representing sentence information
NASA Astrophysics Data System (ADS)
Perkins, Walton A., III
1991-03-01
This paper describes a computer-oriented representation for sentence information. Whereas many Artificial Intelligence (AI) natural language systems start with a syntactic parse of a sentence into the linguist's components: noun, verb, adjective, preposition, etc., we argue that it is better to parse the input sentence into 'meaning' components: attribute, attribute value, object class, object instance, and relation. AI systems need a representation that will allow rapid storage and retrieval of information and convenient reasoning with that information. The attribute-of-object representation has proven useful for handling information in relational databases (which are well known for their efficiency in storage and retrieval) and for reasoning in knowledge- based systems. On the other hand, the linguist's syntactic representation of the works in sentences has not been shown to be useful for information handling and reasoning. We think it is an unnecessary and misleading intermediate form. Our sentence representation is semantic based in terms of attribute, attribute value, object class, object instance, and relation. Every sentence is segmented into one or more components with the form: 'attribute' of 'object' 'relation' 'attribute value'. Using only one format for all information gives the system simplicity and good performance as a RISC architecture does for hardware. The attribute-of-object representation is not new; it is used extensively in relational databases and knowledge-based systems. However, we will show that it can be used as a meaning representation for natural language sentences with minor extensions. In this paper we describe how a computer system can parse English sentences into this representation and generate English sentences from this representation. Much of this has been tested with computer implementation.
Comparing the Concept of Caring in Islamic Perspective with Watson and Parse's Nursing Theories
Sadat-Hoseini, Akram-Sadat; Khosropanah, Abdoul-Hosein
2017-01-01
Background: In the nursing profession, it is apparent that the definition of caring differs between various perspectives. This article compares the difference of caring in Islamic with the Parse and Watson theories. Materials and Methods: In this study, we use concept analyses of Walker–Avants and compare research methods. Material used is all Islamic documents. Results: According to Islamic documents, there are four major types of caring, namely, (1) God taking care of humans, (2) Humans taking care of themselves, (3) Other humans taking care of humans, and (4) The universe taking care of humans and vice versa. God caring for humans affects the three other types of caring. All three definitions of caring have humanistic and holistic view. According to Watson's and Parse's definition, the development of the caring theory is based on the person's experiences that result from human interactions with, and experiences of, their environment. In Islamic definition, although the caring process is affected by environmental experiences and interactions, human not developed only base the effect of environment; rather, it is developed on the basis of human nature and divine commands. God taking care of humans is specific to Islamic perspective and is not found in other definitions. Islamic perspective maintains that God is the creator of humanity and is in charge of guiding humans. A superior form of human can always be discovered. Conclusions: Thus, nursing implementation in Muslims must be done based on Islamic commands, and Islamic commands are superior to human experiences. However, Islamic commands interpreted with human wisdom and thought can be striving toward excellence. PMID:28584543
A prototype system for perinatal knowledge engineering using an artificial intelligence tool.
Sokol, R J; Chik, L
1988-01-01
Though several perinatal expert systems are extant, the use of artificial intelligence has, as yet, had minimal impact in medical computing. In this evaluation of the potential of AI techniques in the development of a computer based "Perinatal Consultant," a "top down" approach to the development of a perinatal knowledge base was taken, using as a source for such a knowledge base a 30-page manuscript of a chapter concerning high risk pregnancy. The UNIX utility "style" was used to parse sentences and obtain key words and phrases, both as part of a natural language interface and to identify key perinatal concepts. Compared with the "gold standard" of sentences containing key facts as chosen by the experts, a semiautomated method using a nonmedical speller to identify key words and phrases in context functioned with a sensitivity of 79%, i.e., approximately 8 in 10 key sentences were detected as the basis for PROLOG, rules and facts for the knowledge base. These encouraging results suggest that functional perinatal expert systems may well be expedited by using programming utilities in conjunction with AI tools and published literature.
Addressing socioeconomic and political challenges posed by climate change
NASA Astrophysics Data System (ADS)
Fernando, Harindra Joseph; Klaic, Zvjezdana Bencetic
2011-08-01
NATO Advanced Research Workshop: Climate Change, Human Health and National Security; Dubrovnik, Croatia, 28-30 April 2011; Climate change has been identified as one of the most serious threats to humanity. It not only causes sea level rise, drought, crop failure, vector-borne diseases, extreme events, degradation of water and air quality, heat waves, and other phenomena, but it is also a threat multiplier wherein concatenation of multiple events may lead to frequent human catastrophes and intranational and international conflicts. In particular, urban areas may bear the brunt of climate change because of the amplification of climate effects that cascade down from global to urban scales, but current modeling and downscaling capabilities are unable to predict these effects with confidence. These were the main conclusions of a NATO Advanced Research Workshop (ARW) sponsored by the NATO Science for Peace and Security program. Thirty-two invitees from 17 counties, including leading modelers; natural, political, and social scientists; engineers; politicians; military experts; urban planners; industry analysts; epidemiologists; and health care professionals, parsed the topic on a common platform.
The EPRDATA Format: A Dialogue
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hughes, III, Henry Grady
2015-08-18
Recently the Los Alamos Nuclear Data Team has communicated certain issues of concern in relation to the new electron/photon/relaxation ACE data format as released in the eprdata12 library. In this document those issues are parsed, analyzed, and answered.
Deterministic Parsing and Linguistic Explanation. Revision,
1985-06-01
near the town can have any of the following intepretations : 2"See Zubizarretta (1082) wrod StoweU (1081). 17...Department of Linguistics and Philosophy. 43 ... FILMED " -85 D I’ DTIC " , S . I * -J’ . p -#-
Sleep Disrupts High-Level Speech Parsing Despite Significant Basic Auditory Processing.
Makov, Shiri; Sharon, Omer; Ding, Nai; Ben-Shachar, Michal; Nir, Yuval; Zion Golumbic, Elana
2017-08-09
The extent to which the sleeping brain processes sensory information remains unclear. This is particularly true for continuous and complex stimuli such as speech, in which information is organized into hierarchically embedded structures. Recently, novel metrics for assessing the neural representation of continuous speech have been developed using noninvasive brain recordings that have thus far only been tested during wakefulness. Here we investigated, for the first time, the sleeping brain's capacity to process continuous speech at different hierarchical levels using a newly developed Concurrent Hierarchical Tracking (CHT) approach that allows monitoring the neural representation and processing-depth of continuous speech online. Speech sequences were compiled with syllables, words, phrases, and sentences occurring at fixed time intervals such that different linguistic levels correspond to distinct frequencies. This enabled us to distinguish their neural signatures in brain activity. We compared the neural tracking of intelligible versus unintelligible (scrambled and foreign) speech across states of wakefulness and sleep using high-density EEG in humans. We found that neural tracking of stimulus acoustics was comparable across wakefulness and sleep and similar across all conditions regardless of speech intelligibility. In contrast, neural tracking of higher-order linguistic constructs (words, phrases, and sentences) was only observed for intelligible speech during wakefulness and could not be detected at all during nonrapid eye movement or rapid eye movement sleep. These results suggest that, whereas low-level auditory processing is relatively preserved during sleep, higher-level hierarchical linguistic parsing is severely disrupted, thereby revealing the capacity and limits of language processing during sleep. SIGNIFICANCE STATEMENT Despite the persistence of some sensory processing during sleep, it is unclear whether high-level cognitive processes such as speech parsing are also preserved. We used a novel approach for studying the depth of speech processing across wakefulness and sleep while tracking neuronal activity with EEG. We found that responses to the auditory sound stream remained intact; however, the sleeping brain did not show signs of hierarchical parsing of the continuous stream of syllables into words, phrases, and sentences. The results suggest that sleep imposes a functional barrier between basic sensory processing and high-level cognitive processing. This paradigm also holds promise for studying residual cognitive abilities in a wide array of unresponsive states. Copyright © 2017 the authors 0270-6474/17/377772-10$15.00/0.
Parsing clinical text: how good are the state-of-the-art parsers?
2015-01-01
Background Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain. Methods In this study, we investigated the performance of three state-of-the-art parsers: the Stanford parser, the Bikel parser, and the Charniak parser, using following two datasets: (1) A Treebank containing 1,100 sentences that were randomly selected from progress notes used in the 2010 i2b2 NLP challenge and manually annotated according to a Penn Treebank based guideline; and (2) the MiPACQ Treebank, which is developed based on pathology notes and clinical notes, containing 13,091 sentences. We conducted three experiments on both datasets. First, we measured the performance of the three state-of-the-art parsers on the clinical Treebanks with their default settings. Then we re-trained the parsers using the clinical Treebanks and evaluated their performance using the 10-fold cross validation method. Finally we re-trained the parsers by combining the clinical Treebanks with the Penn Treebank. Results Our results showed that the original parsers achieved lower performance in clinical text (Bracketing F-measure in the range of 66.6%-70.3%) compared to general English text. After retraining on the clinical Treebank, all parsers achieved better performance, with the best performance from the Stanford parser that reached the highest Bracketing F-measure of 73.68% on progress notes and 83.72% on the MiPACQ corpus using 10-fold cross validation. When the combined clinical Treebanks and Penn Treebank was used, of the three parsers, the Charniak parser achieved the highest Bracketing F-measure of 73.53% on progress notes and the Stanford parser reached the highest F-measure of 84.15% on the MiPACQ corpus. Conclusions Our study demonstrates that re-training using clinical Treebanks is critical for improving general English parsers' performance on clinical text, and combining clinical and open domain corpora might achieve optimal performance for parsing clinical text. PMID:26045009
DOE Office of Scientific and Technical Information (OSTI.GOV)
Owen, R. K.
2007-04-04
A perl module designed to read and parse the voluminous set of event or accounting log files produced by a Portable Batch System (PBS) server. This module can filter on date-time and/or record type. The data can be returned in a variety of formats.
2007-11-01
Architecture ( UIMA ) [3] based framework. All that would be required of such a NEFServer instance would be to send, receive and parse the received information...Management Architecture, http://incubator.apache.org/ uima /. [4] Grishman, R. & Sundheim, B, “Message Understanding Conference – 6: A Brief History
Price, Rebecca B; Lane, Stephanie; Gates, Kathleen; Kraynak, Thomas E; Horner, Michelle S; Thase, Michael E; Siegle, Greg J
2017-02-15
There is well-known heterogeneity in affective mechanisms in depression that may extend to positive affect. We used data-driven parsing of neural connectivity to reveal subgroups present across depressed and healthy individuals during positive processing, informing targets for mechanistic intervention. Ninety-two individuals (68 depressed patients, 24 never-depressed control subjects) completed a sustained positive mood induction during functional magnetic resonance imaging. Directed functional connectivity paths within a depression-relevant network were characterized using Group Iterative Multiple Model Estimation (GIMME), a method shown to accurately recover the direction and presence of connectivity paths in individual participants. During model selection, individuals were clustered using community detection on neural connectivity estimates. Subgroups were externally tested across multiple levels of analysis. Two connectivity-based subgroups emerged: subgroup A, characterized by weaker connectivity overall, and subgroup B, exhibiting hyperconnectivity (relative to subgroup A), particularly among ventral affective regions. Subgroup predicted diagnostic status (subgroup B contained 81% of patients; 50% of control subjects; χ 2 = 8.6, p = .003) and default mode network connectivity during a separate resting-state task. Among patients, subgroup B members had higher self-reported symptoms, lower sustained positive mood during the induction, and higher negative bias on a reaction-time task. Symptom-based depression subgroups did not predict these external variables. Neural connectivity-based categorization travels with diagnostic category and is clinically predictive, but not clinically deterministic. Both patients and control subjects showed heterogeneous, and overlapping, profiles. The larger and more severely affected patient subgroup was characterized by ventrally driven hyperconnectivity during positive processing. Data-driven parsing suggests heterogeneous substrates of depression and possible resilience in control subjects in spite of biological overlap. Copyright © 2016 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Psychophysical and Neural Correlates of Auditory Attraction and Aversion
NASA Astrophysics Data System (ADS)
Patten, Kristopher Jakob
This study explores the psychophysical and neural processes associated with the perception of sounds as either pleasant or aversive. The underlying psychophysical theory is based on auditory scene analysis, the process through which listeners parse auditory signals into individual acoustic sources. The first experiment tests and confirms that a self-rated pleasantness continuum reliably exists for 20 various stimuli (r = .48). In addition, the pleasantness continuum correlated with the physical acoustic characteristics of consonance/dissonance (r = .78), which can facilitate auditory parsing processes. The second experiment uses an fMRI block design to test blood oxygen level dependent (BOLD) changes elicited by a subset of 5 exemplar stimuli chosen from Experiment 1 that are evenly distributed over the pleasantness continuum. Specifically, it tests and confirms that the pleasantness continuum produces systematic changes in brain activity for unpleasant acoustic stimuli beyond what occurs with pleasant auditory stimuli. Results revealed that the combination of two positively and two negatively valenced experimental sounds compared to one neutral baseline control elicited BOLD increases in the primary auditory cortex, specifically the bilateral superior temporal gyrus, and left dorsomedial prefrontal cortex; the latter being consistent with a frontal decision-making process common in identification tasks. The negatively-valenced stimuli yielded additional BOLD increases in the left insula, which typically indicates processing of visceral emotions. The positively-valenced stimuli did not yield any significant BOLD activation, consistent with consonant, harmonic stimuli being the prototypical acoustic pattern of auditory objects that is optimal for auditory scene analysis. Both the psychophysical findings of Experiment 1 and the neural processing findings of Experiment 2 support that consonance is an important dimension of sound that is processed in a manner that aids auditory parsing and functional representation of acoustic objects and was found to be a principal feature of pleasing auditory stimuli.
Machine learning to parse breast pathology reports in Chinese.
Tang, Rong; Ouyang, Lizhi; Li, Clara; He, Yue; Griffin, Molly; Taghian, Alphonse; Smith, Barbara; Yala, Adam; Barzilay, Regina; Hughes, Kevin
2018-06-01
Large structured databases of pathology findings are valuable in deriving new clinical insights. However, they are labor intensive to create and generally require manual annotation. There has been some work in the bioinformatics community to support automating this work via machine learning in English. Our contribution is to provide an automated approach to construct such structured databases in Chinese, and to set the stage for extraction from other languages. We collected 2104 de-identified Chinese benign and malignant breast pathology reports from Hunan Cancer Hospital. Physicians with native Chinese proficiency reviewed the reports and annotated a variety of binary and numerical pathologic entities. After excluding 78 cases with a bilateral lesion in the same report, 1216 cases were used as a training set for the algorithm, which was then refined by 405 development cases. The Natural language processing algorithm was tested by using the remaining 405 cases to evaluate the machine learning outcome. The model was used to extract 13 binary entities and 8 numerical entities. When compared to physicians with native Chinese proficiency, the model showed a per-entity accuracy from 91 to 100% for all common diagnoses on the test set. The overall accuracy of binary entities was 98% and of numerical entities was 95%. In a per-report evaluation for binary entities with more than 100 training cases, 85% of all the testing reports were completely correct and 11% had an error in 1 out of 22 entities. We have demonstrated that Chinese breast pathology reports can be automatically parsed into structured data using standard machine learning approaches. The results of our study demonstrate that techniques effective in parsing English reports can be scaled to other languages.
A Seer of Trump's Coming Parses Repeal and Replace.
Kirkner, Richard Mark
2017-03-01
Diana Furchtgott-Roth, a senior fellow at the Manhattan Institute, a freemarket think tank, confidently predicted back in October what few people saw coming-Donald Trump's electoral victory. Now she gives her take on the dismantling of the ACA and what might come after.
Design Report for the Synchronized Position, Velocity, and Time Code Generator
2015-08-01
Stream Specification 4 2.3 Data Packet Format Specification 4 2.3.1 Individual Message Definition 5 3. MATLAB Parsing Software 6 4. Conclusions and...packet format structure ..................................................................4 Table 2 PPS time message definition ...5 Table 3 Position message definition ...................................................................5
This SOP described the method used to automatically parse analytical data generated from gas chromatography/mass spectrometry (GC/MS) analyses into CTEPP summary spreadsheets and electronically import the summary spreadsheets into the CTEPP study database.
Hart, Reece K; Rico, Rudolph; Hare, Emily; Garcia, John; Westbrook, Jody; Fusaro, Vincent A
2015-01-15
Biological sequence variants are commonly represented in scientific literature, clinical reports and databases of variation using the mutation nomenclature guidelines endorsed by the Human Genome Variation Society (HGVS). Despite the widespread use of the standard, no freely available and comprehensive programming libraries are available. Here we report an open-source and easy-to-use Python library that facilitates the parsing, manipulation, formatting and validation of variants according to the HGVS specification. The current implementation focuses on the subset of the HGVS recommendations that precisely describe sequence-level variation relevant to the application of high-throughput sequencing to clinical diagnostics. The package is released under the Apache 2.0 open-source license. Source code, documentation and issue tracking are available at http://bitbucket.org/hgvs/hgvs/. Python packages are available at PyPI (https://pypi.python.org/pypi/hgvs). Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Parsing recursive sentences with a connectionist model including a neural stack and synaptic gating.
Fedor, Anna; Ittzés, Péter; Szathmáry, Eörs
2011-02-21
It is supposed that humans are genetically predisposed to be able to recognize sequences of context-free grammars with centre-embedded recursion while other primates are restricted to the recognition of finite state grammars with tail-recursion. Our aim was to construct a minimalist neural network that is able to parse artificial sentences of both grammars in an efficient way without using the biologically unrealistic backpropagation algorithm. The core of this network is a neural stack-like memory where the push and pop operations are regulated by synaptic gating on the connections between the layers of the stack. The network correctly categorizes novel sentences of both grammars after training. We suggest that the introduction of the neural stack memory will turn out to be substantial for any biological 'hierarchical processor' and the minimalist design of the model suggests a quest for similar, realistic neural architectures. Copyright © 2010 Elsevier Ltd. All rights reserved.
Mathematical formula recognition using graph grammar
NASA Astrophysics Data System (ADS)
Lavirotte, Stephane; Pottier, Loic
1998-04-01
This paper describes current results of Ofr, a system for extracting and understanding mathematical expressions in documents. Such a tool could be really useful to be able to re-use knowledge in scientific books which are not available in electronic form. We currently also study use of this system for direct input of formulas with a graphical tablet for computer algebra system softwares. Existing solutions for mathematical recognition have problems to analyze 2D expressions like vectors and matrices. This is because they often try to use extended classical grammar to analyze formulas, relatively to baseline. But a lot of mathematical notations do not respect rules for such a parsing and that is the reason why they fail to extend text parsing technic. We investigate graph grammar and graph rewriting as a solution to recognize 2D mathematical notations. Graph grammar provide a powerful formalism to describe structural manipulations of multi-dimensional data. The main two problems to solve are ambiguities between rules of grammar and construction of graph.
Jiao, Dazhi; Wild, David J
2009-02-01
This paper proposes a system that automatically extracts CYP protein and chemical interactions from journal article abstracts, using natural language processing (NLP) and text mining methods. In our system, we employ a maximum entropy based learning method, using results from syntactic, semantic, and lexical analysis of texts. We first present our system architecture and then discuss the data set for training our machine learning based models and the methods in building components in our system, such as part of speech (POS) tagging, Named Entity Recognition (NER), dependency parsing, and relation extraction. An evaluation of the system is conducted at the end, yielding very promising results: The POS, dependency parsing, and NER components in our system have achieved a very high level of accuracy as measured by precision, ranging from 85.9% to 98.5%, and the precision and the recall of the interaction extraction component are 76.0% and 82.6%, and for the overall system are 68.4% and 72.2%, respectively.
Parse, simulation, and prediction of NOx emission across the Midwestern United States
NASA Astrophysics Data System (ADS)
Fang, H.; Michalski, G. M.; Spak, S.
2017-12-01
Accurately constraining N emissions in space and time has been a challenge for atmospheric scientists. It has been suggested that 15N isotopes may be a way of tracking N emission sources across various spatial and temporal scales. However, the complexity of multiple N sources that can quickly change in intensity has made this a difficult problem. We have used a SMOKE emission model to parse NOx emission across the Midwestern United States for a one-year simulation. An isotope mass balance methods was used to assign 15N values to road, non-road, point, and area sources. The SMOKE emissions and isotope mass balance were then combined to predict the 15N of NOx emissions (Figure 1). This ^15N of NOx emissions model was then incorporated into CMAQ to assess the role of transport and chemistry would impact the 15N value of NOx due to mixing and removal processes. The predicted 15N value of NOx was compared to those in recent measurements of NOx and atmospheric nitrate.
Stochastic Time Models of Syllable Structure
Shaw, Jason A.; Gafos, Adamantios I.
2015-01-01
Drawing on phonology research within the generative linguistics tradition, stochastic methods, and notions from complex systems, we develop a modelling paradigm linking phonological structure, expressed in terms of syllables, to speech movement data acquired with 3D electromagnetic articulography and X-ray microbeam methods. The essential variable in the models is syllable structure. When mapped to discrete coordination topologies, syllabic organization imposes systematic patterns of variability on the temporal dynamics of speech articulation. We simulated these dynamics under different syllabic parses and evaluated simulations against experimental data from Arabic and English, two languages claimed to parse similar strings of segments into different syllabic structures. Model simulations replicated several key experimental results, including the fallibility of past phonetic heuristics for syllable structure, and exposed the range of conditions under which such heuristics remain valid. More importantly, the modelling approach consistently diagnosed syllable structure proving resilient to multiple sources of variability in experimental data including measurement variability, speaker variability, and contextual variability. Prospects for extensions of our modelling paradigm to acoustic data are also discussed. PMID:25996153
Form and function: Optional complementizers reduce causal inferences
Rohde, Hannah; Tyler, Joseph; Carlson, Katy
2017-01-01
Many factors are known to influence the inference of the discourse coherence relationship between two sentences. Here, we examine the relationship between two conjoined embedded clauses in sentences like The professor noted that the student teacher did not look confident and (that) the students were poorly behaved. In two studies, we find that the presence of that before the second embedded clause in such sentences reduces the possibility of a forward causal relationship between the clauses, i.e., the inference that the student teacher’s confidence was what affected student behavior. Three further studies tested the possibility of a backward causal relationship between clauses in the same structure, and found that the complementizer’s presence aids that relationship, especially in a forced-choice paradigm. The empirical finding that a complementizer, a linguistic element associated primarily with structure rather than event-level semantics, can affect discourse coherence is novel and illustrates an interdependence between syntactic parsing and discourse parsing. PMID:28804781
Hart, Reece K.; Rico, Rudolph; Hare, Emily; Garcia, John; Westbrook, Jody; Fusaro, Vincent A.
2015-01-01
Summary: Biological sequence variants are commonly represented in scientific literature, clinical reports and databases of variation using the mutation nomenclature guidelines endorsed by the Human Genome Variation Society (HGVS). Despite the widespread use of the standard, no freely available and comprehensive programming libraries are available. Here we report an open-source and easy-to-use Python library that facilitates the parsing, manipulation, formatting and validation of variants according to the HGVS specification. The current implementation focuses on the subset of the HGVS recommendations that precisely describe sequence-level variation relevant to the application of high-throughput sequencing to clinical diagnostics. Availability and implementation: The package is released under the Apache 2.0 open-source license. Source code, documentation and issue tracking are available at http://bitbucket.org/hgvs/hgvs/. Python packages are available at PyPI (https://pypi.python.org/pypi/hgvs). Contact: reecehart@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25273102
Software Development Of XML Parser Based On Algebraic Tools
NASA Astrophysics Data System (ADS)
Georgiev, Bozhidar; Georgieva, Adriana
2011-12-01
In this paper, is presented one software development and implementation of an algebraic method for XML data processing, which accelerates XML parsing process. Therefore, the proposed in this article nontraditional approach for fast XML navigation with algebraic tools contributes to advanced efforts in the making of an easier user-friendly API for XML transformations. Here the proposed software for XML documents processing (parser) is easy to use and can manage files with strictly defined data structure. The purpose of the presented algorithm is to offer a new approach for search and restructuring hierarchical XML data. This approach permits fast XML documents processing, using algebraic model developed in details in previous works of the same authors. So proposed parsing mechanism is easy accessible to the web consumer who is able to control XML file processing, to search different elements (tags) in it, to delete and to add a new XML content as well. The presented various tests show higher rapidity and low consumption of resources in comparison with some existing commercial parsers.
Video content parsing based on combined audio and visual information
NASA Astrophysics Data System (ADS)
Zhang, Tong; Kuo, C.-C. Jay
1999-08-01
While previous research on audiovisual data segmentation and indexing primarily focuses on the pictorial part, significant clues contained in the accompanying audio flow are often ignored. A fully functional system for video content parsing can be achieved more successfully through a proper combination of audio and visual information. By investigating the data structure of different video types, we present tools for both audio and visual content analysis and a scheme for video segmentation and annotation in this research. In the proposed system, video data are segmented into audio scenes and visual shots by detecting abrupt changes in audio and visual features, respectively. Then, the audio scene is categorized and indexed as one of the basic audio types while a visual shot is presented by keyframes and associate image features. An index table is then generated automatically for each video clip based on the integration of outputs from audio and visual analysis. It is shown that the proposed system provides satisfying video indexing results.
Parsing fear: A reassessment of the evidence for fear deficits in psychopathy.
Hoppenbrouwers, Sylco S; Bulten, Berend H; Brazil, Inti A
2016-06-01
Psychopathy is a personality disorder characterized by interpersonal manipulation and callousness, and reckless and impulsive antisocial behavior. It is often seen as a disorder in which profound emotional disturbances lead to antisocial behavior. A lack of fear in particular has been proposed as an etiologically salient factor. In this review, we employ a conceptual model in which fear is parsed into separate subcomponents. Important historical conceptualizations of psychopathy, the neuroscientific and empirical evidence for fear deficits in psychopathy are compared against this model. The empirical evidence is also subjected to a meta-analysis. We conclude that most studies have used the term "fear" generically, amassing different methods and levels of measurement under the umbrella term "fear." Unlike earlier claims that psychopathy is related to general fearlessness, we show there is evidence that psychopathic individuals have deficits in threat detection and responsivity, but that the evidence for reduced subjective experience of fear in psychopathy is far less compelling. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Semantic Support and Parallel Parsing in Chinese
ERIC Educational Resources Information Center
Hsieh, Yufen; Boland, Julie E.
2015-01-01
Two eye-tracking experiments were conducted using written Chinese sentences that contained a multi-word ambiguous region. The goal was to determine whether readers maintained multiple interpretations throughout the ambiguous region or selected a single interpretation at the point of ambiguity. Within the ambiguous region, we manipulated the…
ERIC Educational Resources Information Center
Kirshner, David
1989-01-01
A structured system of visual features is seen to parallel the propositional hierarchy of operations usually associated with the parsing of algebraic expressions. Women more than men were found to depend on these visual cues. Possible causes and consequences are discussed. Subjects were secondary and college students. (Author/DC)
Instructional Implications of Inquiry in Reading Comprehension.
ERIC Educational Resources Information Center
Snow, David
A contract deliverable on the NIE Communication Skills Project, this report consists of three separate documents describing the instructional implications of the analytic and empirical work carried out for the "Classroom Instruction in Reading Comprehension" part of the project: (1) Guidelines for Phrasal Segmentation; (2) Parsing Tasks…
Memory Retrieval in Parsing and Interpretation
ERIC Educational Resources Information Center
Schlueter, Ananda Lila Zoe
2017-01-01
This dissertation explores the relationship between the parser and the grammar in error-driven retrieval by examining the mechanism underlying the illusory licensing of subject-verb agreement violations ("agreement attraction"). Previous work motivates a two-stage model of agreement attraction in which the parser predicts the verb's…
ERIC Educational Resources Information Center
Gabbard, Ryan
2010-01-01
Understanding the syntactic structure of a sentence is a necessary preliminary to understanding its semantics and therefore for many practical applications. The field of natural language processing has achieved a high degree of accuracy in parsing, at least in English. However, the syntactic structures produced by the most commonly used parsers…
Revisiting Executive Function Measurement: Implications for Lifespan Development
ERIC Educational Resources Information Center
Wiebe, Sandra A.; McFall, G. Peggy
2014-01-01
Since Miyake and his colleagues (2000) published their seminal paper on the use of confirmatory factor analysis (CFA) to parse executive function (EF), CFA methods have become ubiquitous in EF research. In their interesting and thoughtful Focus article, "Executive Function: Formative Versus Reflective Measurement," Willoughby and…
Lessons Learned in Part-of-Speech Tagging of Conversational Speech
2010-10-01
for conversational speech recognition. In Plenary Meeting and Symposium on Prosody and Speech Processing. Slav Petrov and Dan Klein. 2007. Improved...inference for unlexicalized parsing. In HLT-NAACL. Slav Petrov. 2010. Products of random latent variable grammars. In HLT-NAACL. Brian Roark, Yang Liu
Decoupling Object Detection and Categorization
ERIC Educational Resources Information Center
Mack, Michael L.; Palmeri, Thomas J.
2010-01-01
We investigated whether there exists a behavioral dependency between object detection and categorization. Previous work (Grill-Spector & Kanwisher, 2005) suggests that object detection and basic-level categorization may be the very same perceptual mechanism: As objects are parsed from the background they are categorized at the basic level. In…
Conversational Simulation in Computer-Assisted Language Learning: Potential and Reality.
ERIC Educational Resources Information Center
Coleman, D. Wells
1988-01-01
Addresses the potential of conversational simulations for computer-assisted language learning (CALL) and reasons why this potential is largely untapped. Topics discussed include artificial intelligence; microworlds; parsing; realism versus reality in computer software; intelligent tutoring systems; and criteria to clarify what kinds of CALL…
2012-01-01
Background White mold, caused by Sclerotinia sclerotiorum, is one of the most important diseases of pea (Pisum sativum L.), however, little is known about the genetics and biochemistry of this interaction. Identification of genes underlying resistance in the host or pathogenicity and virulence factors in the pathogen will increase our knowledge of the pea-S. sclerotiorum interaction and facilitate the introgression of new resistance genes into commercial pea varieties. Although the S. sclerotiorum genome sequence is available, no pea genome is available, due in part to its large genome size (~3500 Mb) and extensive repeated motifs. Here we present an EST data set specific to the interaction between S. sclerotiorum and pea, and a method to distinguish pathogen and host sequences without a species-specific reference genome. Results 10,158 contigs were obtained by de novo assembly of 128,720 high-quality reads generated by 454 pyrosequencing of the pea-S. sclerotiorum interactome. A method based on the tBLASTx program was modified to distinguish pea and S. sclerotiorum ESTs. To test this strategy, a mixture of known ESTs (18,490 pea and 17,198 S. sclerotiorum ESTs) from public databases were pooled and parsed; the tBLASTx method successfully separated 90.1% of the artificial EST mix with 99.9% accuracy. The tBLASTx method successfully parsed 89.4% of the 454-derived EST contigs, as validated by PCR, into pea (6,299 contigs) and S. sclerotiorum (2,780 contigs) categories. Two thousand eight hundred and forty pea ESTs and 996 S. sclerotiorum ESTs were predicted to be expressed specifically during the pea-S. sclerotiorum interaction as determined by homology search against 81,449 pea ESTs (from flowers, leaves, cotyledons, epi- and hypocotyl, and etiolated and light treated etiolated seedlings) and 57,751 S. sclerotiorum ESTs (from mycelia at neutral pH, developing apothecia and developing sclerotia). Among those ESTs specifically expressed, 277 (9.8%) pea ESTs were predicted to be involved in plant defense and response to biotic or abiotic stress, and 93 (9.3%) S. sclerotiorum ESTs were predicted to be involved in pathogenicity/virulence. Additionally, 142 S. sclerotiorum ESTs were identified as secretory/signal peptides of which only 21 were previously reported. Conclusions We present and characterize an EST resource specific to the pea-S. sclerotiorum interaction. Additionally, the tBLASTx method used to parse S. sclerotiorum and pea ESTs was demonstrated to be a reliable and accurate method to distinguish ESTs without a reference genome. PMID:23181755
ERIC Educational Resources Information Center
Landsbergen, Jan, Ed.; Odijk, Jan, Ed.; van Deemter, Kees, Ed.; van Zanten, Gert Veldhuijzen, Ed.
Papers from the meeting on computational linguistics include: "Conversational Games, Belief Revision and Bayesian Networks" (Stephen G. Pulman); "Valence Alternation without Lexical Rules" (Gosse Bouma); "Filtering Left Dislocation Chains in Parsing Categorical Grammar" (Crit Cremers, Maarten Hijzelendoorn);…
Developing a Large Lexical Database for Information Retrieval, Parsing, and Text Generation Systems.
ERIC Educational Resources Information Center
Conlon, Sumali Pin-Ngern; And Others
1993-01-01
Important characteristics of lexical databases and their applications in information retrieval and natural language processing are explained. An ongoing project using various machine-readable sources to build a lexical database is described, and detailed designs of individual entries with examples are included. (Contains 66 references.) (EAM)
Interference Effects from Grammatically Unavailable Constituents during Sentence Processing
ERIC Educational Resources Information Center
Van Dyke, Julie A.
2007-01-01
Evidence from 3 experiments reveals interference effects from structural relationships that are inconsistent with any grammatical parse of the perceived input. Processing disruption was observed when items occurring between a head and a dependent overlapped with either (or both) syntactic or semantic features of the dependent. Effects of syntactic…
From Internationalisation to Education for Global Citizenship: A Multi-Layered History
ERIC Educational Resources Information Center
Haigh, Martin
2014-01-01
The evolving narrative on internationalisation in higher education is complex and multi-layered. This overview explores the evolution of thinking about internationalisation among different stakeholder groups in universities. It parses out eight coexisting layers that progress from concerns based largely upon institutional survival and competition…
The Temporal Organization of Syllabic Structure
ERIC Educational Resources Information Center
Shaw, Jason A.
2010-01-01
This dissertation develops analytical tools which enable rigorous evaluation of competing syllabic parses on the basis of temporal patterns in speech production data. The data come from the articulographic tracking of fleshpoints on target speech organs, e.g., tongue, lips, jaw, in experiments with native speakers of American English and Moroccan…
A Bootstrapped Approach to Multilingual Text Stream Parsing
ERIC Educational Resources Information Center
Londhe, Nikhil
2017-01-01
The ubiquitous hashtag has disruptively transformed how news stories are reported and shared across social media networks. Often, such text streams are massively multilingual with 50 different languages on an average and contain a combination of subjective user opinion, objective evolving information about the story and unrelated spam. This is in…
Acquisition by Processing Theory: A Theory of Everything?
ERIC Educational Resources Information Center
Carroll, Susanne E.
2004-01-01
Truscott and Sharwood Smith (henceforth T&SS) propose a novel theory of language acquisition, "Acquisition by Processing Theory" (APT), designed to account for both first and second language acquisition, monolingual and bilingual speech perception and parsing, and speech production. This is a tall order. Like any theoretically ambitious…
E-Learning for Depth in the Semantic Web
ERIC Educational Resources Information Center
Shafrir, Uri; Etkind, Masha
2006-01-01
In this paper, we describe concept parsing algorithms, a novel semantic analysis methodology at the core of a new pedagogy that focuses learners attention on deep comprehension of the conceptual content of learned material. Two new e-learning tools are described in some detail: interactive concept discovery learning and meaning equivalence…
Neural Encoding of Relative Position
ERIC Educational Resources Information Center
Hayworth, Kenneth J.; Lescroart, Mark D.; Biederman, Irving
2011-01-01
Late ventral visual areas generally consist of cells having a significant degree of translation invariance. Such a "bag of features" representation is useful for the recognition of individual objects; however, it seems unable to explain our ability to parse a scene into multiple objects and to understand their spatial relationships. We…
Income Sustainability through Educational Attainment
ERIC Educational Resources Information Center
Carlson, Ronald H.; McChesney, Christopher S.
2015-01-01
The authors examined the sustainability of income, as it relates to educational attainment, from the two recent decades, which includes three significant economic downturns. The data was analyzed to determine trends in the wealth gap, parsed by educational attainment and gender. Utilizing the data from 1991 through 2010, predictions in changes in…
TOC as a regional sediment conditionindicator: Parsing effects of grain size and organic content
TOC content of sediments is often used as an indicator of benthic condition. Percent TOC is generally positively correlated with sediment percent fines. While sediment grain size may have impacts on benthic organisms independent of organic content, it is often not explicitly co...
TOC as a regional sediment condition indicator: Parsing effects of grain size and organic content
TOC content of sediments is often used as an indicator of benthic condition. Percent TOC is generally positively correlated with sediment percent fines. While sediment grain size may have impacts on benthic organisms independent of organic content, it is often not explicitly co...
ERIC Educational Resources Information Center
Fernandes, Tania; Kolinsky, Regine; Ventura, Paulo
2009-01-01
This study combined artificial language learning (ALL) with conventional experimental techniques to test whether statistical speech segmentation outputs are integrated into adult listeners' mental lexicon. Lexicalization was assessed through inhibitory effects of novel neighbors (created by the parsing process) on auditory lexical decisions to…
Computer-Assisted Analysis of Written Language: Assessing the Written Language of Deaf Children, II.
ERIC Educational Resources Information Center
Parkhurst, Barbara G.; MacEachron, Marion P.
1980-01-01
Two pilot studies investigated the accuracy of a computer parsing system for analyzing written language of deaf children. Results of the studies showed good agreement between human and machine raters. Journal availability: Elsevier North Holland, Inc., 52 Vanderbilt Avenue, New York, NY 10017. (Author)
Comorbid Social Anxiety Disorder in Adults with Autism Spectrum Disorder
ERIC Educational Resources Information Center
Maddox, Brenna B.; White, Susan W.
2015-01-01
Social anxiety symptoms are common among cognitively unimpaired youth with autism spectrum disorder (ASD). Few studies have investigated the co-occurrence of social anxiety disorder (SAD) in adults with ASD, although identification may aid access to effective treatments and inform our scientific efforts to parse heterogeneity. In this preliminary…
Parsing the Relations of Race and Socioeconomic Status in Special Education Disproportionality
ERIC Educational Resources Information Center
Kincaid, Aleksis P.; Sullivan, Amanda L.
2017-01-01
This study investigated how student and school-level socioeconomic status (SES) measures predict students' odds of being identified for special education, particularly high-incidence disabilities. Using the Early Childhood Longitudinal Study--Kindergarten cohort, hierarchical models were used to determine the relations of student and school SES to…
DOE Office of Scientific and Technical Information (OSTI.GOV)
BERG, MICHAEL; RILEY, MARSHALL
System assessments typically yield large quantities of data from disparate sources for an analyst to scrutinize for issues. Netmeld is used to parse input from different file formats, store the data in a common format, allow users to easily query it, and enable analysts to tie different analysis tools together using a common back-end.
A Multiple-Channel Model of Task-Dependent Ambiguity Resolution in Sentence Comprehension
ERIC Educational Resources Information Center
Logacev, Pavel; Vasishth, Shravan
2016-01-01
Traxler, Pickering, and Clifton (1998) found that ambiguous sentences are read faster than their unambiguous counterparts. This so-called "ambiguity advantage" has presented a major challenge to classical theories of human sentence comprehension (parsing) because its most prominent explanation, in the form of the unrestricted race model…
Conversational Coherency. Technical Report No. 95.
ERIC Educational Resources Information Center
Reichman, Rachel
To analyze the process involved in maintaining conversational coherency, the study described in this paper used a construct called a "context space" that grouped utterances referring to a single issue or episode. The paper defines the types of context spaces, parses individual conversations to identify the underlying model or structure,…
A Flexible, Extensible Online Testing System for Mathematics
ERIC Educational Resources Information Center
Passmore, Tim; Brookshaw, Leigh; Butler, Harry
2011-01-01
An online testing system developed for entry-skills testing of first-year university students in algebra and calculus is described. The system combines the open-source computer algebra system "Maxima" with computer scripts to parse student answers, which are entered using standard mathematical notation and conventions. The answers can…
Modelling Parsing Constraints with High-Dimensional Context Space.
ERIC Educational Resources Information Center
Burgess, Curt; Lund, Kevin
1997-01-01
Presents a model of high-dimensional context space, the Hyperspace Analogue to Language (HAL), with a series of simulations modelling human empirical results. Proposes that HAL's context space can be used to provide a basic categorization of semantic and grammatical concepts; model certain aspects of morphological ambiguity in verbs; and provide…
Event Segmentation Improves Event Memory up to One Month Later
ERIC Educational Resources Information Center
Flores, Shaney; Bailey, Heather R.; Eisenberg, Michelle L.; Zacks, Jeffrey M.
2017-01-01
When people observe everyday activity, they spontaneously parse it into discrete meaningful events. Individuals who segment activity in a more normative fashion show better subsequent memory for the events. If segmenting events effectively leads to better memory, does asking people to attend to segmentation improve subsequent memory? To answer…
The Neural Basis of Speech Parsing in Children and Adults
ERIC Educational Resources Information Center
McNealy, Kristin; Mazziotta, John C.; Dapretto, Mirella
2010-01-01
Word segmentation, detecting word boundaries in continuous speech, is a fundamental aspect of language learning that can occur solely by the computation of statistical and speech cues. Fifty-four children underwent functional magnetic resonance imaging (fMRI) while listening to three streams of concatenated syllables that contained either high…
E-Learning Systems Requirements Elicitation: Perspectives and Considerations
ERIC Educational Resources Information Center
AlKhuder, Shaikha B.; AlAli, Fatma H.
2017-01-01
Training and education have evolved far beyond black boards and chalk boxes. The environment of knowledge exchange requires more than simple materials and assessments. This article is an attempt of parsing through the different aspects of e-learning, understanding the real needs, and conducting the right requirements to build the appropriate…
Perceiving Goals and Actions in Individuals with Autism Spectrum Disorders
ERIC Educational Resources Information Center
Zalla, Tiziana; Labruyère, Nelly; Georgieff, Nicolas
2013-01-01
In the present study, we investigated the ability to parse familiar sequences of action into meaningful events in young individuals with autism spectrum disorders (ASDs), as compared to young individuals with typical development (TD) and young individuals with moderate mental retardation or learning disabilities (MLDs). While viewing two…
SUBTLE: Situation Understanding Bot through Language and Environment
2016-01-06
a 4 day “hackathon” by Stuart Young’s small robots group which successfully ported the SUBTLE MURI NLP robot interface to the Packbot platform they...null element restoration, a step typically ig- nored in NLP systems, allows for correct parsing of im- peratives and questions, critical structures
Perez-Riverol, Yasset; Wang, Rui; Hermjakob, Henning; Müller, Markus; Vesada, Vladimir; Vizcaíno, Juan Antonio
2014-01-01
Data processing, management and visualization are central and critical components of a state of the art high-throughput mass spectrometry (MS)-based proteomics experiment, and are often some of the most time-consuming steps, especially for labs without much bioinformatics support. The growing interest in the field of proteomics has triggered an increase in the development of new software libraries, including freely available and open-source software. From database search analysis to post-processing of the identification results, even though the objectives of these libraries and packages can vary significantly, they usually share a number of features. Common use cases include the handling of protein and peptide sequences, the parsing of results from various proteomics search engines output files, and the visualization of MS-related information (including mass spectra and chromatograms). In this review, we provide an overview of the existing software libraries, open-source frameworks and also, we give information on some of the freely available applications which make use of them. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan. PMID:23467006
Perez-Riverol, Yasset; Wang, Rui; Hermjakob, Henning; Müller, Markus; Vesada, Vladimir; Vizcaíno, Juan Antonio
2014-01-01
Data processing, management and visualization are central and critical components of a state of the art high-throughput mass spectrometry (MS)-based proteomics experiment, and are often some of the most time-consuming steps, especially for labs without much bioinformatics support. The growing interest in the field of proteomics has triggered an increase in the development of new software libraries, including freely available and open-source software. From database search analysis to post-processing of the identification results, even though the objectives of these libraries and packages can vary significantly, they usually share a number of features. Common use cases include the handling of protein and peptide sequences, the parsing of results from various proteomics search engines output files, and the visualization of MS-related information (including mass spectra and chromatograms). In this review, we provide an overview of the existing software libraries, open-source frameworks and also, we give information on some of the freely available applications which make use of them. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan. Copyright © 2013 Elsevier B.V. All rights reserved.
[Design of visualized medical images network and web platform based on MeVisLab].
Xiang, Jun; Ye, Qing; Yuan, Xun
2017-04-01
With the trend of the development of "Internet +", some further requirements for the mobility of medical images have been required in the medical field. In view of this demand, this paper presents a web-based visual medical imaging platform. First, the feasibility of medical imaging is analyzed and technical points. CT (Computed Tomography) or MRI (Magnetic Resonance Imaging) images are reconstructed three-dimensionally by MeVisLab and packaged as X3D (Extensible 3D Graphics) files shown in the present paper. Then, the B/S (Browser/Server) system specially designed for 3D image is designed by using the HTML 5 and WebGL rendering engine library, and the X3D image file is parsed and rendered by the system. The results of this study showed that the platform was suitable for multiple operating systems to realize the platform-crossing and mobilization of medical image data. The development of medical imaging platform is also pointed out in this paper. It notes that web application technology will not only promote the sharing of medical image data, but also facilitate image-based medical remote consultations and distance learning.
Omen: identifying potential spear-phishing targets before the email is sent.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wendt, Jeremy Daniel.
2013-07-01
We present the results of a two year project focused on a common social engineering attack method called "spear phishing". In a spear phishing attack, the user receives an email with information specifically focused on the user. This email contains either a malware-laced attachment or a link to download the malware that has been disguised as a useful program. Spear phishing attacks have been one of the most effective avenues for attackers to gain initial entry into a target network. This project focused on a proactive approach to spear phishing. To create an effective, user-specific spear phishing email, the attackermore » must research the intended recipient. We believe that much of the information used by the attacker is provided by the target organization's own external website. Thus when researching potential targets, the attacker leaves signs of his research in the webserver's logs. We created tools and visualizations to improve cybersecurity analysts' abilities to quickly understand a visitor's visit patterns and interests. Given these suspicious visitors and log-parsing tools, analysts can more quickly identify truly suspicious visitors, search for potential spear-phishing targeted users, and improve security around those users before the spear phishing email is sent.« less
MASPECTRAS: a platform for management and analysis of proteomics LC-MS/MS data
Hartler, Jürgen; Thallinger, Gerhard G; Stocker, Gernot; Sturn, Alexander; Burkard, Thomas R; Körner, Erik; Rader, Robert; Schmidt, Andreas; Mechtler, Karl; Trajanoski, Zlatko
2007-01-01
Background The advancements of proteomics technologies have led to a rapid increase in the number, size and rate at which datasets are generated. Managing and extracting valuable information from such datasets requires the use of data management platforms and computational approaches. Results We have developed the MAss SPECTRometry Analysis System (MASPECTRAS), a platform for management and analysis of proteomics LC-MS/MS data. MASPECTRAS is based on the Proteome Experimental Data Repository (PEDRo) relational database schema and follows the guidelines of the Proteomics Standards Initiative (PSI). Analysis modules include: 1) import and parsing of the results from the search engines SEQUEST, Mascot, Spectrum Mill, X! Tandem, and OMSSA; 2) peptide validation, 3) clustering of proteins based on Markov Clustering and multiple alignments; and 4) quantification using the Automated Statistical Analysis of Protein Abundance Ratios algorithm (ASAPRatio). The system provides customizable data retrieval and visualization tools, as well as export to PRoteomics IDEntifications public repository (PRIDE). MASPECTRAS is freely available at Conclusion Given the unique features and the flexibility due to the use of standard software technology, our platform represents significant advance and could be of great interest to the proteomics community. PMID:17567892
Bioinspired engineering of exploration systems for NASA and DoD
NASA Technical Reports Server (NTRS)
Thakoor, Sarita; Chahl, Javaan; Srinivasan, M. V.; Young, L.; Werblin, Frank; Hine, Butler; Zornetzer, Steven
2002-01-01
A new approach called bioinspired engineering of exploration systems (BEES) and its value for solving pressing NASA and DoD needs are described. Insects (for example honeybees and dragonflies) cope remarkably well with their world, despite possessing a brain containing less than 0.01% as many neurons as the human brain. Although most insects have immobile eyes with fixed focus optics and lack stereo vision, they use a number of ingenious, computationally simple strategies for perceiving their world in three dimensions and navigating successfully within it. We are distilling selected insect-inspired strategies to obtain novel solutions for navigation, hazard avoidance, altitude hold, stable flight, terrain following, and gentle deployment of payload. Such functionality provides potential solutions for future autonomous robotic space and planetary explorers. A BEES approach to developing lightweight low-power autonomous flight systems should be useful for flight control of such biomorphic flyers for both NASA and DoD needs. Recent biological studies of mammalian retinas confirm that representations of multiple features of the visual world are systematically parsed and processed in parallel. Features are mapped to a stack of cellular strata within the retina. Each of these representations can be efficiently modeled in semiconductor cellular nonlinear network (CNN) chips. We describe recent breakthroughs in exploring the feasibility of the unique blending of insect strategies of navigation with mammalian visual search, pattern recognition, and image understanding into hybrid biomorphic flyers for future planetary and terrestrial applications. We describe a few future mission scenarios for Mars exploration, uniquely enabled by these newly developed biomorphic flyers.
Event Structure and Cognitive Control
ERIC Educational Resources Information Center
Reimer, Jason F.; Radvansky, Gabriel A.; Lorsbach, Thomas C.; Armendarez, Joseph J.
2015-01-01
Recently, a great deal of research has demonstrated that although everyday experience is continuous in nature, it is parsed into separate events. The aim of the present study was to examine whether event structure can influence the effectiveness of cognitive control. Across 5 experiments we varied the structure of events within the AX-CPT by…
An Activation-Based Model of Sentence Processing as Skilled Memory Retrieval
ERIC Educational Resources Information Center
Lewis, Richard L.; Vasishth, Shravan
2005-01-01
We present a detailed process theory of the moment-by-moment working-memory retrievals and associated control structure that subserve sentence comprehension. The theory is derived from the application of independently motivated principles of memory and cognitive skill to the specialized task of sentence parsing. The resulting theory construes…
Toward a Dynamic, Multidimensional Research Framework for Strategic Processing
ERIC Educational Resources Information Center
Dinsmore, Daniel L.
2017-01-01
While the empirical literature on strategic processing is vast, understanding how and why certain strategies work for certain learners is far from clear. The purpose of this review is to systematically examine the theoretical and empirical literature on strategic process to parse out current conceptual and methodological progress to inform new…
Is Desegregation Dead? Parsing the Relationship between Achievement and Demographics
ERIC Educational Resources Information Center
Eaton, Susan; Rivkin, Steven
2010-01-01
The Supreme Court declared in 1954 that "separate educational facilities are inherently unequal." Into the 1970s, urban education reform focused predominantly on making sure that African American students had the opportunity to attend school with their white peers. Now, however, most reformers take as a given that the typical low-income minority…
Context Modulates Attention to Social Scenes in Toddlers with Autism
ERIC Educational Resources Information Center
Chawarska, Katarzyna; Macari, Suzanne; Shic, Frederick
2012-01-01
Background: In typical development, the unfolding of social and communicative skills hinges upon the ability to allocate and sustain attention toward people, a skill present moments after birth. Deficits in social attention have been well documented in autism, though the underlying mechanisms are poorly understood. Methods: In order to parse the…
Reanalysis of Clause Boundaries in Japanese as a Constraint-Driven Process.
ERIC Educational Resources Information Center
Miyamoto, Edson T.
2003-01-01
Reports on two experiments that focus on clause boundaries in Japanese that suggest that minimal change restriction is unnecessary to characterize reanalysis. Proposes that the data and previous observations are more naturally explained by a constraint-driven model in which revisions are performed only when required by parsing constraints.…
ERIC Educational Resources Information Center
Coello-Coutino, Gerardo; Ainsworth, Shirley; Escalante-Gonzalbo, Ana Marie
2002-01-01
Describes Hermes, a research tool that uses specially designed acquisition, parsing and presentation methods to integrate information resources on the Internet, from searching in disparate bibliographic databases, to accessing full text articles online, and developing a web of information associated with each reference via one common interface.…
ERIC Educational Resources Information Center
Witzel, Jeffrey; Witzel, Naoko; Nicol, Janet
2012-01-01
This study examines the reading patterns of native speakers (NSs) and high-level (Chinese) nonnative speakers (NNSs) on three English sentence types involving temporarily ambiguous structural configurations. The reading patterns on each sentence type indicate that both NSs and NNSs were biased toward specific structural interpretations. These…
Large Constituent Families Help Children Parse Compounds
ERIC Educational Resources Information Center
Krott, Andrea; Nicoladis, Elena
2005-01-01
The family size of the constituents of compound words, or the number of compounds sharing the constituents, has been shown to affect adults' access to compound words in the mental lexicon. The present study was designed to see if family size would affect children's segmentation of compounds. Twenty-five English-speaking children between 3;7 and…
Neural Responses to the Production and Comprehension of Syntax in Identical Utterances
ERIC Educational Resources Information Center
Indefrey, Peter; Hellwig, Frauke; Herzog, Hans; Seitz, Rudiger J.; Hagoort, Peter
2004-01-01
Following up on an earlier positron emission tomography (PET) experiment (Indefrey et al., 2001), we used a scene description paradigm to investigate whether a posterior inferior frontal region subserving syntactic encoding for speaking is also involved in syntactic parsing during listening. In the language production part of the experiment,…
ERIC Educational Resources Information Center
Pappamihiel, N. Eleni; Lynn, C. Allen
2016-01-01
While many teachers and teacher educators in the United States K-12 system acknowledge that the English language learners (ELLs) in our schools need modifications and accommodations to help them succeed in school, few attempt to parse out how different types of accommodations may affect learning in the mainstream classroom, specifically linguistic…
ERIC Educational Resources Information Center
Fox, Allison M.; Reid, Corinne L.; Anderson, Mike; Richardson, Cassandra; Bishop, Dorothy V. M.
2012-01-01
According to the rapid auditory processing theory, the ability to parse incoming auditory information underpins learning of oral and written language. There is wide variation in this low-level perceptual ability, which appears to follow a protracted developmental course. We studied the development of rapid auditory processing using event-related…
How Teachers Teach: Mapping the Terrain of Practice
ERIC Educational Resources Information Center
Sykes, Gary; Wilson, Suzanne
2015-01-01
This paper--conceived as a framework for competencies in teaching--represents an interpretive synthesis by the authors of main and contemporary currents in the research on teaching and learning. The framework resulting from this review parses teaching into two main domains--instruction and role responsibilities--within each of which a set of broad…
Graphemic Cohesion Effect in Reading and Writing Complex Graphemes
ERIC Educational Resources Information Center
Spinelli, Elsa; Kandel, Sonia; Guerassimovitch, Helena; Ferrand, Ludovic
2012-01-01
"AU" /o/ and "AN" /a/ in French are both complex graphemes, but they vary in their strength of association to their respective sounds. The letter sequence "AU" is systematically associated to the phoneme /o/, and as such is always parsed as a complex grapheme. However, "AN" can be associated with either one…
ERIC Educational Resources Information Center
Maxfield, Nathan D.; Lyon, Justine M.; Silliman, Elaine R.
2009-01-01
Bailey and Ferreira (2003) hypothesized and reported behavioral evidence that disfluencies (filled and silent pauses) undesirably affect sentence processing when they appear before disambiguating verbs in Garden Path (GP) sentences. Disfluencies here cause the parser to "linger" on, and apparently accept as correct, an erroneous parse. Critically,…
ERIC Educational Resources Information Center
Wall, C. Edward; And Others
1995-01-01
Discusses the integration of Standard General Markup Language, Hypertext Markup Language, and MARC format to parse classified analytical bibliographies. Use of the resulting electronic knowledge constructs in local library systems as maps of a specified subset of resources is discussed, and an example is included. (LRW)
Graduation Rates: Real Kids, Real Numbers
ERIC Educational Resources Information Center
Swanson, Christopher B.
2004-01-01
Controversies over graduation rates and No Child Left Behind have raged in research, media and political circles for almost a year. All too often, though, when complex issues of social and economic importance collide with policy and politics, heat is generated but little light. As a result, it may be difficult for local educators to parse the…
Allocation of Limited Cognitive Resources during Text Comprehension in a Second Language
ERIC Educational Resources Information Center
Morishima, Yasunori
2013-01-01
For native (L1) comprehenders, lower-level language processes such as lexical access and parsing are considered to consume few cognitive resources. In contrast, these processes pose considerable demands for second-language (L2) comprehenders. Two reading-time experiments employing inconsistency detection found that English learners did not detect…
Using Artificial Intelligence To Teach English to Deaf People. Final Report.
ERIC Educational Resources Information Center
Loritz, Donald; Zambrano, Robert
This report describes a project to develop an English grammar-checking word processor intended for use by college students with hearing impairments. The project succeeded in its first objective, achievement of 92 percent parsing accuracy across the freely written compositions of college-bound deaf students. The second objective, ability to use the…
On the Early Left-Anterior Negativity (ELAN) in Syntax Studies
ERIC Educational Resources Information Center
Steinhauer, Karsten; Drury, John E.
2012-01-01
Within the framework of Friederici's (2002) neurocognitive model of sentence processing, the early left anterior negativity (ELAN) in event-related potentials (ERPs) has been claimed to be a brain marker of syntactic first-pass parsing. As ELAN components seem to be exclusively elicited by word category violations (phrase structure violations),…
Change of Academic Major: The Influence of Broad and Narrow Personality Traits
ERIC Educational Resources Information Center
Foster, N. A.
2017-01-01
The relationship between academic major change and ten personality traits (the Big Five and five narrow traits), was investigated in a sample of 437 college undergraduates. Contrary to expectations, Career Decidedness and Optimism were positively related to academic major change, regardless of class ranking. When parsing data by college year,…
Immediate use of prosody and context in predicting a syntactic structure.
Nakamura, Chie; Arai, Manabu; Mazuka, Reiko
2012-11-01
Numerous studies have reported an effect of prosodic information on parsing but whether prosody can impact even the initial parsing decision is still not evident. In a visual world eye-tracking experiment, we investigated the influence of contrastive intonation and visual context on processing temporarily ambiguous relative clause sentences in Japanese. Our results showed that listeners used the prosodic cue to make a structural prediction before hearing disambiguating information. Importantly, the effect was limited to cases where the visual scene provided an appropriate context for the prosodic cue, thus eliminating the explanation that listeners have simply associated marked prosodic information with a less frequent structure. Furthermore, the influence of the prosodic information was also evident following disambiguating information, in a way that reflected the initial analysis. The current study demonstrates that prosody, when provided with an appropriate context, influences the initial syntactic analysis and also the subsequent cost at disambiguating information. The results also provide first evidence for pre-head structural prediction driven by prosodic and contextual information with a head-final construction. Copyright © 2012 Elsevier B.V. All rights reserved.
What is it that lingers? Garden-path (mis)interpretations in younger and older adults.
Malyutina, Svetlana; den Ouden, Dirk-Bart
2016-01-01
Previous research has shown that comprehenders do not always conduct a full (re)analysis of temporarily ambiguous "garden-path" sentences. The present study used a sentence-picture matching task to investigate what kind of representations are formed when full reanalysis is not performed: Do comprehenders "blend" two incompatible representations as a result of shallow syntactic processing or do they erroneously maintain the initial incorrect parsing without incorporating new information, and does this vary with age? Twenty-five younger and 15 older adults performed a multiple-choice sentence-picture matching task with stimuli including early-closure garden-path sentences. The results suggest that the type of erroneous representation is affected by linguistic variables, such as sentence structure, verb type, and semantic plausibility, as well as by age. Older adults' response patterns indicate an increased reliance on inferencing based on lexical and semantic cues, with a lower bar for accepting an initial parse and with a weaker drive to reanalyse a syntactic representation. Among younger adults, there was a tendency to blend two representations into a single interpretation, even if this was not licensed by the syntax.
Patson, Nikole D; Ferreira, Fernanda
2009-05-01
In three eyetracking studies, we investigated the role of conceptual plurality in initial parsing decisions in temporarily ambiguous sentences with reciprocal verbs (e.g., While the lovers kissed the baby played alone). We varied the subject of the first clause using three types of plural noun phrases: conjoined noun phrases (the bride and the groom), plural definite descriptions (the lovers), and numerically quantified noun phrases (the two lovers). We found no evidence for garden-path effects when the subject was conjoined (Ferreira & McClure, 1997), but traditional garden-path effects were found with the other plural noun phrases. In addition, we tested plural anaphors that had a plural antecedent present in the discourse. We found that when the antecedent was conjoined, garden-path effects were absent compared to cases in which the antecedent was a plural definite description. Our results indicate that the parser is sensitive to the conceptual representation of a plural constituent. In particular, it appears that a Complex Reference Object (Moxey et al., 2004) automatically activates a reciprocal reading of a reciprocal verb.
A fusion network for semantic segmentation using RGB-D data
NASA Astrophysics Data System (ADS)
Yuan, Jiahui; Zhang, Kun; Xia, Yifan; Qi, Lin; Dong, Junyu
2018-04-01
Semantic scene parsing is considerable in many intelligent field, including perceptual robotics. For the past few years, pixel-wise prediction tasks like semantic segmentation with RGB images has been extensively studied and has reached very remarkable parsing levels, thanks to convolutional neural networks (CNNs) and large scene datasets. With the development of stereo cameras and RGBD sensors, it is expected that additional depth information will help improving accuracy. In this paper, we propose a semantic segmentation framework incorporating RGB and complementary depth information. Motivated by the success of fully convolutional networks (FCN) in semantic segmentation field, we design a fully convolutional networks consists of two branches which extract features from both RGB and depth data simultaneously and fuse them as the network goes deeper. Instead of aggregating multiple model, our goal is to utilize RGB data and depth data more effectively in a single model. We evaluate our approach on the NYU-Depth V2 dataset, which consists of 1449 cluttered indoor scenes, and achieve competitive results with the state-of-the-art methods.
Language experience changes subsequent learning
Onnis, Luca; Thiessen, Erik
2013-01-01
What are the effects of experience on subsequent learning? We explored the effects of language-specific word order knowledge on the acquisition of sequential conditional information. Korean and English adults were engaged in a sequence learning task involving three different sets of stimuli: auditory linguistic (nonsense syllables), visual non-linguistic (nonsense shapes), and auditory non-linguistic (pure tones). The forward and backward probabilities between adjacent elements generated two equally probable and orthogonal perceptual parses of the elements, such that any significant preference at test must be due to either general cognitive biases, or prior language-induced biases. We found that language modulated parsing preferences with the linguistic stimuli only. Intriguingly, these preferences are congruent with the dominant word order patterns of each language, as corroborated by corpus analyses, and are driven by probabilistic preferences. Furthermore, although the Korean individuals had received extensive formal explicit training in English and lived in an English-speaking environment, they exhibited statistical learning biases congruent with their native language. Our findings suggest that mechanisms of statistical sequential learning are implicated in language across the lifespan, and experience with language may affect cognitive processes and later learning. PMID:23200510
Two models of minimalist, incremental syntactic analysis.
Stabler, Edward P
2013-07-01
Minimalist grammars (MGs) and multiple context-free grammars (MCFGs) are weakly equivalent in the sense that they define the same languages, a large mildly context-sensitive class that properly includes context-free languages. But in addition, for each MG, there is an MCFG which is strongly equivalent in the sense that it defines the same language with isomorphic derivations. However, the structure-building rules of MGs but not MCFGs are defined in a way that generalizes across categories. Consequently, MGs can be exponentially more succinct than their MCFG equivalents, and this difference shows in parsing models too. An incremental, top-down beam parser for MGs is defined here, sound and complete for all MGs, and hence also capable of parsing all MCFG languages. But since the parser represents its grammar transparently, the relative succinctness of MGs is again evident. Although the determinants of MG structure are narrowly and discretely defined, probabilistic influences from a much broader domain can influence even the earliest analytic steps, allowing frequency and context effects to come early and from almost anywhere, as expected in incremental models. Copyright © 2013 Cognitive Science Society, Inc.
A resource-saving collective approach to biomedical semantic role labeling
2014-01-01
Background Biomedical semantic role labeling (BioSRL) is a natural language processing technique that identifies the semantic roles of the words or phrases in sentences describing biological processes and expresses them as predicate-argument structures (PAS’s). Currently, a major problem of BioSRL is that most systems label every node in a full parse tree independently; however, some nodes always exhibit dependency. In general SRL, collective approaches based on the Markov logic network (MLN) model have been successful in dealing with this problem. However, in BioSRL such an approach has not been attempted because it would require more training data to recognize the more specialized and diverse terms found in biomedical literature, increasing training time and computational complexity. Results We first constructed a collective BioSRL system based on MLN. This system, called collective BIOSMILE (CBIOSMILE), is trained on the BioProp corpus. To reduce the resources used in BioSRL training, we employ a tree-pruning filter to remove unlikely nodes from the parse tree and four argument candidate identifiers to retain candidate nodes in the tree. Nodes not recognized by any candidate identifier are discarded. The pruned annotated parse trees are used to train a resource-saving MLN-based system, which is referred to as resource-saving collective BIOSMILE (RCBIOSMILE). Our experimental results show that our proposed CBIOSMILE system outperforms BIOSMILE, which is the top BioSRL system. Furthermore, our proposed RCBIOSMILE maintains the same level of accuracy as CBIOSMILE using 92% less memory and 57% less training time. Conclusions This greatly improved efficiency makes RCBIOSMILE potentially suitable for training on much larger BioSRL corpora over more biomedical domains. Compared to real-world biomedical corpora, BioProp is relatively small, containing only 445 MEDLINE abstracts and 30 event triggers. It is not large enough for practical applications, such as pathway construction. We consider it of primary importance to pursue SRL training on large corpora in the future. PMID:24884358
Towards a unified theory of neocortex: laminar cortical circuits for vision and cognition.
Grossberg, Stephen
2007-01-01
A key goal of computational neuroscience is to link brain mechanisms to behavioral functions. The present article describes recent progress towards explaining how laminar neocortical circuits give rise to biological intelligence. These circuits embody two new and revolutionary computational paradigms: Complementary Computing and Laminar Computing. Circuit properties include a novel synthesis of feedforward and feedback processing, of digital and analog processing, and of preattentive and attentive processing. This synthesis clarifies the appeal of Bayesian approaches but has a far greater predictive range that naturally extends to self-organizing processes. Examples from vision and cognition are summarized. A LAMINART architecture unifies properties of visual development, learning, perceptual grouping, attention, and 3D vision. A key modeling theme is that the mechanisms which enable development and learning to occur in a stable way imply properties of adult behavior. It is noted how higher-order attentional constraints can influence multiple cortical regions, and how spatial and object attention work together to learn view-invariant object categories. In particular, a form-fitting spatial attentional shroud can allow an emerging view-invariant object category to remain active while multiple view categories are associated with it during sequences of saccadic eye movements. Finally, the chapter summarizes recent work on the LIST PARSE model of cognitive information processing by the laminar circuits of prefrontal cortex. LIST PARSE models the short-term storage of event sequences in working memory, their unitization through learning into sequence, or list, chunks, and their read-out in planned sequential performance that is under volitional control. LIST PARSE provides a laminar embodiment of Item and Order working memories, also called Competitive Queuing models, that have been supported by both psychophysical and neurobiological data. These examples show how variations of a common laminar cortical design can embody properties of visual and cognitive intelligence that seem, at least on the surface, to be mechanistically unrelated.
The Lucy Calkins Project: Parsing a Self-Proclaimed Literacy Guru
ERIC Educational Resources Information Center
Feinberg, Barbara
2007-01-01
This article discusses the work of Lucy McCormick Calkins, an educator and the visionary founding director of Teachers College Reading and Writing Project. Begun in 1981, the think tank and teacher training institute has since trained hundreds of thousands of educators across the country. Calkins is one of the original architects of the…
Working Memory Effects in the L2 Processing of Ambiguous Relative Clauses
ERIC Educational Resources Information Center
Hopp, Holger
2014-01-01
This article investigates whether and how L2 sentence processing is affected by memory constraints that force serial parsing. Monitoring eye movements, we test effects of working memory on L2 relative-clause attachment preferences in a sample of 75 late-adult German learners of English and 25 native English controls. Mixed linear regression…
ERIC Educational Resources Information Center
Smith, David Arthur
2010-01-01
Much recent work in natural language processing treats linguistic analysis as an inference problem over graphs. This development opens up useful connections between machine learning, graph theory, and linguistics. The first part of this dissertation formulates syntactic dependency parsing as a dynamic Markov random field with the novel…
Using topography to meet wildlife and fuels treatment objectives in fire-suppressed landscapes
Emma C. Underwood; Joshua H. Viers; James F. Quinn; Malcolm North
2010-01-01
Past forest management practices, fire suppression, and climate change are increasing the need to actively manage California Sierra Nevada forests for multiple environmental amenities. Here we present a relatively low-cost, repeatable method for spatially parsing the landscape to help the U.S. Forest Service manage for different forest and fuel conditions to meet...
Robust Deep Semantics for Language Understanding
focus on five areas: deep learning, textual inferential relations, relation and event extraction by distant supervision , semantic parsing and...ontology expansion, and coreference resolution. As time went by, the program focus converged towards emphasizing technologies for knowledge base...natural logic methods for text understanding, improved mention coreference algorithms, and the further development of multilingual tools in CoreNLP.
The Effects of Syntactically Parsed Text Formats on Intensive Reading in EFL
ERIC Educational Resources Information Center
Herbert, John C.
2014-01-01
Separating text into meaningful language chunks, as with visual-syntactic text formatting, helps readers to process text more easily and language learners to recognize grammar and syntax patterns more quickly. Evidence of this exists in studies on native and non-native English speakers. However, recent studies question the roll of VSTF in certain…
ERIC Educational Resources Information Center
Rice, Katherine; Moriuchi, Jennifer M.; Jones, Warren; Klin, Ami
2012-01-01
Objective: To examine patterns of variability in social visual engagement and their relationship to standardized measures of social disability in a heterogeneous sample of school-aged children with autism spectrum disorders (ASD). Method: Eye-tracking measures of visual fixation during free-viewing of dynamic social scenes were obtained for 109…
ERIC Educational Resources Information Center
Marinis, Theodoros; Saddy, Douglas
2013-01-01
Twenty-five monolingual (L1) children with specific language impairment (SLI), 32 sequential bilingual (L2) children, and 29 L1 controls completed the Test of Active & Passive Sentences-Revised (van der Lely 1996) and the Self-Paced Listening Task with Picture Verification for actives and passives (Marinis 2007). These revealed important…
A New Framework for Textual Information Mining over Parse Trees. CRESST Report 805
ERIC Educational Resources Information Center
Mousavi, Hamid; Kerr, Deirdre; Iseli, Markus R.
2011-01-01
Textual information mining is a challenging problem that has resulted in the creation of many different rule-based linguistic query languages. However, these languages generally are not optimized for the purpose of text mining. In other words, they usually consider queries as individuals and only return raw results for each query. Moreover they…
ERIC Educational Resources Information Center
Jarman, Jay
2011-01-01
This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms,…
ERIC Educational Resources Information Center
Oerlemans, Anoek M.; Hartman, Catharina A.; De Bruijn, Yvette G. E.; Van Steijn, Daphne J.; Franke, Barbara; Buitelaar, Jan K.; Rommelse, Nanda N. J.
2015-01-01
Autism spectrum disorders (ASD) and attention-deficit/hyperactivity disorder (ADHD) are highly heterogeneous neuropsychiatric disorders, that frequently co-occur. This study examined whether stratification into single-incidence (SPX) and multi-incidence (MPX) is helpful in (a) parsing heterogeneity and (b) detecting overlapping and unique…
Morphological Decomposition in the Recognition of Prefixed and Suffixed Words: Evidence from Korean
ERIC Educational Resources Information Center
Kim, Say Young; Wang, Min; Taft, Marcus
2015-01-01
Korean has visually salient syllable units that are often mapped onto either prefixes or suffixes in derived words. In addition, prefixed and suffixed words may be processed differently given a left-to-right parsing procedure and the need to resolve morphemic ambiguity in prefixes in Korean. To test this hypothesis, four experiments using the…
ERIC Educational Resources Information Center
Hwang, Hyekyung; Steinhauer, Karsten
2011-01-01
In spoken language comprehension, syntactic parsing decisions interact with prosodic phrasing, which is directly affected by phrase length. Here we used ERPs to examine whether a similar effect holds for the on-line processing of written sentences during silent reading, as suggested by theories of "implicit prosody." Ambiguous Korean sentence…
2007-01-01
radiation therapy In radiation therapy , the overarching goal is to deliver a lethal dose to the cancerous tissue while minimizing collateral damage to the ...Computer is shown in Figure 6. The exercise protocol is first parsed into a control mode based on the desired activation of configuration space variables...ABSTRACT In this paper, we present the design and implementation of a
ERIC Educational Resources Information Center
Wang, Xin; Cho, Kwangsu
2010-01-01
This study examined two major academic genres of writing: argumentative and technical writing. Three hundred eighty-four undergraduate student-produced texts were parsed and analyzed through a computational tool called Coh-Metrix. The results inform the instructional librarians that students used genre-dependent cohesive devices in a limited way…
Myelin Biogenesis And Oligodendrocyte Development: Parsing Out The Roles Of Glycosphingolipids
Jackman, Nicole; Ishii, Akihiro; Bansal, Rashmi
2010-01-01
The myelin sheath is an extension of the oligoddendrocyte (OL) plasma membrane enriched in lipids which ensheaths the axons of the central and peripheral nervous system. Here we review the involvement of glycosphingolipid in myelin/OL functions; including the regulation of OL differentiation, lipid raft-mediated trafficking and signaling, and neuron-glia interactions. PMID:19815855
FASTQ quality control dashboard
DOE Office of Scientific and Technical Information (OSTI.GOV)
2016-07-25
FQCDB builds up existing open source software, FastQC, implementing a modern web interface for across parsed output of FastQC. In addition, FQCDB is extensible as a web service to include additional plots of type line, boxplot, or heatmap, across data formatted according to guidelines. The interface is also configurable via more readable JSON format, enabling customization by non-web programmers.
Temporal Clustering and Sequencing in Short-Term Memory and Episodic Memory
ERIC Educational Resources Information Center
Farrell, Simon
2012-01-01
A model of short-term memory and episodic memory is presented, with the core assumptions that (a) people parse their continuous experience into episodic clusters and (b) items are clustered together in memory as episodes by binding information within an episode to a common temporal context. Along with the additional assumption that information…
Morphological Parsing and the Use of Segmentation Cues in Reading Finnish Compounds
ERIC Educational Resources Information Center
Bertram, Raymond; Pollatsek, Alexander; Hyona, Jukka
2004-01-01
This eye movement study investigated the use of two types of segmentation cues in processing long Finnish compounds. The cues were related to the vowel quality properties of the constituents and properties of the consonant starting the second constituent. In Finnish, front vowels never appear with back vowels in a lexeme, but different quality…
Statistical Clustering and the Contents of the Infant Vocabulary
ERIC Educational Resources Information Center
Swingley, Daniel
2005-01-01
Infants parse speech into word-sized units according to biases that develop in the first year. One bias, present before the age of 7 months, is to cluster syllables that tend to co-occur. The present computational research demonstrates that this statistical clustering bias could lead to the extraction of speech sequences that are actual words,…
Living Human Dignity: A Nightingale Legacy.
Hegge, Margaret; Bunkers, Sandra Schmidt
2017-10-01
The authors in this article present the humanbecoming ethical tenets of human dignity: reverence, awe, betrayal, and shame. These four ethical tenets of human dignity are examined from a historical perspective, exploring how Rosemarie Rizzo Parse has conceptualized these ethical tenets with added descriptions from other scholars, and how Florence Nightingale lived human dignity as the founder of modern nursing.
ERIC Educational Resources Information Center
Tebbutt, John
1999-01-01
Discusses efforts at National Institute of Standards and Technology (NIST) to construct an information discovery tool through the fusion of hypertext and information retrieval that works by parsing a contiguous document base into smaller documents and inserting semantic links between them. Also presents a case study that evaluated user reactions.…
The Sentence Fairy: A Natural-Language Generation System to Support Children's Essay Writing
ERIC Educational Resources Information Center
Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine
2008-01-01
We built an NLP system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary texts produced by pupils…
Text Cohesion and Comprehension: A Comparison of Prose Analysis Systems.
ERIC Educational Resources Information Center
Varnhagen, Connie K.; Goldman, Susan R.
To test three specific hypotheses about recall as a function of four categories of logical relations, a study was done to determine whether logical relations systems of prose analysis can be used to predict recall. Two descriptive passages of naturally occurring expository prose were used. Each text was parsed into 45 statements, consisting of…
Direct Object Predictability: Effects on Young Children's Imitation of Sentences
ERIC Educational Resources Information Center
Valian, Virginia; Prasada, Sandeep; Scarpa, Jodi
2006-01-01
We hypothesize that the conceptual relation between a verb and its direct object can make a sentence easier ("the cat is eating some food") or harder ("the cat is eating a sock") to parse and understand. If children's limited performance systems contribute to the ungrammatical brevity of their speech, they should perform better on sentences that…
BuddySuite: Command-Line Toolkits for Manipulating Sequences, Alignments, and Phylogenetic Trees.
Bond, Stephen R; Keat, Karl E; Barreira, Sofia N; Baxevanis, Andreas D
2017-06-01
The ability to manipulate sequence, alignment, and phylogenetic tree files has become an increasingly important skill in the life sciences, whether to generate summary information or to prepare data for further downstream analysis. The command line can be an extremely powerful environment for interacting with these resources, but only if the user has the appropriate general-purpose tools on hand. BuddySuite is a collection of four independent yet interrelated command-line toolkits that facilitate each step in the workflow of sequence discovery, curation, alignment, and phylogenetic reconstruction. Most common sequence, alignment, and tree file formats are automatically detected and parsed, and over 100 tools have been implemented for manipulating these data. The project has been engineered to easily accommodate the addition of new tools, is written in the popular programming language Python, and is hosted on the Python Package Index and GitHub to maximize accessibility. Documentation for each BuddySuite tool, including usage examples, is available at http://tiny.cc/buddysuite_wiki. All software is open source and freely available through http://research.nhgri.nih.gov/software/BuddySuite. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution 2017. This work is written by US Government employees and is in the public domain in the US.
Functional neuroanatomy of intuitive physical inference.
Fischer, Jason; Mikhael, John G; Tenenbaum, Joshua B; Kanwisher, Nancy
2016-08-23
To engage with the world-to understand the scene in front of us, plan actions, and predict what will happen next-we must have an intuitive grasp of the world's physical structure and dynamics. How do the objects in front of us rest on and support each other, how much force would be required to move them, and how will they behave when they fall, roll, or collide? Despite the centrality of physical inferences in daily life, little is known about the brain mechanisms recruited to interpret the physical structure of a scene and predict how physical events will unfold. Here, in a series of fMRI experiments, we identified a set of cortical regions that are selectively engaged when people watch and predict the unfolding of physical events-a "physics engine" in the brain. These brain regions are selective to physical inferences relative to nonphysical but otherwise highly similar scenes and tasks. However, these regions are not exclusively engaged in physical inferences per se or, indeed, even in scene understanding; they overlap with the domain-general "multiple demand" system, especially the parts of that system involved in action planning and tool use, pointing to a close relationship between the cognitive and neural mechanisms involved in parsing the physical content of a scene and preparing an appropriate action.
Deaner, Matthew; Holzman, Allison; Alper, Hal S
2018-04-16
Metabolic engineering typically utilizes a suboptimal step-wise gene target optimization approach to parse a highly connected and regulated cellular metabolism. While the endonuclease-null CRISPR/Cas system has enabled gene expression perturbations without genetic modification, it has been mostly limited to small sets of gene targets in eukaryotes due to inefficient methods to assemble and express large sgRNA operons. In this work, we develop a TEF1p-tRNA expression system and demonstrate that the use of tRNAs as splicing elements flanking sgRNAs provides higher efficiency than both Pol III and ribozyme-based expression across a variety of single sgRNA and multiplexed contexts. Next, we devise and validate a scheme to allow modular construction of tRNA-sgRNA (TST) operons using an iterative Type IIs digestion/ligation extension approach, termed CRISPR-Ligation Extension of sgRNA Operons (LEGO). This approach enables facile construction of large TST operons. We demonstrate this utility by constructing a metabolic rewiring prototype for 2,3-butanediol production in 2 distinct yeast strain backgrounds. These results demonstrate that our approach can act as a surrogate for traditional genetic modification on a much shorter design-cycle timescale. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
ERIC Educational Resources Information Center
Gutierrez-Rexach, Javier, Ed.; Martinez-Gil, Fernando, Ed.
Papers from the 1998 Hispanic Linguistics Symposium include: "Patterns of Gender Agreement in the Speech of Second Language Learners"; "'Nomas' in Mexican American Dialect"; "Parsing Spanish 'solo'"; "On Levels of Processing and Levels of Comprehension"; "The Role of Attention in Second/Foreign Language Classroom Research: Methodological Issues";…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, Joseph; Pirrung, Meg; McCue, Lee Ann
FQC is software that facilitates large-scale quality control of FASTQ files by carrying out a QC protocol, parsing results, and aggregating quality metrics within and across experiments into an interactive dashboard. The dashboard utilizes human-readable configuration files to manipulate the pages and tabs, and is extensible with CSV data.
Freedom: A Promise of Possibility.
Bunkers, Sandra Schmidt
2015-10-01
The idea of freedom as a promise of possibility is explored in this column. The core concepts from a research study on considering tomorrow (Bunkers, 1998) coupled with humanbecoming community change processes (Parse, 2003) are used to illuminate this notion. The importance of intentionality in human freedom is discussed from both a human science and a natural science perspective. © The Author(s) 2015.
Neoliberalism in Historical Light: How Business Models Displaced Science Education Goals in Two Eras
ERIC Educational Resources Information Center
Hayes, Kathryn N.
2016-01-01
Although a growing body of work addresses the current role of neoliberalism in displacing democratic equality as a goal of public education, attempts to parse such impacts rarely draw from historical accounts. At least one tenet of neoliberalism--the application of business models to public institutions--was also pervasive at the turn of the 20th…
Phoneme Restoration Methods Reveal Prosodic Influences on Syntactic Parsing: Data from Bulgarian
ERIC Educational Resources Information Center
Stoyneshka-Raleva, Iglika
2013-01-01
This dissertation introduces and evaluates a new methodology for studying aspects of human language processing and the factors to which it is sensitive. It makes use of the phoneme restoration illusion (Warren, 1970). A small portion of a spoken sentence is replaced by a burst of noise. Listeners typically mentally restore the missing phoneme(s),…
Moving Target Techniques: Leveraging Uncertainty for Cyber Defense
2015-08-24
vulnerability (a flaw or bug that an attacker can exploit to penetrate or disrupt a system) to successfully compromise systems. Defenders, however...device drivers, numerous software applications, and hardware components. Within the cyberspace, this imbalance between a simple, one- bug attack...parsing code itself could have security-relevant software bugs . Dynamic Network Techniques in the dynamic network domain change the properties
ERIC Educational Resources Information Center
Kitajima, Ryu
2016-01-01
Corpus linguistics identifies the qualitative difference in the characteristics of spoken discourse vs. written academic discourse. Whereas spoken discourse makes greater use of finite dependent clauses functioning as constituents in other clauses, written academic discourse incorporates noun phrase constituents and complex phrases. This claim can…
1989-09-30
26 9.1 Overview of SPQR ............................................................................. 26 9.2...domain. The ISR is the input to the selection component SPQR , whose function is to block semantically anomalous parses before they are sent to the...frequently occurring pairs of words, which is useful for identifying fixed multi-word expressions. 9. SELECTION The SPQR module (Selectional Pattern
The Living Experience of Feeling Surprised.
Bunkers, Sandra Schmidt
2017-01-01
The purpose of this article is to report the finding of a Parse research method study on the universal living experience of feeling surprised. In dialogical engagement with the researcher, eight participants described the experience. The structure of the living experience of feeling surprised was found to be: Feeling surprised is stunning amazement arising with shifting fortunes, as delight amid despair surfaces with diverse involvements.
"It Could Have Been so Much Better": The Aesthetic and Social Work of Theatre
ERIC Educational Resources Information Center
Gallagher, Kathleen; Freeman, Barry; Wessells, Anne
2010-01-01
In this paper, the authors consider early results from their ethnographic research in urban drama classrooms by parsing the aesthetic and social imperatives at play in the classroom. Moved by the observation that teachers and students alike seem to be pursuing elusive aesthetic and social ideals, the authors draw on Judith Butler's notion of…
ERIC Educational Resources Information Center
Oltmann, Shannon M.
2012-01-01
Scientific information, used by the U.S. federal government to formulate public policy in many arenas, is frequently contested and sometimes altered, blocked from publication, deleted from reports, or restricted in some way. This dissertation examines how and why restricted access to science policy (RASP) occurs through a comparative case study.…
Sorry Dave, I’m Afraid I Can’t Do That: Explaining Unachievable Robot Tasks using Natural Language
2013-06-24
processing components used by Brooks et al. [6]: the Bikel parser [3] combined with the null element (understood subject) restoration of Gabbard et al...Intelligent Robots and Systems (IROS), pages 1988 – 1993, 2010. [12] Ryan Gabbard , Mitch Marcus, and Seth Kulick. Fully parsing the Penn Treebank. In Human
Design of a Low-Cost Adaptive Question Answering System for Closed Domain Factoid Queries
ERIC Educational Resources Information Center
Toh, Huey Ling
2010-01-01
Closed domain question answering (QA) systems achieve precision and recall at the cost of complex language processing techniques to parse the answer corpus. We propose a "query-based" model for indexing answers in a closed domain factoid QA system. Further, we use a phrase term inference method for improving the ranking order of related questions.…
Syllabic Parsing in Children: A Developmental Study Using Visual Word-Spotting in Spanish
ERIC Educational Resources Information Center
Álvarez, Carlos J.; Garcia-Saavedra, Guacimara; Luque, Juan L.; Taft, Marcus
2017-01-01
Some inconsistency is observed in the results from studies of reading development regarding the role of the syllable in visual word recognition, perhaps due to a disparity between the tasks used. We adopted a word-spotting paradigm, with Spanish children of second grade (mean age: 7 years) and sixth grade (mean age: 11 years). The children were…
ERIC Educational Resources Information Center
Perkins, Kathleen M.
2016-01-01
Theatre is a multi-dimensional discipline encompassing aspects of several domains in the arts and humanities. Therefore, an array of scholarly practices, pedagogies, and methods might be available to a SoTL researcher from the close reading of texts in script analysis to portfolio critiques in set, costume, and lighting design--approaches shared…
Infants' Attention to Patterned Stimuli: Developmental Change from 3 to 12 Months of Age
ERIC Educational Resources Information Center
Courage, Mary L.; Reynolds, Greg D.; Richards, John E.
2006-01-01
To examine the development of look duration as a function of age and stimulus type, 14- to 52-week-old infants were shown static and dynamic versions of faces, Sesame Street material, and achromatic patterns for 20 s of accumulated looking. Heart rate was recorded during looking and parsed into stimulus orienting, sustained attention, and…
Effects of Cognitive Load on Trust
2013-10-01
that may be affected by load Build a parsing tool to extract relevant features Statistical analysis of results (by load components) Achieved...for a business application. Participants assessed potential job candidates and reviewed the applicants’ virtual resume which included standard...substantially different from each other that would make any confounding problems or other issues. Some statistics of the Australian data collection are
Research in Knowledge Representation for Natural Language Understanding
1980-11-01
artificial intelligence, natural language understanding , parsing, syntax, semantics, speaker meaning, knowledge representation, semantic networks...TinB PAGE map M W006 1Report No. 4513 L RESEARCH IN KNOWLEDGE REPRESENTATION FOR NATURAL LANGUAGE UNDERSTANDING Annual Report 1 September 1979 to 31... understanding , knowledge representation, and knowledge based inference. The work that we have been doing falls into three classes, successively motivated by
ERIC Educational Resources Information Center
Dong, Zhiyin Renee
2014-01-01
There is an ongoing debate in the field of Second Language Acquisition concerning whether a fundamental difference exists between the native language (L1) and adult second language (L2) online processing of syntax and morpho-syntax. The Shallow Structure Hypothesis (SSH) (Clahsen and Felser, 2006a, b) states that L2 online parsing is qualitatively…
Parsing Heuristic and Forward Search in First-Graders' Game-Play Behavior
ERIC Educational Resources Information Center
Paz, Luciano; Goldin, Andrea P.; Diuk, Carlos; Sigman, Mariano
2015-01-01
Seventy-three children between 6 and 7 years of age were presented with a problem having ambiguous subgoal ordering. Performance in this task showed reliable fingerprints: (a) a non-monotonic dependence of performance as a function of the distance between the beginning and the end-states of the problem, (b) very high levels of performance when the…
Some Educational Implications from Research on Story Grammar and Story Comprehension.
ERIC Educational Resources Information Center
Freedman, Jonathan M.; Owings, Richard A.
Folk tales were read to 32 kindergarten children of varying levels of language ability, as measured by the language scale of the Metropolitan Readiness Test. Recall protocols were parsed into the categories described by N. L. Stein and C. G. Glenn. Low ability children were found to be less likely to recall details of "internal plan" and…
Preschool Children's Exposure to Story Grammar Elements during Parent-Child Book Reading
ERIC Educational Resources Information Center
Breit-Smith, Allison; van Kleeck, Anne; Prendeville, Jo-Anne; Pan, Wei
2017-01-01
Twenty-three preschool-age children, 3;6 (years; months) to 4;1, were videotaped separately with their mothers and fathers while each mother and father read a different unfamiliar storybook to them. The text from the unfamiliar storybooks was parsed and coded into story grammar elements and all parental extratextual utterances were transcribed and…
ERIC Educational Resources Information Center
Dekydtspotter, Laurent
2001-01-01
From the perspective of Fodor's (1983) theory of mental organization and Chomsky's (1995) Minimalist theory of grammar, considers constraints on the interpretation of French-type and English-type cardinality interrogatives in the task of sentence comprehension, as a function of a universal parsing algorithm and hypotheses embodied in a French-type…
ERIC Educational Resources Information Center
Grossberg, Stephen; Pearson, Lance R.
2008-01-01
How does the brain carry out working memory storage, categorization, and voluntary performance of event sequences? The LIST PARSE neural model proposes an answer that unifies the explanation of cognitive, neurophysiological, and anatomical data. It quantitatively simulates human cognitive data about immediate serial recall and free recall, and…
Impaired P600 in neuroleptic naive patients with first-episode schizophrenia.
Papageorgiou, C; Kontaxakis, V P; Havaki-Kontaxaki, B J; Stamouli, S; Vasios, C; Asvestas, P; Matsopoulos, G K; Kontopantelis, E; Rabavilas, A; Uzunoglu, N; Christodoulou, G N
2001-09-17
Deficits of working memory (WM) are recognized as an important pathological feature in schizophrenia. Since the P600 component of event related potentials has been hypothesized that represents aspects of second-pass parsing processes of information processing, and is related to WM, the present study focuses on P600 elicited during a WM test in drug-naive first-episode schizophrenics (FES) compared to healthy controls. We examined 16 drug-naive first-episode schizophrenic patients and 23 healthy controls matched for age and sex. Compared with controls schizophrenic patients showed reduced P600 amplitude on left temporoparietal region and increased P600 amplitude on left occipital region. With regard to the latency, the patients exhibited significantly prolongation on right temporoparietal region. The obtained pattern of differences classified correctly 89.20% of patients. Memory performance of patients was also significantly impaired relative to controls. Our results suggest that second-pass parsing process of information processing, as indexed by P600, elicited during a WM test, is impaired in FES. Moreover, these findings lend support to the view that the auditory WM in schizophrenia involves or affects a circuitry including temporoparietal and occipital brain areas.
Do 11-month-old French infants process articles?
Hallé, Pierre A; Durand, Catherine; de Boysson-Bardies, Bénédicte
2008-01-01
The first part of this study examined (Parisian) French-learning 11-month-old infants' recognition of the six definite and indefinite French articles: le, la, les, un, une, and des. The six articles were compared with pseudoarticles in the context of disyllabic or monosyllabic nouns, using the Head-turn Preference Procedure. The pseudo articles were similar to real articles in terms of phonetic composition and phonotactic probability, and real and pseudo noun phrases were alike in terms of overall prosodic contour. In three experiments, 11-month-old infants showed preference for real over pseudo articles, suggesting they have the articles' word-forms stored in long-term memory. The second part of the study evaluates several hypotheses about the role of articles in 11-month-olds infants' word recognition. Evidence from three experiments supports the view that articles help infants to recognize the following words. We propose that 11-month-olds have the capacity to parse noun phrases into their constituents, which is consistent with the more general view that function words define a syntactic skeleton that serves as a basis for parsing spoken utterances. This proposition is compared to a competing account, which argues that 11-month-olds recognize noun-phrases as whole-words.
Language experience changes subsequent learning.
Onnis, Luca; Thiessen, Erik
2013-02-01
What are the effects of experience on subsequent learning? We explored the effects of language-specific word order knowledge on the acquisition of sequential conditional information. Korean and English adults were engaged in a sequence learning task involving three different sets of stimuli: auditory linguistic (nonsense syllables), visual non-linguistic (nonsense shapes), and auditory non-linguistic (pure tones). The forward and backward probabilities between adjacent elements generated two equally probable and orthogonal perceptual parses of the elements, such that any significant preference at test must be due to either general cognitive biases, or prior language-induced biases. We found that language modulated parsing preferences with the linguistic stimuli only. Intriguingly, these preferences are congruent with the dominant word order patterns of each language, as corroborated by corpus analyses, and are driven by probabilistic preferences. Furthermore, although the Korean individuals had received extensive formal explicit training in English and lived in an English-speaking environment, they exhibited statistical learning biases congruent with their native language. Our findings suggest that mechanisms of statistical sequential learning are implicated in language across the lifespan, and experience with language may affect cognitive processes and later learning. Copyright © 2012 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Janik, Gregory
Renders, saves, and analyzes pressure from several sensors in a prosthesis socket. The program receives pressure data from 64 manometers and parses the pressure for each individual sensor. The program can then display those pressures as number in a table. The program also interpolates pressures between manometers to create a larger set of data. This larger set of data is displayed as a simple contour plot. That same contour plot can also be placed on a three-dimensional surface in the shape of a prosthesis.This program allows for easy identification of high pressure areas in a prosthesis to reduce the usersmore » discomfort. The program parses the sensor pressures into a human-readable numeric format. The data may also be used to actively adjust bladders within the prosthesis to spread out pressure in real time, according to changing demands placed on the prosthesis. Interpolation of the pressures to create a larger data set makes it even easier for a human to identify particular areas of the prosthesis that are under high pressure. After identifying pressure points, a prosthetician can then redesign the prosthesis and/or command the bladders in the prosthesis to attempt to maintain constant pressures.« less
Novel Advancements in Internet-Based Real Time Data Technologies
NASA Technical Reports Server (NTRS)
Myers, Gerry; Welch, Clara L. (Technical Monitor)
2002-01-01
AZ Technology has been working with MSFC Ground Systems Department to find ways to make it easier for remote experimenters (RPI's) to monitor their International Space Station (ISS) payloads in real-time from anywhere using standard/familiar devices. AZ Technology was awarded an SBIR Phase I grant to research the technologies behind and advancements of distributing live ISS data across the Internet. That research resulted in a product called "EZStream" which is in use on several ISS-related projects. Although the initial implementation is geared toward ISS, the architecture and lessons learned are applicable to other space-related programs. This paper presents the high-level architecture and components that make up EZStream. A combination of commercial-off-the-shelf (COTS) and custom components were used and their interaction will be discussed. The server is powered by Apache's Jakarta-Tomcat web server/servlet engine. User accounts are maintained in a My SQL database. Both Tomcat and MySQL are Open Source products. When used for ISS, EZStream pulls the live data directly from NASA's Telescience Resource Kit (TReK) API. TReK parses the ISS data stream into individual measurement parameters and performs on-the- fly engineering unit conversion and range checking before passing the data to EZStream for distribution. TReK is provided by NASA at no charge to ISS experimenters. By using a combination of well established Open Source, NASA-supplied. and AZ Technology-developed components, operations using EZStream are robust and economical. Security over the Internet is a major concern on most space programs. This paper describes how EZStream provides for secure connection to and transmission of space- related data over the public Internet. Display pages that show sensitive data can be placed under access control by EZStream. Users are required to login before being allowed to pull up those web pages. To enhance security, the EZStream client/server data transmissions can be encrypted to preclude interception. EZStream was developed to make use of a host of standard platforms and protocols. Each are discussed in detail in this paper. The I3ZStream server is written as Java Servlets. This allows different platforms (i.e. Windows, Unix, Linux . Mac) to host the server portion. The EZStream client component is written in two different flavors: JavaBean and ActiveX. The JavaBean component is used to develop Java Applet displays. The ActiveX component is used for developing ActiveX-based displays. Remote user devices will be covered including web browsers on PC#s and scaled-down displays for PDA's and smart cell phones. As mentioned. the interaction between EZStream (web/data server) and TReK (data source) will be covered as related to ISS. EZStream is being enhanced to receive and parse binary data stream directly. This makes EZStream beneficial to both the ISS International Partners and non-NASA applications (i.e. factory floor monitoring). The options for developing client-side display web pages will be addressed along with the development of tools to allow creation of display web pages by non-programmers.
Incremental Parsing with Reference Interaction
2004-07-01
ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) Department of...Computer Science,University of Rochester,Rochester,NY,14627 8. PERFORMING ORGANIZATION REPORT NUMBER 9. SPONSORING/MONITORING AGENCY NAME(S) AND...Evidence from eye movements in spoken language comprehen- sion. Conference Abstract. Architechtures and Mechanisms for Language Processing. R. M
Document Image Parsing and Understanding using Neuromorphic Architecture
2015-03-01
processing speed at different layers. In the pattern matching layer, the computing power of multicore processors is explored to reduce the processing...developed to reduce the processing speed at different layers. In the pattern matching layer, the computing power of multicore processors is explored... cortex where the complex data is reduced to abstract representations. The abstract representation is compared to stored patterns in massively parallel
ERIC Educational Resources Information Center
Jung, Sehoon
2017-01-01
One of the central questions in recent second language processing research is whether the types of parsing heuristics and linguistic resources adult L2 learners compute during online processing are qualitatively similar or different from those used by native speakers of the target language. While the current L2 processing literature provides…
ERIC Educational Resources Information Center
Needham, Amy; Cantlon, Jessica F.; Ormsbee Holley, Susan M.
2006-01-01
The current research investigates infants' perception of a novel object from a category that is familiar to young infants: key rings. We ask whether experiences obtained outside the lab would allow young infants to parse the visible portions of a partly occluded key ring display into one single unit, presumably as a result of having categorized it…
2015-09-01
intrusion detection systems , neural networks 15. NUMBER OF PAGES 75 16. PRICE CODE 17. SECURITY CLASSIFICATION OF... detection system (IDS) software, which learns to detect and classify network attacks and intrusions through prior training data. With the added criteria of...BACKGROUND The growing threat of malicious network activities and intrusion attempts makes intrusion detection systems (IDS) a
Bunkers, Sandra S
2009-01-01
This column focuses on ideas concerning leaders and leadership. The author proposes that leadership is about showing up and participating with others in doing something. "Mandela: His 8 Lessons of Leadership" by Richard Stengel is explored in light of selected philosophical writings, literature on nursing leadership, and nurse theorist Rosemarie Rizzo Parse's humanbecoming leading-following model. Teaching-learning questions are then posed to stimulate further reflection on the lessons of leadership.
AGILE: Autonomous Global Integrated Language Exploitation
2009-12-01
combination, including METEOR-based alignment (with stemming and WordNet synonym matching) and GIZA ++ based alignment. So far, we have not seen any...parse trees and a detailed analysis of how function words operate in translation. This program lets us fix alignment errors that systems like GIZA ...correlates better with Pyramid than with Responsiveness scoring (i.e., it is a more precise, careful, measure) • BE generally outperforms ROUGE
Moving Target Techniques: Leveraging Uncertainty for CyberDefense
2015-12-15
cyberattacks is a continual struggle for system managers. Attackers often need only find one vulnerability (a flaw or bug that an attacker can exploit...additional parsing code itself could have security-relevant software bugs . Dynamic Network Techniques in the dynamic network domain change the...evaluation of MT techniques can benefit from a variety of evaluation approaches, including abstract analysis, modeling and simulation, test bed
Analysis of Cloud-Based Database Systems
2015-06-01
EU) citizens under the Patriot Act [3]. Unforeseen virtualization bugs have caused wide-reaching outages [4], leaving customers helpless to assist...collected from SQL Server Profiler traces. We analyze the trace results captured from our test bed both before and after increasing system resources...cloud test- bed . A. DATA COLLECTION, PARSING, AND ORGANIZATION Once we finished collecting the trace data, we knew we needed to have as close a
ERIC Educational Resources Information Center
Landi, Nicole; Frost, Stephen J.; Mencl, W. Einar; Sandak, Rebecca; Pugh, Kenneth R.
2013-01-01
For accurate reading comprehension, readers must first learn to map letters to their corresponding speech sounds and meaning, and then they must string the meanings of many words together to form a representation of the text. Furthermore, readers must master the complexities involved in parsing the relevant syntactic and pragmatic information…
The Experience of Feeling Disrespected: A Humanbecoming Perspective.
Hawkins, Kim
2017-04-01
The concept of feeling disrespected was explored using the Parse research method. Ten women living with embodied largeness were asked, "What is the experience of feeling disrespected?" The structure of the living experience was feeling disrespected is mortifying disheartenment arising with disquieting irreverence, as distancing affiliations surface while enduring hardship. The findings provided new knowledge of living quality, advanced nursing practice, and presented future direction for research.
Rochester Connectionist Papers. 1979-1985
1985-12-01
updated and improved version of the thesis account of recent neurolinguistic data. Fanty, M., "Context-free parsing in connectionist networks." TR 174...April 1982. Our first large program in the connectionist paradigm. It simulates a multi- layer network for recognizing line drawings of Origami figures...The program successfully deals with noise and simple occlusion and the thesis incorporates many key ideas on designing and running large models. Small
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thoreson, Gregory G
PCF files are binary files designed to contain gamma spectra and neutron count rates from radiation sensors. It is the native format for the GAmma Detector Response and Analysis Software (GADRAS) package [1]. It can contain multiple spectra and information about each spectrum such as energy calibration. This document outlines the format of the file that would allow one to write a computer program to parse and write such files.
Integrated Processing in Planning and Understanding.
1986-12-01
to language analysis seemed necessary. The second observation was the rather commonsense one that it is easier to understand a foreign language ...syntactic analysis Probably the most widely employed method for natural language analysis is augmea ted transition network parsing, or ATNs (Thorne, Bratley...accomplished. It is for this reason that the programming language Prolog, which implements that general method , has proven so well-stilted to writing ATN
They Said So on the News: Parsing Media Reports About Birth
Romano, Amy M.; Lythgoe, Andrea; Goer, Henci
2010-01-01
In this column, the authors reprise recent selections from the Lamaze International research blog, Science & Sensibility. Each selection discusses shortcomings in the news media coverage of childbirth issues. The authors demonstrate how to identify misleading claims in the media and highlight how childbirth educators can apply a common-sense approach and careful fact checking to help women understand the whole story. PMID:20174490
ERIC Educational Resources Information Center
Álvarez, Carlos J.; Taft, Marcus; Hernández-Cabrera, Juan A.
2017-01-01
A word-spotting task is used in Spanish to test the way in which polysyllabic letter-strings are parsed in this language. Monosyllabic words (e.g., "bar") embedded at the beginning of a pseudoword were immediately followed by either a coda-forming consonant (e.g., "barto") or a vowel (e.g., "baros"). In the former…
Intelligent Semantic Query of Notices to Airmen (NOTAMs)
2006-07-01
definition of the airspace is constantly changing, new vocabulary is added and old words retired on a monthly basis, and the information specifying this is...NOTAMs are notices containing information on the conditions, or changes to, aeronautical facilities, services, procedures, or hazards, which are...develop a new parsing system, employing and extending ideas developed by the information-extraction community, rather than on classical computational
Language and the Law: A Case for Linguistic Pragmatics. Sociolinguistic Working Paper Number 94.
ERIC Educational Resources Information Center
Prince, Ellen F.
The emergence of a subfield of linguistics, linguistic pragmatics, whose goal is to discover the principles by which a hearer or reader understands a text or can construct a model based on the text, given the sentence-level competence to parse the text's sentences and assign logical forms to them, is discussed in the context of a court case in…
ERIC Educational Resources Information Center
Sabourin, Laura
2006-01-01
In their Keynote Article, Clahsen and Felser (CF) provide a detailed summary and comparison of grammatical processing in adult first language (L1) speakers, child L1 speakers, and second language (L2) speakers. CF conclude that child and adult L1 processing makes use of a continuous parsing mechanism, and that any differences found in processing…
ERIC Educational Resources Information Center
Heffner, Christopher C.; Newman, Rochelle S.; Dilley, Laura C.; Idsardi, William J.
2015-01-01
Purpose: A new literature has suggested that speech rate can influence the parsing of words quite strongly in speech. The purpose of this study was to investigate differences between younger adults and older adults in the use of context speech rate in word segmentation, given that older adults perceive timing information differently from younger…
ERIC Educational Resources Information Center
Kidd, Evan; Stewart, Andrew J.; Serratrice, Ludovica
2011-01-01
In this paper we report on a visual world eye-tracking experiment that investigated the differing abilities of adults and children to use referential scene information during reanalysis to overcome lexical biases during sentence processing. The results showed that adults incorporated aspects of the referential scene into their parse as soon as it…
ERIC Educational Resources Information Center
Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine
2009-01-01
We built a natural language processing (NLP) system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary…
Critiquing Systems for Decision Support
2006-02-01
errors and deficiencies. An example of a comparative critic is the ATTENDING system ( anaesthesiology ), which first parses the user’s solution into a...design tools at the times when those tools are useful. 9. Experiential critics provide reminders of past experiences with similar designs or design...technique for hypertension rather than the broader field of anaesthesiology ; and (2) critiquing systems are most appropriate for tasks that require
Dotan, Dror; Friedmann, Naama
2018-04-01
We propose a detailed cognitive model of multi-digit number reading. The model postulates separate processes for visual analysis of the digit string and for oral production of the verbal number. Within visual analysis, separate sub-processes encode the digit identities and the digit order, and additional sub-processes encode the number's decimal structure: its length, the positions of 0, and the way it is parsed into triplets (e.g., 314987 → 314,987). Verbal production consists of a process that generates the verbal structure of the number, and another process that retrieves the phonological forms of each number word. The verbal number structure is first encoded in a tree-like structure, similarly to syntactic trees of sentences, and then linearized to a sequence of number-word specifiers. This model is based on an investigation of the number processing abilities of seven individuals with different selective deficits in number reading. We report participants with impairment in specific sub-processes of the visual analysis of digit strings - in encoding the digit order, in encoding the number length, or in parsing the digit string to triplets. Other participants were impaired in verbal production, making errors in the number structure (shifts of digits to another decimal position, e.g., 3,040 → 30,004). Their selective deficits yielded several dissociations: first, we found a double dissociation between visual analysis deficits and verbal production deficits. Second, several dissociations were found within visual analysis: a double dissociation between errors in digit order and errors in the number length; a dissociation between order/length errors and errors in parsing the digit string into triplets; and a dissociation between the processing of different digits - impaired order encoding of the digits 2-9, without errors in the 0 position. Third, within verbal production, a dissociation was found between digit shifts and substitutions of number words. A selective deficit in any of the processes described by the model would cause difficulties in number reading, which we propose to term "dysnumeria". Copyright © 2017 Elsevier Ltd. All rights reserved.
User-defined functions in the Arden Syntax: An extension proposal.
Karadimas, Harry; Ebrahiminia, Vahid; Lepage, Eric
2015-12-11
The Arden Syntax is a knowledge-encoding standard, started in 1989, and now in its 10th revision, maintained by the health level seven (HL7) organization. It has constructs borrowed from several language concepts that were available at that time (mainly the HELP hospital information system and the Regenstrief medical record system (RMRS), but also the Pascal language, functional languages and the data structure of frames, used in artificial intelligence). The syntax has a rationale for its constructs, and has restrictions that follow this rationale. The main goal of the Standard is to promote knowledge sharing, by avoiding the complexity of traditional programs, so that a medical logic module (MLM) written in the Arden Syntax can remain shareable and understandable across institutions. One of the restrictions of the syntax is that you cannot define your own functions and subroutines inside an MLM. An MLM can, however, call another MLM, where this MLM will serve as a function. This will add an additional dependency between MLMs, a known criticism of the Arden Syntax knowledge model. This article explains why we believe the Arden Syntax would benefit from a construct for user-defined functions, discusses the need, the benefits and the limitations of such a construct. We used the recent grammar of the Arden Syntax v.2.10, and both the Arden Syntax standard document and the Arden Syntax Rationale article as guidelines. We gradually introduced production rules to the grammar. We used the CUP parsing tool to verify that no ambiguities were detected. A new grammar was produced, that supports user-defined functions. 22 production rules were added to the grammar. A parser was built using the CUP parsing tool. A few examples are given to illustrate the concepts. All examples were parsed correctly. It is possible to add user-defined functions to the Arden Syntax in a way that remains coherent with the standard. We believe that this enhances the readability and the robustness of MLMs. A detailed proposal will be submitted by the end of the year to the HL7 workgroup on Arden Syntax. Copyright © 2015 Elsevier B.V. All rights reserved.
An automatic method to generate domain-specific investigator networks using PubMed abstracts.
Yu, Wei; Yesupriya, Ajay; Wulf, Anja; Qu, Junfeng; Gwinn, Marta; Khoury, Muin J
2007-06-20
Collaboration among investigators has become critical to scientific research. This includes ad hoc collaboration established through personal contacts as well as formal consortia established by funding agencies. Continued growth in online resources for scientific research and communication has promoted the development of highly networked research communities. Extending these networks globally requires identifying additional investigators in a given domain, profiling their research interests, and collecting current contact information. We present a novel strategy for building investigator networks dynamically and producing detailed investigator profiles using data available in PubMed abstracts. We developed a novel strategy to obtain detailed investigator information by automatically parsing the affiliation string in PubMed records. We illustrated the results by using a published literature database in human genome epidemiology (HuGE Pub Lit) as a test case. Our parsing strategy extracted country information from 92.1% of the affiliation strings in a random sample of PubMed records and in 97.0% of HuGE records, with accuracies of 94.0% and 91.0%, respectively. Institution information was parsed from 91.3% of the general PubMed records (accuracy 86.8%) and from 94.2% of HuGE PubMed records (accuracy 87.0). We demonstrated the application of our approach to dynamic creation of investigator networks by creating a prototype information system containing a large database of PubMed abstracts relevant to human genome epidemiology (HuGE Pub Lit), indexed using PubMed medical subject headings converted to Unified Medical Language System concepts. Our method was able to identify 70-90% of the investigators/collaborators in three different human genetics fields; it also successfully identified 9 of 10 genetics investigators within the PREBIC network, an existing preterm birth research network. We successfully created a web-based prototype capable of creating domain-specific investigator networks based on an application that accurately generates detailed investigator profiles from PubMed abstracts combined with robust standard vocabularies. This approach could be used for other biomedical fields to efficiently establish domain-specific investigator networks.
An automatic method to generate domain-specific investigator networks using PubMed abstracts
Yu, Wei; Yesupriya, Ajay; Wulf, Anja; Qu, Junfeng; Gwinn, Marta; Khoury, Muin J
2007-01-01
Background Collaboration among investigators has become critical to scientific research. This includes ad hoc collaboration established through personal contacts as well as formal consortia established by funding agencies. Continued growth in online resources for scientific research and communication has promoted the development of highly networked research communities. Extending these networks globally requires identifying additional investigators in a given domain, profiling their research interests, and collecting current contact information. We present a novel strategy for building investigator networks dynamically and producing detailed investigator profiles using data available in PubMed abstracts. Results We developed a novel strategy to obtain detailed investigator information by automatically parsing the affiliation string in PubMed records. We illustrated the results by using a published literature database in human genome epidemiology (HuGE Pub Lit) as a test case. Our parsing strategy extracted country information from 92.1% of the affiliation strings in a random sample of PubMed records and in 97.0% of HuGE records, with accuracies of 94.0% and 91.0%, respectively. Institution information was parsed from 91.3% of the general PubMed records (accuracy 86.8%) and from 94.2% of HuGE PubMed records (accuracy 87.0). We demonstrated the application of our approach to dynamic creation of investigator networks by creating a prototype information system containing a large database of PubMed abstracts relevant to human genome epidemiology (HuGE Pub Lit), indexed using PubMed medical subject headings converted to Unified Medical Language System concepts. Our method was able to identify 70–90% of the investigators/collaborators in three different human genetics fields; it also successfully identified 9 of 10 genetics investigators within the PREBIC network, an existing preterm birth research network. Conclusion We successfully created a web-based prototype capable of creating domain-specific investigator networks based on an application that accurately generates detailed investigator profiles from PubMed abstracts combined with robust standard vocabularies. This approach could be used for other biomedical fields to efficiently establish domain-specific investigator networks. PMID:17584920
Hope in "Rita Hayworth and Shawshank Redemption": a human becoming hermeneutic study.
Parse, Rosemarie Rizzo
2007-04-01
This article is the report of the human becoming hermeneutic method study on "Rita Hayworth and Shawshank Redemption" (the short story, the screenplay, and the film). The study unfolded during the Parse-King dialogue that answered the research question what is hope as humanly lived? Emergent meanings were discovered that enhanced knowledge and understanding of hope in general and expanded the human becoming school of thought.
The parser generator as a general purpose tool
NASA Technical Reports Server (NTRS)
Noonan, R. E.; Collins, W. R.
1985-01-01
The parser generator has proven to be an extremely useful, general purpose tool. It can be used effectively by programmers having only a knowledge of grammars and no training at all in the theory of formal parsing. Some of the application areas for which a table-driven parser can be used include interactive, query languages, menu systems, translators, and programming support tools. Each of these is illustrated by an example grammar.
Software Library for Bruker TopSpin NMR Data Files
DOE Office of Scientific and Technical Information (OSTI.GOV)
A software library for parsing and manipulating frequency-domain data files that have been processed using the Bruker TopSpin NMR software package. In the context of NMR, the term "processed" indicates that the end-user of the Bruker TopSpin NMR software package has (a) Fourier transformed the raw, time-domain data (the Free Induction Decay) into the frequency-domain and (b) has extracted the list of NMR peaks.
Schmidt Bunkers, Sandra
2009-04-01
In this column questions concerning wisdom are addressed, such as, what is wisdom? Can wisdom be taught in the academy? Several perspectives on wisdom from philosophy, education, business, and psychology are presented. Wisdom with creativity-creativity with wisdom is then explored through discussion of Parse's humanbecoming teaching-learning model and Laird Hamilton's life lessons learned from surfing, which he termed wisdom of the wave. The column concludes with consideration of the wise person.
Natural Language Semantics using Probabilistic Logic
2014-10-01
standard scores. Here are some examples: • “A man is playing a guitar.” “A woman is playing the guitar.”, score: 2.75 • “A woman is cutting broccoli ...A woman is slicing broccoli .”, score: 5.00 • “A car is parking.” “A cat is playing.”, score: 0.00 9 Sent1 Parsing (Boxer) KB result Sent2 LF1 LF2
The Analysis of Nominal Compounds,
1985-12-01
34Phenomenologically plausible parsing" in Proceedings of the 1984 American Association for Aritificial Intelligence Conference, pp. 335-339. 27 Wilensky, R...34December, 1985 - CPTM #8 LJ _DTIC -5ELECTE’ DEC 1 6 198M This series of internal memos describes research in E artificial intelligence conducted under...representational techniques for natural language that have evolved in linguistics and artificial intelligence , it is difficult to find much uniformity in the
A Formal Model of Ambiguity and its Applications in Machine Translation
2010-01-01
structure indicates linguisti- cally implausible segmentation that might be generated using dictionary - driven approaches...derivation. As was done in the monolingual case, the functions LHS, RHSi, RHSo and υ can be extended to a derivation δ. D(q) where q ∈V denotes the... monolingual parses. My algorithm runs more efficiently than O(n6) with many grammars (including those that required using heuristic search with other parsers
Xu, Duo; Jaber, Yousef; Pavlidis, Pavlos; Gokcumen, Omer
2017-09-26
Constructing alignments and phylogenies for a given locus from large genome sequencing studies with relevant outgroups allow novel evolutionary and anthropological insights. However, no user-friendly tool has been developed to integrate thousands of recently available and anthropologically relevant genome sequences to construct complete sequence alignments and phylogenies. Here, we provide VCFtoTree, a user friendly tool with a graphical user interface that directly accesses online databases to download, parse and analyze genome variation data for regions of interest. Our pipeline combines popular sequence datasets and tree building algorithms with custom data parsing to generate accurate alignments and phylogenies using all the individuals from the 1000 Genomes Project, Neanderthal and Denisovan genomes, as well as reference genomes of Chimpanzee and Rhesus Macaque. It can also be applied to other phased human genomes, as well as genomes from other species. The output of our pipeline includes an alignment in FASTA format and a tree file in newick format. VCFtoTree fulfills the increasing demand for constructing alignments and phylogenies for a given loci from thousands of available genomes. Our software provides a user friendly interface for a wider audience without prerequisite knowledge in programming. VCFtoTree can be accessed from https://github.com/duoduoo/VCFtoTree_3.0.0 .
Gault, Lora V.; Shultz, Mary; Davies, Kathy J.
2002-01-01
Objectives: This study compared the mapping of natural language patron terms to the Medical Subject Headings (MeSH) across six MeSH interfaces for the MEDLINE database. Methods: Test data were obtained from search requests submitted by patrons to the Library of the Health Sciences, University of Illinois at Chicago, over a nine-month period. Search request statements were parsed into separate terms or phrases. Using print sources from the National Library of Medicine, Each parsed patron term was assigned corresponding MeSH terms. Each patron term was entered into each of the selected interfaces to determine how effectively they mapped to MeSH. Data were collected for mapping success, accessibility of MeSH term within mapped list, and total number of MeSH choices within each list. Results: The selected MEDLINE interfaces do not map the same patron term in the same way, nor do they consistently lead to what is considered the appropriate MeSH term. Conclusions: If searchers utilize the MEDLINE database to its fullest potential by mapping to MeSH, the results of the mapping will vary between interfaces. This variance may ultimately impact the search results. These differences should be considered when choosing a MEDLINE interface and when instructing end users. PMID:11999175
Harmony Search Algorithm for Word Sense Disambiguation.
Abed, Saad Adnan; Tiun, Sabrina; Omar, Nazlia
2015-01-01
Word Sense Disambiguation (WSD) is the task of determining which sense of an ambiguous word (word with multiple meanings) is chosen in a particular use of that word, by considering its context. A sentence is considered ambiguous if it contains ambiguous word(s). Practically, any sentence that has been classified as ambiguous usually has multiple interpretations, but just one of them presents the correct interpretation. We propose an unsupervised method that exploits knowledge based approaches for word sense disambiguation using Harmony Search Algorithm (HSA) based on a Stanford dependencies generator (HSDG). The role of the dependency generator is to parse sentences to obtain their dependency relations. Whereas, the goal of using the HSA is to maximize the overall semantic similarity of the set of parsed words. HSA invokes a combination of semantic similarity and relatedness measurements, i.e., Jiang and Conrath (jcn) and an adapted Lesk algorithm, to perform the HSA fitness function. Our proposed method was experimented on benchmark datasets, which yielded results comparable to the state-of-the-art WSD methods. In order to evaluate the effectiveness of the dependency generator, we perform the same methodology without the parser, but with a window of words. The empirical results demonstrate that the proposed method is able to produce effective solutions for most instances of the datasets used.
NASA Astrophysics Data System (ADS)
Peng, B.; Guan, K.; Chen, M.
2016-12-01
Future agricultural production faces a grand challenge of higher temperature under climate change. There are multiple physiological or metabolic processes of how high temperature affects crop yield. Specifically, we consider the following major processes: (1) direct temperature effects on photosynthesis and respiration; (2) speed-up growth rate and the shortening of growing season; (3) heat stress during reproductive stage (flowering and grain-filling); (4) high-temperature induced increase of atmospheric water demands. In this work, we use a newly developed modeling framework (CLM-APSIM) to simulate the corn and soybean growth and explicitly parse the above four processes. By combining the strength of CLM in modeling surface biophysical (e.g., hydrology and energy balance) and biogeochemical (e.g., photosynthesis and carbon-nitrogen interactions), as well as that of APSIM in modeling crop phenology and reproductive stress, the newly developed CLM-APSIM modeling framework enables us to diagnose the impacts of high temperature stress through different processes at various crop phenology stages. Ground measurements from the advanced SoyFACE facility at University of Illinois is used here to calibrate, validate, and improve the CLM-APSIM modeling framework at the site level. We finally use the CLM-APSIM modeling framework to project crop yield for the whole US Corn Belt under different climate scenarios.
Harmony Search Algorithm for Word Sense Disambiguation
Abed, Saad Adnan; Tiun, Sabrina; Omar, Nazlia
2015-01-01
Word Sense Disambiguation (WSD) is the task of determining which sense of an ambiguous word (word with multiple meanings) is chosen in a particular use of that word, by considering its context. A sentence is considered ambiguous if it contains ambiguous word(s). Practically, any sentence that has been classified as ambiguous usually has multiple interpretations, but just one of them presents the correct interpretation. We propose an unsupervised method that exploits knowledge based approaches for word sense disambiguation using Harmony Search Algorithm (HSA) based on a Stanford dependencies generator (HSDG). The role of the dependency generator is to parse sentences to obtain their dependency relations. Whereas, the goal of using the HSA is to maximize the overall semantic similarity of the set of parsed words. HSA invokes a combination of semantic similarity and relatedness measurements, i.e., Jiang and Conrath (jcn) and an adapted Lesk algorithm, to perform the HSA fitness function. Our proposed method was experimented on benchmark datasets, which yielded results comparable to the state-of-the-art WSD methods. In order to evaluate the effectiveness of the dependency generator, we perform the same methodology without the parser, but with a window of words. The empirical results demonstrate that the proposed method is able to produce effective solutions for most instances of the datasets used. PMID:26422368
Gault, Lora V; Shultz, Mary; Davies, Kathy J
2002-04-01
This study compared the mapping of natural language patron terms to the Medical Subject Headings (MeSH) across six MeSH interfaces for the MEDLINE database. Test data were obtained from search requests submitted by patrons to the Library of the Health Sciences, University of Illinois at Chicago, over a nine-month period. Search request statements were parsed into separate terms or phrases. Using print sources from the National Library of Medicine, Each parsed patron term was assigned corresponding MeSH terms. Each patron term was entered into each of the selected interfaces to determine how effectively they mapped to MeSH. Data were collected for mapping success, accessibility of MeSH term within mapped list, and total number of MeSH choices within each list. The selected MEDLINE interfaces do not map the same patron term in the same way, nor do they consistently lead to what is considered the appropriate MeSH term. If searchers utilize the MEDLINE database to its fullest potential by mapping to MeSH, the results of the mapping will vary between interfaces. This variance may ultimately impact the search results. These differences should be considered when choosing a MEDLINE interface and when instructing end users.
MacAlpine, D M; Perlman, P S; Butow, R A
2000-02-15
Mitochondrial DNA (mtDNA) is inherited as a protein-DNA complex (the nucleoid). We show that activation of the general amino acid response pathway in rho(+) and rho(-) petite cells results in an increased number of nucleoids without an increase in mtDNA copy number. In rho(-) cells, activation of the general amino acid response pathway results in increased intramolecular recombination between tandemly repeated sequences of rho(-) mtDNA to produce small, circular oligomers that are packaged into individual nucleoids, resulting in an approximately 10-fold increase in nucleoid number. The parsing of mtDNA into nucleoids due to general amino acid control requires Ilv5p, a mitochondrial protein that also functions in branched chain amino acid biosynthesis, and one or more factors required for mtDNA recombination. Two additional proteins known to function in mtDNA recombination, Abf2p and Mgt1p, are also required for parsing mtDNA into a larger number of nucleoids, although expression of these proteins is not under general amino acid control. Increased nucleoid number leads to increased mtDNA transmission, suggesting a mechanism to enhance mtDNA inheritance under amino acid starvation conditions.
PDB explorer -- a web based algorithm for protein annotation viewer and 3D visualization.
Nayarisseri, Anuraj; Shardiwal, Rakesh Kumar; Yadav, Mukesh; Kanungo, Neha; Singh, Pooja; Shah, Pratik; Ahmed, Sheaza
2014-12-01
The PDB file format, is a text format characterizing the three dimensional structures of macro molecules available in the Protein Data Bank (PDB). Determined protein structure are found in coalition with other molecules or ions such as nucleic acids, water, ions, Drug molecules and so on, which therefore can be described in the PDB format and have been deposited in PDB database. PDB is a machine generated file, it's not human readable format, to read this file we need any computational tool to understand it. The objective of our present study is to develop a free online software for retrieval, visualization and reading of annotation of a protein 3D structure which is available in PDB database. Main aim is to create PDB file in human readable format, i.e., the information in PDB file is converted in readable sentences. It displays all possible information from a PDB file including 3D structure of that file. Programming languages and scripting languages like Perl, CSS, Javascript, Ajax, and HTML have been used for the development of PDB Explorer. The PDB Explorer directly parses the PDB file, calling methods for parsed element secondary structure element, atoms, coordinates etc. PDB Explorer is freely available at http://www.pdbexplorer.eminentbio.com/home with no requirement of log-in.
The Ugly Truth About Ourselves and Our Robot Creations: The Problem of Bias and Social Inequity.
Howard, Ayanna; Borenstein, Jason
2017-09-21
Recently, there has been an upsurge of attention focused on bias and its impact on specialized artificial intelligence (AI) applications. Allegations of racism and sexism have permeated the conversation as stories surface about search engines delivering job postings for well-paying technical jobs to men and not women, or providing arrest mugshots when keywords such as "black teenagers" are entered. Learning algorithms are evolving; they are often created from parsing through large datasets of online information while having truth labels bestowed on them by crowd-sourced masses. These specialized AI algorithms have been liberated from the minds of researchers and startups, and released onto the public. Yet intelligent though they may be, these algorithms maintain some of the same biases that permeate society. They find patterns within datasets that reflect implicit biases and, in so doing, emphasize and reinforce these biases as global truth. This paper describes specific examples of how bias has infused itself into current AI and robotic systems, and how it may affect the future design of such systems. More specifically, we draw attention to how bias may affect the functioning of (1) a robot peacekeeper, (2) a self-driving car, and (3) a medical robot. We conclude with an overview of measures that could be taken to mitigate or halt bias from permeating robotic technology.
Conversion of Radiology Reporting Templates to the MRRT Standard.
Kahn, Charles E; Genereaux, Brad; Langlotz, Curtis P
2015-10-01
In 2013, the Integrating the Healthcare Enterprise (IHE) Radiology workgroup developed the Management of Radiology Report Templates (MRRT) profile, which defines both the format of radiology reporting templates using an extension of Hypertext Markup Language version 5 (HTML5), and the transportation mechanism to query, retrieve, and store these templates. Of 200 English-language report templates published by the Radiological Society of North America (RSNA), initially encoded as text and in an XML schema language, 168 have been converted successfully into MRRT using a combination of automated processes and manual editing; conversion of the remaining 32 templates is in progress. The automated conversion process applied Extensible Stylesheet Language Transformation (XSLT) scripts, an XML parsing engine, and a Java servlet. The templates were validated for proper HTML5 and MRRT syntax using web-based services. The MRRT templates allow radiologists to share best-practice templates across organizations and have been uploaded to the template library to supersede the prior XML-format templates. By using MRRT transactions and MRRT-format templates, radiologists will be able to directly import and apply templates from the RSNA Report Template Library in their own MRRT-compatible vendor systems. The availability of MRRT-format reporting templates will stimulate adoption of the MRRT standard and is expected to advance the sharing and use of templates to improve the quality of radiology reports.
NASA Astrophysics Data System (ADS)
Zhang, Min; Pavlicek, William; Panda, Anshuman; Langer, Steve G.; Morin, Richard; Fetterly, Kenneth A.; Paden, Robert; Hanson, James; Wu, Lin-Wei; Wu, Teresa
2015-03-01
DICOM Index Tracker (DIT) is an integrated platform to harvest rich information available from Digital Imaging and Communications in Medicine (DICOM) to improve quality assurance in radiology practices. It is designed to capture and maintain longitudinal patient-specific exam indices of interests for all diagnostic and procedural uses of imaging modalities. Thus, it effectively serves as a quality assurance and patient safety monitoring tool. The foundation of DIT is an intelligent database system which stores the information accepted and parsed via a DICOM receiver and parser. The database system enables the basic dosimetry analysis. The success of DIT implementation at Mayo Clinic Arizona calls for the DIT deployment at the enterprise level which requires significant improvements. First, for geographically distributed multi-site implementation, the first bottleneck is the communication (network) delay; the second is the scalability of the DICOM parser to handle the large volume of exams from different sites. To address this issue, DICOM receiver and parser are separated and decentralized by site. To facilitate the enterprise wide Quality Assurance (QA), a notable challenge is the great diversities of manufacturers, modalities and software versions, as the solution DIT Enterprise provides the standardization tool for device naming, protocol naming, physician naming across sites. Thirdly, advanced analytic engines are implemented online which support the proactive QA in DIT Enterprise.
Andersson, Richard; Larsson, Linnea; Holmqvist, Kenneth; Stridh, Martin; Nyström, Marcus
2017-04-01
Almost all eye-movement researchers use algorithms to parse raw data and detect distinct types of eye movement events, such as fixations, saccades, and pursuit, and then base their results on these. Surprisingly, these algorithms are rarely evaluated. We evaluated the classifications of ten eye-movement event detection algorithms, on data from an SMI HiSpeed 1250 system, and compared them to manual ratings of two human experts. The evaluation focused on fixations, saccades, and post-saccadic oscillations. The evaluation used both event duration parameters, and sample-by-sample comparisons to rank the algorithms. The resulting event durations varied substantially as a function of what algorithm was used. This evaluation differed from previous evaluations by considering a relatively large set of algorithms, multiple events, and data from both static and dynamic stimuli. The main conclusion is that current detectors of only fixations and saccades work reasonably well for static stimuli, but barely better than chance for dynamic stimuli. Differing results across evaluation methods make it difficult to select one winner for fixation detection. For saccade detection, however, the algorithm by Larsson, Nyström and Stridh (IEEE Transaction on Biomedical Engineering, 60(9):2484-2493,2013) outperforms all algorithms in data from both static and dynamic stimuli. The data also show how improperly selected algorithms applied to dynamic data misestimate fixation and saccade properties.
MASC: Multiprocessor Architecture for Symbolic Processing
1989-08-01
accomplish the ultimate goal of reducing the time to develop the 5- 12 correct parse. 5.6.5 Effect of’ Sentence Length Figure 5-6 shows the relationship...later; this effectively restricts the breadth of study rather severely. 6.1.2. Objective The tools described here have been developed in response to...either expressed or implied, of the Defense Advanced Research Projects Agency or the U.S. Government. ROME AIR DEVELOPMENT CENTER Air Force Systems
MSR 2.0: Language Definition and Programming Environment
2011-11-01
this version, was also used in foundational studies for crypto - protocols [14, 19]. An implementation of MSR 2.0 which adheres to the definition...describing non-standard ways to parse oc- currences of this symbol. A unary constant f can be declared either prefixed or postfixed by means of the...Information Technology — MFCSIT’00, pages 1–43, Cork, Ireland, 2000. Elsevier ENTCS 40. [6] Iliano Cervesato. A Specification Language for Crypto
Parsing Chinese-Russian Military Exercises
2015-04-01
the rubric of their bilateral friendship treaty, than Peace Mission 2007, which involved com- bat troops from other SCO members. Peace Mission 2009...3rd Air Force and Air Defense Command.145 Russia also contributed 60 armored vehicles (including 40 BMP-2 infantry com- bat vehicles and 13 T-72 main...large U.S.- Philippines amphibious drill and followed a series of U.S.-South Korean mili- tary exercises that some Chinese and Russian com- mentators had
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, Joseph; Pirrung, Meg; McCue, Lee Ann
FQC is software that facilitates quality control of FASTQ files by carrying out a QC protocol using FastQC, parsing results, and aggregating quality metrics into an interactive dashboard designed to richly summarize individual sequencing runs. The dashboard groups samples in dropdowns for navigation among the data sets, utilizes human-readable configuration files to manipulate the pages and tabs, and is extensible with CSV data.
Supporting NATO C2-Simulation Experimentation with Scripted Web Services
2011-06-01
SBMLServices services must parse the input scripts. • Semaphores are created to insure serial access to the remaining global resources: − Since there can only...be one connection to the JC3IEDM RI, that connection now must be shared among all instances; this requires a semaphore to control access...Initialization of SBMLServer is also now protected by a semaphore . • Setting and using object identifiers (OIDs) for pushing to the RI requires
An Improved Tarpit for Network Deception
2016-03-25
World” program was, to one who is ready to join the cyber security workforce. Thirdly, I thank my mom and dad for their constant love , support, and...arrow in a part-whole relationship . In the diagram GreaseMonkey contains the three packet handler classes. The numbers next to the PriorityQueue and...arrow from Greasy to the config_parser module represents a usage relationship , where Greasy uses functions from config_parser to parse the configuration
An expert system for natural language processing
NASA Technical Reports Server (NTRS)
Hennessy, John F.
1988-01-01
A solution to the natural language processing problem that uses a rule based system, written in OPS5, to replace the traditional parsing method is proposed. The advantage to using a rule based system are explored. Specifically, the extensibility of a rule based solution is discussed as well as the value of maintaining rules that function independently. Finally, the power of using semantics to supplement the syntactic analysis of a sentence is considered.
Criteria for Evaluating the Performance of Compilers
1974-10-01
cannot be made to fit, then an auxiliary mechanism outside the parser might be used . Finally, changing the choice of parsing tech - nique to a...was not useful in providing a basic for compiler evaluation. The study of the first question eztablished criteria and methodb for assigning four...program. The study of the second question estab- lished criteria for defining a "compiler Gibson mix", and established methods for using this "mix" to
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sorokine, Alexandre
2011-10-01
Simple Ontology Format (SOFT) library and file format specification provides a set of simple tools for developing and maintaining ontologies. The library, implemented as a perl module, supports parsing and verification of the files in SOFt format, operations with ontologies (adding, removing, or filtering of entities), and converting of ontologies into other formats. SOFT allows users to quickly create ontologies using only a basic text editor, verify it, and portray it in a graph layout system using customized styles.
Can Attention be Divided Between Perceptual Groups?
NASA Technical Reports Server (NTRS)
McCann, Robert S.; Foyle, David C.; Johnston, James C.; Hart, Sandra G. (Technical Monitor)
1994-01-01
Previous work using Head-Up Displays (HUDs) suggests that the visual system parses the HUD and the outside world into distinct perceptual groups, with attention deployed sequentially to first one group and then the other. New experiments show that both groups can be processed in parallel in a divided attention search task, even though subjects have just processed a stimulus in one perceptual group or the other. Implications for models of visual attention will be discussed.
The Effect of Input Device on User Performance With a Menu-Based Natural Language Interface
1988-01-01
Texas. The experiment was conducted and the data were analyzed by Virginia Polytechnic Institute and State University human factors engineexing personnel...comments. Thanks to Dr. William Fisher for his help in the parsing of the grammar used in the MBNL interface prototype, and to Mr. Ken Stevenson for...natural language instructions to accomplish particular tasks (Bobrow & Collins, 1975; Brown, Burton, & Bell, 1975; Ford, 1981; Green, Wolf, Chomsky
ERIC Educational Resources Information Center
Queiroz, Fernanda Cristina Barbosa Pereira; Samohyl, Robert Wayne; Queiroz, Jamerson Viegas; Lima, Nilton Cesar; de Souza, Gustavo Henrique Silva
2014-01-01
This paper aims to develop and implement a method to identify the causes of the choice of a course and the reasons for evasion in higher education. This way, we sought to identify the factors that influence student choice to opt for Higher Education Institution parsed, as well as the factors influencing its evasion. The methodology employed was…
CLUSTER: An Approach to Contextual Language Understanding
1986-04-01
to the UCB Math Department, to my adviser Robert Wilensky, and to the Computer Science Department at the University of Southern California. And... purely syntactic investigation of an utterance, such as that resulting in a syntactic parse tree. The latter process is traditionally referred to as...only hurts when I laugh! and verbat£m texts, e. g. 99 and 44/100 percent pure . Both of the above expressions can be understood in a productive
Brown, Joseph; Pirrung, Meg; McCue, Lee Ann
2017-06-09
FQC is software that facilitates quality control of FASTQ files by carrying out a QC protocol using FastQC, parsing results, and aggregating quality metrics into an interactive dashboard designed to richly summarize individual sequencing runs. The dashboard groups samples in dropdowns for navigation among the data sets, utilizes human-readable configuration files to manipulate the pages and tabs, and is extensible with CSV data.
2006-03-01
converters from GIL and many other formats. Other hilites: command line argument parsing, a simple set of routines for de- veloping Xwindows graphical...Ramakrishna Nemani, James E. Vogelmann, V. Ruth Hobson, Benjamin Tuttle, Jeff Safran, Ingrid Nelson. (2001). “Development Sprawl Impacts on the... Sale Prices as a Basis for Farm Land Appraisal,” Technical Bulletin, University of Minnesota. Hosmer, D.W., and S. Lemeshow. (1989). Applied
Inducing Multilingual Text Analysis Tools via Robust Projection across Aligned Corpora
2001-01-01
monolingual dictionary - derived list of canonical roots would resolve ambiguity re- garding which is the appropriate target. � Many of the errors are...system and set of algorithms for automati- cally inducing stand-alone monolingual part-of-speech taggers, base noun-phrase bracketers, named-entity...corpora has tended to focus on their use in translation model training for MT rather than on monolingual applications. One exception is bilin- gual parsing
Haegens, Saskia; Barczak, Annamaria; Musacchia, Gabriella; Lipton, Michael L; Mehta, Ashesh D; Lakatos, Peter; Schroeder, Charles E
2015-10-21
The functional significance of the α rhythm is widely debated. It has been proposed that α reflects sensory inhibition and/or a temporal sampling or "parsing" mechanism. There is also continuing disagreement over the more fundamental questions of which cortical layers generate α rhythms and whether the generation of α is equivalent across sensory systems. To address these latter questions, we analyzed laminar profiles of local field potentials (LFPs) and concomitant multiunit activity (MUA) from macaque V1, S1, and A1 during both spontaneous activity and sensory stimulation. Current source density (CSD) analysis of laminar LFP profiles revealed α current generators in the supragranular, granular, and infragranular layers. MUA phase-locked to local current source/sink configurations confirmed that α rhythms index local neuronal excitability fluctuations. CSD-defined α generators were strongest in the supragranular layers, whereas LFP α power was greatest in the infragranular layers, consistent with some of the previous reports. The discrepancy between LFP and CSD findings appears to be attributable to contamination of the infragranular LFP signal by activity that is volume-conducted from the stronger supragranular α generators. The presence of α generators across cortical depth in V1, S1, and A1 suggests the involvement of α in feedforward as well as feedback processes and is consistent with the view that α rhythms, perhaps in addition to a role in sensory inhibition, may parse sensory input streams in a way that facilitates communication across cortical areas. The α rhythm is thought to reflect sensory inhibition and/or a temporal parsing mechanism. Here, we address two outstanding issues: (1) whether α is a general mechanism across sensory systems and (2) which cortical layers generate α oscillations. Using intracranial recordings from macaque V1, S1, and A1, we show α band activity with a similar spectral and laminar profile in each of these sensory areas. Furthermore, α generators were present in each of the cortical layers, with a strong source in superficial layers. We argue that previous findings, locating α generators exclusively in the deeper layers, were biased because of use of less locally specific local field potential measurements. The laminar distribution of α band activity appears more complex than generally assumed. Copyright © 2015 the authors 0270-6474/15/3514341-12$15.00/0.
Research Activities of the Northwest Laboratory for Integrated Systems
1987-04-06
table, and composite table (to assist evaluation of objects) are each built. The parse tree is also checked to make sure there are no meaningless...Stan- ford) as well as the Apollo DN series. All of these implementations require eight bit planes for effective use of color. Also supported are AED...time of intersection had not yet passed the queuing of the segment was delayed until that time. This algorithm had the effect of preserving the slope of
Toward Deriving Software Architectures from Quality Attributes
1994-08-01
administration of Its orograms on the basis of religion creec ancestry. belief, age veteran status sexuai orientation or rn violation of federal state or Ioca...environments rely on the notion of a "tool bus" or an explicit shared repository [ Wasser - man 89] to allow easy integration of tools. 4.7 Unit...attributed parse tree and symbol table that the compiler cre- ates and annotates during its various phases. This results in a very different software
Parse Completion: A Study of an Inductive Domain
1987-07-01
for Right Linear and Chomsky Normal Form grammars in detail. These two grammar classes were chosen as they can capture the classes of Regular and...Linear and Chomsky Normal Form grammars the allowed RHS formats could be divided into those which introduced new non-terminals and those which reused... Chomsky Normal Form grammars can both be shown to define a partial order over the set of grammars consistent with the examples. (Note that this is a
A Tandem Semantic Interpreter for Incremental Parse Selection
1990-09-28
syntactic role to semantic role. An exam - ple from Fillmore [10] is the sentence I copied that letter, which can be uttered when point- ing either to...person. We want the word fiddle to have the sort predicate violin as its lexical interpretation, how- ever, notthing. Thus, for Ross went for his fiddle...to receive an interpretation, a sort hierarchy is needed to establish that all violins are things. A well-structured sort hierarchy allows newly added
Evaluation of Image Segmentation and Object Recognition Algorithms for Image Parsing
2013-09-01
generation of the features from the key points. OpenCV uses Euclidean distance to match the key points and has the option to use Manhattan distance...feature vector includes polarity and intensity information. Final step is matching the key points. In OpenCV , Euclidean distance or Manhattan...the code below is one way and OpenCV offers the function radiusMatch (a pair must have a distance less than a given maximum distance). OpenCV’s
Aligning HST Images to Gaia: A Faster Mosaicking Workflow
NASA Astrophysics Data System (ADS)
Bajaj, V.
2017-11-01
We present a fully programmatic workflow for aligning HST images using the high-quality astrometry provided by Gaia Data Release 1. Code provided in a Jupyter Notebook works through this procedure, including parsing the data to determine the query area parameters, querying Gaia for the coordinate catalog, and using the catalog with TweakReg as reference catalog. This workflow greatly simplifies the normally time-consuming process of aligning HST images, especially those taken as part of mosaics.
Prediction and constancy of cognitive-motivational structures in mothers and their adolescents.
Malerstein, A J; Ahern, M M; Pulos, S; Arasteh, J D
1995-01-01
Three clinically-derived, cognitive-motivational structures were predicted in 68 adolescents from their caregiving situations as revealed in their mothers' interviews, elicited six years earlier. Basic to each structure is a motivational concern and its related social cognitive style, a style which corresponds to a Piagetian cognitive stage: concrete operational, intuitive or symbolic. Because these structure types parse a non-clinical population, current views of health and accordingly goals of treatment may need modification.
Cook, Tessa S; Zimmerman, Stefan L; Steingall, Scott R; Maidment, Andrew D A; Kim, Woojin; Boonn, William W
2011-01-01
There is growing interest in the ability to monitor, track, and report exposure to radiation from medical imaging. Historically, however, dose information has been stored on an image-based dose sheet, an arrangement that precludes widespread indexing. Although scanner manufacturers are beginning to include dose-related parameters in the Digital Imaging and Communications in Medicine (DICOM) headers of imaging studies, there remains a vast repository of retrospective computed tomographic (CT) data with image-based dose sheets. Consequently, it is difficult for imaging centers to monitor their dose estimates or participate in the American College of Radiology (ACR) Dose Index Registry. An automated extraction software pipeline known as Radiation Dose Intelligent Analytics for CT Examinations (RADIANCE) has been designed that quickly and accurately parses CT dose sheets to extract and archive dose-related parameters. Optical character recognition of information in the dose sheet leads to creation of a text file, which along with the DICOM study header is parsed to extract dose-related data. The data are then stored in a relational database that can be queried for dose monitoring and report creation. RADIANCE allows efficient dose analysis of CT examinations and more effective education of technologists, radiologists, and referring physicians regarding patient exposure to radiation at CT. RADIANCE also allows compliance with the ACR's dose reporting guidelines and greater awareness of patient radiation dose, ultimately resulting in improved patient care and treatment.
Phrase Lengths and the Perceived Informativeness of Prosodic Cues in Turkish.
Dinçtopal Deniz, Nazik; Fodor, Janet Dean
2017-12-01
It is known from previous studies that in many cases (though not all) the prosodic properties of a spoken utterance reflect aspects of its syntactic structure, and also that in many cases (though not all) listeners can benefit from these prosodic cues. A novel contribution to this literature is the Rational Speaker Hypothesis (RSH), proposed by Clifton, Carlson and Frazier. The RSH maintains that listeners are sensitive to possible reasons for why a speaker might introduce a prosodic break: "listeners treat a prosodic boundary as more informative about the syntax when it flanks short constituents than when it flanks longer constituents," because in the latter case the speaker might have been motivated solely by consideration of optimal phrase lengths. This would effectively reduce the cue value of an appropriately placed prosodic boundary. We present additional evidence for the RSH from Turkish, a language typologically different from English. In addition, our study shows for the first time that the RSH also applies to a prosodic break which conflicts with the syntactic structure, reducing its perceived cue strength if it might have been motivated by length considerations. In this case, the RSH effect is beneficial. Finally, the Turkish data show that prosody-based explanations for parsing preferences such as the RSH do not take the place of traditional syntax-sensitive parsing strategies such as Late Closure. The two sources of guidance co-exist; both are used when available.
Optimizing ROOT’s Performance Using C++ Modules
NASA Astrophysics Data System (ADS)
Vassilev, Vassil
2017-10-01
ROOT comes with a C++ compliant interpreter cling. Cling needs to understand the content of the libraries in order to interact with them. Exposing the full shared library descriptors to the interpreter at runtime translates into increased memory footprint. ROOT’s exploratory programming concepts allow implicit and explicit runtime shared library loading. It requires the interpreter to load the library descriptor. Re-parsing of descriptors’ content has a noticeable effect on the runtime performance. Present state-of-art lazy parsing technique brings the runtime performance to reasonable levels but proves to be fragile and can introduce correctness issues. An elegant solution is to load information from the descriptor lazily and in a non-recursive way. The LLVM community advances its C++ Modules technology providing an io-efficient, on-disk representation capable to reduce build times and peak memory usage. The feature is standardized as a C++ technical specification. C++ Modules are a flexible concept, which can be employed to match CMS and other experiments’ requirement for ROOT: to optimize both runtime memory usage and performance. Cling technically “inherits” the feature, however tweaking it to ROOT scale and beyond is a complex endeavor. The paper discusses the status of the C++ Modules in the context of ROOT, supported by few preliminary performance results. It shows a step-by-step migration plan and describes potential challenges which could appear.
NASA Taxonomies for Searching Problem Reports and FMEAs
NASA Technical Reports Server (NTRS)
Malin, Jane T.; Throop, David R.
2006-01-01
Many types of hazard and risk analyses are used during the life cycle of complex systems, including Failure Modes and Effects Analysis (FMEA), Hazard Analysis, Fault Tree and Event Tree Analysis, Probabilistic Risk Assessment, Reliability Analysis and analysis of Problem Reporting and Corrective Action (PRACA) databases. The success of these methods depends on the availability of input data and the analysts knowledge. Standard nomenclature can increase the reusability of hazard, risk and problem data. When nomenclature in the source texts is not standard, taxonomies with mapping words (sets of rough synonyms) can be combined with semantic search to identify items and tag them with metadata based on a rich standard nomenclature. Semantic search uses word meanings in the context of parsed phrases to find matches. The NASA taxonomies provide the word meanings. Spacecraft taxonomies and ontologies (generalization hierarchies with attributes and relationships, based on terms meanings) are being developed for types of subsystems, functions, entities, hazards and failures. The ontologies are broad and general, covering hardware, software and human systems. Semantic search of Space Station texts was used to validate and extend the taxonomies. The taxonomies have also been used to extract system connectivity (interaction) models and functions from requirements text. Now the Reconciler semantic search tool and the taxonomies are being applied to improve search in the Space Shuttle PRACA database, to discover recurring patterns of failure. Usual methods of string search and keyword search fall short because the entries are terse and have numerous shortcuts (irregular abbreviations, nonstandard acronyms, cryptic codes) and modifier words cannot be used in sentence context to refine the search. The limited and fixed FMEA categories associated with the entries do not make the fine distinctions needed in the search. The approach assigns PRACA report titles to problem classes in the taxonomy. Each ontology class includes mapping words - near-synonyms naming different manifestations of that problem class. The mapping words for Problems, Entities and Functions are converted to a canonical form plus any of a small set of modifier words (e.g. non-uniformity NOT + UNIFORM.) The report titles are parsed as sentences if possible, or treated as a flat sequence of word tokens if parsing fails. When canonical forms in the title match mapping words, the PRACA entry is associated with the corresponding Problem, Entity or Function in the ontology. The user can search for types of failures associated with types of equipment, clustering by type of problem (e.g., all bearings found with problems of being uneven: rough, irregular, gritty ). The results could also be used for tagging PRACA report entries with rich metadata. This approach could also be applied to searching and tagging failure modes, failure effects and mitigations in FMEAs. In the pilot work, parsing 52K+ truncated titles (the test cases that were available), has resulted in identification of both a type of equipment and type of problem in about 75% of the cases. The results are displayed in a manner analogous to Google search results. The effort has also led to the enrichment of the taxonomy, adding some new categories and many new mapping words. Further work would make enhancements that have been identified for improving the clustering and further reducing the false alarm rate. (In searching for recurring problems, good clustering is more important than reducing false alarms). Searching complete PRACA reports should lead to immediate improvement.
Gradient Boosting for Conditional Random Fields
2014-09-23
221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257...length around 208. We use exactly the same features and data split step as [22]. The resulting data set contains 5600 sequences as training set, 256 ...1097–1104. [16] F. Sha and F. Pereira. Shallow parsing with conditional random fields. In Proceedings of the 2003 Conference of the North American
Obscuration Code with Space Station Applications (Manual)
1985-12-01
used to perform this DCL style com - mand parsing, readers are referred to the VMS documentation concerning the Command Definition Utility or CDU. I I I...FOR0O7.DAT; Input echo file: USERI: [RJM.NASJAN5S1 .LIS;3 The above examples show the operation of the SET OUTPUT com - mand. Note that the printer file is...be opened using the SET OUTPUT com - mand. The output files can be opened and closed using the SET OUTPUT /ECHOING, /PRINTABLE, /PLOTTABLE commands
Feeling Peaceful: A Universal Living Experience.
Doucet, Thomas J
2018-01-01
The purpose of this study was to investigate the living experience of feeling peaceful. Parse's research method was used to answer the question: What is the structure of the living experience of feeling peaceful? Twelve participants living in a community consented to partake in the study. The central finding of the study is the structure: feeling peaceful is contentedness amid tribulation, as unburdening surfaces with devout involvements. The findings are discussed in relation to the humanbecoming school of thought and extant literature.
2014-06-01
central location. Each of the SQLite databases are converted and stored in one MySQL database and the pcap files are parsed to extract call information...from the specific communications applications used during the experiment. This extracted data is then stored in the same MySQL database. With all...rhythm of the event. Figure 3 demonstrates the application usage over the course of the experiment for the EXDIR. As seen, the EXDIR spent the majority
The TextLearner System: Reading Learning Comprehension
2006-06-01
edibles, we would generate strings such as "beef " " tomato " and "salmon. " 2. Find all strings which Cyc recognizes as satisfying the head syntactic...Comprehension Final Report cYc0rp June 2006 For example, "beef stew, " " tomato soup, " "salmon tofu." 4. Use the rule to parse these combinations into the...10245136186770428 (RelationalNCRuleFn 10.1 1219512195121951 (PresentTenseVersion~n transporter) 2 Transport- TheWord) [Athrax- Vaccination -NCR
1988-08-01
heavily on the original SPQR component, and uses the same context free grammar to analyze the ISR. The main difference is that, where before SPQR ...ISR is semantically coherent. This has been tested thoroughly on the CASREPS domain, and selects the same parses that SPQR Eid, in less time. There...were a few SPQR patterns that reflected semantic information that could only be provided by time analysis, such as the fact that [pressure during
The Rise of the Pasdaran. Assessing the Domestic Roles of Iran’s Islamic Revolutionary Guards Corps
2009-01-01
Minis- try of Oil News Agency, “Emza-e Gharardad-e Shirinsaziy-e Gas projey-e Iran LNG” (The contract of gas sweetening of Iran’s LNG project was signed...established in 1999 to import sugar , construction materials, and pharmaceuticals. It is also said to maintain an office near a sus- pected nuclear...Gharardad-e Shirinsaziye gase faze 12 parse jonubi emza shod” (The Agreement on the Sweetening of Gas from South Pars Phase 12 Has Been Signed
BBN: Description of the PLUM System as Used for MUC-4
1992-01-01
in the MUC-4 corpus’ . Here are the 8 parse fragments generated by FPP for the first sentence of TST2- MUC4 -0048 : ("SALVADORAN PRESIDENT-ELECT ALFREDO...extensive patterns for fragment combination . Figure 2 shows a graphical version of the semantics generated for the first fragment of S1 in TST2- MUC4 ...trigger. Following is the discourse event structure for the first event in TST2- MUC4 -0048 : Event MURDER Trigger fragments: "SALVADORAN PRESIDENT
Preliminary Analysis of a Breadth-First Parsing Algorithm: Theoretical and Experimental Results.
1981-06-01
present discussion we will assume that phrases have one or two daughters, or more formally, that the grammar is in Chomsky Normal Form [1].) This... grammar point of view, these pairs contrast Chomsky Normal Form [1] with Categorial Grammars [2], and from a representational point of view, these pairs...chart(i, k) * chart(k, j) bottom-up ( Chomsky Normal Form) (9) chart(k, j) = chart(i, ) top-down (Categorial Grammars )chart(i, k) Earley’s Algorithm [8
2002-01-01
of the U.S. Army includes a strong and continuous presence in regions with high human immunodeficiency virus (HIV) and STI prevalence makes disease...changes on their pap smears. As we now understand, the human papilloma virus is sexually transmitted. We had to send them out to Germany to get... Human Immunodeficiency Virus (HIV) Education and HIV Risk Behavior: A Survey of Rapid Deployment Troops. Military Medicine, 163, 672-675. Parse, R
Ramu, Chenna
2003-07-01
SIRW (http://sirw.embl.de/) is a World Wide Web interface to the Simple Indexing and Retrieval System (SIR) that is capable of parsing and indexing various flat file databases. In addition it provides a framework for doing sequence analysis (e.g. motif pattern searches) for selected biological sequences through keyword search. SIRW is an ideal tool for the bioinformatics community for searching as well as analyzing biological sequences of interest.
Parser for Sabin-to-Mahoney Transition Model of Quasispecies Replication
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ecale Zhou, Carol
2016-01-03
This code is a data parse for preparing output from the Qspp agent-based stochastic simulation model for plotting in Excel. This code is specific to a set of simulations that were run for the purpose of preparing data for a publication. It is necessary to make this code open-source in order to publish the model code (Qspp), which has already been released. There is a necessity of assuring that results from using Qspp for a publication
1989-09-30
Please choose a list of switches, or type ’"ok.’’ -- [3,5,7]. Changed the switch: parse-. tree --------------------------------- > ON Changed the switch...argument of the verb, especially in the passive (The car was found parked on Elm Street). Other verbs are clearer: They reported the car stolen doesn’t...object slot in the passive object passobj, as in the tree above. Strings, LXiRs and Disjunctive Rules In general, there are three basic types of rules in
1982-01-01
the best known being ELIZA - a simulated Rogerian psychotherapist (Weizenbaum 1966), and PARRY - a simulated paranoid patient (Colby 1968). These...derived from the syntactic aspects of the input, that is, the word classes (noun, verb etc) rather than the word meanings. The concept of parsing is...captures the "full" meaning of a word or concept , consequently few researchers actually seek "absolute" definitions of words. The definition of a word, as
Learning and Parsing Video Events with Goal and Intent Prediction
2012-03-19
including office, lab, hallway, cor- ridor and near vending machines . Figure 14 shows some screen-shots of the videos. The training video total lasts...most of the ambiguities can be removed by the event context in the top-down bottom-up inference, we will show this in the experiment section. 5 Figure 5...events, and remove the ambiguities in the detection of atomic actions by the event context. The energy of PG is E(PG | I∧) = p(K) K∑ k=1 (ε(pgk | I
An Overview of the Production Quality Compiler-Compiler Project
1979-02-01
process. A parse tree is assumed, and there is a set of primitives for extracting information from it and for "walking" it: using its structure to...not adequate for, and even preclude, techniques that involve multiple phases, or non-trivial auxiliary data structures. In recent years there have...VALUE field of node 23: would indicate that the type of the value field was mtcger. As with "union mode" or "variant record" features in many
Evaluating the Reliability of Emergency Response Systems for Large-Scale Incident Operations
Jackson, Brian A.; Faith, Kay Sullivan; Willis, Henry H.
2012-01-01
Abstract The ability to measure emergency preparedness—to predict the likely performance of emergency response systems in future events—is critical for policy analysis in homeland security. Yet it remains difficult to know how prepared a response system is to deal with large-scale incidents, whether it be a natural disaster, terrorist attack, or industrial or transportation accident. This research draws on the fields of systems analysis and engineering to apply the concept of system reliability to the evaluation of emergency response systems. The authors describe a method for modeling an emergency response system; identifying how individual parts of the system might fail; and assessing the likelihood of each failure and the severity of its effects on the overall response effort. The authors walk the reader through two applications of this method: a simplified example in which responders must deliver medical treatment to a certain number of people in a specified time window, and a more complex scenario involving the release of chlorine gas. The authors also describe an exploratory analysis in which they parsed a set of after-action reports describing real-world incidents, to demonstrate how this method can be used to quantitatively analyze data on past response performance. The authors conclude with a discussion of how this method of measuring emergency response system reliability could inform policy discussion of emergency preparedness, how system reliability might be improved, and the costs of doing so. PMID:28083267
Research issues of geometry-based visual languages and some solutions
NASA Astrophysics Data System (ADS)
Green, Thorn G.
This dissertation addresses the problem of how to design visual language systems that are based upon Geometric Algebra, and provide a visual coupling of algebraic expressions and geometric depictions. This coupling of algebraic expressions and geometric depictions provides a new means for expressing both mathematical and geometric relationships present in mathematics, physics, and Computer-Aided Geometric Design (CAGD). Another significant feature of such a system is that the result of changing a parameter (by dragging the mouse) can be seen immediately in the depiction(s) of all expressions that use that parameter. This greatly aides the cognition of the relationships between variables. Systems for representing such a coupling of algebra and geometry have characteristics of both visual language systems, and systems for scientific visualization. Instead of using a parsing or dataflow paradigm for the visual language representation, the systems instead represent equations as manipulatible constrained diagrams for their visualization. This requires that the design of such a system have (but is not limited to) a means for parsing equations entered by the user, a scheme for producing a visual representation of these equations; techniques for maintaining the coupling between the expressions entered and the diagrams displayed; algorithms for maintaining the consistency of the diagrams; and, indexing capabilities that are efficient enough to allow diagrams to be created, and manipulated in a short enough period of time. The author proposes solutions for how such a design can be realized.
Serious Games that Improve Performance
NASA Technical Reports Server (NTRS)
McGowan, Clement, III; Pecheux, Benjamin
2010-01-01
Serious games can help people function more effectively in complex settings, facilitate their role as team members, and provide insight into their team's mission. In such games, coordination and cooperation among team members are foundational to the mission's success and provide a preview of what individuals and the team as a whole could choose to do in a real scenario. Serious games often model events requiring life-or-death choices, such as civilian rescue during chemical warfare. How the players communicate and what actions they take can determine the number of lives lost or saved. However, merely playing a game is not enough to realize its most practical value, which is in learning what actions and communication methods are closest to what the mission requires. Teams often play serious games in isolation, so when the game is complete, an analytical stage is needed to extract the strategies used and examine each strategy's success relative to the others chosen. Recognizing the importance of this next stage, Noblis has been developing Game Analysis, software that parses individual game play into meaningful units and generates a strategic analysis. Trainers create a custom game-specific grammar that reflects the objects and range of actions allowable in a particular game, which Game Analysis then uses to parse the data and generate a practical analysis. Trainers have then enough information to represent strategies in tools, such as Gantt and heat map charts. First-responder trainees in North Carolina have already partnered Hot-Zone and Game Analysis with great success.
HEFNER, KATHRYN R.; VERONA, EDELYN; CURTIN, JOHN. J.
2017-01-01
Improved understanding of fear inhibition processes can inform the etiology and treatment of anxiety disorders. Safety signals can reduce fear to threat, but precise mechanisms remain unclear. Safety signals may acquire attentional salience and affective properties (e.g., relief) independent of the threat; alternatively, safety signals may only hold affective value in the presence of simultaneous threat. To clarify such mechanisms, an experimental paradigm assessed independent processing of threat and safety cues. Participants viewed a series of red and green words from two semantic categories. Shocks were administered following red words (cue+). No shocks followed green words (cue−). Words from one category were defined as safety signals (SS); no shocks were administered on cue+ trials. Words from the other (control) category did not provide information regarding shock administration. Threat (cue+ vs. cue−) and safety (SS+ vs. SS−) were fully crossed. Startle response and ERPs were recorded. Startle response was increased during cue+ versus cue−. Safety signals reduced startle response during cue+, but had no effect on startle response during cue−. ERP analyses (PD130 and P3) suggested that participants parsed threat and safety signal information in parallel. Motivated attention was not associated with safety signals in the absence of threat. Overall, these results confirm that fear can be reduced by safety signals. Furthermore, safety signals do not appear to hold inherent hedonic salience independent of their effect during threat. Instead, safety signals appear to enable participants to engage in effective top-down emotion regulatory processes. PMID:27088643
A linguistic rule-based approach to extract drug-drug interactions from pharmacological documents.
Segura-Bedmar, Isabel; Martínez, Paloma; de Pablo-Sánchez, César
2011-03-29
A drug-drug interaction (DDI) occurs when one drug influences the level or activity of another drug. The increasing volume of the scientific literature overwhelms health care professionals trying to be kept up-to-date with all published studies on DDI. This paper describes a hybrid linguistic approach to DDI extraction that combines shallow parsing and syntactic simplification with pattern matching. Appositions and coordinate structures are interpreted based on shallow syntactic parsing provided by the UMLS MetaMap tool (MMTx). Subsequently, complex and compound sentences are broken down into clauses from which simple sentences are generated by a set of simplification rules. A pharmacist defined a set of domain-specific lexical patterns to capture the most common expressions of DDI in texts. These lexical patterns are matched with the generated sentences in order to extract DDIs. We have performed different experiments to analyze the performance of the different processes. The lexical patterns achieve a reasonable precision (67.30%), but very low recall (14.07%). The inclusion of appositions and coordinate structures helps to improve the recall (25.70%), however, precision is lower (48.69%). The detection of clauses does not improve the performance. Information Extraction (IE) techniques can provide an interesting way of reducing the time spent by health care professionals on reviewing the literature. Nevertheless, no approach has been carried out to extract DDI from texts. To the best of our knowledge, this work proposes the first integral solution for the automatic extraction of DDI from biomedical texts.
Coleman, Kristy K L; Coleman, Brenda L; MacKinley, Julia D; Pasternak, Stephen H; Finger, Elizabeth C
2016-01-01
The Montreal Cognitive Assessment (MoCA) is a cognitive screening tool used by practitioners worldwide. The efficacy of the MoCA for screening frontotemporal dementia (FTD) and related disorders is unknown. The objectives were: (1) to determine whether the MoCA detects cognitive impairment (CI) in FTD subjects; (2) to determine whether Alzheimer disease (AD) and FTD subtypes and related disorders can be parsed using the MoCA; and (3) describe longitudinal MoCA performance by subtype. We extracted demographic and testing data from a database of patients referred to a cognitive neurology clinic who met criteria for probable AD or FTD (N=192). Logistic regression was used to determine whether dementia subtypes were associated with overall scores, subscores, or combinations of subscores on the MoCA. Initial MoCA results demonstrated CI in the majority of FTD subjects (87%). FTD subjects (N=94) performed better than AD subjects (N=98) on the MoCA (mean scores: 18.1 vs. 16.3; P=0.02). Subscores parsed many, but not all subtypes. FTD subjects had a larger decline on the MoCA within 13 to 36 months than AD subjects (P=0.02). The results indicate that the MoCA is a useful tool to identify and track progression of CI in FTD. Further, the data informs future research on scoring models for the MoCA to enhance cognitive screening and detection of FTD patients.
Progress in recognizing typeset mathematics
NASA Astrophysics Data System (ADS)
Fateman, Richard J.; Tokuyasu, Taku A.
1996-03-01
Printed mathematics has a number of features which distinguish it from conventional text. These include structure in two dimensions (fractions, exponents, limits), frequent font changes, symbols with variable shape (quotient bars), and substantially differing notational conventions from source to source. When compounded with more generic problems such as noise and merged or broken characters, printed mathematics offers a challenging arena for recognition. Our project was initially driven by the goal of scanning and parsing some 5,000 pages of elaborate mathematics (tables of definite integrals). While our prototype system demonstrates success on translating noise-free typeset equations into Lisp expressions appropriate for further processing, a more semantic top-down approach appears necessary for higher levels of performance. Such an approach may benefit the incorporation of these programs into a more general document processing viewpoint. We intend to release to the public our somewhat refined prototypes as utility programs in the hope that they will be of general use in the construction of custom OCR packages. These utilities are quite fast even as originally prototyped in Lisp, where they may be of particular interest to those working on 'intelligent' optical processing. Some routines have been re-written in C++ as well. Additional programs providing formula recognition and parsing also form a part of this system. It is important however to realize that distinct conflicting grammars are needed to cover variations in contemporary and historical typesetting, and thus a single simple solution is not possible.
Online Object Tracking, Learning and Parsing with And-Or Graphs.
Wu, Tianfu; Lu, Yang; Zhu, Song-Chun
2017-12-01
This paper presents a method, called AOGTracker, for simultaneously tracking, learning and parsing (TLP) of unknown objects in video sequences with a hierarchical and compositional And-Or graph (AOG) representation. The TLP method is formulated in the Bayesian framework with a spatial and a temporal dynamic programming (DP) algorithms inferring object bounding boxes on-the-fly. During online learning, the AOG is discriminatively learned using latent SVM [1] to account for appearance (e.g., lighting and partial occlusion) and structural (e.g., different poses and viewpoints) variations of a tracked object, as well as distractors (e.g., similar objects) in background. Three key issues in online inference and learning are addressed: (i) maintaining purity of positive and negative examples collected online, (ii) controling model complexity in latent structure learning, and (iii) identifying critical moments to re-learn the structure of AOG based on its intrackability. The intrackability measures uncertainty of an AOG based on its score maps in a frame. In experiments, our AOGTracker is tested on two popular tracking benchmarks with the same parameter setting: the TB-100/50/CVPR2013 benchmarks , [3] , and the VOT benchmarks [4] -VOT 2013, 2014, 2015 and TIR2015 (thermal imagery tracking). In the former, our AOGTracker outperforms state-of-the-art tracking algorithms including two trackers based on deep convolutional network [5] , [6] . In the latter, our AOGTracker outperforms all other trackers in VOT2013 and is comparable to the state-of-the-art methods in VOT2014, 2015 and TIR2015.
Parsing affective dynamics to identify risk for mood and anxiety disorders.
Heller, Aaron S; Fox, Andrew S; Davidson, Richard J
2018-06-04
Emotional dysregulation is thought to underlie risk for both anxiety and depressive disorders. However, despite high rates of comorbidity, anxiety and depression are phenotypically different. Apart from nosological differences (e.g., worry for anxiety, low mood for depression), it remains unclear how the emotional dysregulation inherent in individual differences in trait anxiety and depression severity present on a day-to-day basis. One approach that may facilitate addressing these questions is to utilize Ecological Momentary Assessment (EMA) using mobile phones to parse the temporal dynamics of affective experiences into specific parameters. An emerging literature in affective science suggests that risk for anxiety and depressive disorders may be associated with variation in the mean and instability/variability of emotion. Here we examine the extent to which distinct temporal dynamic parameters uniquely predict risk for anxiety versus depression. Over 10 days, 105 individuals rated their current positive and negative affective state several times each day. Using two distinct approaches to statistically assess mean and instability of positive and negative affect, we found that individual differences in trait anxiety was generally associated with increased instability of positive and negative affect whereas mean levels of positive and negative affect were generally associated with individual differences in depression. These data provide evidence that the emotional dysregulation underlying risk for mood versus anxiety disorders unfolds in distinct ways and highlights the utility in examining affective dynamics to understand psychopathology. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Probabilistic grammatical model for helix‐helix contact site classification
2013-01-01
Background Hidden Markov Models power many state‐of‐the‐art tools in the field of protein bioinformatics. While excelling in their tasks, these methods of protein analysis do not convey directly information on medium‐ and long‐range residue‐residue interactions. This requires an expressive power of at least context‐free grammars. However, application of more powerful grammar formalisms to protein analysis has been surprisingly limited. Results In this work, we present a probabilistic grammatical framework for problem‐specific protein languages and apply it to classification of transmembrane helix‐helix pairs configurations. The core of the model consists of a probabilistic context‐free grammar, automatically inferred by a genetic algorithm from only a generic set of expert‐based rules and positive training samples. The model was applied to produce sequence based descriptors of four classes of transmembrane helix‐helix contact site configurations. The highest performance of the classifiers reached AUCROC of 0.70. The analysis of grammar parse trees revealed the ability of representing structural features of helix‐helix contact sites. Conclusions We demonstrated that our probabilistic context‐free framework for analysis of protein sequences outperforms the state of the art in the task of helix‐helix contact site classification. However, this is achieved without necessarily requiring modeling long range dependencies between interacting residues. A significant feature of our approach is that grammar rules and parse trees are human‐readable. Thus they could provide biologically meaningful information for molecular biologists. PMID:24350601
Auditory scene analysis in school-aged children with developmental language disorders
Sussman, E.; Steinschneider, M.; Lee, W.; Lawson, K.
2014-01-01
Natural sound environments are dynamic, with overlapping acoustic input originating from simultaneously active sources. A key function of the auditory system is to integrate sensory inputs that belong together and segregate those that come from different sources. We hypothesized that this skill is impaired in individuals with phonological processing difficulties. There is considerable disagreement about whether phonological impairments observed in children with developmental language disorders can be attributed to specific linguistic deficits or to more general acoustic processing deficits. However, most tests of general auditory abilities have been conducted with a single set of sounds. We assessed the ability of school-aged children (7–15 years) to parse complex auditory non-speech input, and determined whether the presence of phonological processing impairments was associated with stream perception performance. A key finding was that children with language impairments did not show the same developmental trajectory for stream perception as typically developing children. In addition, children with language impairments required larger frequency separations between sounds to hear distinct streams compared to age-matched peers. Furthermore, phonological processing ability was a significant predictor of stream perception measures, but only in the older age groups. No such association was found in the youngest children. These results indicate that children with language impairments have difficulty parsing speech streams, or identifying individual sound events when there are competing sound sources. We conclude that language group differences may in part reflect fundamental maturational disparities in the analysis of complex auditory scenes. PMID:24548430
Perceptual learning improves visual performance in juvenile amblyopia.
Li, Roger W; Young, Karen G; Hoenig, Pia; Levi, Dennis M
2005-09-01
To determine whether practicing a position-discrimination task improves visual performance in children with amblyopia and to determine the mechanism(s) of improvement. Five children (age range, 7-10 years) with amblyopia practiced a positional acuity task in which they had to judge which of three pairs of lines was misaligned. Positional noise was produced by distributing the individual patches of each line segment according to a Gaussian probability function. Observers were trained at three noise levels (including 0), with each observer performing between 3000 and 4000 responses in 7 to 10 sessions. Trial-by-trial feedback was provided. Four of the five observers showed significant improvement in positional acuity. In those four observers, on average, positional acuity with no noise improved by approximately 32% and with high noise by approximately 26%. A position-averaging model was used to parse the improvement into an increase in efficiency or a decrease in equivalent input noise. Two observers showed increased efficiency (51% and 117% improvements) with no significant change in equivalent input noise across sessions. The other two observers showed both a decrease in equivalent input noise (18% and 29%) and an increase in efficiency (17% and 71%). All five observers showed substantial improvement in Snellen acuity (approximately 26%) after practice. Perceptual learning can improve visual performance in amblyopic children. The improvement can be parsed into two important factors: decreased equivalent input noise and increased efficiency. Perceptual learning techniques may add an effective new method to the armamentarium of amblyopia treatments.
CLOUDCLOUD : general-purpose instrument monitoring and data managing software
NASA Astrophysics Data System (ADS)
Dias, António; Amorim, António; Tomé, António
2016-04-01
An effective experiment is dependent on the ability to store and deliver data and information to all participant parties regardless of their degree of involvement in the specific parts that make the experiment a whole. Having fast, efficient and ubiquitous access to data will increase visibility and discussion, such that the outcome will have already been reviewed several times, strengthening the conclusions. The CLOUD project aims at providing users with a general purpose data acquisition, management and instrument monitoring platform that is fast, easy to use, lightweight and accessible to all participants of an experiment. This work is now implemented in the CLOUD experiment at CERN and will be fully integrated with the experiment as of 2016. Despite being used in an experiment of the scale of CLOUD, this software can also be used in any size of experiment or monitoring station, from single computers to large networks of computers to monitor any sort of instrument output without influencing the individual instrument's DAQ. Instrument data and meta data is stored and accessed via a specially designed database architecture and any type of instrument output is accepted using our continuously growing parsing application. Multiple databases can be used to separate different data taking periods or a single database can be used if for instance an experiment is continuous. A simple web-based application gives the user total control over the monitored instruments and their data, allowing data visualization and download, upload of processed data and the ability to edit existing instruments or add new instruments to the experiment. When in a network, new computers are immediately recognized and added to the system and are able to monitor instruments connected to them. Automatic computer integration is achieved by a locally running python-based parsing agent that communicates with a main server application guaranteeing that all instruments assigned to that computer are monitored with parsing intervals as fast as milliseconds. This software (server+agents+interface+database) comes in easy and ready-to-use packages that can be installed in any operating system, including Android and iOS systems. This software is ideal for use in modular experiments or monitoring stations with large variability in instruments and measuring methods or in large collaborations, where data requires homogenization in order to be effectively transmitted to all involved parties. This work presents the software and provides performance comparison with previously used monitoring systems in the CLOUD experiment at CERN.
1983-09-01
fluctuat- structed for the present purpose and a small amount of ing velocity components. The present experimental data milk was introduced into the...water flow. Finely dis- will also be compared with the nuerical results and the parsed milk droplets serve as the scattering particles._ experimental...correlation coefficient for the radial In the photoultipIler, I.e., f’ a 0.01. The mean mass transport Is 0.3 < Rvf < 0.4 though the momentum concentration
(abstract) Modeling Protein Families and Human Genes: Hidden Markov Models and a Little Beyond
NASA Technical Reports Server (NTRS)
Baldi, Pierre
1994-01-01
We will first give a brief overview of Hidden Markov Models (HMMs) and their use in Computational Molecular Biology. In particular, we will describe a detailed application of HMMs to the G-Protein-Coupled-Receptor Superfamily. We will also describe a number of analytical results on HMMs that can be used in discrimination tests and database mining. We will then discuss the limitations of HMMs and some new directions of research. We will conclude with some recent results on the application of HMMs to human gene modeling and parsing.
Storyline Visualizations of Eye Tracking of Movie Viewing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Balint, John T.; Arendt, Dustin L.; Blaha, Leslie M.
Storyline visualizations offer an approach that promises to capture the spatio-temporal characteristics of individual observers and simultaneously illustrate emerging group behaviors. We develop a visual analytics approach to parsing, aligning, and clustering fixation sequences from eye tracking data. Visualization of the results captures the similarities and differences across a group of observers performing a common task. We apply our storyline approach to visualize gaze patterns of people watching dynamic movie clips. Storylines mitigate some of the shortcomings of existent spatio-temporal visualization techniques and, importantly, continue to highlight individual observer behavioral dynamics.
Unsupervised chunking based on graph propagation from bilingual corpus.
Zhu, Ling; Wong, Derek F; Chao, Lidia S
2014-01-01
This paper presents a novel approach for unsupervised shallow parsing model trained on the unannotated Chinese text of parallel Chinese-English corpus. In this approach, no information of the Chinese side is applied. The exploitation of graph-based label propagation for bilingual knowledge transfer, along with an application of using the projected labels as features in unsupervised model, contributes to a better performance. The experimental comparisons with the state-of-the-art algorithms show that the proposed approach is able to achieve impressive higher accuracy in terms of F-score.
NASA Astrophysics Data System (ADS)
Arns, James A.
2016-08-01
The Subaru Prime Focus Spectrograph[1] (PFS) requires a suite of volume phase holographic (VPH) gratings that parse the observational spectrum into three sub-spectral regions. In addition, the red region has a second, higher resolution arm that includes a VPH grating that will eventually be incorporated into a grism. This paper describes the specifications of the four grating types, gives the theoretical performances of diffraction efficiency for the production designs and presents the measured performances on the gratings produced to date.
Automated extraction of radiation dose information from CT dose report images.
Li, Xinhua; Zhang, Da; Liu, Bob
2011-06-01
The purpose of this article is to describe the development of an automated tool for retrieving texts from CT dose report images. Optical character recognition was adopted to perform text recognitions of CT dose report images. The developed tool is able to automate the process of analyzing multiple CT examinations, including text recognition, parsing, error correction, and exporting data to spreadsheets. The results were precise for total dose-length product (DLP) and were about 95% accurate for CT dose index and DLP of scanned series.
1987-05-14
such as and. but, or the conjunction comma, as in apples, oranges and bananas . Once such a word was recognized, normal parsing was suspended; a portion...Antecedents interpreted with respect to the "reaching the Stadium" event, as happening sometime after thaL A new node a. I pioced up a banana . Up...dose, I noticed the banana would be created in e/s structure ordered sometime after was too green to a the "reaching the Stadium" event. On the other
Connections and lingering presence as cocreated art.
Dempsey, Leona F
2008-10-01
Parse described nursing practice as a performing art where the nurse is like a dancer. Just as in any dance performance, unplanned events may occur. When a nurse is artistically living, unique and meaningful performances might emerge from unplanned events. In this practice column, the author describes how shifting experiences surfaced with unforeseen connections and lingering presence during her study of feeling confined. In her study she was in true presence with men living in prison, who were diagnosed with severe mental illness. The humanbecoming school of thought was the nursing perspective guiding the research study.
ESTEST: A Framework for the Verification and Validation of Electronic Structure Codes
NASA Astrophysics Data System (ADS)
Yuan, Gary; Gygi, Francois
2011-03-01
ESTEST is a verification and validation (V& V) framework for electronic structure codes that supports Qbox, Quantum Espresso, ABINIT, the Exciting Code and plans support for many more. We discuss various approaches to the electronic structure V& V problem implemented in ESTEST, that are related to parsing, formats, data management, search, comparison and analyses. Additionally, an early experiment in the distribution of V& V ESTEST servers among the electronic structure community will be presented. Supported by NSF-OCI 0749217 and DOE FC02-06ER25777.
Research on complex 3D tree modeling based on L-system
NASA Astrophysics Data System (ADS)
Gang, Chen; Bin, Chen; Yuming, Liu; Hui, Li
2018-03-01
L-system as a fractal iterative system could simulate complex geometric patterns. Based on the field observation data of trees and knowledge of forestry experts, this paper extracted modeling constraint rules and obtained an L-system rules set. Using the self-developed L-system modeling software the L-system rule set was parsed to generate complex tree 3d models.The results showed that the geometrical modeling method based on l-system could be used to describe the morphological structure of complex trees and generate 3D tree models.
Reflections on the prairie as a creative teaching-learning place.
Bunkers, Sandra Schmidt
2006-01-01
In this column, the author reflects on characteristics of the prairie land of South Dakota and how it contributes to a creative teaching-learning place. Attributes of the prairie that are linked with creative teaching-learning include prairie as a space of aloneness and solitude, prairie as a boundless seeing what may be, prairie as contradiction and paradox, and prairie as possibility. These attributes of the prairie are explored through the author's personal experience, theoretical literature on creativity and teaching-learning, and literature from Parse's theory of human becoming.
NASA Astrophysics Data System (ADS)
Fan, Hong; Zhu, Anfeng; Zhang, Weixia
2015-12-01
In order to meet the rapid positioning of 12315 complaints, aiming at the natural language expression of telephone complaints, a semantic retrieval framework is proposed which is based on natural language parsing and geographical names ontology reasoning. Among them, a search result ranking and recommended algorithms is proposed which is regarding both geo-name conceptual similarity and spatial geometry relation similarity. The experiments show that this method can assist the operator to quickly find location of 12,315 complaints, increased industry and commerce customer satisfaction.
Multidimensional incremental parsing for universal source coding.
Bae, Soo Hyun; Juang, Biing-Hwang
2008-10-01
A multidimensional incremental parsing algorithm (MDIP) for multidimensional discrete sources, as a generalization of the Lempel-Ziv coding algorithm, is investigated. It consists of three essential component schemes, maximum decimation matching, hierarchical structure of multidimensional source coding, and dictionary augmentation. As a counterpart of the longest match search in the Lempel-Ziv algorithm, two classes of maximum decimation matching are studied. Also, an underlying behavior of the dictionary augmentation scheme for estimating the source statistics is examined. For an m-dimensional source, m augmentative patches are appended into the dictionary at each coding epoch, thus requiring the transmission of a substantial amount of information to the decoder. The property of the hierarchical structure of the source coding algorithm resolves this issue by successively incorporating lower dimensional coding procedures in the scheme. In regard to universal lossy source coders, we propose two distortion functions, the local average distortion and the local minimax distortion with a set of threshold levels for each source symbol. For performance evaluation, we implemented three image compression algorithms based upon the MDIP; one is lossless and the others are lossy. The lossless image compression algorithm does not perform better than the Lempel-Ziv-Welch coding, but experimentally shows efficiency in capturing the source structure. The two lossy image compression algorithms are implemented using the two distortion functions, respectively. The algorithm based on the local average distortion is efficient at minimizing the signal distortion, but the images by the one with the local minimax distortion have a good perceptual fidelity among other compression algorithms. Our insights inspire future research on feature extraction of multidimensional discrete sources.
A linguistic rule-based approach to extract drug-drug interactions from pharmacological documents
2011-01-01
Background A drug-drug interaction (DDI) occurs when one drug influences the level or activity of another drug. The increasing volume of the scientific literature overwhelms health care professionals trying to be kept up-to-date with all published studies on DDI. Methods This paper describes a hybrid linguistic approach to DDI extraction that combines shallow parsing and syntactic simplification with pattern matching. Appositions and coordinate structures are interpreted based on shallow syntactic parsing provided by the UMLS MetaMap tool (MMTx). Subsequently, complex and compound sentences are broken down into clauses from which simple sentences are generated by a set of simplification rules. A pharmacist defined a set of domain-specific lexical patterns to capture the most common expressions of DDI in texts. These lexical patterns are matched with the generated sentences in order to extract DDIs. Results We have performed different experiments to analyze the performance of the different processes. The lexical patterns achieve a reasonable precision (67.30%), but very low recall (14.07%). The inclusion of appositions and coordinate structures helps to improve the recall (25.70%), however, precision is lower (48.69%). The detection of clauses does not improve the performance. Conclusions Information Extraction (IE) techniques can provide an interesting way of reducing the time spent by health care professionals on reviewing the literature. Nevertheless, no approach has been carried out to extract DDI from texts. To the best of our knowledge, this work proposes the first integral solution for the automatic extraction of DDI from biomedical texts. PMID:21489220
CytometryML binary data standards
NASA Astrophysics Data System (ADS)
Leif, Robert C.
2005-03-01
CytometryML is a proposed new Analytical Cytology (Cytomics) data standard, which is based on a common set of XML schemas for encoding flow cytometry and digital microscopy text based data types (metadata). CytometryML schemas reference both DICOM (Digital Imaging and Communications in Medicine) codes and FCS keywords. Flow Cytometry Standard (FCS) list-mode has been mapped to the DICOM Waveform Information Object. The separation of the large binary data objects (list mode and image data) from the XML description of the metadata permits the metadata to be directly displayed, analyzed, and reported with standard commercial software packages; the direct use of XML languages; and direct interfacing with clinical information systems. The separation of the binary data into its own files simplifies parsing because all extraneous header data has been eliminated. The storage of images as two-dimensional arrays without any extraneous data, such as in the Adobe Photoshop RAW format, facilitates the development by scientists of their own analysis and visualization software. Adobe Photoshop provided the display infrastructure and the translation facility to interconvert between the image data from commercial formats and RAW format. Similarly, the storage and parsing of list mode binary data type with a group of parameters that are specified at compilation time is straight forward. However when the user is permitted at run-time to select a subset of the parameters and/or specify results of mathematical manipulations, the development of special software was required. The use of CytometryML will permit investigators to be able to create their own interoperable data analysis software and to employ commercially available software to disseminate their data.
Automated extraction of radiation dose information for CT examinations.
Cook, Tessa S; Zimmerman, Stefan; Maidment, Andrew D A; Kim, Woojin; Boonn, William W
2010-11-01
Exposure to radiation as a result of medical imaging is currently in the spotlight, receiving attention from Congress as well as the lay press. Although scanner manufacturers are moving toward including effective dose information in the Digital Imaging and Communications in Medicine headers of imaging studies, there is a vast repository of retrospective CT data at every imaging center that stores dose information in an image-based dose sheet. As such, it is difficult for imaging centers to participate in the ACR's Dose Index Registry. The authors have designed an automated extraction system to query their PACS archive and parse CT examinations to extract the dose information stored in each dose sheet. First, an open-source optical character recognition program processes each dose sheet and converts the information to American Standard Code for Information Interchange (ASCII) text. Each text file is parsed, and radiation dose information is extracted and stored in a database which can be queried using an existing pathology and radiology enterprise search tool. Using this automated extraction pipeline, it is possible to perform dose analysis on the >800,000 CT examinations in the PACS archive and generate dose reports for all of these patients. It is also possible to more effectively educate technologists, radiologists, and referring physicians about exposure to radiation from CT by generating report cards for interpreted and performed studies. The automated extraction pipeline enables compliance with the ACR's reporting guidelines and greater awareness of radiation dose to patients, thus resulting in improved patient care and management. Copyright © 2010 American College of Radiology. Published by Elsevier Inc. All rights reserved.
Perl Modules for Constructing Iterators
NASA Technical Reports Server (NTRS)
Tilmes, Curt
2009-01-01
The Iterator Perl Module provides a general-purpose framework for constructing iterator objects within Perl, and a standard API for interacting with those objects. Iterators are an object-oriented design pattern where a description of a series of values is used in a constructor. Subsequent queries can request values in that series. These Perl modules build on the standard Iterator framework and provide iterators for some other types of values. Iterator::DateTime constructs iterators from DateTime objects or Date::Parse descriptions and ICal/RFC 2445 style re-currence descriptions. It supports a variety of input parameters, including a start to the sequence, an end to the sequence, an Ical/RFC 2445 recurrence describing the frequency of the values in the series, and a format description that can refine the presentation manner of the DateTime. Iterator::String constructs iterators from string representations. This module is useful in contexts where the API consists of supplying a string and getting back an iterator where the specific iteration desired is opaque to the caller. It is of particular value to the Iterator::Hash module which provides nested iterations. Iterator::Hash constructs iterators from Perl hashes that can include multiple iterators. The constructed iterators will return all the permutations of the iterations of the hash by nested iteration of embedded iterators. A hash simply includes a set of keys mapped to values. It is a very common data structure used throughout Perl programming. The Iterator:: Hash module allows a hash to include strings defining iterators (parsed and dispatched with Iterator::String) that are used to construct an overall series of hash values.
A system for endobronchial video analysis
NASA Astrophysics Data System (ADS)
Byrnes, Patrick D.; Higgins, William E.
2017-03-01
Image-guided bronchoscopy is a critical component in the treatment of lung cancer and other pulmonary disorders. During bronchoscopy, a high-resolution endobronchial video stream facilitates guidance through the lungs and allows for visual inspection of a patient's airway mucosal surfaces. Despite the detailed information it contains, little effort has been made to incorporate recorded video into the clinical workflow. Follow-up procedures often required in cancer assessment or asthma treatment could significantly benefit from effectively parsed and summarized video. Tracking diagnostic regions of interest (ROIs) could potentially better equip physicians to detect early airway-wall cancer or improve asthma treatments, such as bronchial thermoplasty. To address this need, we have developed a system for the postoperative analysis of recorded endobronchial video. The system first parses an input video stream into endoscopic shots, derives motion information, and selects salient representative key frames. Next, a semi-automatic method for CT-video registration creates data linkages between a CT-derived airway-tree model and the input video. These data linkages then enable the construction of a CT-video chest model comprised of a bronchoscopy path history (BPH) - defining all airway locations visited during a procedure - and texture-mapping information for rendering registered video frames onto the airwaytree model. A suite of analysis tools is included to visualize and manipulate the extracted data. Video browsing and retrieval is facilitated through a video table of contents (TOC) and a search query interface. The system provides a variety of operational modes and additional functionality, including the ability to define regions of interest. We demonstrate the potential of our system using two human case study examples.
IDEA: Interactive Display for Evolutionary Analyses.
Egan, Amy; Mahurkar, Anup; Crabtree, Jonathan; Badger, Jonathan H; Carlton, Jane M; Silva, Joana C
2008-12-08
The availability of complete genomic sequences for hundreds of organisms promises to make obtaining genome-wide estimates of substitution rates, selective constraints and other molecular evolution variables of interest an increasingly important approach to addressing broad evolutionary questions. Two of the programs most widely used for this purpose are codeml and baseml, parts of the PAML (Phylogenetic Analysis by Maximum Likelihood) suite. A significant drawback of these programs is their lack of a graphical user interface, which can limit their user base and considerably reduce their efficiency. We have developed IDEA (Interactive Display for Evolutionary Analyses), an intuitive graphical input and output interface which interacts with PHYLIP for phylogeny reconstruction and with codeml and baseml for molecular evolution analyses. IDEA's graphical input and visualization interfaces eliminate the need to edit and parse text input and output files, reducing the likelihood of errors and improving processing time. Further, its interactive output display gives the user immediate access to results. Finally, IDEA can process data in parallel on a local machine or computing grid, allowing genome-wide analyses to be completed quickly. IDEA provides a graphical user interface that allows the user to follow a codeml or baseml analysis from parameter input through to the exploration of results. Novel options streamline the analysis process, and post-analysis visualization of phylogenies, evolutionary rates and selective constraint along protein sequences simplifies the interpretation of results. The integration of these functions into a single tool eliminates the need for lengthy data handling and parsing, significantly expediting access to global patterns in the data.
IDEA: Interactive Display for Evolutionary Analyses
Egan, Amy; Mahurkar, Anup; Crabtree, Jonathan; Badger, Jonathan H; Carlton, Jane M; Silva, Joana C
2008-01-01
Background The availability of complete genomic sequences for hundreds of organisms promises to make obtaining genome-wide estimates of substitution rates, selective constraints and other molecular evolution variables of interest an increasingly important approach to addressing broad evolutionary questions. Two of the programs most widely used for this purpose are codeml and baseml, parts of the PAML (Phylogenetic Analysis by Maximum Likelihood) suite. A significant drawback of these programs is their lack of a graphical user interface, which can limit their user base and considerably reduce their efficiency. Results We have developed IDEA (Interactive Display for Evolutionary Analyses), an intuitive graphical input and output interface which interacts with PHYLIP for phylogeny reconstruction and with codeml and baseml for molecular evolution analyses. IDEA's graphical input and visualization interfaces eliminate the need to edit and parse text input and output files, reducing the likelihood of errors and improving processing time. Further, its interactive output display gives the user immediate access to results. Finally, IDEA can process data in parallel on a local machine or computing grid, allowing genome-wide analyses to be completed quickly. Conclusion IDEA provides a graphical user interface that allows the user to follow a codeml or baseml analysis from parameter input through to the exploration of results. Novel options streamline the analysis process, and post-analysis visualization of phylogenies, evolutionary rates and selective constraint along protein sequences simplifies the interpretation of results. The integration of these functions into a single tool eliminates the need for lengthy data handling and parsing, significantly expediting access to global patterns in the data. PMID:19061522
Detecting modification of biomedical events using a deep parsing approach.
Mackinlay, Andrew; Martinez, David; Baldwin, Timothy
2012-04-30
This work describes a system for identifying event mentions in bio-molecular research abstracts that are either speculative (e.g. analysis of IkappaBalpha phosphorylation, where it is not specified whether phosphorylation did or did not occur) or negated (e.g. inhibition of IkappaBalpha phosphorylation, where phosphorylation did not occur). The data comes from a standard dataset created for the BioNLP 2009 Shared Task. The system uses a machine-learning approach, where the features used for classification are a combination of shallow features derived from the words of the sentences and more complex features based on the semantic outputs produced by a deep parser. To detect event modification, we use a Maximum Entropy learner with features extracted from the data relative to the trigger words of the events. The shallow features are bag-of-words features based on a small sliding context window of 3-4 tokens on either side of the trigger word. The deep parser features are derived from parses produced by the English Resource Grammar and the RASP parser. The outputs of these parsers are converted into the Minimal Recursion Semantics formalism, and from this, we extract features motivated by linguistics and the data itself. All of these features are combined to create training or test data for the machine learning algorithm. Over the test data, our methods produce approximately a 4% absolute increase in F-score for detection of event modification compared to a baseline based only on the shallow bag-of-words features. Our results indicate that grammar-based techniques can enhance the accuracy of methods for detecting event modification.
SEQ-REVIEW: A tool for reviewing and checking spacecraft sequences
NASA Astrophysics Data System (ADS)
Maldague, Pierre F.; El-Boushi, Mekki; Starbird, Thomas J.; Zawacki, Steven J.
1994-11-01
A key component of JPL's strategy to make space missions faster, better and cheaper is the Advanced Multi-Mission Operations System (AMMOS), a ground software intensive system currently in use and in further development. AMMOS intends to eliminate the cost of re-engineering a ground system for each new JPL mission. This paper discusses SEQ-REVIEW, a component of AMMOS that was designed to facilitate and automate the task of reviewing and checking spacecraft sequences. SEQ-REVIEW is a smart browser for inspecting files created by other sequence generation tools in the AMMOS system. It can parse sequence-related files according to a computer-readable version of a 'Software Interface Specification' (SIS), which is a standard document for defining file formats. It lets users display one or several linked files and check simple constraints using a Basic-like 'Little Language'. SEQ-REVIEW represents the first application of the Quality Function Development (QFD) method to sequence software development at JPL. The paper will show how the requirements for SEQ-REVIEW were defined and converted into a design based on object-oriented principles. The process starts with interviews of potential users, a small but diverse group that spans multiple disciplines and 'cultures'. It continues with the development of QFD matrices that related product functions and characteristics to user-demanded qualities. These matrices are then turned into a formal Software Requirements Document (SRD). The process concludes with the design phase, in which the CRC (Class, Responsibility, Collaboration) approach was used to convert requirements into a blueprint for the final product.
SEQ-REVIEW: A tool for reviewing and checking spacecraft sequences
NASA Technical Reports Server (NTRS)
Maldague, Pierre F.; El-Boushi, Mekki; Starbird, Thomas J.; Zawacki, Steven J.
1994-01-01
A key component of JPL's strategy to make space missions faster, better and cheaper is the Advanced Multi-Mission Operations System (AMMOS), a ground software intensive system currently in use and in further development. AMMOS intends to eliminate the cost of re-engineering a ground system for each new JPL mission. This paper discusses SEQ-REVIEW, a component of AMMOS that was designed to facilitate and automate the task of reviewing and checking spacecraft sequences. SEQ-REVIEW is a smart browser for inspecting files created by other sequence generation tools in the AMMOS system. It can parse sequence-related files according to a computer-readable version of a 'Software Interface Specification' (SIS), which is a standard document for defining file formats. It lets users display one or several linked files and check simple constraints using a Basic-like 'Little Language'. SEQ-REVIEW represents the first application of the Quality Function Development (QFD) method to sequence software development at JPL. The paper will show how the requirements for SEQ-REVIEW were defined and converted into a design based on object-oriented principles. The process starts with interviews of potential users, a small but diverse group that spans multiple disciplines and 'cultures'. It continues with the development of QFD matrices that related product functions and characteristics to user-demanded qualities. These matrices are then turned into a formal Software Requirements Document (SRD). The process concludes with the design phase, in which the CRC (Class, Responsibility, Collaboration) approach was used to convert requirements into a blueprint for the final product.
Technical development of PubMed interact: an improved interface for MEDLINE/PubMed searches.
Muin, Michael; Fontelo, Paul
2006-11-03
The project aims to create an alternative search interface for MEDLINE/PubMed that may provide assistance to the novice user and added convenience to the advanced user. An earlier version of the project was the 'Slider Interface for MEDLINE/PubMed searches' (SLIM) which provided JavaScript slider bars to control search parameters. In this new version, recent developments in Web-based technologies were implemented. These changes may prove to be even more valuable in enhancing user interactivity through client-side manipulation and management of results. PubMed Interact is a Web-based MEDLINE/PubMed search application built with HTML, JavaScript and PHP. It is implemented on a Windows Server 2003 with Apache 2.0.52, PHP 4.4.1 and MySQL 4.1.18. PHP scripts provide the backend engine that connects with E-Utilities and parses XML files. JavaScript manages client-side functionalities and converts Web pages into interactive platforms using dynamic HTML (DHTML), Document Object Model (DOM) tree manipulation and Ajax methods. With PubMed Interact, users can limit searches with JavaScript slider bars, preview result counts, delete citations from the list, display and add related articles and create relevance lists. Many interactive features occur at client-side, which allow instant feedback without reloading or refreshing the page resulting in a more efficient user experience. PubMed Interact is a highly interactive Web-based search application for MEDLINE/PubMed that explores recent trends in Web technologies like DOM tree manipulation and Ajax. It may become a valuable technical development for online medical search applications.
SURF: Taking Sustainable Remediation from Concept to Standard Operating Procedure (Invited)
NASA Astrophysics Data System (ADS)
Smith, L. M.; Wice, R. B.; Torrens, J.
2013-12-01
Over the last decade, many sectors of industrialized society have been rethinking behavior and re-engineering practices to reduce consumption of energy and natural resources. During this time, green and sustainable remediation (GSR) has evolved from conceptual discussions to standard operating procedure for many environmental remediation practitioners. Government agencies and private sector entities have incorporated GSR metrics into their performance criteria and contracting documents. One of the early think tanks for the development of GSR was the Sustainable Remediation Forum (SURF). SURF brings together representatives of government, industry, consultancy, and academia to parse the means and ends of incorporating societal and economic considerations into environmental cleanup projects. Faced with decades-old treatment programs with high energy outputs and no endpoints in sight, a small group of individuals published the institutional knowledge gathered in two years of ad hoc meetings into a 2009 White Paper on sustainable remediation drivers, practices, objectives, and case studies. Since then, SURF has expanded on those introductory topics, publishing its Framework for Integrating Sustainability into Remediation Projects, Guidance for Performing Footprint Analyses and Life-Cycle Assessments for the Remediation Industry, a compendium of metrics, and a call to improve the integration of land remediation and reuse. SURF's research and members have also been instrumental in the development of additional guidance through ASTM International and the Interstate Technology and Regulatory Council. SURF's current efforts focus on water reuse, the international perspective on GSR (continuing the conversations that were the basis of SURF's December 2012 meeting at the National Academy of Sciences in Washington, DC), and ways to capture and evaluate the societal benefits of site remediation. SURF also promotes and supports student chapters at universities across the US, encouraging the incorporation of sustainability concepts into environmental science and engineering in undergraduate curricula and graduate research, and student participation at professional conferences. This presentation will provide an overview of the evolution of GSR to-date and a history of SURF's technical and outreach work. Examples will be provided--using both qualitative and quantitative metrics--that document and support the benefits of GSR.
Constructing storyboards based on hierarchical clustering analysis
NASA Astrophysics Data System (ADS)
Hasebe, Satoshi; Sami, Mustafa M.; Muramatsu, Shogo; Kikuchi, Hisakazu
2005-07-01
There are growing needs for quick preview of video contents for the purpose of improving accessibility of video archives as well as reducing network traffics. In this paper, a storyboard that contains a user-specified number of keyframes is produced from a given video sequence. It is based on hierarchical cluster analysis of feature vectors that are derived from wavelet coefficients of video frames. Consistent use of extracted feature vectors is the key to avoid a repetition of computationally-intensive parsing of the same video sequence. Experimental results suggest that a significant reduction in computational time is gained by this strategy.
Parsing brain activity with fMRI and mixed designs: what kind of a state is neuroimaging in?
Donaldson, David I
2004-08-01
Neuroimaging is often pilloried for providing little more than pretty pictures that simply show where activity occurs in the brain. Strong critics (notably Uttal) have even argued that neuroimaging is nothing more than a modern day version of phrenology: destined to fail, and fundamentally uninformative. Here, I make the opposite case, arguing that neuroimaging is in a vibrant and healthy state of development. As recent investigations of memory illustrate, when used well, neuroimaging goes beyond asking 'where' activity is occurring, to ask questions concerned more with 'what' functional role the activity reflects.
Grieving a loss: the lived experience for elders residing in an institution.
Pilkington, F Beryl
2005-07-01
Grieving a loss is a profound and universal human experience. This phenomenological-hermeneutic study was an inquiry into the lived experience of grieving a loss. The nursing perspective was Parse's human becoming theory. Participants were 10 elderly persons residing in a long-term care facility. The study finding specifies the structure of the lived experience of grieving a loss as aching solitude amid enduring cherished affiliations, as serene acquiescence arises with sorrowful curtailments. Findings are discussed in relation to the guiding theoretical perspective and related literature. Recommendations for additional research and insights for practice are presented.
The Living Experience of Feeling Playful.
Baumann, Steven L; Tanzi, Donna; Lewis, Tricia A
2017-07-01
The purpose of this study was to investigate the living experience of feeling playful. Parse's research method was used to answer the question: What is the structure of the living experience of feeling playful? The participants were 10 persons, ages 9 to 83, living in the United States. The central finding of the study is the living experience of feeling playful is entertaining amusements amid burdens with uplifting endeavors strengthening affiliations with blissful moments of unfettered unfolding. The living experience of feeling playful is discussed in relation to the principles of the humanbecoming paradigm and in relation to how it can inform further research.
NASA Astrophysics Data System (ADS)
Gonzales, Kalim
It is argued that infants build a foundation for learning about the world through their incidental acquisition of the spatial and temporal regularities surrounding them. A challenge is that learning occurs across multiple contexts whose statistics can greatly differ. Two artificial language studies with 12-month-olds demonstrate that infants come prepared to parse statistics across contexts using the temporal and perceptual features that distinguish one context from another. These results suggest that infants can organize their statistical input with a wider range of features that typically considered. Possible attention, decision making, and memory mechanisms are discussed.
Diana Reference Manual. Revision 3,
1983-02-28
will help the reader to understand DIANA. Section 1. 1. 1 presents those principles that motivated the original design of DIAN. and Section 1. 1. 2...parenthesized node whose offspring was the addition, since ADA’s parsing rules require the parentheses. The motivation for this requirement Is to ease the...3.7. a 41neA~olm S3.I. C, S.1. F] 51. 52 number 3.1, 3.2.8] 33, 33 numw_d [3.2.83e 35 number rep C4.4.0] 43 nummicUteral [4:4.0] 43 ps4 . I.A 46
Human Time-Frequency Acuity Beats the Fourier Uncertainty Principle
NASA Astrophysics Data System (ADS)
Oppenheim, Jacob N.; Magnasco, Marcelo O.
2013-01-01
The time-frequency uncertainty principle states that the product of the temporal and frequency extents of a signal cannot be smaller than 1/(4π). We study human ability to simultaneously judge the frequency and the timing of a sound. Our subjects often exceeded the uncertainty limit, sometimes by more than tenfold, mostly through remarkable timing acuity. Our results establish a lower bound for the nonlinearity and complexity of the algorithms employed by our brains in parsing transient sounds, rule out simple “linear filter” models of early auditory processing, and highlight timing acuity as a central feature in auditory object processing.
A standard format and a graphical user interface for spin system specification.
Biternas, A G; Charnock, G T P; Kuprov, Ilya
2014-03-01
We introduce a simple and general XML format for spin system description that is the result of extensive consultations within Magnetic Resonance community and unifies under one roof all major existing spin interaction specification conventions. The format is human-readable, easy to edit and easy to parse using standard XML libraries. We also describe a graphical user interface that was designed to facilitate construction and visualization of complicated spin systems. The interface is capable of generating input files for several popular spin dynamics simulation packages. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
RGSS-ID: an approach to new radiologic reporting system.
Ikeda, M; Sakuma, S; Maruyama, K
1990-01-01
RGSS-ID is a developmental computer system that applies artificial intelligence (AI) methods to a reporting system. The representation scheme called Generalized Finding Representation (GFR) is proposed to bridge the gap between natural language expressions in the radiology report and AI methods. The entry process of RGSS-ID is made mainly by selecting items; our system allows a radiologist to compose a sentence which can be completely parsed by the computer. Further RGSS-ID encodes findings into the expression corresponding to GFR, and stores this expression into the knowledge data base. The final printed report is made in the natural language.
Proscene: A feature-rich framework for interactive environments
NASA Astrophysics Data System (ADS)
Charalambos, Jean Pierre
We introduce Proscene, a feature-rich, open-source framework for interactive environments. The design of Proscene comprises a three-layered onion-like software architecture, promoting different possible development scenarios. The framework innermost layer decouples user gesture parsing from user-defined actions. The in-between layer implements a feature-rich set of widely-used motion actions allowing the selection and manipulation of objects, including the scene viewpoint. The outermost layer exposes those features as a Processing library. The results have shown the feasibility of our approach together with the simplicity and flexibility of the Proscene framework API.
The lived experience of feeling sad.
Bunkers, Sandra Schmidt
2010-07-01
The purpose of this study was to enhance understanding of the lived experience of feeling sad. Parse's phenomenological-hermeneutic research method was used to answer the research question: What is the structure of the lived experience of feeling sad? Participants were 7 elders who had lost a pet. Data were collected with dialogical engagement. The major finding of the study is the structure: Feeling sad is penetrating anguish surfacing with contemplating absent-yet present intimacies, while prevailing amid misfortune. Feeling sad is discussed in relation to the principles of humanbecoming and in relation to how it can inform future nursing research and nursing practice.
Smelter, Andrey; Astra, Morgan; Moseley, Hunter N B
2017-03-17
The Biological Magnetic Resonance Data Bank (BMRB) is a public repository of Nuclear Magnetic Resonance (NMR) spectroscopic data of biological macromolecules. It is an important resource for many researchers using NMR to study structural, biophysical, and biochemical properties of biological macromolecules. It is primarily maintained and accessed in a flat file ASCII format known as NMR-STAR. While the format is human readable, the size of most BMRB entries makes computer readability and explicit representation a practical requirement for almost any rigorous systematic analysis. To aid in the use of this public resource, we have developed a package called nmrstarlib in the popular open-source programming language Python. The nmrstarlib's implementation is very efficient, both in design and execution. The library has facilities for reading and writing both NMR-STAR version 2.1 and 3.1 formatted files, parsing them into usable Python dictionary- and list-based data structures, making access and manipulation of the experimental data very natural within Python programs (i.e. "saveframe" and "loop" records represented as individual Python dictionary data structures). Another major advantage of this design is that data stored in original NMR-STAR can be easily converted into its equivalent JavaScript Object Notation (JSON) format, a lightweight data interchange format, facilitating data access and manipulation using Python and any other programming language that implements a JSON parser/generator (i.e., all popular programming languages). We have also developed tools to visualize assigned chemical shift values and to convert between NMR-STAR and JSONized NMR-STAR formatted files. Full API Reference Documentation, User Guide and Tutorial with code examples are also available. We have tested this new library on all current BMRB entries: 100% of all entries are parsed without any errors for both NMR-STAR version 2.1 and version 3.1 formatted files. We also compared our software to three currently available Python libraries for parsing NMR-STAR formatted files: PyStarLib, NMRPyStar, and PyNMRSTAR. The nmrstarlib package is a simple, fast, and efficient library for accessing data from the BMRB. The library provides an intuitive dictionary-based interface with which Python programs can read, edit, and write NMR-STAR formatted files and their equivalent JSONized NMR-STAR files. The nmrstarlib package can be used as a library for accessing and manipulating data stored in NMR-STAR files and as a command-line tool to convert from NMR-STAR file format into its equivalent JSON file format and vice versa, and to visualize chemical shift values. Furthermore, the nmrstarlib implementation provides a guide for effectively JSONizing other older scientific formats, improving the FAIRness of data in these formats.
Fast and Efficient XML Data Access for Next-Generation Mass Spectrometry.
Röst, Hannes L; Schmitt, Uwe; Aebersold, Ruedi; Malmström, Lars
2015-01-01
In mass spectrometry-based proteomics, XML formats such as mzML and mzXML provide an open and standardized way to store and exchange the raw data (spectra and chromatograms) of mass spectrometric experiments. These file formats are being used by a multitude of open-source and cross-platform tools which allow the proteomics community to access algorithms in a vendor-independent fashion and perform transparent and reproducible data analysis. Recent improvements in mass spectrometry instrumentation have increased the data size produced in a single LC-MS/MS measurement and put substantial strain on open-source tools, particularly those that are not equipped to deal with XML data files that reach dozens of gigabytes in size. Here we present a fast and versatile parsing library for mass spectrometric XML formats available in C++ and Python, based on the mature OpenMS software framework. Our library implements an API for obtaining spectra and chromatograms under memory constraints using random access or sequential access functions, allowing users to process datasets that are much larger than system memory. For fast access to the raw data structures, small XML files can also be completely loaded into memory. In addition, we have improved the parsing speed of the core mzML module by over 4-fold (compared to OpenMS 1.11), making our library suitable for a wide variety of algorithms that need fast access to dozens of gigabytes of raw mass spectrometric data. Our C++ and Python implementations are available for the Linux, Mac, and Windows operating systems. All proposed modifications to the OpenMS code have been merged into the OpenMS mainline codebase and are available to the community at https://github.com/OpenMS/OpenMS.
Fast and Efficient XML Data Access for Next-Generation Mass Spectrometry
Röst, Hannes L.; Schmitt, Uwe; Aebersold, Ruedi; Malmström, Lars
2015-01-01
Motivation In mass spectrometry-based proteomics, XML formats such as mzML and mzXML provide an open and standardized way to store and exchange the raw data (spectra and chromatograms) of mass spectrometric experiments. These file formats are being used by a multitude of open-source and cross-platform tools which allow the proteomics community to access algorithms in a vendor-independent fashion and perform transparent and reproducible data analysis. Recent improvements in mass spectrometry instrumentation have increased the data size produced in a single LC-MS/MS measurement and put substantial strain on open-source tools, particularly those that are not equipped to deal with XML data files that reach dozens of gigabytes in size. Results Here we present a fast and versatile parsing library for mass spectrometric XML formats available in C++ and Python, based on the mature OpenMS software framework. Our library implements an API for obtaining spectra and chromatograms under memory constraints using random access or sequential access functions, allowing users to process datasets that are much larger than system memory. For fast access to the raw data structures, small XML files can also be completely loaded into memory. In addition, we have improved the parsing speed of the core mzML module by over 4-fold (compared to OpenMS 1.11), making our library suitable for a wide variety of algorithms that need fast access to dozens of gigabytes of raw mass spectrometric data. Availability Our C++ and Python implementations are available for the Linux, Mac, and Windows operating systems. All proposed modifications to the OpenMS code have been merged into the OpenMS mainline codebase and are available to the community at https://github.com/OpenMS/OpenMS. PMID:25927999
Teng, Xiangbin; Tian, Xing; Doelling, Keith; Poeppel, David
2017-10-17
Parsing continuous acoustic streams into perceptual units is fundamental to auditory perception. Previous studies have uncovered a cortical entrainment mechanism in the delta and theta bands (~1-8 Hz) that correlates with formation of perceptual units in speech, music, and other quasi-rhythmic stimuli. Whether cortical oscillations in the delta-theta bands are passively entrained by regular acoustic patterns or play an active role in parsing the acoustic stream is debated. Here, we investigate cortical oscillations using novel stimuli with 1/f modulation spectra. These 1/f signals have no rhythmic structure but contain information over many timescales because of their broadband modulation characteristics. We chose 1/f modulation spectra with varying exponents of f, which simulate the dynamics of environmental noise, speech, vocalizations, and music. While undergoing magnetoencephalography (MEG) recording, participants listened to 1/f stimuli and detected embedded target tones. Tone detection performance varied across stimuli of different exponents and can be explained by local signal-to-noise ratio computed using a temporal window around 200 ms. Furthermore, theta band oscillations, surprisingly, were observed for all stimuli, but robust phase coherence was preferentially displayed by stimuli with exponents 1 and 1.5. We constructed an auditory processing model to quantify acoustic information on various timescales and correlated the model outputs with the neural results. We show that cortical oscillations reflect a chunking of segments, > 200 ms. These results suggest an active auditory segmentation mechanism, complementary to entrainment, operating on a timescale of ~200 ms to organize acoustic information. © 2017 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Detecting modification of biomedical events using a deep parsing approach
2012-01-01
Background This work describes a system for identifying event mentions in bio-molecular research abstracts that are either speculative (e.g. analysis of IkappaBalpha phosphorylation, where it is not specified whether phosphorylation did or did not occur) or negated (e.g. inhibition of IkappaBalpha phosphorylation, where phosphorylation did not occur). The data comes from a standard dataset created for the BioNLP 2009 Shared Task. The system uses a machine-learning approach, where the features used for classification are a combination of shallow features derived from the words of the sentences and more complex features based on the semantic outputs produced by a deep parser. Method To detect event modification, we use a Maximum Entropy learner with features extracted from the data relative to the trigger words of the events. The shallow features are bag-of-words features based on a small sliding context window of 3-4 tokens on either side of the trigger word. The deep parser features are derived from parses produced by the English Resource Grammar and the RASP parser. The outputs of these parsers are converted into the Minimal Recursion Semantics formalism, and from this, we extract features motivated by linguistics and the data itself. All of these features are combined to create training or test data for the machine learning algorithm. Results Over the test data, our methods produce approximately a 4% absolute increase in F-score for detection of event modification compared to a baseline based only on the shallow bag-of-words features. Conclusions Our results indicate that grammar-based techniques can enhance the accuracy of methods for detecting event modification. PMID:22595089
NASA Astrophysics Data System (ADS)
Melchert, O.; Hartmann, A. K.
2015-02-01
In this work we consider information-theoretic observables to analyze short symbolic sequences, comprising time series that represent the orientation of a single spin in a two-dimensional (2D) Ising ferromagnet on a square lattice of size L2=1282 for different system temperatures T . The latter were chosen from an interval enclosing the critical point Tc of the model. At small temperatures the sequences are thus very regular; at high temperatures they are maximally random. In the vicinity of the critical point, nontrivial, long-range correlations appear. Here we implement estimators for the entropy rate, excess entropy (i.e., "complexity"), and multi-information. First, we implement a Lempel-Ziv string-parsing scheme, providing seemingly elaborate entropy rate and multi-information estimates and an approximate estimator for the excess entropy. Furthermore, we apply easy-to-use black-box data-compression utilities, providing approximate estimators only. For comparison and to yield results for benchmarking purposes, we implement the information-theoretic observables also based on the well-established M -block Shannon entropy, which is more tedious to apply compared to the first two "algorithmic" entropy estimation procedures. To test how well one can exploit the potential of such data-compression techniques, we aim at detecting the critical point of the 2D Ising ferromagnet. Among the above observables, the multi-information, which is known to exhibit an isolated peak at the critical point, is very easy to replicate by means of both efficient algorithmic entropy estimation procedures. Finally, we assess how good the various algorithmic entropy estimates compare to the more conventional block entropy estimates and illustrate a simple modification that yields enhanced results.
Identifying elemental genomic track types and representing them uniformly
2011-01-01
Background With the recent advances and availability of various high-throughput sequencing technologies, data on many molecular aspects, such as gene regulation, chromatin dynamics, and the three-dimensional organization of DNA, are rapidly being generated in an increasing number of laboratories. The variation in biological context, and the increasingly dispersed mode of data generation, imply a need for precise, interoperable and flexible representations of genomic features through formats that are easy to parse. A host of alternative formats are currently available and in use, complicating analysis and tool development. The issue of whether and how the multitude of formats reflects varying underlying characteristics of data has to our knowledge not previously been systematically treated. Results We here identify intrinsic distinctions between genomic features, and argue that the distinctions imply that a certain variation in the representation of features as genomic tracks is warranted. Four core informational properties of tracks are discussed: gaps, lengths, values and interconnections. From this we delineate fifteen generic track types. Based on the track type distinctions, we characterize major existing representational formats and find that the track types are not adequately supported by any single format. We also find, in contrast to the XML formats, that none of the existing tabular formats are conveniently extendable to support all track types. We thus propose two unified formats for track data, an improved XML format, BioXSD 1.1, and a new tabular format, GTrack 1.0. Conclusions The defined track types are shown to capture relevant distinctions between genomic annotation tracks, resulting in varying representational needs and analysis possibilities. The proposed formats, GTrack 1.0 and BioXSD 1.1, cater to the identified track distinctions and emphasize preciseness, flexibility and parsing convenience. PMID:22208806
Irurtzun, Aritz
2015-01-01
In recent research (Boeckx and Benítez-Burraco, 2014a,b) have advanced the hypothesis that our species-specific language-ready brain should be understood as the outcome of developmental changes that occurred in our species after the split from Neanderthals-Denisovans, which resulted in a more globular braincase configuration in comparison to our closest relatives, who had elongated endocasts. According to these authors, the development of a globular brain is an essential ingredient for the language faculty and in particular, it is the centrality occupied by the thalamus in a globular brain that allows its modulatory or regulatory role, essential for syntactico-semantic computations. Their hypothesis is that the syntactico-semantic capacities arise in humans as a consequence of a process of globularization, which significantly takes place postnatally (cf. Neubauer et al., 2010). In this paper, I show that Boeckx and Benítez-Burraco's hypothesis makes an interesting developmental prediction regarding the path of language acquisition: it teases apart the onset of phonological acquisition and the onset of syntactic acquisition (the latter starting significantly later, after globularization). I argue that this hypothesis provides a developmental rationale for the prosodic bootstrapping hypothesis of language acquisition (cf. i.a. Gleitman and Wanner, 1982; Mehler et al., 1988, et seq.; Gervain and Werker, 2013), which claim that prosodic cues are employed for syntactic parsing. The literature converges in the observation that a large amount of such prosodic cues (in particular, rhythmic cues) are already acquired before the completion of the globularization phase, which paves the way for the premises of the prosodic bootstrapping hypothesis, allowing babies to have a rich knowledge of the prosody of their target language before they can start parsing the primary linguistic data syntactically.
Chen, W; Kowatch, R; Lin, S; Splaingard, M; Huang, Y
2015-01-01
Nationwide Children's Hospital established an i2b2 (Informatics for Integrating Biology & the Bedside) application for sleep disorder cohort identification. Discrete data were gleaned from semistructured sleep study reports. The system showed to work more efficiently than the traditional manual chart review method, and it also enabled searching capabilities that were previously not possible. We report on the development and implementation of the sleep disorder i2b2 cohort identification system using natural language processing of semi-structured documents. We developed a natural language processing approach to automatically parse concepts and their values from semi-structured sleep study documents. Two parsers were developed: a regular expression parser for extracting numeric concepts and a NLP based tree parser for extracting textual concepts. Concepts were further organized into i2b2 ontologies based on document structures and in-domain knowledge. 26,550 concepts were extracted with 99% being textual concepts. 1.01 million facts were extracted from sleep study documents such as demographic information, sleep study lab results, medications, procedures, diagnoses, among others. The average accuracy of terminology parsing was over 83% when comparing against those by experts. The system is capable of capturing both standard and non-standard terminologies. The time for cohort identification has been reduced significantly from a few weeks to a few seconds. Natural language processing was shown to be powerful for quickly converting large amount of semi-structured or unstructured clinical data into discrete concepts, which in combination of intuitive domain specific ontologies, allows fast and effective interactive cohort identification through the i2b2 platform for research and clinical use.
Neural Correlates of Three Promising Endophenotypes of Depression: Evidence from the EMBARC Study
Webb, Christian A; Dillon, Daniel G; Pechtel, Pia; Goer, Franziska K; Murray, Laura; Huys, Quentin JM; Fava, Maurizio; McGrath, Patrick J; Weissman, Myrna; Parsey, Ramin; Kurian, Benji T; Adams, Phillip; Weyandt, Sarah; Trombello, Joseph M; Grannemann, Bruce; Cooper, Crystal M; Deldin, Patricia; Tenke, Craig; Trivedi, Madhukar; Bruder, Gerard; Pizzagalli, Diego A
2016-01-01
Major depressive disorder (MDD) is clinically, and likely pathophysiologically, heterogeneous. A potentially fruitful approach to parsing this heterogeneity is to focus on promising endophenotypes. Guided by the NIMH Research Domain Criteria initiative, we used source localization of scalp-recorded EEG resting data to examine the neural correlates of three emerging endophenotypes of depression: neuroticism, blunted reward learning, and cognitive control deficits. Data were drawn from the ongoing multi-site EMBARC study. We estimated intracranial current density for standard EEG frequency bands in 82 unmedicated adults with MDD, using Low-Resolution Brain Electromagnetic Tomography. Region-of-interest and whole-brain analyses tested associations between resting state EEG current density and endophenotypes of interest. Neuroticism was associated with increased resting gamma (36.5–44 Hz) current density in the ventral (subgenual) anterior cingulate cortex (ACC) and orbitofrontal cortex (OFC). In contrast, reduced cognitive control correlated with decreased gamma activity in the left dorsolateral prefrontal cortex (dlPFC), decreased theta (6.5–8 Hz) and alpha2 (10.5–12 Hz) activity in the dorsal ACC, and increased alpha2 activity in the right dlPFC. Finally, blunted reward learning correlated with lower OFC and left dlPFC gamma activity. Computational modeling of trial-by-trial reinforcement learning further indicated that lower OFC gamma activity was linked to reduced reward sensitivity. Three putative endophenotypes of depression were found to have partially dissociable resting intracranial EEG correlates, reflecting different underlying neural dysfunctions. Overall, these findings highlight the need to parse the heterogeneity of MDD by focusing on promising endophenotypes linked to specific pathophysiological abnormalities. PMID:26068725
Chiang, Michael F; Casper, Daniel S; Cimino, James J; Starren, Justin
2005-02-01
To assess the adequacy of 5 controlled medical terminologies (International Classification of Diseases 9, Clinical Modification [ICD9-CM]; Current Procedural Terminology 4 [CPT-4]; Systematized Nomenclature of Medicine, Clinical Terms [SNOMED-CT]; Logical Identifiers, Names, and Codes [LOINC]; Medical Entities Dictionary [MED]) for representing concepts in ophthalmology. Noncomparative case series. Twenty complete ophthalmology case presentations were sequentially selected from a publicly available ophthalmology journal. Each of the 20 cases was parsed into discrete concepts, and each concept was classified along 2 axes: (1) diagnosis, finding, or procedure and (2) ophthalmic or medical concept. Electronic or paper browsers were used to assign a code for every concept in each of the 5 terminologies. Adequacy of assignment for each concept was scored on a 3-point scale. Findings from all 20 case presentations were combined and compared based on a coverage score, which was the average score for all concepts in that terminology. Adequacy of assignment for concepts in each terminology, based on a 3-point Likert scale (0, no match; 1, partial match; 2, complete match). Cases were parsed into 1603 concepts. SNOMED-CT had the highest mean overall coverage score (1.625+/-0.667), followed by MED (0.974+/-0.764), LOINC (0.781+/-0.929), ICD9-CM (0.280+/-0.619), and CPT-4 (0.082+/-0.337). SNOMED-CT also had higher coverage scores than any of the other terminologies for concepts in the diagnosis, finding, and procedure categories. Average coverage scores for ophthalmic concepts were lower than those for medical concepts. Controlled terminologies are required for electronic representation of ophthalmology data. SNOMED-CT had significantly higher content coverage than any other terminology in this study.
Walenski, Matthew; Swinney, David
2009-01-01
The central question underlying this study revolves around how children process co-reference relationships—such as those evidenced by pronouns (him) and reflexives (himself)—and how a slowed rate of speech input may critically affect this process. Previous studies of child language processing have demonstrated that typical language developing (TLD) children as young as 4 years of age process co-reference relations in a manner similar to adults on-line. In contrast, off-line measures of pronoun comprehension suggest a developmental delay for pronouns (relative to reflexives). The present study examines dependency relations in TLD children (ages 5–13) and investigates how a slowed rate of speech input affects the unconscious (on-line) and conscious (off-line) parsing of these constructions. For the on-line investigations (using a cross-modal picture priming paradigm), results indicate that at a normal rate of speech TLD children demonstrate adult-like syntactic reflexes. At a slowed rate of speech the typical language developing children displayed a breakdown in automatic syntactic parsing (again, similar to the pattern seen in unimpaired adults). As demonstrated in the literature, our off-line investigations (sentence/picture matching task) revealed that these children performed much better on reflexives than on pronouns at a regular speech rate. However, at the slow speech rate, performance on pronouns was substantially improved, whereas performance on reflexives was not different than at the regular speech rate. We interpret these results in light of a distinction between fast automatic processes (relied upon for on-line processing in real time) and conscious reflective processes (relied upon for off-line processing), such that slowed speech input disrupts the former, yet improves the latter. PMID:19343495
Comparing different kinds of words and word-word relations to test an habituation model of priming.
Rieth, Cory A; Huber, David E
2017-06-01
Huber and O'Reilly (2003) proposed that neural habituation exists to solve a temporal parsing problem, minimizing blending between one word and the next when words are visually presented in rapid succession. They developed a neural dynamics habituation model, explaining the finding that short duration primes produce positive priming whereas long duration primes produce negative repetition priming. The model contains three layers of processing, including a visual input layer, an orthographic layer, and a lexical-semantic layer. The predicted effect of prime duration depends both on this assumed representational hierarchy and the assumption that synaptic depression underlies habituation. The current study tested these assumptions by comparing different kinds of words (e.g., words versus non-words) and different kinds of word-word relations (e.g., associative versus repetition). For each experiment, the predictions of the original model were compared to an alternative model with different representational assumptions. Experiment 1 confirmed the prediction that non-words and inverted words require longer prime durations to eliminate positive repetition priming (i.e., a slower transition from positive to negative priming). Experiment 2 confirmed the prediction that associative priming increases and then decreases with increasing prime duration, but remains positive even with long duration primes. Experiment 3 replicated the effects of repetition and associative priming using a within-subjects design and combined these effects by examining target words that were expected to repeat (e.g., viewing the target word 'BACK' after the prime phrase 'back to'). These results support the originally assumed representational hierarchy and more generally the role of habituation in temporal parsing and priming. Copyright © 2017 Elsevier Inc. All rights reserved.
Chen, W.; Kowatch, R.; Lin, S.; Splaingard, M.
2015-01-01
Summary Nationwide Children’s Hospital established an i2b2 (Informatics for Integrating Biology & the Bedside) application for sleep disorder cohort identification. Discrete data were gleaned from semistructured sleep study reports. The system showed to work more efficiently than the traditional manual chart review method, and it also enabled searching capabilities that were previously not possible. Objective We report on the development and implementation of the sleep disorder i2b2 cohort identification system using natural language processing of semi-structured documents. Methods We developed a natural language processing approach to automatically parse concepts and their values from semi-structured sleep study documents. Two parsers were developed: a regular expression parser for extracting numeric concepts and a NLP based tree parser for extracting textual concepts. Concepts were further organized into i2b2 ontologies based on document structures and in-domain knowledge. Results 26,550 concepts were extracted with 99% being textual concepts. 1.01 million facts were extracted from sleep study documents such as demographic information, sleep study lab results, medications, procedures, diagnoses, among others. The average accuracy of terminology parsing was over 83% when comparing against those by experts. The system is capable of capturing both standard and non-standard terminologies. The time for cohort identification has been reduced significantly from a few weeks to a few seconds. Conclusion Natural language processing was shown to be powerful for quickly converting large amount of semi-structured or unstructured clinical data into discrete concepts, which in combination of intuitive domain specific ontologies, allows fast and effective interactive cohort identification through the i2b2 platform for research and clinical use. PMID:26171080
The taxonomic name resolution service: an online tool for automated standardization of plant names
2013-01-01
Background The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this ‘names problem’ has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science. Results The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets. Conclusions We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at http://tnrs.iplantcollaborative.org/ and as a RESTful web service and application programming interface. Source code is available at https://github.com/iPlantCollaborativeOpenSource/TNRS/. PMID:23324024
Roth, Dan
2013-01-01
Objective This paper presents a coreference resolution system for clinical narratives. Coreference resolution aims at clustering all mentions in a single document to coherent entities. Materials and methods A knowledge-intensive approach for coreference resolution is employed. The domain knowledge used includes several domain-specific lists, a knowledge intensive mention parsing, and task informed discourse model. Mention parsing allows us to abstract over the surface form of the mention and represent each mention using a higher-level representation, which we call the mention's semantic representation (SR). SR reduces the mention to a standard form and hence provides better support for comparing and matching. Existing coreference resolution systems tend to ignore discourse aspects and rely heavily on lexical and structural cues in the text. The authors break from this tradition and present a discourse model for “person” type mentions in clinical narratives, which greatly simplifies the coreference resolution. Results This system was evaluated on four different datasets which were made available in the 2011 i2b2/VA coreference challenge. The unweighted average of F1 scores (over B-cubed, MUC and CEAF) varied from 84.2% to 88.1%. These experiments show that domain knowledge is effective for different mention types for all the datasets. Discussion Error analysis shows that most of the recall errors made by the system can be handled by further addition of domain knowledge. The precision errors, on the other hand, are more subtle and indicate the need to understand the relations in which mentions participate for building a robust coreference system. Conclusion This paper presents an approach that makes an extensive use of domain knowledge to significantly improve coreference resolution. The authors state that their system and the knowledge sources developed will be made publicly available. PMID:22781192
NASA Astrophysics Data System (ADS)
Butcher, G. J.; Roberts-Harris, D.
2013-12-01
A set of innovative classroom lessons were developed based on informal learning activities in the 'Sensors, Circuits, and Satellites' kit manufactured by littleBits™ Electronics that are designed to lead students through a logical science content storyline about energy using sound and light and fully implements an integrated approach to the three dimensions of the Next Generation of Science Standards (NGSS). This session will illustrate the integration of NGSS into curriculum by deconstructing lesson design to parse out the unique elements of the 3 dimensions of NGSS. We will demonstrate ways in which we have incorporated the NGSS as we believe they were intended. According to the NGSS, 'The real innovation in the NGSS is the requirement that students are required to operate at the intersection of practice, content, and connection. Performance expectations are the right way to integrate the three dimensions. It provides specificity for educators, but it also sets the tone for how science instruction should look in classrooms. (p. 3). The 'Sensors, Circuits, and Satellites' series of lessons accomplishes this by going beyond just focusing on the conceptual knowledge (the disciplinary core ideas) - traditionally approached by mapping lessons to standards. These lessons incorporate the other 2 dimensions -cross-cutting concepts and the 8-practices of Sciences and Engineering-via an authentic and exciting connection to NASA science, thus implementing the NGSS in the way they were designed to be used: practices and content with the crosscutting concepts. When the NGSS are properly integrated, students are engaged in science and engineering content through the coupling of practice, content and connection. In the past, these two dimensions have been separated as distinct entities. We know now that coupling content and practices better demonstrates what goes on in real world science and engineering. We set out to accomplish what is called for in NGSS by integrating these three dimensions to 'provide students with a context for the content of science, how science knowledge is acquired and understood, and how the sciences are connected through concepts that have universal meaning across the disciplines,' which include connections to authentic NASA science (NGSS, pg.2). The NASA context is embedded in the lessons and designed to interest students in Earth and space science. Research suggests that personal interest, experience, and enthusiasm--critical to children's learning of science at school or in other settings-- may also be linked to later educational and career choices. (Framework for K-12 Science Education: Practices, Cross-cutting concepts, Core ideas, p. 28) Students are encouraged to follow their interests, through additional online resources, real world NASA applications, and career connections offering insight to course offerings and possible majors. Combined with the innovative electronic component kit manufactured by littleBits™ Electronics, students are excited and engaged in authentic science and engineering. Sample circuit used in the Sensors, Circuits, and Satellites kit.