Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics
NASA Technical Reports Server (NTRS)
Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.; Stanley, H. E.
1995-01-01
We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C. elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of coding regions. In particular, (i) a n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger "n-gram redundancy") than the coding regions. In contrast to the three chromosomes, we find that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of the zeroth- and first-order Markovian models or simple nucleotide repeats to account fully for these "linguistic" features of DNA. Finally, we emphasize that our results by no means prove the existence of a "language" in noncoding DNA.
Spectral and textural processing of ERTS imagery. [Kansas
NASA Technical Reports Server (NTRS)
Haralick, R. M.; Bosley, R. J.
1974-01-01
A procedure is developed to simultaneously extract textural features from all bands of ERTS multispectral scanner imagery for automatic analysis. Multi-images lead to excessively large grey tone N-tuple co-occurrence matrices; therefore, neighboring grey N-tuple differences are measured and an ellipsoidally symmetric functional form is assumed for the co-occurrence distribution of multiimage greytone N-tuple differences. On the basis of past data the ellipsoidally symmetric approximation is shown to be reasonable. Initial evaluation of the procedure is encouraging.
Hybrid normed ideal perturbations of n-tuples of operators I
NASA Astrophysics Data System (ADS)
Voiculescu, Dan-Virgil
2018-06-01
In hybrid normed ideal perturbations of n-tuples of operators, the normed ideal is allowed to vary with the component operators. We begin extending to this setting the machinery we developed for normed ideal perturbations based on the modulus of quasicentral approximation and an adaptation of our non-commutative generalization of the Weyl-von Neumann theorem. For commuting n-tuples of hermitian operators, the modulus of quasicentral approximation remains essentially the same when Cn- is replaced by a hybrid n-tuple Cp1,…- , … , Cpn- , p1-1 + ⋯ + pn-1 = 1. The proof involves singular integrals of mixed homogeneity.
Signature Verification Using N-tuple Learning Machine.
Maneechot, Thanin; Kitjaidure, Yuttana
2005-01-01
This research presents new algorithm for signature verification using N-tuple learning machine. The features are taken from handwritten signature on Digital Tablet (On-line). This research develops recognition algorithm using four features extraction, namely horizontal and vertical pen tip position(x-y position), pen tip pressure, and pen altitude angles. Verification uses N-tuple technique with Gaussian thresholding.
True reason for Zipf's law in language
NASA Astrophysics Data System (ADS)
Dahui, Wang; Menghui, Li; Zengru, Di
2005-12-01
Analysis of word frequency have historically used data that included English, French, or other language, data typically described by Zipf's law. Using data on traditional and modern Chinese literatures, we show here that Chinese character frequency stroked Zipf's law based on literature before Qin dynasty; however, it departed from Zipf's law based on literature after Qin dynasty. Combined with data about English dictionaries and Chinese dictionaries, we show that the true reason for Zipf's Law in language is that growth and preferential selection mechanism of word or character in given language.
Hasan, Mehedi; Guemri, Rabiaa; Maldonado-Basilio, Ramón; Lucarz, Frédéric; de Bougrenet de la Tocnaye, Jean-Louis; Hall, Trevor
2015-12-15
A novel photonic circuit design for implementing frequency 8-tupling and 24-tupling was presented [Opt. Lett.39, 6950 (2014)10.1364/OL.39.006950OPLEDP0146-9592], and although its key message remains unaltered, there were typographical errors in the equations that are corrected in this erratum.
Co-ordination of Mobile Information Agents in TuCSoN.
ERIC Educational Resources Information Center
Omicini, Andrea; Zambonelli, Franco
1998-01-01
Examines mobile agent coordination and presents TuCSoN, a coordination model for Internet applications based on mobile information agents that uses a tuple centre, a tuple space enhanced with the capability of programming its behavior in response to communication events. Discusses the effectiveness of the TuCSoN model in the contexts of Internet…
Zipf's Law Application To Oil Spill Detection In The Ocean
NASA Astrophysics Data System (ADS)
Platonov, A.; Redondo, J. M.
One of the results of the CLEAN SEAS European Union project using SAR imaging of European Coastal Waters was the statistical analysis and detection of thousands of oil spills and slicks in the three compared sections, Baltic Sea, North Sea and N.W. Mediterranean. The results of another European Project, OIL WATCH together with the past 30 years of recorded mayor tanker accidental oil spills have been used in a predictive scheme that subject to spatial and temporal normalization of these two different scale processes clearly shows that the annual probability of the occurence of an oil spill follows Zipf's law. Local deviations from the law may be also explained in terms of multifractal analysis.
Information Theory Applied to Animal Communication Systems and Its Possible Application to SETI
NASA Astrophysics Data System (ADS)
Hanser, Sean F.; Doyle, Laurance R.; McCowan, Brenda; Jenkins, Jon M.
2004-06-01
Information theory, as first introduced by Claude Shannon (Shannon &Weaver 1949) quantitatively evaluates the organizational complexity of communication systems. At the same time George Zipf was examining linguistic structure in a way that was mathematically similar to the components of the Shannon first-order entropy (Zipf 1949). Both Shannon's and Zipf's mathematical procedures have been applied to animal communication and recently have been providing insightful results. The Zipf plot is a useful tool for a first estimate of the characterization of a communication system's complexity (which can later be examined for complex structure at deeper levels using Shannon entropic analysis). In this paper we shall discuss some of the applications and pitfalls of using the Zipf distribution as a preliminary evaluator of the communication complexity of a signaling system.
The emergence of Zipf's law - Spontaneous encoding optimization by users of a command language
NASA Technical Reports Server (NTRS)
Ellis, S. R.; Hitchcock, R. J.
1986-01-01
The distribution of commands issued by experienced users of a computer operating system allowing command customization tends to conform to Zipf's law. This result documents the emergence of a statistical property of natural language as users master an artificial language. Analysis of Zipf's law by Mandelbrot and Cherry shows that its emergence in the computer interaction of experienced users may be interpreted as evidence that these users optimize their encoding of commands. Accordingly, the extent to which users of a command language exhibit Zipf's law can provide a metric of the naturalness and efficiency with which that language is used.
Towards Informetrics: Haitun, Laplace, Zipf, Bradford and the Alvey Programme.
ERIC Educational Resources Information Center
Brookes, B. C.
1984-01-01
Review of recent developments in statistical theories for social sciences highlights Haitun's statistical distributions, Laplace's "Law of Succession" and distribution, Laplace and Bradford analysis of book-index data, inefficiency of frequency distribution analysis, Laws of Bradford and Zipf, natural categorization, and Bradford Law and…
García-Jacas, C R; Marrero-Ponce, Y; Barigye, S J; Hernández-Ortega, T; Cabrera-Leyva, L; Fernández-Castillo, A
2016-12-01
Novel N-tuple topological/geometric cutoffs to consider specific inter-atomic relations in the QuBiLS-MIDAS framework are introduced in this manuscript. These molecular cutoffs permit the taking into account of relations between more than two atoms by using (dis-)similarity multi-metrics and the concepts related with topological and Euclidean-geometric distances. To this end, the kth two-, three- and four-tuple topological and geometric neighbourhood quotient (NQ) total (or local-fragment) spatial-(dis)similarity matrices are defined, to represent 3D information corresponding to the relations between two, three and four atoms of the molecular structures that satisfy certain cutoff criteria. First, an analysis of a diverse chemical space for the most common values of topological/Euclidean-geometric distances, bond/dihedral angles, triangle/quadrilateral perimeters, triangle area and volume was performed in order to determine the intervals to take into account in the cutoff procedures. A variability analysis based on Shannon's entropy reveals that better distribution patterns are attained with the descriptors based on the cutoffs proposed (QuBiLS-MIDAS NQ-MDs) with regard to the results obtained when all inter-atomic relations are considered (QuBiLS-MIDAS KA-MDs - 'Keep All'). A principal component analysis shows that the novel molecular cutoffs codify chemical information captured by the respective QuBiLS-MIDAS KA-MDs, as well as information not captured by the latter. Lastly, a QSAR study to obtain deeper knowledge of the contribution of the proposed methods was carried out, using four molecular datasets (steroids (STER), angiotensin converting enzyme (ACE), thermolysin inhibitors (THER) and thrombin inhibitors (THR)) widely used as benchmarks in the evaluation of several methodologies. One to four variable QSAR models based on multiple linear regression were developed for each compound dataset following the original division into training and test sets. The results obtained reveal that the novel cutoff procedures yield superior performances relative to those of the QuBiLS-MIDAS KA-MDs in the prediction of the biological activities considered. From the results achieved, it can be suggested that the proposed N-tuple topological/geometric cutoffs constitute a relevant criteria for generating MDs codifying particular atomic relations, ultimately useful in enhancing the modelling capacity of the QuBiLS-MIDAS 3D-MDs.
Emergence of good conduct, scaling and zipf laws in human behavioral sequences in an online world.
Thurner, Stefan; Szell, Michael; Sinatra, Roberta
2012-01-01
We study behavioral action sequences of players in a massive multiplayer online game. In their virtual life players use eight basic actions which allow them to interact with each other. These actions are communication, trade, establishing or breaking friendships and enmities, attack, and punishment. We measure the probabilities for these actions conditional on previous taken and received actions and find a dramatic increase of negative behavior immediately after receiving negative actions. Similarly, positive behavior is intensified by receiving positive actions. We observe a tendency towards antipersistence in communication sequences. Classifying actions as positive (good) and negative (bad) allows us to define binary 'world lines' of lives of individuals. Positive and negative actions are persistent and occur in clusters, indicated by large scaling exponents α ~ 0.87 of the mean square displacement of the world lines. For all eight action types we find strong signs for high levels of repetitiveness, especially for negative actions. We partition behavioral sequences into segments of length n (behavioral 'words' and 'motifs') and study their statistical properties. We find two approximate power laws in the word ranking distribution, one with an exponent of κ ~ -1 for the ranks up to 100, and another with a lower exponent for higher ranks. The Shannon n-tuple redundancy yields large values and increases in terms of word length, further underscoring the non-trivial statistical properties of behavioral sequences. On the collective, societal level the timeseries of particular actions per day can be understood by a simple mean-reverting log-normal model.
Hasan, Mehedi; Guemri, Rabiaa; Maldonado-Basilio, Ramón; Lucarz, Frédéric; de Bougrenet de la Tocnaye, Jean-Louis; Hall, Trevor
2014-12-15
A photonic circuit design for implementing frequency 8-tupling and 24-tupling is proposed. The front- and back-end of the circuit comprises 4×4 MMI couplers enclosing an array of four pairs of phase modulators and 2×2 MMI couplers. The proposed design for frequency multiplication requires no optical or electrical filters, the operation is not limited to carefully adjusted modulation indexes, and the drift originated from static DC bias is mitigated by making use of the intrinsic phase relations of multi-mode interference couplers. A transfer matrix approach is used to represent the main building blocks of the design and hence to describe the operation of the frequency 8-tupling and 24-tupling. The concept is theoretically developed and demonstrated by simulations. Ideal and imperfect power imbalances in the multi-mode interference couplers, as well as ideal and imperfect phases of the electric drives to the phase modulators, are analyzed.
The languages of health in general practice electronic patient records: a Zipf's law analysis.
Kalankesh, Leila R; New, John P; Baker, Patricia G; Brass, Andy
2014-01-10
Natural human languages show a power law behaviour in which word frequency (in any large enough corpus) is inversely proportional to word rank - Zipf's law. We have therefore asked whether similar power law behaviours could be seen in data from electronic patient records. In order to examine this question, anonymised data were obtained from all general practices in Salford covering a seven year period and captured in the form of Read codes. It was found that data for patient diagnoses and procedures followed Zipf's law. However, the medication data behaved very differently, looking much more like a referential index. We also observed differences in the statistical behaviour of the language used to describe patient diagnosis as a function of an anonymised GP practice identifier. This works demonstrate that data from electronic patient records does follow Zipf's law. We also found significant differences in Zipf's law behaviour in data from different GP practices. This suggests that computational linguistic techniques could become a useful additional tool to help understand and monitor the data quality of health records.
Using Zipf-Mandelbrot law and graph theory to evaluate animal welfare
NASA Astrophysics Data System (ADS)
de Oliveira, Caprice G. L.; Miranda, José G. V.; Japyassú, Hilton F.; El-Hani, Charbel N.
2018-02-01
This work deals with the construction and testing of metrics of welfare based on behavioral complexity, using assumptions derived from Zipf-Mandelbrot law and graph theory. To test these metrics we compared yellow-breasted capuchins (Sapajus xanthosternos) (Wied-Neuwied, 1826) (PRIMATES CEBIDAE) found in two institutions, subjected to different captive conditions: a Zoobotanical Garden (hereafter, ZOO; n = 14), in good welfare condition, and a Wildlife Rescue Center (hereafter, WRC; n = 8), in poor welfare condition. In the Zipf-Mandelbrot-based analysis, the power law exponent was calculated using behavior frequency values versus behavior rank value. These values allow us to evaluate variations in individual behavioral complexity. For each individual we also constructed a graph using the sequence of behavioral units displayed in each recording (average recording time per individual: 4 h 26 min in the ZOO, 4 h 30 min in the WRC). Then, we calculated the values of the main graph attributes, which allowed us to analyze the complexity of the connectivity of the behaviors displayed in the individuals' behavioral sequences. We found significant differences between the two groups for the slope values in the Zipf-Mandelbrot analysis. The slope values for the ZOO individuals approached -1, with graphs representing a power law, while the values for the WRC individuals diverged from -1, differing from a power law pattern. Likewise, we found significant differences for the graph attributes average degree, weighted average degree, and clustering coefficient when comparing the ZOO and WRC individual graphs. However, no significant difference was found for the attributes modularity and average path length. Both analyses were effective in detecting differences between the patterns of behavioral complexity in the two groups. The slope values for the ZOO individuals indicated a higher behavioral complexity when compared to the WRC individuals. Similarly, graph construction and the calculation of its attributes values allowed us to show that the complexity of the connectivity among the behaviors was higher in the ZOO than in the WRC individual graphs. These results show that the two measuring approaches introduced and tested in this paper were capable of capturing the differences in welfare levels between the two conditions, as shown by differences in behavioral complexity.
Beyond Zipf's Law: The Lavalette Rank Function and Its Properties.
Fontanelli, Oscar; Miramontes, Pedro; Yang, Yaning; Cocho, Germinal; Li, Wentian
Although Zipf's law is widespread in natural and social data, one often encounters situations where one or both ends of the ranked data deviate from the power-law function. Previously we proposed the Beta rank function to improve the fitting of data which does not follow a perfect Zipf's law. Here we show that when the two parameters in the Beta rank function have the same value, the Lavalette rank function, the probability density function can be derived analytically. We also show both computationally and analytically that Lavalette distribution is approximately equal, though not identical, to the lognormal distribution. We illustrate the utility of Lavalette rank function in several datasets. We also address three analysis issues on the statistical testing of Lavalette fitting function, comparison between Zipf's law and lognormal distribution through Lavalette function, and comparison between lognormal distribution and Lavalette distribution.
Variation of Zipf's exponent in one hundred live languages: A study of the Holy Bible translations
NASA Astrophysics Data System (ADS)
Mehri, Ali; Jamaati, Maryam
2017-08-01
Zipf's law, as a power-law regularity, confirms long-range correlations between the elements in natural and artificial systems. In this article, this law is evaluated for one hundred live languages. We calculate Zipf's exponent for translations of the holy Bible to several languages, for this purpose. The results show that, the average of Zipf's exponent in studied texts is slightly above unity. All studied languages in some families have Zipf's exponent lower/higher than unity. It seems that geographical distribution impresses the communication between speakers of different languages in a language family, and affect similarity between their Zipf's exponent. The Bible has unique concept regardless of its language, but the discrepancy in grammatical rules and syntactic regularities in applying stop words to make sentences and imply a certain concept, lead to difference in Zipf's exponent for various languages.
Zipf's law holds for phrases, not words.
Williams, Jake Ryland; Lessard, Paul R; Desu, Suma; Clark, Eric M; Bagrow, James P; Danforth, Christopher M; Dodds, Peter Sheridan
2015-08-11
With Zipf's law being originally and most famously observed for word frequency, it is surprisingly limited in its applicability to human language, holding over no more than three to four orders of magnitude before hitting a clear break in scaling. Here, building on the simple observation that phrases of one or more words comprise the most coherent units of meaning in language, we show empirically that Zipf's law for phrases extends over as many as nine orders of rank magnitude. In doing so, we develop a principled and scalable statistical mechanical method of random text partitioning, which opens up a rich frontier of rigorous text analysis via a rank ordering of mixed length phrases.
The evolution of Zipf's law indicative of city development
NASA Astrophysics Data System (ADS)
Chen, Yanguang
2016-02-01
Zipf's law of city-size distributions can be expressed by three types of mathematical models: one-parameter form, two-parameter form, and three-parameter form. The one-parameter and one of the two-parameter models are familiar to urban scientists. However, the three-parameter model and another type of two-parameter model have not attracted attention. This paper is devoted to exploring the conditions and scopes of application of these Zipf models. By mathematical reasoning and empirical analysis, new discoveries are made as follows. First, if the size distribution of cities in a geographical region cannot be described with the one- or two-parameter model, maybe it can be characterized by the three-parameter model with a scaling factor and a scale-translational factor. Second, all these Zipf models can be unified by hierarchical scaling laws based on cascade structure. Third, the patterns of city-size distributions seem to evolve from three-parameter mode to two-parameter mode, and then to one-parameter mode. Four-year census data of Chinese cities are employed to verify the three-parameter Zipf's law and the corresponding hierarchical structure of rank-size distributions. This study is revealing for people to understand the scientific laws of social systems and the property of urban development.
Zipf's law and city size distribution: A survey of the literature and future research agenda
NASA Astrophysics Data System (ADS)
Arshad, Sidra; Hu, Shougeng; Ashraf, Badar Nadeem
2018-02-01
This study provides a systematic review of the existing literature on Zipf's law for city size distribution. Existing empirical evidence suggests that Zipf's law is not always observable even for the upper-tail cities of a territory. However, the controversy with empirical findings arises due to sample selection biases, methodological weaknesses and data limitations. The hypothesis of Zipf's law is more likely to be rejected for the entire city size distribution and, in such case, alternative distributions have been suggested. On the contrary, the hypothesis is more likely to be accepted if better empirical methods are employed and cities are properly defined. The debate is still far from to be conclusive. In addition, we identify four emerging areas in Zipf's law and city size distribution research including the size distribution of lower-tail cities, the size distribution of cities in sub-national regions, the alternative forms of Zipf's law, and the relationship between Zipf's law and the coherence property of the urban system.
Bankruptcy risk model and empirical tests
Podobnik, Boris; Horvatic, Davor; Petersen, Alexander M.; Urošević, Branko; Stanley, H. Eugene
2010-01-01
We analyze the size dependence and temporal stability of firm bankruptcy risk in the US economy by applying Zipf scaling techniques. We focus on a single risk factor—the debt-to-asset ratio R—in order to study the stability of the Zipf distribution of R over time. We find that the Zipf exponent increases during market crashes, implying that firms go bankrupt with larger values of R. Based on the Zipf analysis, we employ Bayes’s theorem and relate the conditional probability that a bankrupt firm has a ratio R with the conditional probability of bankruptcy for a firm with a given R value. For 2,737 bankrupt firms, we demonstrate size dependence in assets change during the bankruptcy proceedings. Prepetition firm assets and petition firm assets follow Zipf distributions but with different exponents, meaning that firms with smaller assets adjust their assets more than firms with larger assets during the bankruptcy process. We compare bankrupt firms with nonbankrupt firms by analyzing the assets and liabilities of two large subsets of the US economy: 2,545 Nasdaq members and 1,680 New York Stock Exchange (NYSE) members. We find that both assets and liabilities follow a Pareto distribution. The finding is not a trivial consequence of the Zipf scaling relationship of firm size quantified by employees—although the market capitalization of Nasdaq stocks follows a Pareto distribution, the same distribution does not describe NYSE stocks. We propose a coupled Simon model that simultaneously evolves both assets and debt with the possibility of bankruptcy, and we also consider the possibility of firm mergers. PMID:20937903
The Evolution of the Exponent of Zipf's Law in Language Ontogeny
Baixeries, Jaume; Elvevåg, Brita; Ferrer-i-Cancho, Ramon
2013-01-01
It is well-known that word frequencies arrange themselves according to Zipf's law. However, little is known about the dependency of the parameters of the law and the complexity of a communication system. Many models of the evolution of language assume that the exponent of the law remains constant as the complexity of a communication systems increases. Using longitudinal studies of child language, we analysed the word rank distribution for the speech of children and adults participating in conversations. The adults typically included family members (e.g., parents) or the investigators conducting the research. Our analysis of the evolution of Zipf's law yields two main unexpected results. First, in children the exponent of the law tends to decrease over time while this tendency is weaker in adults, thus suggesting this is not a mere mirror effect of adult speech. Second, although the exponent of the law is more stable in adults, their exponents fall below 1 which is the typical value of the exponent assumed in both children and adults. Our analysis also shows a tendency of the mean length of utterances (MLU), a simple estimate of syntactic complexity, to increase as the exponent decreases. The parallel evolution of the exponent and a simple indicator of syntactic complexity (MLU) supports the hypothesis that the exponent of Zipf's law and linguistic complexity are inter-related. The assumption that Zipf's law for word ranks is a power-law with a constant exponent of one in both adults and children needs to be revised. PMID:23516390
Design and Implementation of A Backend Multiple-Processor Relational Data Base Computer System.
1981-12-01
propogated to other parts of the data base. 18 Cost. As mentioned earlier, a primary motivation for the backend DBMS work is the development of an...uniquely identify the n- tuples of the relation is called the primary key. For example, in Figure 3, the primary key is NUMBER. A primary key is said to...identifying the tuple. For example, in Figure 3, (NUMBER,TITLE) would not be a nonredundant primary key for COURSE. A relation can contain more than one
Zipf law: an extreme perspective
NASA Astrophysics Data System (ADS)
Eliazar, Iddo
2016-04-01
Extreme value theory (EVT) asserts that the Fréchet law emerges universally from linearly scaled maxima of collections of independent and identically distributed random variables that are positive-valued. Observations of many real-world sizes, e.g. city-sizes, give rise to the Zipf law: if we rank the sizes decreasingly, and plot the log-sizes versus the log-ranks, then an affine line emerges. In this paper we present an EVT approach to the Zipf law. Specifically, we establish that whenever the Fréchet law emerges from the EVT setting, then the Zipf law follows. The EVT generation of the Zipf law, its universality, and its associated phase transition, are analyzed and described in detail.
Empirical tests of Zipf's law mechanism in open source Linux distribution.
Maillart, T; Sornette, D; Spaeth, S; von Krogh, G
2008-11-21
Zipf's power law is a ubiquitous empirical regularity found in many systems, thought to result from proportional growth. Here, we establish empirically the usually assumed ingredients of stochastic growth models that have been previously conjectured to be at the origin of Zipf's law. We use exceptionally detailed data on the evolution of open source software projects in Linux distributions, which offer a remarkable example of a growing complex self-organizing adaptive system, exhibiting Zipf's law over four full decades.
Empirical and Theoretical Bases of Zipf's Law.
ERIC Educational Resources Information Center
Wyllys, Ronald E.
1981-01-01
Explains Zipf's Law of Vocabulary Distribution (i.e., relationship between frequency of a word in a corpus and its rank), noting the discovery of the law, alternative forms, and literature relating to the search for a rationale for Zipf's Law. Thirty-eight references are cited. (EJS)
Evolution of scaling emergence in large-scale spatial epidemic spreading.
Wang, Lin; Li, Xiang; Zhang, Yi-Qing; Zhang, Yan; Zhang, Kan
2011-01-01
Zipf's law and Heaps' law are two representatives of the scaling concepts, which play a significant role in the study of complexity science. The coexistence of the Zipf's law and the Heaps' law motivates different understandings on the dependence between these two scalings, which has still hardly been clarified. In this article, we observe an evolution process of the scalings: the Zipf's law and the Heaps' law are naturally shaped to coexist at the initial time, while the crossover comes with the emergence of their inconsistency at the larger time before reaching a stable state, where the Heaps' law still exists with the disappearance of strict Zipf's law. Such findings are illustrated with a scenario of large-scale spatial epidemic spreading, and the empirical results of pandemic disease support a universal analysis of the relation between the two laws regardless of the biological details of disease. Employing the United States domestic air transportation and demographic data to construct a metapopulation model for simulating the pandemic spread at the U.S. country level, we uncover that the broad heterogeneity of the infrastructure plays a key role in the evolution of scaling emergence. The analyses of large-scale spatial epidemic spreading help understand the temporal evolution of scalings, indicating the coexistence of the Zipf's law and the Heaps' law depends on the collective dynamics of epidemic processes, and the heterogeneity of epidemic spread indicates the significance of performing targeted containment strategies at the early time of a pandemic disease.
Evolution of Scaling Emergence in Large-Scale Spatial Epidemic Spreading
Wang, Lin; Li, Xiang; Zhang, Yi-Qing; Zhang, Yan; Zhang, Kan
2011-01-01
Background Zipf's law and Heaps' law are two representatives of the scaling concepts, which play a significant role in the study of complexity science. The coexistence of the Zipf's law and the Heaps' law motivates different understandings on the dependence between these two scalings, which has still hardly been clarified. Methodology/Principal Findings In this article, we observe an evolution process of the scalings: the Zipf's law and the Heaps' law are naturally shaped to coexist at the initial time, while the crossover comes with the emergence of their inconsistency at the larger time before reaching a stable state, where the Heaps' law still exists with the disappearance of strict Zipf's law. Such findings are illustrated with a scenario of large-scale spatial epidemic spreading, and the empirical results of pandemic disease support a universal analysis of the relation between the two laws regardless of the biological details of disease. Employing the United States domestic air transportation and demographic data to construct a metapopulation model for simulating the pandemic spread at the U.S. country level, we uncover that the broad heterogeneity of the infrastructure plays a key role in the evolution of scaling emergence. Conclusions/Significance The analyses of large-scale spatial epidemic spreading help understand the temporal evolution of scalings, indicating the coexistence of the Zipf's law and the Heaps' law depends on the collective dynamics of epidemic processes, and the heterogeneity of epidemic spread indicates the significance of performing targeted containment strategies at the early time of a pandemic disease. PMID:21747932
Mapping of the Tuple1 gene to mouse chromosome 16A-B1
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mattei, M.G.; Halford, S.; Scambler, P.J.
The human TUPLE1 gene encodes a putative transcriptional regulator and maps to chromosome 22, and therefore may play a role in Di-George syndrome (DGS), relo-cardio-facial syndrome (VCFS), or a related pathology. The murine TUPLE1 gene has also been cloned and shows strong sequence similarity to TUPLE1. Comparative mapping is useful in the study of chromosome evolution and is sometimes able to indicate possible mouse mutations that are potential models of human genetic disorders. As TIPLE1 is a candidate gene for the haploinsufficient phenotype in DGS, we mapped TUPLE1 to mouse chromosome 16A-B1. 6 refs., 1 fig.
Zipf's Law and Avoidance of Excessive Synonymy
ERIC Educational Resources Information Center
Manin, Dmitrii Y.
2008-01-01
Zipf's law states that if words of language are ranked in the order of decreasing frequency in texts, the frequency of a word is inversely proportional to its rank. It is very reliably observed in the data, but to date it escaped satisfactory theoretical explanation. This article suggests that Zipf's law may result from a hierarchical organization…
Zipf's word frequency law in natural language: a critical review and future directions.
Piantadosi, Steven T
2014-10-01
The frequency distribution of words has been a key object of study in statistical linguistics for the past 70 years. This distribution approximately follows a simple mathematical form known as Zipf's law. This article first shows that human language has a highly complex, reliable structure in the frequency distribution over and above this classic law, although prior data visualization methods have obscured this fact. A number of empirical phenomena related to word frequencies are then reviewed. These facts are chosen to be informative about the mechanisms giving rise to Zipf's law and are then used to evaluate many of the theoretical explanations of Zipf's law in language. No prior account straightforwardly explains all the basic facts or is supported with independent evaluation of its underlying assumptions. To make progress at understanding why language obeys Zipf's law, studies must seek evidence beyond the law itself, testing assumptions and evaluating novel predictions with new, independent data.
On power series expansions of the S-resolvent operator and the Taylor formula
NASA Astrophysics Data System (ADS)
Colombo, Fabrizio; Gantner, Jonathan
2016-12-01
The S-functional calculus is based on the theory of slice hyperholomorphic functions and it defines functions of n-tuples of not necessarily commuting operators or of quaternionic operators. This calculus relays on the notion of S-spectrum and of S-resolvent operator. Since most of the properties that hold for the Riesz-Dunford functional calculus extend to the S-functional calculus, it can be considered its non commutative version. In this paper we show that the Taylor formula of the Riesz-Dunford functional calculus can be generalized to the S-functional calculus. The proof is not a trivial extension of the classical case because there are several obstructions due to the non commutativity of the setting in which we work that have to be overcome. To prove the Taylor formula we need to introduce a new series expansion of the S-resolvent operators associated to the sum of two n-tuples of operators. This result is a crucial step in the proof of our main results, but it is also of independent interest because it gives a new series expansion for the S-resolvent operators. This paper is addressed to researchers working in operator theory and in hypercomplex analysis.
Log-Log Convexity of Type-Token Growth in Zipf's Systems
NASA Astrophysics Data System (ADS)
Font-Clos, Francesc; Corral, Álvaro
2015-06-01
It is traditionally assumed that Zipf's law implies the power-law growth of the number of different elements with the total number of elements in a system—the so-called Heaps' law. We show that a careful definition of Zipf's law leads to the violation of Heaps' law in random systems, with growth curves that have a convex shape in log-log scale. These curves fulfill universal data collapse that only depends on the value of Zipf's exponent. We observe that real books behave very much in the same way as random systems, despite the presence of burstiness in word occurrence. We advance an explanation for this unexpected correspondence.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wadey, R.; Roberts, C.; Daw, S.
1994-09-01
Deletions within chromosome 22q11 cause a wide variety of birth defects including DiGeorge syndrome and Shprintzen syndrome. We have defined a commonly deleted region of over 2 Mb, and a critical region of 300 kb. A gene, TUPLE1, has been isolated from this critical region encoding a transcriptional regulator similar to the yeast HIR1 histone regulator gene. Since it has been suggested that DGS results from a defective neural crest, the expression of Tuple1 was examined in whole mouse and chick embryos, tissue sections and neural tube explants: Tuple1 is expressed in a dynamic pattern with high levels in regionsmore » containing migrating crest. Prior to crest migration Tuple1 is expressed in a rhombomere-specific expression pattern. Later Tuple1 is expressed in discrete domains within the developing neural tube. A remarkable feature of the experiments was the detection of a similar dynamic pattern with sense probe; i.e., there is an antisense Tuple1 transcript. This was confirmed using RPA. Tuple1 is being screened for mutations in non-deletion patients and constructs assembled for homologous recombination in ES cells. Tuple1 maps to MMU16 extending the homology of linkage with human chromosome 22. From these data we predict that the human homologue of the murine scid mutation maps to 22q11.« less
Word frequencies: A comparison of Pareto type distributions
NASA Astrophysics Data System (ADS)
Wiegand, Martin; Nadarajah, Saralees; Si, Yuancheng
2018-03-01
Mehri and Jamaati (2017) [18] used Zipf's law to model word frequencies in Holy Bible translations for one hundred live languages. We compare the fit of Zipf's law to a number of Pareto type distributions. The latter distributions are shown to provide the best fit, as judged by a number of comparative plots and error measures. The fit of Zipf's law appears generally poor.
Modeling Fractal Structure of City-Size Distributions Using Correlation Functions
Chen, Yanguang
2011-01-01
Zipf's law is one the most conspicuous empirical facts for cities, however, there is no convincing explanation for the scaling relation between rank and size and its scaling exponent. Using the idea from general fractals and scaling, I propose a dual competition hypothesis of city development to explain the value intervals and the special value, 1, of the power exponent. Zipf's law and Pareto's law can be mathematically transformed into one another, but represent different processes of urban evolution, respectively. Based on the Pareto distribution, a frequency correlation function can be constructed. By scaling analysis and multifractals spectrum, the parameter interval of Pareto exponent is derived as (0.5, 1]; Based on the Zipf distribution, a size correlation function can be built, and it is opposite to the first one. By the second correlation function and multifractals notion, the Pareto exponent interval is derived as [1, 2). Thus the process of urban evolution falls into two effects: one is the Pareto effect indicating city number increase (external complexity), and the other the Zipf effect indicating city size growth (internal complexity). Because of struggle of the two effects, the scaling exponent varies from 0.5 to 2; but if the two effects reach equilibrium with each other, the scaling exponent approaches 1. A series of mathematical experiments on hierarchical correlation are employed to verify the models and a conclusion can be drawn that if cities in a given region follow Zipf's law, the frequency and size correlations will follow the scaling law. This theory can be generalized to interpret the inverse power-law distributions in various fields of physical and social sciences. PMID:21949753
Understanding Zipf's law of word frequencies through sample-space collapse in sentence formation
Thurner, Stefan; Hanel, Rudolf; Liu, Bo; Corominas-Murtra, Bernat
2015-01-01
The formation of sentences is a highly structured and history-dependent process. The probability of using a specific word in a sentence strongly depends on the ‘history’ of word usage earlier in that sentence. We study a simple history-dependent model of text generation assuming that the sample-space of word usage reduces along sentence formation, on average. We first show that the model explains the approximate Zipf law found in word frequencies as a direct consequence of sample-space reduction. We then empirically quantify the amount of sample-space reduction in the sentences of 10 famous English books, by analysis of corresponding word-transition tables that capture which words can follow any given word in a text. We find a highly nested structure in these transition tables and show that this ‘nestedness’ is tightly related to the power law exponents of the observed word frequency distributions. With the proposed model, it is possible to understand that the nestedness of a text can be the origin of the actual scaling exponent and that deviations from the exact Zipf law can be understood by variations of the degree of nestedness on a book-by-book basis. On a theoretical level, we are able to show that in the case of weak nesting, Zipf's law breaks down in a fast transition. Unlike previous attempts to understand Zipf's law in language the sample-space reducing model is not based on assumptions of multiplicative, preferential or self-organized critical mechanisms behind language formation, but simply uses the empirically quantifiable parameter ‘nestedness’ to understand the statistics of word frequencies. PMID:26063827
Understanding Zipf's law of word frequencies through sample-space collapse in sentence formation.
Thurner, Stefan; Hanel, Rudolf; Liu, Bo; Corominas-Murtra, Bernat
2015-07-06
The formation of sentences is a highly structured and history-dependent process. The probability of using a specific word in a sentence strongly depends on the 'history' of word usage earlier in that sentence. We study a simple history-dependent model of text generation assuming that the sample-space of word usage reduces along sentence formation, on average. We first show that the model explains the approximate Zipf law found in word frequencies as a direct consequence of sample-space reduction. We then empirically quantify the amount of sample-space reduction in the sentences of 10 famous English books, by analysis of corresponding word-transition tables that capture which words can follow any given word in a text. We find a highly nested structure in these transition tables and show that this 'nestedness' is tightly related to the power law exponents of the observed word frequency distributions. With the proposed model, it is possible to understand that the nestedness of a text can be the origin of the actual scaling exponent and that deviations from the exact Zipf law can be understood by variations of the degree of nestedness on a book-by-book basis. On a theoretical level, we are able to show that in the case of weak nesting, Zipf's law breaks down in a fast transition. Unlike previous attempts to understand Zipf's law in language the sample-space reducing model is not based on assumptions of multiplicative, preferential or self-organized critical mechanisms behind language formation, but simply uses the empirically quantifiable parameter 'nestedness' to understand the statistics of word frequencies. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
Uniqueness of the joint measurement and the structure of the set of compatible quantum measurements
NASA Astrophysics Data System (ADS)
Guerini, Leonardo; Terra Cunha, Marcelo
2018-04-01
We address the problem of characterising the compatible tuples of measurements that admit a unique joint measurement. We derive a uniqueness criterion based on the method of perturbations and apply it to show that extremal points of the set of compatible tuples admit a unique joint measurement, while all tuples that admit a unique joint measurement lie in the boundary of such a set. We also provide counter-examples showing that none of these properties are both necessary and sufficient, thus completely describing the relation between the joint measurement uniqueness and the structure of the compatible set. As a by-product of our investigations, we completely characterise the extremal and boundary points of the set of general tuples of measurements and of the subset of compatible tuples.
Deviation of Zipf's and Heaps' Laws in Human Languages with Limited Dictionary Sizes
Lü, Linyuan; Zhang, Zi-Ke; Zhou, Tao
2013-01-01
Zipf's law on word frequency and Heaps' law on the growth of distinct words are observed in Indo-European language family, but it does not hold for languages like Chinese, Japanese and Korean. These languages consist of characters, and are of very limited dictionary sizes. Extensive experiments show that: (i) The character frequency distribution follows a power law with exponent close to one, at which the corresponding Zipf's exponent diverges. Indeed, the character frequency decays exponentially in the Zipf's plot. (ii) The number of distinct characters grows with the text length in three stages: It grows linearly in the beginning, then turns to a logarithmical form, and eventually saturates. A theoretical model for writing process is proposed, which embodies the rich-get-richer mechanism and the effects of limited dictionary size. Experiments, simulations and analytical solutions agree well with each other. This work refines the understanding about Zipf's and Heaps' laws in human language systems. PMID:23378896
Towards a seascape typology. I. Zipf versus Pareto laws
NASA Astrophysics Data System (ADS)
Seuront, Laurent; Mitchell, James G.
Two data analysis methods, referred to as the Zipf and Pareto methods, initially introduced in economics and linguistics two centuries ago and subsequently used in a wide range of fields (word frequency in languages and literature, human demographics, finance, city formation, genomics and physics), are described and proposed here as a potential tool to classify space-time patterns in marine ecology. The aim of this paper is, first, to present the theoretical bases of Zipf and Pareto laws, and to demonstrate that they are strictly equivalent. In that way, we provide a one-to-one correspondence between their characteristic exponents and argue that the choice of technique is a matter of convenience. Second, we argue that the appeal of this technique is that it is assumption-free for the distribution of the data and regularity of sampling interval, as well as being extremely easy to implement. Finally, in order to allow marine ecologists to identify and classify any structure in their data sets, we provide a step by step overview of the characteristic shapes expected for Zipf's law for the cases of randomness, power law behavior, power law behavior contaminated by internal and external noise, and competing power laws illustrated on the basis of typical ecological situations such as mixing processes involving non-interacting and interacting species, phytoplankton growth processes and differential grazing by zooplankton.
Design of a lattice-based faceted classification system
NASA Technical Reports Server (NTRS)
Eichmann, David A.; Atkins, John
1992-01-01
We describe a software reuse architecture supporting component retrieval by facet classes. The facets are organized into a lattice of facet sets and facet n-tuples. The query mechanism supports precise retrieval and flexible browsing.
Zipf's Law for Word Frequencies: Word Forms versus Lemmas in Long Texts.
Corral, Álvaro; Boleda, Gemma; Ferrer-i-Cancho, Ramon
2015-01-01
Zipf's law is a fundamental paradigm in the statistics of written and spoken natural language as well as in other communication systems. We raise the question of the elementary units for which Zipf's law should hold in the most natural way, studying its validity for plain word forms and for the corresponding lemma forms. We analyze several long literary texts comprising four languages, with different levels of morphological complexity. In all cases Zipf's law is fulfilled, in the sense that a power-law distribution of word or lemma frequencies is valid for several orders of magnitude. We investigate the extent to which the word-lemma transformation preserves two parameters of Zipf's law: the exponent and the low-frequency cut-off. We are not able to demonstrate a strict invariance of the tail, as for a few texts both exponents deviate significantly, but we conclude that the exponents are very similar, despite the remarkable transformation that going from words to lemmas represents, considerably affecting all ranges of frequencies. In contrast, the low-frequency cut-offs are less stable, tending to increase substantially after the transformation.
Explaining Zipf's law via a mental lexicon
NASA Astrophysics Data System (ADS)
Allahverdyan, Armen E.; Deng, Weibing; Wang, Q. A.
2013-12-01
Zipf's law is the major regularity of statistical linguistics that has served as a prototype for rank-frequency relations and scaling laws in natural sciences. Here we show that Zipf's law—together with its applicability for a single text and its generalizations to high and low frequencies including hapax legomena—can be derived from assuming that the words are drawn into the text with random probabilities. Their a priori density relates, via the Bayesian statistics, to the mental lexicon of the author who produced the text.
Pareto-Zipf law in growing systems with multiplicative interactions
NASA Astrophysics Data System (ADS)
Ohtsuki, Toshiya; Tanimoto, Satoshi; Sekiyama, Makoto; Fujihara, Akihiro; Yamamoto, Hiroshi
2018-06-01
Numerical simulations of multiplicatively interacting stochastic processes with weighted selections were conducted. A feedback mechanism to control the weight w of selections was proposed. It becomes evident that when w is moderately controlled around 0, such systems spontaneously exhibit the Pareto-Zipf distribution. The simulation results are universal in the sense that microscopic details, such as parameter values and the type of control and weight, are irrelevant. The central ingredient of the Pareto-Zipf law is argued to be the mild control of interactions.
Estimates of Storage Capacity of Multilayer Perceptron with Threshold Logic Hidden Units.
Kowalczyk, Adam
1997-11-01
We estimate the storage capacity of multilayer perceptron with n inputs, h(1) threshold logic units in the first hidden layer and 1 output. We show that if the network can memorize 50% of all dichotomies of a randomly selected N-tuple of points of R(n) with probability 1, then N=2(nh(1)+1), while at 100% memorization N=nh(1)+1. Furthermore, if the bounds are reached, then the first hidden layer must be fully connected to the input. It is shown that such a network has memory capacity (in the sense of Cover) between nh(1)+1 and 2(nh(1)+1) input patterns and for the most efficient networks in this class between 1 and 2 input patterns per connection. Comparing these results with the recent estimates of VC-dimension we find that in contrast to a single neuron case, the VC-dimension exceeds the capacity for a sufficiently large n and h(1). The results are based on the derivation of an explicit expression for the number of dichotomies which can be implemented by such a network for a special class of N-tuples of input patterns which has a positive probability of being randomly chosen.
Optimal shortening of uniform covering arrays
Rangel-Valdez, Nelson; Avila-George, Himer; Carrizalez-Turrubiates, Oscar
2017-01-01
Software test suites based on the concept of interaction testing are very useful for testing software components in an economical way. Test suites of this kind may be created using mathematical objects called covering arrays. A covering array, denoted by CA(N; t, k, v), is an N × k array over Zv={0,…,v-1} with the property that every N × t sub-array covers all t-tuples of Zvt at least once. Covering arrays can be used to test systems in which failures occur as a result of interactions among components or subsystems. They are often used in areas such as hardware Trojan detection, software testing, and network design. Because system testing is expensive, it is critical to reduce the amount of testing required. This paper addresses the Optimal Shortening of Covering ARrays (OSCAR) problem, an optimization problem whose objective is to construct, from an existing covering array matrix of uniform level, an array with dimensions of (N − δ) × (k − Δ) such that the number of missing t-tuples is minimized. Two applications of the OSCAR problem are (a) to produce smaller covering arrays from larger ones and (b) to obtain quasi-covering arrays (covering arrays in which the number of missing t-tuples is small) to be used as input to a meta-heuristic algorithm that produces covering arrays. In addition, it is proven that the OSCAR problem is NP-complete, and twelve different algorithms are proposed to solve it. An experiment was performed on 62 problem instances, and the results demonstrate the effectiveness of solving the OSCAR problem to facilitate the construction of new covering arrays. PMID:29267343
A knowledge base architecture for distributed knowledge agents
NASA Technical Reports Server (NTRS)
Riedesel, Joel; Walls, Bryan
1990-01-01
A tuple space based object oriented model for knowledge base representation and interpretation is presented. An architecture for managing distributed knowledge agents is then implemented within the model. The general model is based upon a database implementation of a tuple space. Objects are then defined as an additional layer upon the database. The tuple space may or may not be distributed depending upon the database implementation. A language for representing knowledge and inference strategy is defined whose implementation takes advantage of the tuple space. The general model may then be instantiated in many different forms, each of which may be a distinct knowledge agent. Knowledge agents may communicate using tuple space mechanisms as in the LINDA model as well as using more well known message passing mechanisms. An implementation of the model is presented describing strategies used to keep inference tractable without giving up expressivity. An example applied to a power management and distribution network for Space Station Freedom is given.
Deviations in the Zipf and Heaps laws in natural languages
NASA Astrophysics Data System (ADS)
Bochkarev, Vladimir V.; Lerner, Eduard Yu; Shevlyakova, Anna V.
2014-03-01
This paper is devoted to verifying of the empirical Zipf and Hips laws in natural languages using Google Books Ngram corpus data. The connection between the Zipf and Heaps law which predicts the power dependence of the vocabulary size on the text size is discussed. In fact, the Heaps exponent in this dependence varies with the increasing of the text corpus. To explain it, the obtained results are compared with the probability model of text generation. Quasi-periodic variations with characteristic time periods of 60-100 years were also found.
The mathematical relationship between Zipf’s law and the hierarchical scaling law
NASA Astrophysics Data System (ADS)
Chen, Yanguang
2012-06-01
The empirical studies of city-size distribution show that Zipf's law and the hierarchical scaling law are linked in many ways. The rank-size scaling and hierarchical scaling seem to be two different sides of the same coin, but their relationship has never been revealed by strict mathematical proof. In this paper, the Zipf's distribution of cities is abstracted as a q-sequence. Based on this sequence, a self-similar hierarchy consisting of many levels is defined and the numbers of cities in different levels form a geometric sequence. An exponential distribution of the average size of cities is derived from the hierarchy. Thus we have two exponential functions, from which follows a hierarchical scaling equation. The results can be statistically verified by simple mathematical experiments and observational data of cities. A theoretical foundation is then laid for the conversion from Zipf's law to the hierarchical scaling law, and the latter can show more information about city development than the former. Moreover, the self-similar hierarchy provides a new perspective for studying networks of cities as complex systems. A series of mathematical rules applied to cities such as the allometric growth law, the 2n principle and Pareto's law can be associated with one another by the hierarchical organization.
A probabilistic NF2 relational algebra for integrated information retrieval and database systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fuhr, N.; Roelleke, T.
The integration of information retrieval (IR) and database systems requires a data model which allows for modelling documents as entities, representing uncertainty and vagueness and performing uncertain inference. For this purpose, we present a probabilistic data model based on relations in non-first-normal-form (NF2). Here, tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Thus, the set of weighted index terms of a document are represented as a probabilistic subrelation. In a similar way, imprecise attribute values are modelled as a set-valued attribute. We redefine the relational operators for this type of relations such thatmore » the result of each operator is again a probabilistic NF2 relation, where the weight of a tuple gives the probability that this tuple belongs to the result. By ordering the tuples according to decreasing probabilities, the model yields a ranking of answers like in most IR models. This effect also can be used for typical database queries involving imprecise attribute values as well as for combinations of database and IR queries.« less
The span of correlations in dolphin whistle sequences
NASA Astrophysics Data System (ADS)
Ferrer-i-Cancho, Ramon; McCowan, Brenda
2012-06-01
Long-range correlations are found in symbolic sequences from human language, music and DNA. Determining the span of correlations in dolphin whistle sequences is crucial for shedding light on their communicative complexity. Dolphin whistles share various statistical properties with human words, i.e. Zipf's law for word frequencies (namely that the probability of the ith most frequent word of a text is about i-α) and a parallel of the tendency of more frequent words to have more meanings. The finding of Zipf's law for word frequencies in dolphin whistles has been the topic of an intense debate on its implications. One of the major arguments against the relevance of Zipf's law in dolphin whistles is that it is not possible to distinguish the outcome of a die-rolling experiment from that of a linguistic or communicative source producing Zipf's law for word frequencies. Here we show that statistically significant whistle-whistle correlations extend back to the second previous whistle in the sequence, using a global randomization test, and to the fourth previous whistle, using a local randomization test. None of these correlations are expected by a die-rolling experiment and other simple explanations of Zipf's law for word frequencies, such as Simon's model, that produce sequences of unpredictable elements.
Zipf rank approach and cross-country convergence of incomes
NASA Astrophysics Data System (ADS)
Shao, Jia; Ivanov, Plamen Ch.; Urošević, Branko; Stanley, H. Eugene; Podobnik, Boris
2011-05-01
We employ a concept popular in physics —the Zipf rank approach— in order to estimate the number of years that EU members would need in order to achieve "convergence" of their per capita incomes. Assuming that trends in the past twenty years continue to hold in the future, we find that after t≈30 years both developing and developed EU countries indexed by i will have comparable values of their per capita gross domestic product {\\cal G}_{i,t} . Besides the traditional Zipf rank approach we also propose a weighted Zipf rank method. In contrast to the EU block, on the world level the Zipf rank approach shows that, between 1960 and 2009, cross-country income differences increased over time. For a brief period during the 2007-2008 global economic crisis, at world level the {\\cal G}_{i,t} of richer countries declined more rapidly than the {\\cal G}_{i,t} of poorer countries, in contrast to EU where the {\\cal G}_{i,t} of developing EU countries declined faster than the {\\cal G}_{i,t} of developed EU countries, indicating that the recession interrupted the convergence between EU members. We propose a simple model of GDP evolution that accounts for the scaling we observe in the data.
NASA Technical Reports Server (NTRS)
Haralick, R. H. (Principal Investigator); Bosley, R. J.
1974-01-01
The author has identified the following significant results. A procedure was developed to extract cross-band textural features from ERTS MSS imagery. Evolving from a single image texture extraction procedure which uses spatial dependence matrices to measure relative co-occurrence of nearest neighbor grey tones, the cross-band texture procedure uses the distribution of neighboring grey tone N-tuple differences to measure the spatial interrelationships, or co-occurrences, of the grey tone N-tuples present in a texture pattern. In both procedures, texture is characterized in such a way as to be invariant under linear grey tone transformations. However, the cross-band procedure complements the single image procedure by extracting texture information and spectral information contained in ERTS multi-images. Classification experiments show that when used alone, without spectral processing, the cross-band texture procedure extracts more information than the single image texture analysis. Results show an improvement in average correct classification from 86.2% to 88.8% for ERTS image no. 1021-16333 with the cross-band texture procedure. However, when used together with spectral features, the single image texture plus spectral features perform better than the cross-band texture plus spectral features, with an average correct classification of 93.8% and 91.6%, respectively.
Detection of core-periphery structure in networks based on 3-tuple motifs
NASA Astrophysics Data System (ADS)
Ma, Chuang; Xiang, Bing-Bing; Chen, Han-Shuang; Small, Michael; Zhang, Hai-Feng
2018-05-01
Detecting mesoscale structure, such as community structure, is of vital importance for analyzing complex networks. Recently, a new mesoscale structure, core-periphery (CP) structure, has been identified in many real-world systems. In this paper, we propose an effective algorithm for detecting CP structure based on a 3-tuple motif. In this algorithm, we first define a 3-tuple motif in terms of the patterns of edges as well as the property of nodes, and then a motif adjacency matrix is constructed based on the 3-tuple motif. Finally, the problem is converted to find a cluster that minimizes the smallest motif conductance. Our algorithm works well in different CP structures: including single or multiple CP structure, and local or global CP structures. Results on the synthetic and the empirical networks validate the high performance of our method.
Zipf's law in city size from a resource utilization model.
Ghosh, Asim; Chatterjee, Arnab; Chakrabarti, Anindya S; Chakrabarti, Bikas K
2014-10-01
We study a resource utilization scenario characterized by intrinsic fitness. To describe the growth and organization of different cities, we consider a model for resource utilization where many restaurants compete, as in a game, to attract customers using an iterative learning process. Results for the case of restaurants with uniform fitness are reported. When fitness is uniformly distributed, it gives rise to a Zipf law for the number of customers. We perform an exact calculation for the utilization fraction for the case when choices are made independent of fitness. A variant of the model is also introduced where the fitness can be treated as an ability to stay in the business. When a restaurant loses customers, its fitness is replaced by a random fitness. The steady state fitness distribution is characterized by a power law, while the distribution of the number of customers still follows the Zipf law, implying the robustness of the model. Our model serves as a paradigm for the emergence of Zipf law in city size distribution.
Zipf's law in city size from a resource utilization model
NASA Astrophysics Data System (ADS)
Ghosh, Asim; Chatterjee, Arnab; Chakrabarti, Anindya S.; Chakrabarti, Bikas K.
2014-10-01
We study a resource utilization scenario characterized by intrinsic fitness. To describe the growth and organization of different cities, we consider a model for resource utilization where many restaurants compete, as in a game, to attract customers using an iterative learning process. Results for the case of restaurants with uniform fitness are reported. When fitness is uniformly distributed, it gives rise to a Zipf law for the number of customers. We perform an exact calculation for the utilization fraction for the case when choices are made independent of fitness. A variant of the model is also introduced where the fitness can be treated as an ability to stay in the business. When a restaurant loses customers, its fitness is replaced by a random fitness. The steady state fitness distribution is characterized by a power law, while the distribution of the number of customers still follows the Zipf law, implying the robustness of the model. Our model serves as a paradigm for the emergence of Zipf law in city size distribution.
Phylogenetic Invariants for Metazoan Mitochondrial Genome Evolution.
Sankoff; Blanchette
1998-01-01
The method of phylogenetic invariants was developed to apply to aligned sequence data generated, according to a stochastic substitution model, for N species related through an unknown phylogenetic tree. The invariants are functions of the probabilities of the observable N-tuples, which are identically zero, over all choices of branch length, for some trees. Evaluating the invariants associated with all possible trees, using observed N-tuple frequencies over all sequence positions, enables us to rapidly infer the generating tree. An aspect of evolution at the genomic level much studied recently is the rearrangements of gene order along the chromosome from one species to another. Instead of the substitutions responsible for sequence evolution, we examine the non-local processes responsible for genome rearrangements such as inversion of arbitrarily long segments of chromosomes. By treating the potential adjacency of each possible pair of genes as a position", an appropriate substitution" model can be recognized as governing the rearrangement process, and a probabilistically principled phylogenetic inference can be set up. We calculate the invariants for this process for N=5, and apply them to mitochondrial genome data from coelomate metazoans, showing how they resolve key aspects of branching order.
Typed Multiset Rewriting Specifications of Security Protocols
2011-10-01
to define the type of a tuple as the sequence of the types of its components. Therefore, if A is a principal name and kA is a public key for A, the...tuple (A, kA ) would have type “principal × pubK A” (the Cartesian product symbol “×” is the standard constructor for tuple types). This construction...allows us to associate a generic principal with A’s public key: if B is another principal, then (B, kA ) will have this type as well. We will often need
Term Dependence: A Basis for Luhn and Zipf Models.
ERIC Educational Resources Information Center
Losee, Robert M.
2001-01-01
Discusses relationships between the frequency-based characteristics of neighboring terms in natural language and the rank or frequency of the terms. Topics include information theory measures, including expected mutual information measure (EMIM); entropy and rank; Luhn's model of term aboutness; Zipf's law; and implications for indexing and…
Kanwal, Jasmeen; Smith, Kenny; Culbertson, Jennifer; Kirby, Simon
2017-08-01
The linguist George Kingsley Zipf made a now classic observation about the relationship between a word's length and its frequency; the more frequent a word is, the shorter it tends to be. He claimed that this "Law of Abbreviation" is a universal structural property of language. The Law of Abbreviation has since been documented in a wide range of human languages, and extended to animal communication systems and even computer programming languages. Zipf hypothesised that this universal design feature arises as a result of individuals optimising form-meaning mappings under competing pressures to communicate accurately but also efficiently-his famous Principle of Least Effort. In this study, we use a miniature artificial language learning paradigm to provide direct experimental evidence for this explanatory hypothesis. We show that language users optimise form-meaning mappings only when pressures for accuracy and efficiency both operate during a communicative task, supporting Zipf's conjecture that the Principle of Least Effort can explain this universal feature of word length distributions. Copyright © 2017 Elsevier B.V. All rights reserved.
The language of gene ontology: a Zipf's law analysis.
Kalankesh, Leila Ranandeh; Stevens, Robert; Brass, Andy
2012-06-07
Most major genome projects and sequence databases provide a GO annotation of their data, either automatically or through human annotators, creating a large corpus of data written in the language of GO. Texts written in natural language show a statistical power law behaviour, Zipf's law, the exponent of which can provide useful information on the nature of the language being used. We have therefore explored the hypothesis that collections of GO annotations will show similar statistical behaviours to natural language. Annotations from the Gene Ontology Annotation project were found to follow Zipf's law. Surprisingly, the measured power law exponents were consistently different between annotation captured using the three GO sub-ontologies in the corpora (function, process and component). On filtering the corpora using GO evidence codes we found that the value of the measured power law exponent responded in a predictable way as a function of the evidence codes used to support the annotation. Techniques from computational linguistics can provide new insights into the annotation process. GO annotations show similar statistical behaviours to those seen in natural language with measured exponents that provide a signal which correlates with the nature of the evidence codes used to support the annotations, suggesting that the measured exponent might provide a signal regarding the information content of the annotation.
Aono, Masashi; Kim, Song-Ju; Hara, Masahiko; Munakata, Toshinori
2014-03-01
The true slime mold Physarum polycephalum, a single-celled amoeboid organism, is capable of efficiently allocating a constant amount of intracellular resource to its pseudopod-like branches that best fit the environment where dynamic light stimuli are applied. Inspired by the resource allocation process, the authors formulated a concurrent search algorithm, called the Tug-of-War (TOW) model, for maximizing the profit in the multi-armed Bandit Problem (BP). A player (gambler) of the BP should decide as quickly and accurately as possible which slot machine to invest in out of the N machines and faces an "exploration-exploitation dilemma." The dilemma is a trade-off between the speed and accuracy of the decision making that are conflicted objectives. The TOW model maintains a constant intracellular resource volume while collecting environmental information by concurrently expanding and shrinking its branches. The conservation law entails a nonlocal correlation among the branches, i.e., volume increment in one branch is immediately compensated by volume decrement(s) in the other branch(es). Owing to this nonlocal correlation, the TOW model can efficiently manage the dilemma. In this study, we extend the TOW model to apply it to a stretched variant of BP, the Extended Bandit Problem (EBP), which is a problem of selecting the best M-tuple of the N machines. We demonstrate that the extended TOW model exhibits better performances for 2-tuple-3-machine and 2-tuple-4-machine instances of EBP compared with the extended versions of well-known algorithms for BP, the ϵ-Greedy and SoftMax algorithms, particularly in terms of its short-term decision-making capability that is essential for the survival of the amoeba in a hostile environment. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Coding Instead of Splitting - Algebraic Combinations in Time and Space
2016-06-09
sources message. For certain classes of two-unicast-Z networks, we show that the rate-tuple ( N ,1) is achievable as long as the individual source...destination cuts for the two source-destination pairs are respectively at least as large as N and 1, and the generalized network sharing cut - a bound...previously defined by Kamath et. al. - is at least as large as N + 1. We show this through a novel achievable scheme which is based on random linear coding at
A self-similar hierarchy of the Korean stock market
NASA Astrophysics Data System (ADS)
Lim, Gyuchang; Min, Seungsik; Yoo, Kun-Woo
2013-01-01
A scaling analysis is performed on market values of stocks listed on Korean stock exchanges such as the KOSPI and the KOSDAQ. Different from previous studies on price fluctuations, market capitalizations are dealt with in this work. First, we show that the sum of the two stock exchanges shows a clear rank-size distribution, i.e., the Zipf's law, just as each separate one does. Second, by abstracting Zipf's law as a γ-sequence, we define a self-similar hierarchy consisting of many levels, with the numbers of firms at each level forming a geometric sequence. We also use two exponential functions to describe the hierarchy and derive a scaling law from them. Lastly, we propose a self-similar hierarchical process and perform an empirical analysis on our data set. Based on our findings, we argue that all money invested in the stock market is distributed in a hierarchical way and that a slight difference exists between the two exchanges.
Systems of conservation laws with third-order Hamiltonian structures
NASA Astrophysics Data System (ADS)
Ferapontov, Evgeny V.; Pavlov, Maxim V.; Vitolo, Raffaele F.
2018-06-01
We investigate n-component systems of conservation laws that possess third-order Hamiltonian structures of differential-geometric type. The classification of such systems is reduced to the projective classification of linear congruences of lines in P^{n+2} satisfying additional geometric constraints. Algebraically, the problem can be reformulated as follows: for a vector space W of dimension n+2, classify n-tuples of skew-symmetric 2-forms A^{α } \\in Λ^2(W) such that φ _{β γ }A^{β }\\wedge A^{γ }=0, for some non-degenerate symmetric φ.
A Trust-Based Adaptive Probability Marking and Storage Traceback Scheme for WSNs
Liu, Anfeng; Liu, Xiao; Long, Jun
2016-01-01
Security is a pivotal issue for wireless sensor networks (WSNs), which are emerging as a promising platform that enables a wide range of military, scientific, industrial and commercial applications. Traceback, a key cyber-forensics technology, can play an important role in tracing and locating a malicious source to guarantee cybersecurity. In this work a trust-based adaptive probability marking and storage (TAPMS) traceback scheme is proposed to enhance security for WSNs. In a TAPMS scheme, the marking probability is adaptively adjusted according to the security requirements of the network and can substantially reduce the number of marking tuples and improve network lifetime. More importantly, a high trust node is selected to store marking tuples, which can avoid the problem of marking information being lost. Experimental results show that the total number of marking tuples can be reduced in a TAPMS scheme, thus improving network lifetime. At the same time, since the marking tuples are stored in high trust nodes, storage reliability can be guaranteed, and the traceback time can be reduced by more than 80%. PMID:27043566
Zipf 's law and the effect of ranking on probability distributions
NASA Astrophysics Data System (ADS)
Günther, R.; Levitin, L.; Schapiro, B.; Wagner, P.
1996-02-01
Ranking procedures are widely used in the description of many different types of complex systems. Zipf's law is one of the most remarkable frequency-rank relationships and has been observed independently in physics, linguistics, biology, demography, etc. We show that ranking plays a crucial role in making it possible to detect empirical relationships in systems that exist in one realization only, even when the statistical ensemble to which the systems belong has a very broad probability distribution. Analytical results and numerical simulations are presented which clarify the relations between the probability distributions and the behavior of expected values for unranked and ranked random variables. This analysis is performed, in particular, for the evolutionary model presented in our previous papers which leads to Zipf's law and reveals the underlying mechanism of this phenomenon in terms of a system with interdependent and interacting components as opposed to the “ideal gas” models suggested by previous researchers. The ranking procedure applied to this model leads to a new, unexpected phenomenon: a characteristic “staircase” behavior of the mean values of the ranked variables (ranked occupation numbers). This result is due to the broadness of the probability distributions for the occupation numbers and does not follow from the “ideal gas” model. Thus, it provides an opportunity, by comparison with empirical data, to obtain evidence as to which model relates to reality.
Nonlinear, nonbinary cyclic group codes
NASA Technical Reports Server (NTRS)
Solomon, G.
1992-01-01
New cyclic group codes of length 2(exp m) - 1 over (m - j)-bit symbols are introduced. These codes can be systematically encoded and decoded algebraically. The code rates are very close to Reed-Solomon (RS) codes and are much better than Bose-Chaudhuri-Hocquenghem (BCH) codes (a former alternative). The binary (m - j)-tuples are identified with a subgroup of the binary m-tuples which represents the field GF(2 exp m). Encoding is systematic and involves a two-stage procedure consisting of the usual linear feedback register (using the division or check polynomial) and a small table lookup. For low rates, a second shift-register encoding operation may be invoked. Decoding uses the RS error-correcting procedures for the m-tuple codes for m = 4, 5, and 6.
Learning to Understand Natural Language with Less Human Effort
2015-05-01
j ); if one of these has the correct logical form, ` j = `i, then tj is taken as the approximate maximizer. 29 2.3 Discussion This chapter...where j indexes entity tuples (e1, e2). Training optimizes the semantic parser parameters θ to predict Y = yj,Z = zj given S = sj . The parameters θ...be au tif ul / J J N 1 /N 1 λ f .f L on do n /N N P N λ x .M (x ,“ lo nd on ”, C IT Y ) N : λ x .M (x ,“ lo nd on ”, C IT Y ) (S [d cl ]\\N
NASA Astrophysics Data System (ADS)
Zhang, Wancheng; Xu, Yejun; Wang, Huimin
2016-01-01
The aim of this paper is to put forward a consensus reaching method for multi-attribute group decision-making (MAGDM) problems with linguistic information, in which the weight information of experts and attributes is unknown. First, some basic concepts and operational laws of 2-tuple linguistic label are introduced. Then, a grey relational analysis method and a maximising deviation method are proposed to calculate the incomplete weight information of experts and attributes respectively. To eliminate the conflict in the group, a weight-updating model is employed to derive the weights of experts based on their contribution to the consensus reaching process. After conflict elimination, the final group preference can be obtained which will give the ranking of the alternatives. The model can effectively avoid information distortion which is occurred regularly in the linguistic information processing. Finally, an illustrative example is given to illustrate the application of the proposed method and comparative analysis with the existing methods are offered to show the advantages of the proposed method.
Greedy algorithms and Zipf laws
NASA Astrophysics Data System (ADS)
Moran, José; Bouchaud, Jean-Philippe
2018-04-01
We consider a simple model of firm/city/etc growth based on a multi-item criterion: whenever entity B fares better than entity A on a subset of M items out of K, the agent originally in A moves to B. We solve the model analytically in the cases K = 1 and . The resulting stationary distribution of sizes is generically a Zipf-law provided M > K/2. When , no selection occurs and the size distribution remains thin-tailed. In the special case M = K, one needs to regularize the problem by introducing a small ‘default’ probability ϕ. We find that the stationary distribution has a power-law tail that becomes a Zipf-law when . The approach to the stationary state can also be characterized, with strong similarities with a simple ‘aging’ model considered by Barrat and Mézard.
Two Universality Properties Associated with the Monkey Model of Zipf's Law
NASA Astrophysics Data System (ADS)
Perline, Richard; Perline, Ron
2016-03-01
The distribution of word probabilities in the monkey model of Zipf's law is associated with two universality properties: (1) the power law exponent converges strongly to $-1$ as the alphabet size increases and the letter probabilities are specified as the spacings from a random division of the unit interval for any distribution with a bounded density function on $[0,1]$; and (2), on a logarithmic scale the version of the model with a finite word length cutoff and unequal letter probabilities is approximately normally distributed in the part of the distribution away from the tails. The first property is proved using a remarkably general limit theorem for the logarithm of sample spacings from Shao and Hahn, and the second property follows from Anscombe's central limit theorem for a random number of i.i.d. random variables. The finite word length model leads to a hybrid Zipf-lognormal mixture distribution closely related to work in other areas.
Scaling Property of Period-n-Tupling Sequences in One-Dimensional Mappings
NASA Astrophysics Data System (ADS)
Zeng, Wan-Zhen; Hao, Bai-Lin; Wang, Guang-Rui; Chen, Shi-Gang
1984-05-01
We calculated the universal scaling function g(x) and the scaling factor α as well as the convergence rate δ for periodtripling, -quadrapling and-quintupling sequences of RL, RL^2, RLR^2, RL2 R and RL^3 types. The superstable periods are closely connected to a set of polynomial P_n defined recursively by the original mapping. Some notable properties of these polynomials are studied. Several approaches to solving the renormalization group equation and estimating the scaling factors are suggested.
A Study of Memory Effects in a Chess Database.
Schaigorodsky, Ana L; Perotti, Juan I; Billoni, Orlando V
2016-01-01
A series of recent works studying a database of chronologically sorted chess games-containing 1.4 million games played by humans between 1998 and 2007- have shown that the popularity distribution of chess game-lines follows a Zipf's law, and that time series inferred from the sequences of those game-lines exhibit long-range memory effects. The presence of Zipf's law together with long-range memory effects was observed in several systems, however, the simultaneous emergence of these two phenomena were always studied separately up to now. In this work, by making use of a variant of the Yule-Simon preferential growth model, introduced by Cattuto et al., we provide an explanation for the simultaneous emergence of Zipf's law and long-range correlations memory effects in a chess database. We find that Cattuto's Model (CM) is able to reproduce both, Zipf's law and the long-range correlations, including size-dependent scaling of the Hurst exponent for the corresponding time series. CM allows an explanation for the simultaneous emergence of these two phenomena via a preferential growth dynamics, including a memory kernel, in the popularity distribution of chess game-lines. This mechanism results in an aging process in the chess game-line choice as the database grows. Moreover, we find burstiness in the activity of subsets of the most active players, although the aggregated activity of the pool of players displays inter-event times without burstiness. We show that CM is not able to produce time series with bursty behavior providing evidence that burstiness is not required for the explanation of the long-range correlation effects in the chess database. Our results provide further evidence favoring the hypothesis that long-range correlations effects are a consequence of the aging of game-lines and not burstiness, and shed light on the mechanism that operates in the simultaneous emergence of Zipf's law and long-range correlations in a community of chess players.
Stable power laws in variable economies; Lotka-Volterra implies Pareto-Zipf
NASA Astrophysics Data System (ADS)
Solomon, S.; Richmond, P.
2002-05-01
In recent years we have found that logistic systems of the Generalized Lotka-Volterra type (GLV) describing statistical systems of auto-catalytic elements posses power law distributions of the Pareto-Zipf type. In particular, when applied to economic systems, GLV leads to power laws in the relative individual wealth distribution and in market returns. These power laws and their exponent α are invariant to arbitrary variations in the total wealth of the system and to other endogenously and exogenously induced variations.
Efficient Synthesis of Network Updates
2015-06-17
model include switches S i , links L j , and a single controller element C, and a network N is a tuple containing these. Each switch S i is encoded as a...and the ports they should be forwarded to respec- tively. Each link L j is represented by a record consisting of two locations loc and loc0 and a list...the union of multisets m1 and m2. We write [x] for a singleton list, and l1@l2 for concatenation of l1 and l2. Each transition N o ! N 0 is anno - tated
On spectra of Lüders operations
NASA Astrophysics Data System (ADS)
Nagy, Gabriel
2008-02-01
We show that all the eigenvalues of certain generalized Lüders operations are non-negative real numbers in two cases of interest. In particular, given a commuting n-tuple A =(A1,…,An) consisting of positive operators on a Hilbert space H, satisfying ∑j =1nAj=I, we show that the spectrum of the Lüders operation: ΛA:B(H)∋X↦∑j =1nAj1/2XAj1/2∈B(H) is contained in [0,∞), so the only solution of the equation ΛA(X)=I-X is the "expected" one: X =1/2I.
Beyond description. Comment on "Approaching human language with complex networks" by Cong and Liu
NASA Astrophysics Data System (ADS)
Ferrer-i-Cancho, R.
2014-12-01
In their historical overview, Cong & Liu highlight Sausurre as the father of modern linguistics [1]. They apparently miss G.K. Zipf as a pioneer of the view of language as a complex system. His idea of a balance between unification and diversification forces in the organization of natural systems, e.g., vocabularies [2], can be seen as a precursor of the view of complexity as a balance between order (unification) and disorder (diversification) near the edge of chaos [3]. Although not mentioned by Cong & Liu somewhere else, trade-offs between hearer and speaker needs are very important in Zipf's view, which has inspired research on the optimal networks mapping words into meanings [4-6]. Quantitative linguists regard G.K. Zipf as the funder of modern quantitative linguistics [7], a discipline where statistics plays a central role as in network science. Interestingly, that centrality of statistics is missing Saussure's work and that of many of his successors.
Rank-frequency relation for Chinese characters
NASA Astrophysics Data System (ADS)
Deng, Weibing; Allahverdyan, Armen E.; Li, Bo; Wang, Qiuping A.
2014-02-01
We show that the Zipf's law for Chinese characters perfectly holds for sufficiently short texts (few thousand different characters). The scenario of its validity is similar to the Zipf's law for words in short English texts. For long Chinese texts (or for mixtures of short Chinese texts), rank-frequency relations for Chinese characters display a two-layer, hierarchic structure that combines a Zipfian power-law regime for frequent characters (first layer) with an exponential-like regime for less frequent characters (second layer). For these two layers we provide different (though related) theoretical descriptions that include the range of low-frequency characters (hapax legomena). We suggest that this hierarchic structure of the rank-frequency relation connects to semantic features of Chinese characters (number of different meanings and homographies). The comparative analysis of rank-frequency relations for Chinese characters versus English words illustrates the extent to which the characters play for Chinese writers the same role as the words for those writing within alphabetical systems.
NASA Astrophysics Data System (ADS)
Liu, Bingsheng; Fu, Meiqing; Zhang, Shuibo; Xue, Bin; Zhou, Qi; Zhang, Shiruo
2018-01-01
The Choquet integral (IL) operator is an effective approach for handling interdependence among decision attributes in complex decision-making problems. However, the fuzzy measures of attributes and attribute sets required by IL are difficult to achieve directly, which limits the application of IL. This paper proposes a new method for determining fuzzy measures of attributes by extending Marichal's concept of entropy for fuzzy measure. To well represent the assessment information, interval-valued 2-tuple linguistic context is utilised to represent information. Then, we propose a Choquet integral operator in an interval-valued 2-tuple linguistic environment, which can effectively handle the correlation between attributes. In addition, we apply these methods to solve multi-attribute group decision-making problems. The feasibility and validity of the proposed operator is demonstrated by comparisons with other models in illustrative example part.
Cárdenas-García, Maura; González-Pérez, Pedro Pablo
2013-03-01
Apoptotic cell death plays a crucial role in development and homeostasis. This process is driven by mitochondrial permeabilization and activation of caspases. In this paper we adopt a tuple spaces-based modelling and simulation approach, and show how it can be applied to the simulation of this intracellular signalling pathway. Specifically, we are working to explore and to understand the complex interaction patterns of the caspases apoptotic and the mitochondrial role. As a first approximation, using the tuple spacesbased in silico approach, we model and simulate both the extrinsic and intrinsic apoptotic signalling pathways and the interactions between them. During apoptosis, mitochondrial proteins, released from mitochondria to cytosol are decisively involved in the process. If the decision is to die, from this point there is normally no return, cancer cells offer resistance to the mitochondrial induction.
Cárdenas-García, Maura; González-Pérez, Pedro Pablo
2013-04-11
Apoptotic cell death plays a crucial role in development and homeostasis. This process is driven by mitochondrial permeabilization and activation of caspases. In this paper we adopt a tuple spaces-based modelling and simulation approach, and show how it can be applied to the simulation of this intracellular signalling pathway. Specifically, we are working to explore and to understand the complex interaction patterns of the caspases apoptotic and the mitochondrial role. As a first approximation, using the tuple spaces-based in silico approach, we model and simulate both the extrinsic and intrinsic apoptotic signalling pathways and the interactions between them. During apoptosis, mitochondrial proteins, released from mitochondria to cytosol are decisively involved in the process. If the decision is to die, from this point there is normally no return, cancer cells offer resistance to the mitochondrial induction.
Weiqi games as a tree: Zipf's law of openings and beyond
NASA Astrophysics Data System (ADS)
Xu, Li-Gong; Li, Ming-Xia; Zhou, Wei-Xing
2015-06-01
Weiqi is one of the most complex board games played by two persons. The placement strategies adopted by Weiqi players are often used to analog the philosophy of human wars. Contrary to the western chess, Weiqi games are less studied by academics partially because Weiqi is popular only in East Asia, especially in China, Japan and Korea. Here, we propose to construct a directed tree using a database of extensive Weiqi games and perform a quantitative analysis of the Weiqi tree. We find that the popularity distribution of Weiqi openings with the same number of moves is distributed according to a power law and the tail exponent increases with the number of moves. Intriguingly, the superposition of the popularity distributions of Weiqi openings with a number of moves not higher than a given number also has a power-law tail in which the tail exponent increases with the number of moves, and the superposed distribution approaches the Zipf law. These findings are the same as for chess and support the conjecture that the popularity distribution of board game openings follows the Zipf law with a universal exponent. We also find that the distribution of out-degrees has a power-law form, the distribution of branching ratios has a very complicated pattern, and the distribution of uniqueness scores defined by the path lengths from the root vertex to the leaf vertices exhibits a unimodal shape. Our work provides a promising direction for the study of the decision-making process of Weiqi playing from the perspective of directed branching tree.
SH c realization of minimal model CFT: triality, poset and Burge condition
NASA Astrophysics Data System (ADS)
Fukuda, M.; Nakamura, S.; Matsuo, Y.; Zhu, R.-D.
2015-11-01
Recently an orthogonal basis of {{W}}_N -algebra (AFLT basis) labeled by N-tuple Young diagrams was found in the context of 4D/2D duality. Recursion relations among the basis are summarized in the form of an algebra SH c which is universal for any N. We show that it has an {{S}}_3 automorphism which is referred to as triality. We study the level-rank duality between minimal models, which is a special example of the automorphism. It is shown that the nonvanishing states in both systems are described by N or M Young diagrams with the rows of boxes appropriately shuffled. The reshuffling of rows implies there exists partial ordering of the set which labels them. For the simplest example, one can compute the partition functions for the partially ordered set (poset) explicitly, which reproduces the Rogers-Ramanujan identities. We also study the description of minimal models by SH c . Simple analysis reproduces some known properties of minimal models, the structure of singular vectors and the N-Burge condition in the Hilbert space.
NASA Astrophysics Data System (ADS)
Hasan, Mehedi; Hall, Trevor
2016-11-01
In the title paper, Li et al. have presented a scheme for filter-less photonic millimetre-wave (mm-wave) generation based on two polarization multiplexed parallel dual-parallel Mach-Zehnder modulators (DP-MZMs). For frequency octo-tupling, all the harmonics are suppressed except those of order 4l, where l is the integer. The carrier is then suppressed by the polarization multiplexing technique, which is the principal innovative step in their design. Frequency 12-tupling and 16-tupling is also described following a similar method. The two DP-MZM are similarly driven and provide identical outputs for the same RF modulation indices. Consequently, a demerit of their design is the requirement to apply two different RF signal modulation indexes in a particular range and set the polarizer to a precise angle which depends on the pair of modulation indices used in order to suppress the unwanted harmonics (e.g. the carrier) without simultaneously suppressing the wanted harmonics. The aim of this comment is to show that, an adjustment of the RF drive phases with a fixed polarizer angle with the design presented by Li, all harmonics can be suppressed except those of order4l, where l is an odd integer. Hence, a filter-less frequency octo-tupling can be generated whose performance is not limited by the careful adjustment of the RF drive signal, rather it can be operated for a wide range of modulation indexes (m 2.5 → 7.5). If the modulation index is adjusted to suppress 4th harmonics, then the design can be used to perform frequency 24-tupling. Since, the carrier is suppressed by design in the modified architecture, the strict requirement to adjust the RF drive (and polarizer angle) can be avoided without any significant change to the circuit complexity.
A Logical Basis In The Layered Computer Vision Systems Model
NASA Astrophysics Data System (ADS)
Tejwani, Y. J.
1986-03-01
In this paper a four layer computer vision system model is described. The model uses a finite memory scratch pad. In this model planar objects are defined as predicates. Predicates are relations on a k-tuple. The k-tuple consists of primitive points and relationship between primitive points. The relationship between points can be of the direct type or the indirect type. Entities are goals which are satisfied by a set of clauses. The grammar used to construct these clauses is examined.
NASA Astrophysics Data System (ADS)
Zhu, Huatao; Wang, Rong; Xiang, Peng; Pu, Tao; Fang, Tao; Zheng, Jilin; Li, Yuandong
2017-10-01
In this paper, a novel approach for photonic generation of microwave signals based on frequency multiplication using an injected distributed-feedback (DFB) semiconductor laser is proposed and demonstrated by a proof-of-concept experiment. The proposed system is mainly made up of a dual-parallel Mach-Zehnder modulator (DPMZM) and an injected DFB laser. By properly setting the bias voltage of the DPMZM, ±2-order sidebands with carrier suppression are generated, which are then injected into the slave laser. Due to the optical sideband locking and four-wave mixing (FWM) nonlinearity in the slave laser, new sidebands are generated. Then these sidebands are sent to an optical notch filter where all the undesired sidebands are removed. Finally, after photodetector detection, frequency multiplied microwave signals can be generated. Thanks to the flexibility of the optical sideband locking and FWM, frequency octupling, 12-tupling, 14-tupling and 16-tupling can be obtained.
WWWinda Orchestrator: a mechanism for coordinating distributed flocks of Java Applets
NASA Astrophysics Data System (ADS)
Gutfreund, Yechezkal-Shimon; Nicol, John R.
1997-01-01
The WWWinda Orchestrator is a simple but powerful tool for coordinating distributed Java applets. Loosely derived from the Linda programming language developed by David Gelernter and Nicholas Carriero of Yale, WWWinda implements a distributed shared object space called TupleSpace where applets can post, read, or permanently store arbitrary Java objects. In this manner, applets can easily share information without being aware of the underlying communication mechanisms. WWWinda is a very useful for orchestrating flocks of distributed Java applets. Coordination event scan be posted to WWWinda TupleSpace and used to orchestrate the actions of remote applets. Applets can easily share information via the TupleSpace. The technology combines several functions in one simple metaphor: distributed web objects, remote messaging between applets, distributed synchronization mechanisms, object- oriented database, and a distributed event signaling mechanisms. WWWinda can be used a s platform for implementing shared VRML environments, shared groupware environments, controlling remote devices such as cameras, distributed Karaoke, distributed gaming, and shared audio and video experiences.
Smart Drill-Down: A New Data Exploration Operator
Joglekar, Manas; Garcia-Molina, Hector; Parameswaran, Aditya
2015-01-01
We present a data exploration system equipped with smart drill-down, a novel operator for interactively exploring a relational table to discover and summarize “interesting” groups of tuples. Each such group of tuples is represented by a rule. For instance, the rule (a, b, ★, 1000) tells us that there are a thousand tuples with value a in the first column and b in the second column (and any value in the third column). Smart drill-down presents an analyst with a list of rules that together describe interesting aspects of the table. The analyst can tailor the definition of interesting, and can interactively apply smart drill-down on an existing rule to explore that part of the table. In the demonstration, conference attendees will be able to use the data exploration system equipped with smart drill-down, and will be able to contrast smart drill-down to traditional drill-down, for various interestingness measures, and resource constraints. PMID:26844008
Dual-function photonic integrated circuit for frequency octo-tupling or single-side-band modulation.
Hasan, Mehedi; Maldonado-Basilio, Ramón; Hall, Trevor J
2015-06-01
A dual-function photonic integrated circuit for microwave photonic applications is proposed. The circuit consists of four linear electro-optic phase modulators connected optically in parallel within a generalized Mach-Zehnder interferometer architecture. The photonic circuit is arranged to have two separate output ports. A first port provides frequency up-conversion of a microwave signal from the electrical to the optical domain; equivalently single-side-band modulation. A second port provides tunable millimeter wave carriers by frequency octo-tupling of an appropriate amplitude RF carrier. The circuit exploits the intrinsic relative phases between the ports of multi-mode interference couplers to provide substantially all the static optical phases needed. The operation of the proposed dual-function photonic integrated circuit is verified by computer simulations. The performance of the frequency octo-tupling and up-conversion functions is analyzed in terms of the electrical signal to harmonic distortion ratio and the optical single side band to unwanted harmonics ratio, respectively.
Zipf exponent of trajectory distribution in the hidden Markov model
NASA Astrophysics Data System (ADS)
Bochkarev, V. V.; Lerner, E. Yu
2014-03-01
This paper is the first step of generalization of the previously obtained full classification of the asymptotic behavior of the probability for Markov chain trajectories for the case of hidden Markov models. The main goal is to study the power (Zipf) and nonpower asymptotics of the frequency list of trajectories of hidden Markov frequencys and to obtain explicit formulae for the exponent of the power asymptotics. We consider several simple classes of hidden Markov models. We prove that the asymptotics for a hidden Markov model and for the corresponding Markov chain can be essentially different.
Equilibrium and dynamic methods when comparing an English text and its Esperanto translation
NASA Astrophysics Data System (ADS)
Ausloos, M.
2008-11-01
A comparison of two English texts written by Lewis Carroll, one (Alice in Wonderland), also translated into Esperanto, the other (Through the Looking Glass) are discussed in order to observe whether natural and artificial languages significantly differ from each other. One dimensional time series like signals are constructed using only word frequencies (FTS) or word lengths (LTS). The data is studied through (i) a Zipf method for sorting out correlations in the FTS and (ii) a Grassberger-Procaccia (GP) technique based method for finding correlations in LTS. The methods correspond to an equilibrium and a dynamic approach respectively to human texts features. There are quantitative statistical differences between the original English text and its Esperanto translation, but the qualitative differences are very minutes. However different power laws are observed with characteristic exponents for the ranking properties, and the phase space attractor dimensionality. The Zipf exponent can take values much less than unity (∼0.50 or 0.30) depending on how a sentence is defined. This variety in exponents can be conjectured to be an intrinsic measure of the book style or purpose, rather than the language or author vocabulary richness, since a similar exponent is obtained whatever the text. Moreover the attractor dimension r is a simple function of the so called phase space dimension n, i.e., r=nλ, with λ=0.79. Such an exponent could also be conjectured to be a measure of the author style versatility, - here well preserved in the translation.
Diversity and antimicrobial potential in sea anemone and holothurian microbiomes.
León-Palmero, Elizabeth; Joglar, Vanessa; Álvarez, Pedro A; Martín-Platero, Antonio; Llamas, Inmaculada; Reche, Isabel
2018-01-01
Marine invertebrates, as holobionts, contain symbiotic bacteria that coevolve and develop antimicrobial substances. These symbiotic bacteria are an underexplored source of new bioactive molecules to face the emerging antibiotic resistance in pathogens. Here, we explored the antimicrobial activity of bacteria retrieved from the microbiota of two sea anemones (Anemonia sulcata, Actinia equina) and two holothurians (Holothuria tubulosa, Holothuria forskali). We tested the antimicrobial activity of the isolated bacteria against pathogens with interest for human health, agriculture and aquaculture. We isolated 27 strains with antibacterial activity and 12 of these isolates also showed antifungal activity. We taxonomically identified these strains being Bacillus and Vibrio species the most representative producers of antimicrobial substances. Microbiome species composition of the two sea anemones was similar between them but differed substantially of seawater bacteria. In contrast, microbiome species composition of the two holothurian species was different between them and in comparison with the bacteria in holothurian feces and seawater. In all the holobiont microbiomes Bacteroidetes was the predominant phylum. For each microbiome, we determined diversity and the rank-abundance dominance using five fitted models (null, pre-emption, log-Normal, Zipf and Zipf-Mandelbrot). The models with less evenness (i.e. Zipf and Zipf-Mandelblot) showed the best fits in all the microbiomes. Finally, we tracked (using the V4 hypervariable region of 16S rRNA gene) the relative abundance of these 27 isolates with antibacterial activity in the total pool of sequences obtained for the microbiome of each holobiont. Coincidences, although with extremely low frequencies, were detected only in the microbiome of H. forskali. This fact suggests that these isolated bacteria belong to the long tail of rare symbiotic bacteria. Therefore, more and more sophisticated culture techniques are necessary to explore this apparently vast pool of rare symbiontic bacteria and to determine their biotechnological potentiality.
Logical definability and asymptotic growth in optimization and counting problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Compton, K.
1994-12-31
There has recently been a great deal of interest in the relationship between logical definability and NP-optimization problems. Let MS{sub n} (resp. MP{sub n}) be the class of problems to compute, for given a finite structure A, the maximum number of tuples {bar x} in A satisfying a {Sigma}{sub n} (resp. II{sub n}) formula {psi}({bar x}, {bar S}) as {bar S} ranges over predicates on A. Kolaitis and Thakur showed that the classes MS{sub n} and MP{sub n} collapse to a hierarchy of four levels. Papadimitriou and Yannakakis previously showed that problems in the two lowest levels MS{sub 0} andmore » MS{sub 1} (which they called Max Snp and Max Np) are approximable to within a contrast factor in polynomial time. Similarly, Saluja, Subrahmanyam, and Thakur defined SS{sub n} (resp. SP{sub n}) to be the class of problems to compute, for given a finite structure A, the number of tuples ({bar T}, {bar S}) satisfying a given {Sigma}{sub n} (resp. II{sub n}) formula {psi}({bar T}, {bar c}) in A. They showed that the classes SS{sub n} and SP{sub n} collapse to a hierarchy of five levels and that problems in the two lowest levels SS{sub 0} and SS{sub 1} have a fully polynomial time randomized approximation scheme. We define extended classes MSF{sub n}, MPF{sub n} SSF{sub n}, and SPF{sub n} by allowing formulae to contain predicates definable in a logic known as least fixpoint logic. The resulting hierarchies classes collapse to the same number of levels and problems in the bottom levels can be approximated as before, but now some problems descend from the highest levels in the original hierarchies to the lowest levels in the new hierarchies. We introduce a method characterizing rates of growth of average solution sizes thereby showing a number of important problems do not belong MSF{sub 1} and SSF{sub 1}. This method is related to limit laws for logics and the probabilistic method from combinatorics.« less
Computer systems and methods for the query and visualization of multidimensional databases
Stolte, Chris; Tang, Diane L.; Hanrahan, Patrick
2006-08-08
A method and system for producing graphics. A hierarchical structure of a database is determined. A visual table, comprising a plurality of panes, is constructed by providing a specification that is in a language based on the hierarchical structure of the database. In some cases, this language can include fields that are in the database schema. The database is queried to retrieve a set of tuples in accordance with the specification. A subset of the set of tuples is associated with a pane in the plurality of panes.
Computer systems and methods for the query and visualization of multidimensional database
Stolte, Chris; Tang, Diane L.; Hanrahan, Patrick
2010-05-11
A method and system for producing graphics. A hierarchical structure of a database is determined. A visual table, comprising a plurality of panes, is constructed by providing a specification that is in a language based on the hierarchical structure of the database. In some cases, this language can include fields that are in the database schema. The database is queried to retrieve a set of tuples in accordance with the specification. A subset of the set of tuples is associated with a pane in the plurality of panes.
Design and Implementation of a Pretty Printer for the Functional Specification Language SPEC
1988-06-01
language independent pretty printer using Kodiyak and attribute grammars. These general guidelines are a direct result of the insight gained from the...outlined. The final subject in this chapter is the general and specific rules for the pretty printer. A user of the pretty printer code needs only a working...an extension of a context-free grammar whose generated language includes syntax and semantics. A context-free grammar (CFG) is a four tuple G (N,T, P ,S
Trimodal interpretation of constraints for planning
NASA Technical Reports Server (NTRS)
Krieger, David; Brown, Richard
1987-01-01
Constraints are used in the CAMPS knowledge based planning system to represent those propositions that must be true for a plan to be acceptable. CAMPS introduces the make-mode for interpreting a constraint. Given an unsatisfied constraint, make evaluation mode suggests planning actions which, if taken, would result in a modified plan in which the constraint in question may be satisfied. These suggested planning actions, termed delta-tuples, are the raw material of intelligent plan repair. They are used both in debugging an almost-right plan and in replanning due to changing situations. Given a defective plan in which some set of constraints are violated, a problem solving strategy selects one or more constraints as a focus of attention. These selected constraints are evaluated in the make-mode to produce delta-tuples. The problem solving strategy then reviews the delta-tuples according to its application and problem-specific criteria to find the most acceptable change in terms of success likelihood and plan disruption. Finally, the problem solving strategy makes the suggested alteration to the plan and then rechecks constraints to find any unexpected consequences.
Jiang, Yuyi; Shao, Zhiqing; Guo, Yi
2014-01-01
A complex computing problem can be solved efficiently on a system with multiple computing nodes by dividing its implementation code into several parallel processing modules or tasks that can be formulated as directed acyclic graph (DAG) problems. The DAG jobs may be mapped to and scheduled on the computing nodes to minimize the total execution time. Searching an optimal DAG scheduling solution is considered to be NP-complete. This paper proposed a tuple molecular structure-based chemical reaction optimization (TMSCRO) method for DAG scheduling on heterogeneous computing systems, based on a very recently proposed metaheuristic method, chemical reaction optimization (CRO). Comparing with other CRO-based algorithms for DAG scheduling, the design of tuple reaction molecular structure and four elementary reaction operators of TMSCRO is more reasonable. TMSCRO also applies the concept of constrained critical paths (CCPs), constrained-critical-path directed acyclic graph (CCPDAG) and super molecule for accelerating convergence. In this paper, we have also conducted simulation experiments to verify the effectiveness and efficiency of TMSCRO upon a large set of randomly generated graphs and the graphs for real world problems. PMID:25143977
Jiang, Yuyi; Shao, Zhiqing; Guo, Yi
2014-01-01
A complex computing problem can be solved efficiently on a system with multiple computing nodes by dividing its implementation code into several parallel processing modules or tasks that can be formulated as directed acyclic graph (DAG) problems. The DAG jobs may be mapped to and scheduled on the computing nodes to minimize the total execution time. Searching an optimal DAG scheduling solution is considered to be NP-complete. This paper proposed a tuple molecular structure-based chemical reaction optimization (TMSCRO) method for DAG scheduling on heterogeneous computing systems, based on a very recently proposed metaheuristic method, chemical reaction optimization (CRO). Comparing with other CRO-based algorithms for DAG scheduling, the design of tuple reaction molecular structure and four elementary reaction operators of TMSCRO is more reasonable. TMSCRO also applies the concept of constrained critical paths (CCPs), constrained-critical-path directed acyclic graph (CCPDAG) and super molecule for accelerating convergence. In this paper, we have also conducted simulation experiments to verify the effectiveness and efficiency of TMSCRO upon a large set of randomly generated graphs and the graphs for real world problems.
Interactive Data Exploration with Smart Drill-Down
Joglekar, Manas; Garcia-Molina, Hector; Parameswaran, Aditya
2017-01-01
We present smart drill-down, an operator for interactively exploring a relational table to discover and summarize “interesting” groups of tuples. Each group of tuples is described by a rule. For instance, the rule (a, b, ⋆, 1000) tells us that there are a thousand tuples with value a in the first column and b in the second column (and any value in the third column). Smart drill-down presents an analyst with a list of rules that together describe interesting aspects of the table. The analyst can tailor the definition of interesting, and can interactively apply smart drill-down on an existing rule to explore that part of the table. We demonstrate that the underlying optimization problems are NP-Hard, and describe an algorithm for finding the approximately optimal list of rules to display when the user uses a smart drill-down, and a dynamic sampling scheme for efficiently interacting with large tables. Finally, we perform experiments on real datasets on our experimental prototype to demonstrate the usefulness of smart drill-down and study the performance of our algorithms. PMID:28210096
Coupled-cluster based basis sets for valence correlation calculations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Claudino, Daniel; Bartlett, Rodney J., E-mail: bartlett@qtp.ufl.edu; Gargano, Ricardo
Novel basis sets are generated that target the description of valence correlation in atoms H through Ar. The new contraction coefficients are obtained according to the Atomic Natural Orbital (ANO) procedure from CCSD(T) (coupled-cluster singles and doubles with perturbative triples correction) density matrices starting from the primitive functions of Dunning et al. [J. Chem. Phys. 90, 1007 (1989); ibid. 98, 1358 (1993); ibid. 100, 2975 (1993)] (correlation consistent polarized valence X-tuple zeta, cc-pVXZ). The exponents of the primitive Gaussian functions are subject to uniform scaling in order to ensure satisfaction of the virial theorem for the corresponding atoms. These newmore » sets, named ANO-VT-XZ (Atomic Natural Orbital Virial Theorem X-tuple Zeta), have the same number of contracted functions as their cc-pVXZ counterparts in each subshell. The performance of these basis sets is assessed by the evaluation of the contraction errors in four distinct computations: correlation energies in atoms, probing the density in different regions of space via 〈r{sup n}〉 (−3 ≤ n ≤ 3) in atoms, correlation energies in diatomic molecules, and the quality of fitting potential energy curves as measured by spectroscopic constants. All energy calculations with ANO-VT-QZ have contraction errors within “chemical accuracy” of 1 kcal/mol, which is not true for cc-pVQZ, suggesting some improvement compared to the correlation consistent series of Dunning and co-workers.« less
NASA Astrophysics Data System (ADS)
Ausloos, Marcel; Vandewalle, Nicolas; Ivanova, Kristinka
Specialized topics on financial data analysis from a numerical and physical point of view are discussed when pertaining to the analysis of coherent and random sequences in financial fluctuations within (i) the extended detrended fluctuation analysis method, (ii) multi-affine analysis technique, (iii) mobile average intersection rules and distributions, (iv) sandpile avalanches models for crash prediction, (v) the (m,k)-Zipf method and (vi) the i-variability diagram technique for sorting out short range correlations. The most baffling result that needs further thought from mathematicians and physicists is recalled: the crossing of two mobile averages is an original method for measuring the "signal" roughness exponent, but why it is so is not understood up to now.
Arithmetic Data Cube as a Data Intensive Benchmark
NASA Technical Reports Server (NTRS)
Frumkin, Michael A.; Shabano, Leonid
2003-01-01
Data movement across computational grids and across memory hierarchy of individual grid machines is known to be a limiting factor for application involving large data sets. In this paper we introduce the Data Cube Operator on an Arithmetic Data Set which we call Arithmetic Data Cube (ADC). We propose to use the ADC to benchmark grid capabilities to handle large distributed data sets. The ADC stresses all levels of grid memory by producing 2d views of an Arithmetic Data Set of d-tuples described by a small number of parameters. We control data intensity of the ADC by controlling the sizes of the views through choice of the tuple parameters.
Kang, Hyunchul
2015-01-01
We investigate the in-network processing of an iceberg join query in wireless sensor networks (WSNs). An iceberg join is a special type of join where only those joined tuples whose cardinality exceeds a certain threshold (called iceberg threshold) are qualified for the result. Processing such a join involves the value matching for the join predicate as well as the checking of the cardinality constraint for the iceberg threshold. In the previous scheme, the value matching is carried out as the main task for filtering non-joinable tuples while the iceberg threshold is treated as an additional constraint. We take an alternative approach, meeting the cardinality constraint first and matching values next. In this approach, with a logical fragmentation of the join operand relations on the aggregate counts of the joining attribute values, the optimal sequence of 2-way fragment semijoins is generated, where each fragment semijoin employs a Bloom filter as a synopsis of the joining attribute values. This sequence filters non-joinable tuples in an energy-efficient way in WSNs. Through implementation and a set of detailed experiments, we show that our alternative approach considerably outperforms the previous one. PMID:25774710
Explaining the uneven distribution of numbers in nature: the laws of Benford and Zipf
NASA Astrophysics Data System (ADS)
Pietronero, L.; Tosatti, E.; Tosatti, V.; Vespignani, A.
2001-04-01
The distribution of first digits in numbers series obtained from very different origins shows a marked asymmetry in favor of small digits that goes under the name of Benford's law. We analyze in detail this property for different data sets and give a general explanation for the origin of the Benford's law in terms of multiplicative processes. We show that this law can be also generalized to series of numbers generated from more complex systems like the catalogs of seismic activity. Finally, we derive a relation between the generalized Benford's law and the popular Zipf's law which characterize the rank order statistics and has been extensively applied to many problems ranging from city population to linguistics.
Forensic Memory Analysis for Apple OS X
2012-06-14
x86. Table 5. Template interface fields. Variable Python Type Description template dict template implementing the C stuct interface MBR_NAME str ...dictionary key, variable name for a struct member template[MBR_NAME] tuple dictionary value, a struct member description MBR_TYPE str C type of the...named member OFFSET int offset in bytes for the member SIZE int size in bytes for the member type FIELD str lsof field represented by member
Familial 18 centromere variant resulting in difficulties in interpreting prenatal interphase FISH.
Bourthoumieu, S; Esclaire, F; Terro, F; Brosset, P; Fiorenza, M; Aubard, V; Beguet, M; Yardin, C
2010-08-01
We report here on a familial case of centromeric heteromorphism of chromosome 18 detected by prenatal interphase fluorescence in situ hybridization (FISH) analysis transmitted by the mother to her fetus, and resulting in complete loss of one 18 signal. The prenatal diagnosis was performed by interphase FISH (AneuVysion probe set, and LSI DiGeorge 22q11.2 kit) because of the presence of an isolated fetal cardiac abnormality, and was first difficult to interpret: only one centromeric 18 signal was detectable on prenatal interphase nuclei, along with one signal for the Y and one for the X chromosome. The LSI DiGeorge 22q11.2 kit also showed the absence of one TUPLE 1 signal on all examined nuclei. In fact, the FISH performed on maternal buccal smear displayed the same absence of one chromosome 18 centromeric signal, combined with the presence of two TUPLE1 signals. All these results led to the diagnosis of an isolated 22q11.2 fetal microdeletion that was confirmed on metaphases spreads. This case illustrates once again that the locus specific (LSI) probes are more effective than the alpha centromeric probes for interphase analysis. The development of high-quality LSI probes for chromosomes 18, X and Y could avoid the misinterpretation of prenatal interphase FISH leading to numerous additional and expensive investigations. Copyright 2010 Elsevier Masson SAS. All rights reserved.
PseKNC: a flexible web server for generating pseudo K-tuple nucleotide composition.
Chen, Wei; Lei, Tian-Yu; Jin, Dian-Chuan; Lin, Hao; Chou, Kuo-Chen
2014-07-01
The pseudo oligonucleotide composition, or pseudo K-tuple nucleotide composition (PseKNC), can be used to represent a DNA or RNA sequence with a discrete model or vector yet still keep considerable sequence order information, particularly the global or long-range sequence order information, via the physicochemical properties of its constituent oligonucleotides. Therefore, the PseKNC approach may hold very high potential for enhancing the power in dealing with many problems in computational genomics and genome sequence analysis. However, dealing with different DNA or RNA problems may need different kinds of PseKNC. Here, we present a flexible and user-friendly web server for PseKNC (at http://lin.uestc.edu.cn/pseknc/default.aspx) by which users can easily generate many different modes of PseKNC according to their need by selecting various parameters and physicochemical properties. Furthermore, for the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the current web server to generate their desired PseKNC without the need to follow the complicated mathematical equations, which are presented in this article just for the integrity of PseKNC formulation and its development. It is anticipated that the PseKNC web server will become a very useful tool in computational genomics and genome sequence analysis. Copyright © 2014 Elsevier Inc. All rights reserved.
Do neural nets learn statistical laws behind natural language?
Takahashi, Shuntaro; Tanaka-Ishii, Kumiko
2017-01-01
The performance of deep learning in natural language processing has been spectacular, but the reasons for this success remain unclear because of the inherent complexity of deep learning. This paper provides empirical evidence of its effectiveness and of a limitation of neural networks for language engineering. Precisely, we demonstrate that a neural language model based on long short-term memory (LSTM) effectively reproduces Zipf's law and Heaps' law, two representative statistical properties underlying natural language. We discuss the quality of reproducibility and the emergence of Zipf's law and Heaps' law as training progresses. We also point out that the neural language model has a limitation in reproducing long-range correlation, another statistical property of natural language. This understanding could provide a direction for improving the architectures of neural networks.
Sandia Unstructured Triangle Tabular Interpolation Package v 0.1 beta
DOE Office of Scientific and Technical Information (OSTI.GOV)
2013-09-24
The software interpolates tabular data, such as for equations of state, provided on an unstructured triangular grid. In particular, interpolation occurs in a two dimensional space by looking up the triangle in which the desired evaluation point resides and then performing a linear interpolation over the n-tuples associated with the nodes of the chosen triangle. The interface to the interpolation routines allows for automated conversion of units from those tabulated to the desired output units. when multiple tables are included in a data file, new tables may be generated by on-the-fly mixing of the provided tables
Benjafield, John G
2016-05-01
The digital humanities are being applied with increasing frequency to the analysis of historically important texts. In this study, the methods of G. K. Zipf are used to explore the digital history of the vocabulary of psychology. Zipf studied a great many phenomena, from word frequencies to city sizes, showing that they tend to have a characteristic distribution in which there are a few cases that occur very frequently and many more cases that occur very infrequently. We find that the number of new words and word senses that writers contribute to the vocabulary of psychology have such a Zipfian distribution. Moreover, those who make the most contributions, such as William James, tend also to invent new metaphorical senses of words rather than new words. By contrast, those who make the fewest contributions tend to invent entirely new words. The use of metaphor makes a text easier for a reader to understand. While the use of new words requires more effort on the part of the reader, it may lead to more precise understanding than does metaphor. On average, new words and word senses become a part of psychology's vocabulary in the time leading up to World War I, suggesting that psychology was "finding its language" (Danziger, 1997) during this period. (c) 2016 APA, all rights reserved).
Quantiprot - a Python package for quantitative analysis of protein sequences.
Konopka, Bogumił M; Marciniak, Marta; Dyrka, Witold
2017-07-17
The field of protein sequence analysis is dominated by tools rooted in substitution matrices and alignments. A complementary approach is provided by methods of quantitative characterization. A major advantage of the approach is that quantitative properties defines a multidimensional solution space, where sequences can be related to each other and differences can be meaningfully interpreted. Quantiprot is a software package in Python, which provides a simple and consistent interface to multiple methods for quantitative characterization of protein sequences. The package can be used to calculate dozens of characteristics directly from sequences or using physico-chemical properties of amino acids. Besides basic measures, Quantiprot performs quantitative analysis of recurrence and determinism in the sequence, calculates distribution of n-grams and computes the Zipf's law coefficient. We propose three main fields of application of the Quantiprot package. First, quantitative characteristics can be used in alignment-free similarity searches, and in clustering of large and/or divergent sequence sets. Second, a feature space defined by quantitative properties can be used in comparative studies of protein families and organisms. Third, the feature space can be used for evaluating generative models, where large number of sequences generated by the model can be compared to actually observed sequences.
Isotropic probability measures in infinite-dimensional spaces
NASA Technical Reports Server (NTRS)
Backus, George
1987-01-01
Let R be the real numbers, R(n) the linear space of all real n-tuples, and R(infinity) the linear space of all infinite real sequences x = (x sub 1, x sub 2,...). Let P sub in :R(infinity) approaches R(n) be the projection operator with P sub n (x) = (x sub 1,...,x sub n). Let p(infinity) be a probability measure on the smallest sigma-ring of subsets of R(infinity) which includes all of the cylinder sets P sub n(-1) (B sub n), where B sub n is an arbitrary Borel subset of R(n). Let p sub n be the marginal distribution of p(infinity) on R(n), so p sub n(B sub n) = p(infinity) (P sub n to the -1 (B sub n)) for each B sub n. A measure on R(n) is isotropic if it is invariant under all orthogonal transformations of R(n). All members of the set of all isotropic probability distributions on R(n) are described. The result calls into question both stochastic inversion and Bayesian inference, as currently used in many geophysical inverse problems.
On splice site prediction using weight array models: a comparison of smoothing techniques
NASA Astrophysics Data System (ADS)
Taher, Leila; Meinicke, Peter; Morgenstern, Burkhard
2007-11-01
In most eukaryotic genes, protein-coding exons are separated by non-coding introns which are removed from the primary transcript by a process called "splicing". The positions where introns are cut and exons are spliced together are called "splice sites". Thus, computational prediction of splice sites is crucial for gene finding in eukaryotes. Weight array models are a powerful probabilistic approach to splice site detection. Parameters for these models are usually derived from m-tuple frequencies in trusted training data and subsequently smoothed to avoid zero probabilities. In this study we compare three different ways of parameter estimation for m-tuple frequencies, namely (a) non-smoothed probability estimation, (b) standard pseudo counts and (c) a Gaussian smoothing procedure that we recently developed.
Empowering Provenance in Data Integration
NASA Astrophysics Data System (ADS)
Kondylakis, Haridimos; Doerr, Martin; Plexousakis, Dimitris
The provenance of data has recently been recognized as central to the trust one places in data. This paper presents a novel framework in order to empower provenance in a mediator based data integration system. We use a simple mapping language for mapping schema constructs, between an ontology and relational sources, capable to carry provenance information. This language extends the traditional data exchange setting by translating our mapping specifications into source-to-target tuple generating dependencies (s-t tgds). Then we define formally the provenance information we want to retrieve i.e. annotation, source and tuple provenance. We provide three algorithms to retrieve provenance information using information stored on the mappings and the sources. We show the feasibility of our solution and the advantages of our framework.
Statistical Analysis of the Indus Script Using n-Grams
Yadav, Nisha; Joglekar, Hrishikesh; Rao, Rajesh P. N.; Vahia, Mayank N.; Adhikari, Ronojoy; Mahadevan, Iravatham
2010-01-01
The Indus script is one of the major undeciphered scripts of the ancient world. The small size of the corpus, the absence of bilingual texts, and the lack of definite knowledge of the underlying language has frustrated efforts at decipherment since the discovery of the remains of the Indus civilization. Building on previous statistical approaches, we apply the tools of statistical language processing, specifically n-gram Markov chains, to analyze the syntax of the Indus script. We find that unigrams follow a Zipf-Mandelbrot distribution. Text beginner and ender distributions are unequal, providing internal evidence for syntax. We see clear evidence of strong bigram correlations and extract significant pairs and triplets using a log-likelihood measure of association. Highly frequent pairs and triplets are not always highly significant. The model performance is evaluated using information-theoretic measures and cross-validation. The model can restore doubtfully read texts with an accuracy of about 75%. We find that a quadrigram Markov chain saturates information theoretic measures against a held-out corpus. Our work forms the basis for the development of a stochastic grammar which may be used to explore the syntax of the Indus script in greater detail. PMID:20333254
Where Gibrat meets Zipf: Scale and scope of French firms
NASA Astrophysics Data System (ADS)
Bee, Marco; Riccaboni, Massimo; Schiavo, Stefano
2017-09-01
The proper characterization of the size distribution and growth of firms represents an important issue in economics and business. We use the Maximum Entropy approach to assess the plausibility of the assumption that firm size follows Lognormal or Pareto distributions, which underlies most recent works on the subject. A comprehensive dataset covering the universe of French firms allows us to draw two major conclusions. First, the Pareto hypothesis for the whole distribution should be rejected. Second, by discriminating across firms based on the number of products sold and markets served, we find that, within the class of multi-product companies active in multiple markets, the distribution converges to a Zipf's law. Conversely, Lognormal distribution is a good benchmark for small single-product firms. The size distribution of firms largely depends on firms' diversification patterns.
Inferring Human Activity in Mobile Devices by Computing Multiple Contexts.
Chen, Ruizhi; Chu, Tianxing; Liu, Keqiang; Liu, Jingbin; Chen, Yuwei
2015-08-28
This paper introduces a framework for inferring human activities in mobile devices by computing spatial contexts, temporal contexts, spatiotemporal contexts, and user contexts. A spatial context is a significant location that is defined as a geofence, which can be a node associated with a circle, or a polygon; a temporal context contains time-related information that can be e.g., a local time tag, a time difference between geographical locations, or a timespan; a spatiotemporal context is defined as a dwelling length at a particular spatial context; and a user context includes user-related information that can be the user's mobility contexts, environmental contexts, psychological contexts or social contexts. Using the measurements of the built-in sensors and radio signals in mobile devices, we can snapshot a contextual tuple for every second including aforementioned contexts. Giving a contextual tuple, the framework evaluates the posteriori probability of each candidate activity in real-time using a Naïve Bayes classifier. A large dataset containing 710,436 contextual tuples has been recorded for one week from an experiment carried out at Texas A&M University Corpus Christi with three participants. The test results demonstrate that the multi-context solution significantly outperforms the spatial-context-only solution. A classification accuracy of 61.7% is achieved for the spatial-context-only solution, while 88.8% is achieved for the multi-context solution.
PseKRAAC: a flexible web server for generating pseudo K-tuple reduced amino acids composition.
Zuo, Yongchun; Li, Yuan; Chen, Yingli; Li, Guangpeng; Yan, Zhenhe; Yang, Lei
2017-01-01
The reduced amino acids perform powerful ability for both simplifying protein complexity and identifying functional conserved regions. However, dealing with different protein problems may need different kinds of cluster methods. Encouraged by the success of pseudo-amino acid composition algorithm, we developed a freely available web server, called PseKRAAC (the pseudo K-tuple reduced amino acids composition). By implementing reduced amino acid alphabets, the protein complexity can be significantly simplified, which leads to decrease chance of overfitting, lower computational handicap and reduce information redundancy. PseKRAAC delivers more capability for protein research by incorporating three crucial parameters that describes protein composition. Users can easily generate many different modes of PseKRAAC tailored to their needs by selecting various reduced amino acids alphabets and other characteristic parameters. It is anticipated that the PseKRAAC web server will become a very useful tool in computational proteomics and protein sequence analysis. Freely available on the web at http://bigdata.imu.edu.cn/psekraac CONTACTS: yczuo@imu.edu.cn or imu.hema@foxmail.com or yanglei_hmu@163.comSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
An Efficient Conflict Detection Algorithm for Packet Filters
NASA Astrophysics Data System (ADS)
Lee, Chun-Liang; Lin, Guan-Yu; Chen, Yaw-Chung
Packet classification is essential for supporting advanced network services such as firewalls, quality-of-service (QoS), virtual private networks (VPN), and policy-based routing. The rules that routers use to classify packets are called packet filters. If two or more filters overlap, a conflict occurs and leads to ambiguity in packet classification. This study proposes an algorithm that can efficiently detect and resolve filter conflicts using tuple based search. The time complexity of the proposed algorithm is O(nW+s), and the space complexity is O(nW), where n is the number of filters, W is the number of bits in a header field, and s is the number of conflicts. This study uses the synthetic filter databases generated by ClassBench to evaluate the proposed algorithm. Simulation results show that the proposed algorithm can achieve better performance than existing conflict detection algorithms both in time and space, particularly for databases with large numbers of conflicts.
Languages cool as they expand: Allometric scaling and the decreasing need for new words
Petersen, Alexander M.; Tenenbaum, Joel N.; Havlin, Shlomo; Stanley, H. Eugene; Perc, Matjaž
2012-01-01
We analyze the occurrence frequencies of over 15 million words recorded in millions of books published during the past two centuries in seven different languages. For all languages and chronological subsets of the data we confirm that two scaling regimes characterize the word frequency distributions, with only the more common words obeying the classic Zipf law. Using corpora of unprecedented size, we test the allometric scaling relation between the corpus size and the vocabulary size of growing languages to demonstrate a decreasing marginal need for new words, a feature that is likely related to the underlying correlations between words. We calculate the annual growth fluctuations of word use which has a decreasing trend as the corpus size increases, indicating a slowdown in linguistic evolution following language expansion. This “cooling pattern” forms the basis of a third statistical regularity, which unlike the Zipf and the Heaps law, is dynamical in nature. PMID:23230508
Languages cool as they expand: Allometric scaling and the decreasing need for new words
NASA Astrophysics Data System (ADS)
Petersen, Alexander M.; Tenenbaum, Joel N.; Havlin, Shlomo; Stanley, H. Eugene; Perc, Matjaž
2012-12-01
We analyze the occurrence frequencies of over 15 million words recorded in millions of books published during the past two centuries in seven different languages. For all languages and chronological subsets of the data we confirm that two scaling regimes characterize the word frequency distributions, with only the more common words obeying the classic Zipf law. Using corpora of unprecedented size, we test the allometric scaling relation between the corpus size and the vocabulary size of growing languages to demonstrate a decreasing marginal need for new words, a feature that is likely related to the underlying correlations between words. We calculate the annual growth fluctuations of word use which has a decreasing trend as the corpus size increases, indicating a slowdown in linguistic evolution following language expansion. This ``cooling pattern'' forms the basis of a third statistical regularity, which unlike the Zipf and the Heaps law, is dynamical in nature.
Modeling species-abundance relationships in multi-species collections
Peng, S.; Yin, Z.; Ren, H.; Guo, Q.
2003-01-01
Species-abundance relationship is one of the most fundamental aspects of community ecology. Since Motomura first developed the geometric series model to describe the feature of community structure, ecologists have developed many other models to fit the species-abundance data in communities. These models can be classified into empirical and theoretical ones, including (1) statistical models, i.e., negative binomial distribution (and its extension), log-series distribution (and its extension), geometric distribution, lognormal distribution, Poisson-lognormal distribution, (2) niche models, i.e., geometric series, broken stick, overlapping niche, particulate niche, random assortment, dominance pre-emption, dominance decay, random fraction, weighted random fraction, composite niche, Zipf or Zipf-Mandelbrot model, and (3) dynamic models describing community dynamics and restrictive function of environment on community. These models have different characteristics and fit species-abundance data in various communities or collections. Among them, log-series distribution, lognormal distribution, geometric series, and broken stick model have been most widely used.
Capital death in the world market
NASA Astrophysics Data System (ADS)
Avakian, Adam; Podobnik, Boris; Piskor, Manuela; Stanley, H. Eugene
2014-03-01
We study the gross domestic product (GDP) per capita together with the market capitalization (MCAP) per capita as two indicators of the effect of globalization. We find that g, the GDP per capita, as a function of m, the MCAP per capita, follows a power law with average exponent close to 1/3. In addition, the Zipf ranking approach confirms that the m for countries with initially lower values of m tends to grow more rapidly than for countries with initially larger values of m. If the trends over the past 20 years continue to hold in the future, then the Zipf ranking approach leads to the prediction that in about 50 years, all countries participating in globalization will have comparable values of their MCAP per capita. We call this economic state "capital death," in analogy to the physics state of "heat death" predicted by thermodynamic arguments.
NASA Technical Reports Server (NTRS)
Backus, George
1987-01-01
Let R be the real numbers, R(n) the linear space of all real n-tuples, and R(infinity) the linear space of all infinite real sequences x = (x sub 1, x sub 2,...). Let P sub n :R(infinity) approaches R(n) be the projection operator with P sub n (x) = (x sub 1,...,x sub n). Let p(infinity) be a probability measure on the smallest sigma-ring of subsets of R(infinity) which includes all of the cylinder sets P sub n(-1) (B sub n), where B sub n is an arbitrary Borel subset of R(n). Let p sub n be the marginal distribution of p(infinity) on R(n), so p sub n(B sub n) = p(infinity)(P sub n to the -1(B sub n)) for each B sub n. A measure on R(n) is isotropic if it is invariant under all orthogonal transformations of R(n). All members of the set of all isotropic probability distributions on R(n) are described. The result calls into question both stochastic inversion and Bayesian inference, as currently used in many geophysical inverse problems.
Delay correlation analysis and representation for vital complaint VHDL models
Rich, Marvin J.; Misra, Ashutosh
2004-11-09
A method and system unbind a rise/fall tuple of a VHDL generic variable and create rise time and fall time generics of each generic variable that are independent of each other. Then, according to a predetermined correlation policy, the method and system collect delay values in a VHDL standard delay file, sort the delay values, remove duplicate delay values, group the delay values into correlation sets, and output an analysis file. The correlation policy may include collecting all generic variables in a VHDL standard delay file, selecting each generic variable, and performing reductions on the set of delay values associated with each selected generic variable.
2005-12-01
moteur de simulation de l’Environnement Intégré de Modélisation de la Performance est utilisé de pair avec l’approche pour démontrer comment des...d’être utilisé dans les forces générées par ordinateur. Le travail ultérieur inclura plus d’essais, l’intégration avec les moteurs de simulateurs et...Aspects are reasoning units relevant to simulated tasks. Each Aspect schema is a 4-tuple: AspectSchema = <MA, WM, LM, CL>, where MA refers to meta
Quantum catastrophes: a case study
NASA Astrophysics Data System (ADS)
Znojil, Miloslav
2012-11-01
The bound-state spectrum of a Hamiltonian H is assumed real in a non-empty domain D of physical values of parameters. This means that for these parameters, H may be called crypto-Hermitian, i.e. made Hermitian via an ad hoc choice of the inner product in the physical Hilbert space of quantum bound states (i.e. via an ad hoc construction of the operator Θ called the metric). The name quantum catastrophe is then assigned to the N-tuple-exceptional-point crossing, i.e. to the scenario in which we leave the domain D along such a path that at the boundary of D, an N-plet of bound-state energies degenerates and, subsequently, complexifies. At any fixed N ⩾ 2, this process is simulated via an N × N benchmark effective matrix Hamiltonian H. It is being assigned such a closed-form metric which is made unique via an N-extrapolation-friendliness requirement. This article is part of a special issue of Journal of Physics A: Mathematical and Theoretical devoted to ‘Quantum physics with non-Hermitian operators’.
Zipf's Law and the Visible Stars
NASA Astrophysics Data System (ADS)
Upgren, A. R.
2003-12-01
Zipf's law is a power law that describes the frequency of occurrence of many events in nature and in human affairs. It has recently found a resonance among some of the natural phenomena in the Earth sciences. Here we extend its applicability to astronomy. The essence of this law is that the sizes of objects or the frequencies of events adhere closely to a linear power law in which P, their frequency or size is a function of its rank; that is, P(i) ˜ 1/ia with the exponent, a, lying close to unity for many widely differing phenomena. Thus, i = 1, 2, 3, 4, . . . and P(i) ˜ 1, \\onehalf, . . . We examine the brightnesses of the naked-eye stars and compare their distribution to city populations and other examples of Zipf's law in other disciplines. At first glance it would seem that everything about these stars has long been known, but this is far from the case. In their classic 1953 text, Statistical Astronomy, Robert Trumpler and Harold Weaver (along with other authors since) list eight parameters that can define a star rather uniquely, to which we add a ninth. At that time, the only reliably known data for all or almost all of the 9110 BSC stars were their apparent magnitudes and proper motions. Reliable distances and parallaxes were known for few, and accurate radial velocities for almost none. But today photoelectric photometry and radial velocities are known for most, and parallaxes from the recent Hipparcos Catalogue for all. Only the age, through [Fe/H] or other, is not known for many of the stars; we can hope that this will be corrected in years to come.
Benchmarking Memory Performance with the Data Cube Operator
NASA Technical Reports Server (NTRS)
Frumkin, Michael A.; Shabanov, Leonid V.
2004-01-01
Data movement across a computer memory hierarchy and across computational grids is known to be a limiting factor for applications processing large data sets. We use the Data Cube Operator on an Arithmetic Data Set, called ADC, to benchmark capabilities of computers and of computational grids to handle large distributed data sets. We present a prototype implementation of a parallel algorithm for computation of the operatol: The algorithm follows a known approach for computing views from the smallest parent. The ADC stresses all levels of grid memory and storage by producing some of 2d views of an Arithmetic Data Set of d-tuples described by a small number of integers. We control data intensity of the ADC by selecting the tuple parameters, the sizes of the views, and the number of realized views. Benchmarking results of memory performance of a number of computer architectures and of a small computational grid are presented.
NASA Astrophysics Data System (ADS)
Zhang, L.; Li, Y.; Wu, Q.
2013-05-01
Integrated Project Delivery (IPD) is a newly-developed project delivery approach for construction projects, and the level of collaboration of project management team is crucial to the success of its implementation. Existing research has shown that collaborative satisfaction is one of the key indicators of team collaboration. By reviewing the literature on team collaborative satisfaction and taking into consideration the characteristics of IPD projects, this paper summarizes the factors that influence collaborative satisfaction of IPD project management team. Based on these factors, this research develops a fuzzy linguistic method to effectively evaluate the level of team collaborative satisfaction, in which the authors adopted the 2-tuple linguistic variables and 2-tuple linguistic hybrid average operators to enhance the objectivity and accuracy of the evaluation. The paper demonstrates the practicality and effectiveness of the method through carrying out a case study with the method.
Finding exact constants in a Markov model of Zipfs law generation
NASA Astrophysics Data System (ADS)
Bochkarev, V. V.; Lerner, E. Yu.; Nikiforov, A. A.; Pismenskiy, A. A.
2017-12-01
According to the classical Zipfs law, the word frequency is a power function of the word rank with an exponent -1. The objective of this work is to find multiplicative constant in a Markov model of word generation. Previously, the case of independent letters was mathematically strictly investigated in [Bochkarev V V and Lerner E Yu 2017 International Journal of Mathematics and Mathematical Sciences Article ID 914374]. Unfortunately, the methods used in this paper cannot be generalized in case of Markov chains. The search of the correct formulation of the Markov generalization of this results was performed using experiments with different ergodic matrices of transition probability P. Combinatory technique allowed taking into account all the words with probability of more than e -300 in case of 2 by 2 matrices. It was experimentally proved that the required constant in the limit is equal to the value reciprocal to conditional entropy of matrix row P with weights presenting the elements of the vector π of the stationary distribution of the Markov chain.
NASA Astrophysics Data System (ADS)
Young, F.; Siegel, Edward Carl-Ludwig
2011-03-01
(so MIScalled) "complexity" with INHERENT BOTH SCALE-Invariance Symmetry-RESTORING, AND 1 / w (1.000..) "pink" Zipf-law Archimedes-HYPERBOLICITY INEVITABILITY power-spectrum power-law decay algebraicity. Their CONNECTION is via simple-calculus SCALE-Invariance Symmetry-RESTORING logarithm-function derivative: (d/ d ω) ln(ω) = 1 / ω , i.e. (d/ d ω) [SCALE-Invariance Symmetry-RESTORING](ω) = 1/ ω . Via Noether-theorem continuous-symmetries relation to conservation-laws: (d/ d ω) [inter-scale 4-current 4-div-ergence} = 0](ω) = 1 / ω . Hence (so MIScalled) "complexity" is information inter-scale conservation, in agreement with Anderson-Mandell [Fractals of Brain/Mind, G. Stamov ed.(1994)] experimental-psychology!!!], i.e. (so MIScalled) "complexity" is UTTER-SIMPLICITY!!! Versus COMPLICATEDNESS either PLUS (Additive) VS. TIMES (Multiplicative) COMPLICATIONS of various system-specifics. COMPLICATEDNESS-MEASURE DEVIATIONS FROM complexity's UTTER-SIMPLICITY!!!: EITHER [SCALE-Invariance Symmetry-BREAKING] MINUS [SCALE-Invariance Symmetry-RESTORING] via power-spectrum power-law algebraicity decays DIFFERENCES: ["red"-Pareto] MINUS ["pink"-Zipf Archimedes-HYPERBOLICITY INEVITABILITY]!!!
Zipf's law from scale-free geometry.
Lin, Henry W; Loeb, Abraham
2016-03-01
The spatial distribution of people exhibits clustering across a wide range of scales, from household (∼10(-2) km) to continental (∼10(4) km) scales. Empirical data indicate simple power-law scalings for the size distribution of cities (known as Zipf's law) and the population density fluctuations as a function of scale. Using techniques from random field theory and statistical physics, we show that these power laws are fundamentally a consequence of the scale-free spatial clustering of human populations and the fact that humans inhabit a two-dimensional surface. In this sense, the symmetries of scale invariance in two spatial dimensions are intimately connected to urban sociology. We test our theory by empirically measuring the power spectrum of population density fluctuations and show that the logarithmic slope α=2.04 ± 0.09, in excellent agreement with our theoretical prediction α=2. The model enables the analytic computation of many new predictions by importing the mathematical formalism of random fields.
Lexical Frequency Profiles and Zipf's Law
ERIC Educational Resources Information Center
Edwards, Roderick; Collins, Laura
2011-01-01
Laufer and Nation (1995) proposed that the Lexical Frequency Profile (LFP) can estimate the size of a second-language writer's productive vocabulary. Meara (2005) questioned the sensitivity and the reliability of LFPs for estimating vocabulary sizes, based on the results obtained from probabilistic simulations of LFPs. However, the underlying…
ERIC Educational Resources Information Center
Ivancheva, Ludmila E.
2001-01-01
Discusses the concept of the hyperbolic or skew distribution as a universal statistical law in information science and socioeconomic studies. Topics include Zipf's law; Stankov's universal law; non-Gaussian distributions; and why most bibliometric and scientometric laws reveal characters of non-Gaussian distribution. (Author/LRW)
Rescuing Computerized Testing by Breaking Zipf's Law.
ERIC Educational Resources Information Center
Wainer, Howard
2000-01-01
Suggests that because of the nonlinear relationship between item usage and item security, the problems of test security posed by continuous administration of standardized tests cannot be resolved merely by increasing the size of the item pool. Offers alternative strategies to overcome these problems, distributing test items so as to avoid the…
NASA Astrophysics Data System (ADS)
Petersson, George A.; Malick, David K.; Frisch, Michael J.; Braunstein, Matthew
2006-07-01
Examination of the convergence of full valence complete active space self-consistent-field configuration interaction including all single and double excitation (CASSCF-CISD) energies with expansion of the one-electron basis set reveals a pattern very similar to the convergence of single determinant energies. Calculations on the lowest four singlet states and the lowest four triplet states of N2 with the sequence of n-tuple-ζ augmented polarized (nZaP) basis sets (n =2, 3, 4, 5, and 6) are used to establish the complete basis set limits. Full configuration-interaction (CI) and core electron contributions must be included for very accurate potential energy surfaces. However, a simple extrapolation scheme that has no adjustable parameters and requires nothing more demanding than CAS(10e -,8orb)-CISD/3ZaP calculations gives the Re, ωe, ωeXe, Te, and De for these eight states with rms errors of 0.0006Å, 4.43cm-1, 0.35cm-1, 0.063eV, and 0.018eV, respectively.
Lin, Hao; Deng, En-Ze; Ding, Hui; Chen, Wei; Chou, Kuo-Chen
2014-01-01
The σ54 promoters are unique in prokaryotic genome and responsible for transcripting carbon and nitrogen-related genes. With the avalanche of genome sequences generated in the postgenomic age, it is highly desired to develop automated methods for rapidly and effectively identifying the σ54 promoters. Here, a predictor called ‘iPro54-PseKNC’ was developed. In the predictor, the samples of DNA sequences were formulated by a novel feature vector called ‘pseudo k-tuple nucleotide composition’, which was further optimized by the incremental feature selection procedure. The performance of iPro54-PseKNC was examined by the rigorous jackknife cross-validation tests on a stringent benchmark data set. As a user-friendly web-server, iPro54-PseKNC is freely accessible at http://lin.uestc.edu.cn/server/iPro54-PseKNC. For the convenience of the vast majority of experimental scientists, a step-by-step protocol guide was provided on how to use the web-server to get the desired results without the need to follow the complicated mathematics that were presented in this paper just for its integrity. Meanwhile, we also discovered through an in-depth statistical analysis that the distribution of distances between the transcription start sites and the translation initiation sites were governed by the gamma distribution, which may provide a fundamental physical principle for studying the σ54 promoters. PMID:25361964
Paradise: A Parallel Information System for EOSDIS
NASA Technical Reports Server (NTRS)
DeWitt, David
1996-01-01
The Paradise project was begun-in 1993 in order to explore the application of the parallel and object-oriented database system technology developed as a part of the Gamma, Exodus. and Shore projects to the design and development of a scaleable, geo-spatial database system for storing both massive spatial and satellite image data sets. Paradise is based on an object-relational data model. In addition to the standard attribute types such as integers, floats, strings and time, Paradise also provides a set of and multimedia data types, designed to facilitate the storage and querying of complex spatial and multimedia data sets. An individual tuple can contain any combination of this rich set of data types. For example, in the EOSDIS context, a tuple might mix terrain and map data for an area along with the latest satellite weather photo of the area. The use of a geo-spatial metaphor simplifies the task of fusing disparate forms of data from multiple data sources including text, image, map, and video data sets.
Do Young Children Have Adult-Like Syntactic Categories? Zipf's Law and the Case of the Determiner
ERIC Educational Resources Information Center
Pine, Julian M.; Freudenthal, Daniel; Krajewski, Grzegorz; Gobet, Fernand
2013-01-01
Generativist models of grammatical development assume that children have adult-like grammatical categories from the earliest observable stages, whereas constructivist models assume that children's early categories are more limited in scope. In the present paper, we test these assumptions with respect to one particular syntactic category, the…
Compression as a Universal Principle of Animal Behavior
ERIC Educational Resources Information Center
Ferrer-i-Cancho, Ramon; Hernández-Fernández, Antoni; Lusseau, David; Agoramoorthy, Govindasamy; Hsu, Minna J.; Semple, Stuart
2013-01-01
A key aim in biology and psychology is to identify fundamental principles underpinning the behavior of animals, including humans. Analyses of human language and the behavior of a range of non-human animal species have provided evidence for a common pattern underlying diverse behavioral phenomena: Words follow Zipf's law of brevity (the…
Between disorder and order: A case study of power law
NASA Astrophysics Data System (ADS)
Cao, Yong; Zhao, Youjie; Yue, Xiaoguang; Xiong, Fei; Sun, Yongke; He, Xin; Wang, Lichao
2016-08-01
Power law is an important feature of phenomena in long memory behaviors. Zipf ever found power law in the distribution of the word frequencies. In physics, the terms order and disorder are Thermodynamic or statistical physics concepts originally and a lot of research work has focused on self-organization of the disorder ingredients of simple physical systems. It is interesting what make disorder-order transition. We devise an experiment-based method about random symbolic sequences to research regular pattern between disorder and order. The experiment results reveal power law is indeed an important regularity in transition from disorder to order. About these results the preliminary study and analysis has been done to explain the reasons.
Statistical physics in foreign exchange currency and stock markets
NASA Astrophysics Data System (ADS)
Ausloos, M.
2000-09-01
Problems in economy and finance have attracted the interest of statistical physicists all over the world. Fundamental problems pertain to the existence or not of long-, medium- or/and short-range power-law correlations in various economic systems, to the presence of financial cycles and on economic considerations, including economic policy. A method like the detrended fluctuation analysis is recalled emphasizing its value in sorting out correlation ranges, thereby leading to predictability at short horizon. The ( m, k)-Zipf method is presented for sorting out short-range correlations in the sign and amplitude of the fluctuations. A well-known financial analysis technique, the so-called moving average, is shown to raise questions to physicists about fractional Brownian motion properties. Among spectacular results, the possibility of crash predictions has been demonstrated through the log-periodicity of financial index oscillations.
NASA Astrophysics Data System (ADS)
Hertz, Anaelle; Vanbever, Luc; Cerf, Nicolas J.
2018-01-01
The uncertainty relation for continuous variables due to Byałinicki-Birula and Mycielski [I. Białynicki-Birula and J. Mycielski, Commun. Math. Phys. 44, 129 (1975), 10.1007/BF01608825] expresses the complementarity between two n -tuples of canonically conjugate variables (x1,x2,...,xn) and (p1,p2,...,pn) in terms of Shannon differential entropy. Here we consider the generalization to variables that are not canonically conjugate and derive an entropic uncertainty relation expressing the balance between any two n -variable Gaussian projective measurements. The bound on entropies is expressed in terms of the determinant of a matrix of commutators between the measured variables. This uncertainty relation also captures the complementarity between any two incompatible linear canonical transforms, the bound being written in terms of the corresponding symplectic matrices in phase space. Finally, we extend this uncertainty relation to Rényi entropies and also prove a covariance-based uncertainty relation which generalizes the Robertson relation.
Parental Relationships, Autonomy, and Identity Processes of High School Students
ERIC Educational Resources Information Center
Mullis, Ronald L.; Graf, Shruti Chatterjee; Mullis, Ann K.
2009-01-01
To examine the interrelations among parental relationships, emotional autonomy, and identity statuses, the authors asked 234 (105 male, 129 female) high school students to complete the Parental Bonding Scale (G. Parker, H. Tupling, & L. B. Brown, 1979), Emotional Autonomy Scale (L. D. Steinberg & S. B. Silverberg, 1986), and Extended Objective…
Tool independence for the Web Accessibility Quantitative Metric.
Vigo, Markel; Brajnik, Giorgio; Arrue, Myriam; Abascal, Julio
2009-07-01
The Web Accessibility Quantitative Metric (WAQM) aims at accurately measuring the accessibility of web pages. One of the main features of WAQM among others is that it is evaluation tool independent for ranking and accessibility monitoring scenarios. This article proposes a method to attain evaluation tool independence for all foreseeable scenarios. After demonstrating that homepages have a more similar error profile than any other web page in a given web site, 15 homepages were measured with 10,000 different values of WAQM parameters using EvalAccess and LIFT, two automatic evaluation tools for accessibility. A similar procedure was followed with random pages and with several test files obtaining several tuples that minimise the difference between both tools. One thousand four hundred forty-nine web pages from 15 web sites were measured with these tuples and those values that minimised the difference between the tools were selected. Once the WAQM was tuned, the accessibility of 15 web sites was measured with two metrics for web sites, concluding that even if similar values can be produced, obtaining the same scores is undesirable since evaluation tools behave in a different way.
Randomness versus specifics for word-frequency distributions
NASA Astrophysics Data System (ADS)
Yan, Xiaoyong; Minnhagen, Petter
2016-02-01
The text-length-dependence of real word-frequency distributions can be connected to the general properties of a random book. It is pointed out that this finding has strong implications, when deciding between two conceptually different views on word-frequency distributions, i.e. the specific 'Zipf's-view' and the non-specific 'Randomness-view', as is discussed. It is also noticed that the text-length transformation of a random book does have an exact scaling property precisely for the power-law index γ = 1, as opposed to the Zipf's exponent γ = 2 and the implication of this exact scaling property is discussed. However a real text has γ > 1 and as a consequence γ increases when shortening a real text. The connections to the predictions from the RGF (Random Group Formation) and to the infinite length-limit of a meta-book are also discussed. The difference between 'curve-fitting' and 'predicting' word-frequency distributions is stressed. It is pointed out that the question of randomness versus specifics for the distribution of outcomes in case of sufficiently complex systems has a much wider relevance than just the word-frequency example analyzed in the present work.
Multiscale volatility duration characteristics on financial multi-continuum percolation dynamics
NASA Astrophysics Data System (ADS)
Wang, Min; Wang, Jun
A random stock price model based on the multi-continuum percolation system is developed to investigate the nonlinear dynamics of stock price volatility duration, in an attempt to explain various statistical facts found in financial data, and have a deeper understanding of mechanisms in the financial market. The continuum percolation system is usually referred to be a random coverage process or a Boolean model, it is a member of a class of statistical physics systems. In this paper, the multi-continuum percolation (with different values of radius) is employed to model and reproduce the dispersal of information among the investors. To testify the rationality of the proposed model, the nonlinear analyses of return volatility duration series are preformed by multifractal detrending moving average analysis and Zipf analysis. The comparison empirical results indicate the similar nonlinear behaviors for the proposed model and the actual Chinese stock market.
Architecture for interoperable software in biology.
Bare, James Christopher; Baliga, Nitin S
2014-07-01
Understanding biological complexity demands a combination of high-throughput data and interdisciplinary skills. One way to bring to bear the necessary combination of data types and expertise is by encapsulating domain knowledge in software and composing that software to create a customized data analysis environment. To this end, simple flexible strategies are needed for interconnecting heterogeneous software tools and enabling data exchange between them. Drawing on our own work and that of others, we present several strategies for interoperability and their consequences, in particular, a set of simple data structures--list, matrix, network, table and tuple--that have proven sufficient to achieve a high degree of interoperability. We provide a few guidelines for the development of future software that will function as part of an interoperable community of software tools for biological data analysis and visualization. © The Author 2012. Published by Oxford University Press.
1994-03-16
105 2.10 Decidability ........ ................................ 116 3 Declaring Refinements of Recursive Data Types 165 3.1...However, when we introduce polymorphic constructors in Chapter 5, tuples will become a polymorphic data type very similar to other polymorphic data types...terminate. 0 Chapter 3 Declaring Refinements of Recursive Data Types 3.1 Introduction The previous chapter defined refinement type inference in terms of
High Performance Active Database Management on a Shared-Nothing Parallel Processor
1998-05-01
either stored or virtual. A stored node is like a materialized view. It actually contains the specified tuples. A virtual node is like a real view...90292-6695 DL-5 COLUMBIA UNIV/DEPT COMPUTER SCIENCi ATTN: OR GAIL £. KAISER 450 COMPUTER SCIENCE 3LDG 500 WEST 12ÖTH STRSET NEW YORK NY 10027
Modeling stock price dynamics by continuum percolation system and relevant complex systems analysis
NASA Astrophysics Data System (ADS)
Xiao, Di; Wang, Jun
2012-10-01
The continuum percolation system is developed to model a random stock price process in this work. Recent empirical research has demonstrated various statistical features of stock price changes, the financial model aiming at understanding price fluctuations needs to define a mechanism for the formation of the price, in an attempt to reproduce and explain this set of empirical facts. The continuum percolation model is usually referred to as a random coverage process or a Boolean model, the local interaction or influence among traders is constructed by the continuum percolation, and a cluster of continuum percolation is applied to define the cluster of traders sharing the same opinion about the market. We investigate and analyze the statistical behaviors of normalized returns of the price model by some analysis methods, including power-law tail distribution analysis, chaotic behavior analysis and Zipf analysis. Moreover, we consider the daily returns of Shanghai Stock Exchange Composite Index from January 1997 to July 2011, and the comparisons of return behaviors between the actual data and the simulation data are exhibited.
Weak vector boson production with many jets at the LHC √{s }=13 TeV
NASA Astrophysics Data System (ADS)
Anger, F. R.; Febres Cordero, F.; Höche, S.; Maître, D.
2018-05-01
Signatures with an electroweak vector boson and many jets play a crucial role at the Large Hadron Collider, both in the measurement of Standard-Model parameters and in searches for new physics. Precise predictions for these multiscale processes are therefore indispensable. We present next-to-leading order QCD predictions for W±/Z +jets at √{s }=13 TeV , including up to five/four jets in the final state. All production channels are included, and leptonic decays of the vector bosons are considered at the amplitude level. We assess theoretical uncertainties arising from renormalization- and factorization-scale dependence by considering fixed-order dynamical scales based on the HT variable as well as on the MiNLO procedure. We also explore uncertainties associated with different choices of parton-distribution functions. We provide event samples that can be explored through publicly available n -tuple sets, generated with BlackHat in combination with Sherpa.
Statistics of Language Morphology Change: From Biconsonantal Hunters to Triconsonantal Farmers
Agmon, Noam; Bloch, Yigal
2013-01-01
Linguistic evolution mirrors cultural evolution, of which one of the most decisive steps was the "agricultural revolution" that occurred 11,000 years ago in W. Asia. Traditional comparative historical linguistics becomes inaccurate for time depths greater than, say, 10 kyr. Therefore it is difficult to determine whether decisive events in human prehistory have had an observable impact on human language. Here we supplement the traditional methodology with independent statistical measures showing that following the transition to agriculture, languages of W. Asia underwent a transition from biconsonantal (2c) to triconsonantal (3c) morphology. Two independent proofs for this are provided. Firstly the reconstructed Proto-Semitic fire and hunting lexicons are predominantly 2c, whereas the farming lexicon is almost exclusively 3c in structure. Secondly, while Biblical verbs show the usual Zipf exponent of about 1, their 2c subset exhibits a larger exponent. After the 2c > 3c transition, this could arise from a faster decay in the frequency of use of the less common 2c verbs. Using an established frequency-dependent word replacement rate, we calculate that the observed increase in the Zipf exponent has occurred over the 7,500 years predating Biblical Hebrew namely, starting with the transition to agriculture. PMID:24367613
Understanding scaling through history-dependent processes with collapsing sample space.
Corominas-Murtra, Bernat; Hanel, Rudolf; Thurner, Stefan
2015-04-28
History-dependent processes are ubiquitous in natural and social systems. Many such stochastic processes, especially those that are associated with complex systems, become more constrained as they unfold, meaning that their sample space, or their set of possible outcomes, reduces as they age. We demonstrate that these sample-space-reducing (SSR) processes necessarily lead to Zipf's law in the rank distributions of their outcomes. We show that by adding noise to SSR processes the corresponding rank distributions remain exact power laws, p(x) ~ x(-λ), where the exponent directly corresponds to the mixing ratio of the SSR process and noise. This allows us to give a precise meaning to the scaling exponent in terms of the degree to which a given process reduces its sample space as it unfolds. Noisy SSR processes further allow us to explain a wide range of scaling exponents in frequency distributions ranging from α = 2 to ∞. We discuss several applications showing how SSR processes can be used to understand Zipf's law in word frequencies, and how they are related to diffusion processes in directed networks, or aging processes such as in fragmentation processes. SSR processes provide a new alternative to understand the origin of scaling in complex systems without the recourse to multiplicative, preferential, or self-organized critical processes.
Beyond Word Frequency: Bursts, Lulls, and Scaling in the Temporal Distributions of Words
Altmann, Eduardo G.; Pierrehumbert, Janet B.; Motter, Adilson E.
2009-01-01
Background Zipf's discovery that word frequency distributions obey a power law established parallels between biological and physical processes, and language, laying the groundwork for a complex systems perspective on human communication. More recent research has also identified scaling regularities in the dynamics underlying the successive occurrences of events, suggesting the possibility of similar findings for language as well. Methodology/Principal Findings By considering frequent words in USENET discussion groups and in disparate databases where the language has different levels of formality, here we show that the distributions of distances between successive occurrences of the same word display bursty deviations from a Poisson process and are well characterized by a stretched exponential (Weibull) scaling. The extent of this deviation depends strongly on semantic type – a measure of the logicality of each word – and less strongly on frequency. We develop a generative model of this behavior that fully determines the dynamics of word usage. Conclusions/Significance Recurrence patterns of words are well described by a stretched exponential distribution of recurrence times, an empirical scaling that cannot be anticipated from Zipf's law. Because the use of words provides a uniquely precise and powerful lens on human thought and activity, our findings also have implications for other overt manifestations of collective human dynamics. PMID:19907645
NASA Astrophysics Data System (ADS)
Young, F.; Siegel, E.
2010-03-01
(so MIScalled) ``complexity''(sMc) associated BOTH SCALE- INVARIANCE Symmetry-RESTORING(S-I S-R) [vs. S-I S-B!!!], AND X (w) P(w ) 1/w^(1.000...) ``pink''/Zipf/Archimedes-HYPERBOLICITY INEVITABILITY CONNECTION is by simple-calculus SISR's logarithm- function derivative: (d/dw)ln(w)=1/w=1/w^(1.000...), hence: (d/dw) [SISR](w)=1/w=1/w^(1.000...)=(via Noether-theorem relating continuous-(SISR)-symmetries to conservation-laws)=(d/dw)[4-DIV (J(INTER-SCALE)=0](w)=1/w =1/w^(1.000...). Hence sMc is information inter-scale conservation [as Anderson-Mandell, Fractals of Brain; Fractals of Mind(1994)-experimental- psychology!!!], i.e. sMciUS!!!, VERSUS ``COMPLICATEDNESS", is sMcciUS!!!: EITHER: PLUS (Additive: Murphy's-law absence) OR TIMES (Multiplicative: Murphy's-law dominance) various disparate system-specificity ``COMPLICATIONS". ``COMPLICATEDNESS" MEASURES: DEVIATIONS FROM sMciUS!!!: EITHER [S-I S-B] MINUS [S- I S-R] AND/OR [``red"/Pareto X(w) P(w) 1/w^(#=/=1.000...)] MINUS [X(w) P(w) 1/w^(1.000...) ``pink"/Zipf/Archimedes-HYPERBOLICITY INEVITABILITY] = [1/w^(#=/=1.000...)] MINUS [1/w^(1.000...)]; almost but not exactly a fractals Hurst-exponent-like [# - 1.000...]!!!
FastSim: A Fast Simulation for the SuperB Detector
NASA Astrophysics Data System (ADS)
Andreassen, R.; Arnaud, N.; Brown, D. N.; Burmistrov, L.; Carlson, J.; Cheng, C.-h.; Di Simone, A.; Gaponenko, I.; Manoni, E.; Perez, A.; Rama, M.; Roberts, D.; Rotondo, M.; Simi, G.; Sokoloff, M.; Suzuki, A.; Walsh, J.
2011-12-01
We have developed a parameterized (fast) simulation for detector optimization and physics reach studies of the proposed SuperB Flavor Factory in Italy. Detector components are modeled as thin sections of planes, cylinders, disks or cones. Particle-material interactions are modeled using simplified cross-sections and formulas. Active detectors are modeled using parameterized response functions. Geometry and response parameters are configured using xml files with a custom-designed schema. Reconstruction algorithms adapted from BaBar are used to build tracks and clusters. Multiple sources of background signals can be merged with primary signals. Pattern recognition errors are modeled statistically by randomly misassigning nearby tracking hits. Standard BaBar analysis tuples are used as an event output. Hadronic B meson pair events can be simulated at roughly 10Hz.
Evens, Nicholas P; Buchner, Peter; Williams, Lorraine E; Hawkesford, Malcolm J
2017-10-01
Understanding the molecular basis of zinc (Zn) uptake and transport in staple cereal crops is critical for improving both Zn content and tolerance to low-Zn soils. This study demonstrates the importance of group F bZIP transcription factors and ZIP transporters in responses to Zn deficiency in wheat (Triticum aestivum). Seven group F TabZIP genes and 14 ZIPs with homeologs were identified in hexaploid wheat. Promoter analysis revealed the presence of Zn-deficiency-response elements (ZDREs) in a number of the ZIPs. Functional complementation of the zrt1/zrt2 yeast mutant by TaZIP3, -6, -7, -9 and -13 supported an ability to transport Zn. Group F TabZIPs contain the group-defining cysteine-histidine-rich motifs, which are the predicted binding site of Zn 2+ in the Zn-deficiency response. Conservation of these motifs varied between the TabZIPs suggesting that individual TabZIPs may have specific roles in the wheat Zn-homeostatic network. Increased expression in response to low Zn levels was observed for several of the wheat ZIPs and bZIPs; this varied temporally and spatially suggesting specific functions in the response mechanism. The ability of the group F TabZIPs to bind to specific ZDREs in the promoters of TaZIPs indicates a conserved mechanism in monocots and dicots in responding to Zn deficiency. In support of this, TabZIPF1-7DL and TabZIPF4-7AL afforded a strong level of rescue to the Arabidopsis hypersensitive bzip19 bzip23 double mutant under Zn deficiency. These results provide a greater understanding of Zn-homeostatic mechanisms in wheat, demonstrating an expanded repertoire of group F bZIP transcription factors, adding to the complexity of Zn homeostasis. © 2017 The Authors The Plant Journal published by John Wiley & Sons Ltd and Society for Experimental Biology.
Analysis of mass incident diffusion in Weibo based on self-organization theory
NASA Astrophysics Data System (ADS)
Pan, Jun; Shen, Huizhang
2018-02-01
This study introduces some theories and methods of self-organization system to the research of the diffusion mechanism of mass incidents in Weibo (Chinese Twitter). Based on the analysis on massive Weibo data from Songjiang battery factory incident happened in 2013 and Jiiangsu Qidong OJI PAPER incident happened in 2012, we find out that diffusion system of mass incident in Weibo satisfies Power Law, Zipf's Law, 1/f noise and Self-similarity. It means this system is the self-organization criticality system and dissemination bursts can be understood as one kind of Self-organization behavior. As the consequence, self-organized criticality (SOC) theory can be used to explain the evolution of mass incident diffusion and people may come up with the right strategy to control such kind of diffusion if they can handle the key ingredients of Self-organization well. Such a study is of practical importance which can offer opportunities for policy makers to have good management on these events.
There is More than a Power Law in Zipf
Cristelli, Matthieu; Batty, Michael; Pietronero, Luciano
2012-01-01
The largest cities, the most frequently used words, the income of the richest countries, and the most wealthy billionaires, can be all described in terms of Zipf’s Law, a rank-size rule capturing the relation between the frequency of a set of objects or events and their size. It is assumed to be one of many manifestations of an underlying power law like Pareto’s or Benford’s, but contrary to popular belief, from a distribution of, say, city sizes and a simple random sampling, one does not obtain Zipf’s law for the largest cities. This pathology is reflected in the fact that Zipf’s Law has a functional form depending on the number of events N. This requires a fundamental property of the sample distribution which we call ‘coherence’ and it corresponds to a ‘screening’ between various elements of the set. We show how it should be accounted for when fitting Zipf’s Law. PMID:23139862
Parallelizing Data-Centric Programs
2013-09-25
results than current techniques, such as ImageWebs [HGO+10], given the same budget of matches performed. 4.2 Scalable Parallel Similarity Search The work...algorithms. 5 Data-Driven Applications in the Cloud In this project, we investigated what happens when data-centric software is moved from expensive custom ...returns appropriate answer tuples. Figure 9 (b) shows the mutual constraint satisfaction that takes place in answering for 122. The intent is that
Knowledge Base Refinement as Improving an Incorrect and Incomplete Domain Theory
1990-04-01
Ginsberg et al., 1985), and RL (Fu and Buchanan, 1985), which perform empirical induction over a library of test cases. This chapter describes a new...state knowledge. Examples of high-level goals are: to test a hypothesis, to differentiate between several plausible hypotheses, to ask a clarifying...one tuple when we Group Hypotheses Test Hypothesis Applyrule Findout Strategy Metarule Strategy Metarule Strategy Metarule Strategy Metarule goal(group
Optimization of Extended Relational Database Systems
1986-07-23
control functions are integrated into a single system in a homogeneoua way. As a first exam - ple, consider previous work in supporting various semantic...sizes are reduced and, wnk? quently, the number of materializations that will be needed is aba lower. For exam - pie, in the above query tuple...retrieve (EMP.name) where EMP hobbies instrument = ’ violin ’ When the various entries in the hobbies field are materialized, only those queries that
Extending Phrase-Based Decoding with a Dependency-Based Reordering Model
2009-11-01
strictly within the confines of phrase-based translation. The hope was to introduce an approach that could take advantage of monolingual syntactic...tuple represents one element of the XML markup, where element is the name of this element, attributes is a dictionary (mapping strings to strings...representing the range of possible compressions, in the form of a dictionary mapping the latter to the former. To represent multiple dependency
Analytical solution of a stochastic model of risk spreading with global coupling
NASA Astrophysics Data System (ADS)
Morita, Satoru; Yoshimura, Jin
2013-11-01
We study a stochastic matrix model to understand the mechanics of risk spreading (or bet hedging) by dispersion. Up to now, this model has been mostly dealt with numerically, except for the well-mixed case. Here, we present an analytical result that shows that optimal dispersion leads to Zipf's law. Moreover, we found that the arithmetic ensemble average of the total growth rate converges to the geometric one, because the sample size is finite.
Zhu, Zihang; Zhao, Shanghong; Zheng, Wanze; Wang, Wei; Lin, Baoqin
2015-11-10
A novel frequency 12-tupling optical millimeter-wave (mm-wave) generation using two cascaded dual-parallel Mach-Zehnder modulators (DP-MZMs) without an optical filter is proposed and demonstrated by computer simulation. By properly adjusting the amplitude and phase of radio frequency (RF) driving signal and the direct current (DC) bias points of two DP-MZMs, a 120 GHz mm-wave with an optical sideband suppression ratio (OSSR) of 25.1 dB and a radio frequency spurious suppression ratio (RFSSR) of 19.1 dB is shown to be generated from a 10 GHz RF driving signal, which largely reduces the response frequency of electronic devices. Furthermore, it is also proved to be valid that even if the phase difference of RF driving signals, the RF driving voltage, and the DC bias voltage deviate from the ideal values to a certain degree, the performance is still acceptable. Since no optical filter is employed to suppress the undesired optical sidebands, a high-spectral-purity mm-wave signal tunable from 48 to 216 GHz can be obtained theoretically when a RF driving signal from 4 to 18 GHz is applied to the DP-MZMs, and the system can be readily implemented in wavelength-division-multiplexing upconversion systems to provide high-quality optical local oscillator signal.
Scientific Production about the Adherence to Antiretroviral Therapy
de Oliveira, Regina Célia; de Andrade Moraes, Danielle Chianca; Santos, Cleytiane Stephany Silva; da Silva Monteiro, Gicely Regina Sobral; da Rocha Cabral, Juliana; Beltrão, Roberta Andrade; da Silva, Calos Roberto Lyra
2017-01-01
Objective To identify the elite of authors about the subject adherence to antiretroviral therapy; to identify the journals turned to publishing articles about adherence to antiretroviral therapy; and to identify and analyze the most commonly used words in abstracts of articles about adherence to antiretroviral therapy. Method A bibliometric study conducted through the Scopus base. We used articles published between 1996 and 2014, after application of the eligibility criteria, there were composed the sample with 24 articles. The data were analyzed descriptively. Were used the laws of bibliometric (Lotka, Bradford and Zipf) and the conceptual cloud map of words, through the program Cmap tools. Results Lotka’s Law identified the 5 authors more productive (46% of the total published). Bradford is impaired in this study. Concerning Zipf, 3 zones were determined, 31.47% of the words with in the first zone, 26.46% in the second and 42.06% in the third. In the conceptual map, the words/factors that positively and negatively influence adherence were emphasized, among them the need for more research in the health services. Conclusion There are few publications about the accession to antiretroviral therapy, and the scientific production is in the process of maturation. One can infer that the theme researched is not yet an obsolete topic. It should be noted that the Bibliometric was a relevant statistic tool to generate information about the publications about the antiretroviral therapy. PMID:28979571
Automorphic Forms and Mock Modular Forms in String Theory
NASA Astrophysics Data System (ADS)
Nazaroglu, Caner
We study a variety of modular invariant objects in relation to string theory. First, we focus on Jacobi forms over generic rank lattices and Siegel forms that appear in N = 2, D = 4 compactifications of heterotic string with Wilson lines. Constraints from low energy spectrum and modularity are employed to deduce the relevant supersymmetric partition functions entirely. This procedure is applied on models that lead to Jacobi forms of index 3, 4, 5 as well as Jacobi forms over root lattices A2 and A3. These computations are then checked against an explicit orbifold model which can be Higgsed to the models under question. Models with a single Wilson line are then studied in detail with their relation to paramodular group Gammam as T-duality group made explicit. These results on the heterotic string side are then turned into predictions for geometric invariants using TypeII - Heterotic duality. Secondly, we study theta functions for indenite signature lattices of generic signature. Building on results in literature for signature (n-1,1) and (n-2,2) lattices, we work out the properties of generalized error functions which we call r-tuple error functions. We then use these functions to build such indenite theta functions and describe their modular completions.
Complexity multiscale asynchrony measure and behavior for interacting financial dynamics
NASA Astrophysics Data System (ADS)
Yang, Ge; Wang, Jun; Niu, Hongli
2016-08-01
A stochastic financial price process is proposed and investigated by the finite-range multitype contact dynamical system, in an attempt to study the nonlinear behaviors of real asset markets. The viruses spreading process in a finite-range multitype system is used to imitate the interacting behaviors of diverse investment attitudes in a financial market, and the empirical research on descriptive statistics and autocorrelation behaviors of return time series is performed for different values of propagation rates. Then the multiscale entropy analysis is adopted to study several different shuffled return series, including the original return series, the corresponding reversal series, the random shuffled series, the volatility shuffled series and the Zipf-type shuffled series. Furthermore, we propose and compare the multiscale cross-sample entropy and its modification algorithm called composite multiscale cross-sample entropy. We apply them to study the asynchrony of pairs of time series under different time scales.
Dependence of exponents on text length versus finite-size scaling for word-frequency distributions
NASA Astrophysics Data System (ADS)
Corral, Álvaro; Font-Clos, Francesc
2017-08-01
Some authors have recently argued that a finite-size scaling law for the text-length dependence of word-frequency distributions cannot be conceptually valid. Here we give solid quantitative evidence for the validity of this scaling law, using both careful statistical tests and analytical arguments based on the generalized central-limit theorem applied to the moments of the distribution (and obtaining a novel derivation of Heaps' law as a by-product). We also find that the picture of word-frequency distributions with power-law exponents that decrease with text length [X. Yan and P. Minnhagen, Physica A 444, 828 (2016), 10.1016/j.physa.2015.10.082] does not stand with rigorous statistical analysis. Instead, we show that the distributions are perfectly described by power-law tails with stable exponents, whose values are close to 2, in agreement with the classical Zipf's law. Some misconceptions about scaling are also clarified.
A method for determining customer requirement weights based on TFMF and TLR
NASA Astrophysics Data System (ADS)
Ai, Qingsong; Shu, Ting; Liu, Quan; Zhou, Zude; Xiao, Zheng
2013-11-01
'Customer requirements' (CRs) management plays an important role in enterprise systems (ESs) by processing customer-focused information. Quality function deployment (QFD) is one of the main CRs analysis methods. Because CR weights are crucial for the input of QFD, we developed a method for determining CR weights based on trapezoidal fuzzy membership function (TFMF) and 2-tuple linguistic representation (TLR). To improve the accuracy of CR weights, we propose to apply TFMF to describe CR weights so that they can be appropriately represented. Because the fuzzy logic is not capable of aggregating information without loss, TLR model is adopted as well. We first describe the basic concepts of TFMF and TLR and then introduce an approach to compute CR weights. Finally, an example is provided to explain and verify the proposed method.
Rank-frequency distributions of Romanian words
NASA Astrophysics Data System (ADS)
Cocioceanu, Adrian; Raportaru, Carina Mihaela; Nicolin, Alexandru I.; Jakimovski, Dragan
2017-12-01
The calibration of voice biometrics solutions requires detailed analyses of spoken texts and in this context we investigate by computational means the rank-frequency distributions of Romanian words and word series to determine the most common words and word series of the language. To this end, we have constructed a corpus of approximately 2.5 million words and then determined that the rank-frequency distributions of the Romanian words, as well as series of two, and three subsequent words, obey the celebrated Zipf law.
Storage and Database Management for Big Data
2015-07-27
and value ), each cell is actually a seven tuple where the column is broken into three parts, and there is an additional field for a timestamp as seen...questions require a careful understanding of the technology field in addition to the types of problems that are being solved. This chapter aims to address...formats such as comma separated values (CSV), JavaScript Object Notation (JSON) [21], or other proprietary sensor formats. Most often, this raw data
System Engineering and Evolution Decision Support
2001-09-30
collection of Tuple Space model was first conceived in the mid- 1980 at objects that isolates the requestor of services from the Yale University by...Academic Press, NY, 1980 , pp. 325-347. [2] V. Berzins, M. Shing, Luqi, M. Saluto and J. Williams, Re-engineering the Janus(A) Combat Simulation System...Naval Postgraduate School Monterey, CA 93943-5100 3. Research Office, Code 09 Naval Postgraduate School Monterey, CA 93943-5000 4. Dr. David Hislop U.S
2007-06-01
particle accelerators cannot run unless enough network band- width is available to absorb their data streams. DOE scientists running simulations routinely...send tuples to TelegraphCQ. To simulate a less-powerful machine, I increased the playback rate of the trace by a factor of 10 and reduced the query...III CPUs and 1.5 GB of main memory. To simulate using a less powerful embedded CPU, I wrote a program that would “play back” the trace at a multiple
NASA Technical Reports Server (NTRS)
Haralick, R. M.; Kanemasu, E. T.; Morain, S. A.; Yarger, H. L.; Ulaby, F. T.; Davis, J. C. (Principal Investigator); Bosley, R. J.; Williams, D. L.; Mccauley, J. R.; Mcnaughton, J. L.
1973-01-01
The author has identified the following significant results. Improvement in the land use classification accuracy of ERTS-1 MSS multi-images over Kansas can be made using two distances between neighboring grey tone N-tuples instead of one distance. Much more information is contained texturally than spectrally on the Kansas image. Ground truth measurements indicate that reflectance ratios of the 545 and 655 nm wavebands provide an index of plant development and possibly physiological stress. Preliminary analysis of MSS 4 and 5 channels substantiate the ground truth interpretation. Results of the land use mapping experiment indicate that ERTS-1 imagery has major potential in regionalization. The ways in which land is utilized within these regions may then be studied more effectively than if no adequate regionalization is available. A model for estimating wheat yield per acre has been applied to acreage estimates derived from ERTS-1 imagery to project the 1973 wheat yields for a ten county area in southwest Kansas. The results are within 3% of the preharvest estimates for the same area prepared by the USDA. Visual identification of winter wheat is readily achieved by using a temporal sequence of images. Identification can be improve by stratifying the project area into subregions having more or less homogeneous agricultural practices and crop mixes.
Optical implementation of inner product neural associative memory
NASA Technical Reports Server (NTRS)
Liu, Hua-Kuang (Inventor)
1995-01-01
An optical implementation of an inner-product neural associative memory is realized with a first spatial light modulator for entering an initial two-dimensional N-tuple vector and for entering a thresholded output vector image after each iteration until convergence is reached, and a second spatial light modulator for entering M weighted vectors of inner-product scalars multiplied with each of the M stored vectors, where the inner-product scalars are produced by multiplication of the initial input vector in the first iterative cycle (and thresholded vectors in subsequent iterative cycles) with each of the M stored vectors, and the weighted vectors are produced by multiplication of the scalars with corresponding ones of the stored vectors. A Hughes liquid crystal light valve is used for the dual function of summing the weighted vectors and thresholding the sum vector. The thresholded vector is then entered through the first spatial light modulator for reiteration of the process cycle until convergence is reached.
Similarity of Symbol Frequency Distributions with Heavy Tails
NASA Astrophysics Data System (ADS)
Gerlach, Martin; Font-Clos, Francesc; Altmann, Eduardo G.
2016-04-01
Quantifying the similarity between symbolic sequences is a traditional problem in information theory which requires comparing the frequencies of symbols in different sequences. In numerous modern applications, ranging from DNA over music to texts, the distribution of symbol frequencies is characterized by heavy-tailed distributions (e.g., Zipf's law). The large number of low-frequency symbols in these distributions poses major difficulties to the estimation of the similarity between sequences; e.g., they hinder an accurate finite-size estimation of entropies. Here, we show analytically how the systematic (bias) and statistical (fluctuations) errors in these estimations depend on the sample size N and on the exponent γ of the heavy-tailed distribution. Our results are valid for the Shannon entropy (α =1 ), its corresponding similarity measures (e.g., the Jensen-Shanon divergence), and also for measures based on the generalized entropy of order α . For small α 's, including α =1 , the errors decay slower than the 1 /N decay observed in short-tailed distributions. For α larger than a critical value α*=1 +1 /γ ≤2 , the 1 /N decay is recovered. We show the practical significance of our results by quantifying the evolution of the English language over the last two centuries using a complete α spectrum of measures. We find that frequent words change more slowly than less frequent words and that α =2 provides the most robust measure to quantify language change.
Molina-Casado, José M; Carmona, Enrique J; García-Feijoó, Julián
2017-10-01
The anatomical structure detection in retinal images is an open problem. However, most of the works in the related literature are oriented to the detection of each structure individually or assume the previous detection of a structure which is used as a reference. The objective of this paper is to obtain simultaneous detection of the main retinal structures (optic disc, macula, network of vessels and vascular bundle) in a fast and robust way. We propose a new methodology oriented to accomplish the mentioned objective. It consists of two stages. In an initial stage, a set of operators is applied to the retinal image. Each operator uses intra-structure relational knowledge in order to produce a set of candidate blobs that belongs to the desired structure. In a second stage, a set of tuples is created, each of which contains a different combination of the candidate blobs. Next, filtering operators, using inter-structure relational knowledge, are used in order to find the winner tuple. A method using template matching and mathematical morphology is implemented following the proposed methodology. A success is achieved if the distance between the automatically detected blob center and the actual structure center is less than or equal to one optic disc radius. The success rates obtained in the different public databases analyzed were: MESSIDOR (99.33%, 98.58%, 97.92%), DIARETDB1 (96.63%, 100%, 97.75%), DRIONS (100%, n/a, 100%) and ONHSD (100%, 98.85%, 97.70%) for optic disc (OD), macula (M) and vascular bundle (VB), respectively. Finally, the overall success rate obtained in this study for each structure was: 99.26% (OD), 98.69% (M) and 98.95% (VB). The average time of processing per image was 4.16 ± 0.72 s. The main advantage of the use of inter-structure relational knowledge was the reduction of the number of false positives in the detection process. The implemented method is able to simultaneously detect four structures. It is fast, robust and its detection results are competitive in relation to other methods of the recent literature. Copyright © 2017 Elsevier B.V. All rights reserved.
Inferring cultural regions from correlation networks of given baby names
NASA Astrophysics Data System (ADS)
Pomorski, Mateusz; Krawczyk, Małgorzata J.; Kułakowski, Krzysztof; Kwapień, Jarosław; Ausloos, Marcel
2016-03-01
We report investigations on the statistical characteristics of the baby names given between 1910 and 2010 in the United States of America. For each year, the 100 most frequent names in the USA are sorted out. For these names, the correlations between the names profiles are calculated for all pairs of states (minus Hawaii and Alaska). The correlations are used to form a weighted network which is found to vary mildly in time. In fact, the structure of communities in the network remains quite stable till about 1980. The goal is that the calculated structure approximately reproduces the usually accepted geopolitical regions: the Northeast, the South, and the "Midwest + West" as the third one. Furthermore, the dataset reveals that the name distribution satisfies the Zipf law, separately for each state and each year, i.e. the name frequency f ∝r-α, where r is the name rank. Between 1920 and 1980, the exponent α is the largest one for the set of states classified as 'the South', but the smallest one for the set of states classified as "Midwest + West". Our interpretation is that the pool of selected names was quite narrow in the Southern states. The data is compared with some related statistics of names in Belgium, a country also with different regions, but having quite a different scale than the USA. There, the Zipf exponent is low for young people and for the Brussels citizens.
On the orthogonal dissipative lax-phillips scattering theory
NASA Astrophysics Data System (ADS)
Neidhardt, Hagen
1988-08-01
The paper is devoted to the so-called orthogonal dissipative Lax-Phillips scattering theory. A parametrization of all possible orthogonal dissipative Lax-Phillips scattering theories is obtained in terms of ordered 6-tuples consisting of unilateral shifts and contractions which can be, roughly speaking, freely chosen. In this parametrization the wave and scattering operators as well as the scattering matrix are explicitly calculated. Moreover, a description of all analytical contraction-valued functions admitting a Darlington synthesis is found.
1992-10-16
the DNA Fingerprint Laboratory. The Los Angeles Police Department and its former Chief, Daryl Gates for permitting a secret unit, the ...authorized to change information in. Conclusions Where angels fear .... Of all the reasons for compartmentation for which the level of evaluation...database, and a security label attribute is associated with data in each tuple in a relation. The range and distribution of security levels may
Cross-language Babel structs—making scientific interfaces more efficient
NASA Astrophysics Data System (ADS)
Prantl, Adrian; Ebner, Dietmar; Epperly, Thomas G. W.
2013-01-01
Babel is an open-source language interoperability framework tailored to the needs of high-performance scientific computing. As an integral element of the Common Component Architecture, it is employed in a wide range of scientific applications where it is used to connect components written in different programming languages. In this paper we describe how we extended Babel to support interoperable tuple data types (structs). Structs are a common idiom in (mono-lingual) scientific application programming interfaces (APIs); they are an efficient way to pass tuples of nonuniform data between functions, and are supported natively by most programming languages. Using our extended version of Babel, developers of scientific codes can now pass structs as arguments between functions implemented in any of the supported languages. In C, C++, Fortran 2003/2008 and Chapel, structs can be passed without the overhead of data marshaling or copying, providing language interoperability at minimal cost. Other supported languages are Fortran 77, Fortran 90/95, Java and Python. We will show how we designed a struct implementation that is interoperable with all of the supported languages and present benchmark data to compare the performance of all language bindings, highlighting the differences between languages that offer native struct support and an object-oriented interface with getter/setter methods. A case study shows how structs can help simplify the interfaces of scientific codes significantly.
Experiments in fault tolerant software reliability
NASA Technical Reports Server (NTRS)
Mcallister, David F.; Vouk, Mladen A.
1989-01-01
Twenty functionally equivalent programs were built and tested in a multiversion software experiment. Following unit testing, all programs were subjected to an extensive system test. In the process sixty-one distinct faults were identified among the versions. Less than 12 percent of the faults exhibited varying degrees of positive correlation. The common-cause (or similar) faults spanned as many as 14 components. However, a majority of these faults were trivial, and easily detected by proper unit and/or system testing. Only two of the seven similar faults were difficult faults, and both were caused by specification ambiguities. One of these faults exhibited variable identical-and-wrong response span, i.e. response span which varied with the testing conditions and input data. Techniques that could have been used to avoid the faults are discussed. For example, it was determined that back-to-back testing of 2-tuples could have been used to eliminate about 90 percent of the faults. In addition, four of the seven similar faults could have been detected by using back-to-back testing of 5-tuples. It is believed that most, if not all, similar faults could have been avoided had the specifications been written using more formal notation, the unit testing phase was subject to more stringent standards and controls, and better tools for measuring the quality and adequacy of the test data (e.g. coverage) were used.
Multiple alignment-free sequence comparison
Ren, Jie; Song, Kai; Sun, Fengzhu; Deng, Minghua; Reinert, Gesine
2013-01-01
Motivation: Recently, a range of new statistics have become available for the alignment-free comparison of two sequences based on k-tuple word content. Here, we extend these statistics to the simultaneous comparison of more than two sequences. Our suite of statistics contains, first, and , extensions of statistics for pairwise comparison of the joint k-tuple content of all the sequences, and second, , and , averages of sums of pairwise comparison statistics. The two tasks we consider are, first, to identify sequences that are similar to a set of target sequences, and, second, to measure the similarity within a set of sequences. Results: Our investigation uses both simulated data as well as cis-regulatory module data where the task is to identify cis-regulatory modules with similar transcription factor binding sites. We find that although for real data, all of our statistics show a similar performance, on simulated data the Shepp-type statistics are in some instances outperformed by star-type statistics. The multiple alignment-free statistics are more sensitive to contamination in the data than the pairwise average statistics. Availability: Our implementation of the five statistics is available as R package named ‘multiAlignFree’ at be http://www-rcf.usc.edu/∼fsun/Programs/multiAlignFree/multiAlignFreemain.html. Contact: reinert@stats.ox.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23990418
Statistical properties of DNA sequences
NASA Technical Reports Server (NTRS)
Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.
1995-01-01
We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.
Word lengths are optimized for efficient communication.
Piantadosi, Steven T; Tily, Harry; Gibson, Edward
2011-03-01
We demonstrate a substantial improvement on one of the most celebrated empirical laws in the study of language, Zipf's 75-y-old theory that word length is primarily determined by frequency of use. In accord with rational theories of communication, we show across 10 languages that average information content is a much better predictor of word length than frequency. This indicates that human lexicons are efficiently structured for communication by taking into account interword statistical dependencies. Lexical systems result from an optimization of communicative pressures, coding meanings efficiently given the complex statistics of natural language use.
Trust-Based Service Composition and Binding for Tactical Networks with Multiple Objectives
2013-12-01
services, then it will have a set of four-tuple records for each abstract service that it can provide. We assume that the service quality of a SP in...user (i.e., a SR) does not have knowledge of the “best” service quality , so its satisfaction level with services received is based on what has been...hand, when USRm is less than USTm, identifies the culprits with low performance (by comparing the advertised service quality profile with the
Flexible Decision Support in Device-Saturated Environments
2003-10-01
also output tuples to a remote MySQL or Postgres database. 3.3 GUI The GUI allows the user to pose queries using SQL and to display query...DatabaseConnection.java – handles connections to an external database (such as MySQL or Postgres ). • Debug.java – contains the code for printing out Debug messages...also provided. It is possible to output the results of queries to a MySQL or Postgres database for archival and the GUI can query those results
Using information theory to assess the communicative capacity of circulating microRNA.
Finn, Nnenna A; Searles, Charles D
2013-10-11
The discovery of extracellular microRNAs (miRNAs) and their transport modalities (i.e., microparticles, exosomes, proteins and lipoproteins) has sparked theories regarding their role in intercellular communication. Here, we assessed the information transfer capacity of different miRNA transport modalities in human serum by utilizing basic principles of information theory. Zipf Statistics were calculated for each of the miRNA transport modalities identified in human serum. Our analyses revealed that miRNA-mediated information transfer is redundant, as evidenced by negative Zipf's Statistics with magnitudes greater than one. In healthy subjects, the potential communicative capacity of miRNA in complex with circulating proteins was significantly lower than that of miRNA encapsulated in circulating microparticles and exosomes. Moreover, the presence of coronary heart disease significantly lowered the communicative capacity of all circulating miRNA transport modalities. To assess the internal organization of circulating miRNA signals, Shannon's zero- and first-order entropies were calculated. Microparticles (MPs) exhibited the lowest Shannon entropic slope, indicating a relatively high capacity for information transfer. Furthermore, compared to the other miRNA transport modalities, MPs appeared to be the most efficient at transferring miRNA to cultured endothelial cells. Taken together, these findings suggest that although all transport modalities have the capacity for miRNA-based information transfer, MPs may be the simplest and most robust way to achieve miRNA-based signal transduction in sera. This study presents a novel method for analyzing the quantitative capacity of miRNA-mediated information transfer while providing insight into the communicative characteristics of distinct circulating miRNA transport modalities. Published by Elsevier Inc.
NASA Astrophysics Data System (ADS)
Siegel, Edward
2008-03-01
Classic statistics digits Newcomb[Am.J.Math.4,39,1881]-Weyl[Goett.Nachr.1912]-Benford[Proc.Am.Phil.Soc.78,4,51,1938]("NeWBe")probability ON-AVERAGE/MEAN log-law: =log[1+1/d]=log[(d+1)/d][google:``Benford's-Law'';"FUZZYICS": Siegel[AMS Nat.-Mtg.:2002&2008)]; Raimi[Sci.Am.221,109,1969]; Hill[Proc.AMS,123,3,887,1996]=log-base=units=SCALE-INVARIANCE!. Algebraic-inverse d=1/[ê(w)-1]: BOSONS(1924)=DIGITS(<1881): Energy-levels:ground=(d=0),first-(d=1)-excited ,... No fractions; only digit-integer-differences=quanta! Quo vadis digit
=oo vs.
<<
Highlighting entanglement of cultures via ranking of multilingual Wikipedia articles.
Eom, Young-Ho; Shepelyansky, Dima L
2013-01-01
How different cultures evaluate a person? Is an important person in one culture is also important in the other culture? We address these questions via ranking of multilingual Wikipedia articles. With three ranking algorithms based on network structure of Wikipedia, we assign ranking to all articles in 9 multilingual editions of Wikipedia and investigate general ranking structure of PageRank, CheiRank and 2DRank. In particular, we focus on articles related to persons, identify top 30 persons for each rank among different editions and analyze distinctions of their distributions over activity fields such as politics, art, science, religion, sport for each edition. We find that local heroes are dominant but also global heroes exist and create an effective network representing entanglement of cultures. The Google matrix analysis of network of cultures shows signs of the Zipf law distribution. This approach allows to examine diversity and shared characteristics of knowledge organization between cultures. The developed computational, data driven approach highlights cultural interconnections in a new perspective. Dated: June 26, 2013.
Highlighting Entanglement of Cultures via Ranking of Multilingual Wikipedia Articles
Eom, Young-Ho; Shepelyansky, Dima L.
2013-01-01
How different cultures evaluate a person? Is an important person in one culture is also important in the other culture? We address these questions via ranking of multilingual Wikipedia articles. With three ranking algorithms based on network structure of Wikipedia, we assign ranking to all articles in 9 multilingual editions of Wikipedia and investigate general ranking structure of PageRank, CheiRank and 2DRank. In particular, we focus on articles related to persons, identify top 30 persons for each rank among different editions and analyze distinctions of their distributions over activity fields such as politics, art, science, religion, sport for each edition. We find that local heroes are dominant but also global heroes exist and create an effective network representing entanglement of cultures. The Google matrix analysis of network of cultures shows signs of the Zipf law distribution. This approach allows to examine diversity and shared characteristics of knowledge organization between cultures. The developed computational, data driven approach highlights cultural interconnections in a new perspective. Dated: June 26, 2013 PMID:24098338
NASA Astrophysics Data System (ADS)
Ramsden, Jeremy J.; Naran, Deven
2007-03-01
The word frequencies of the speeches of some contemporary politicians have been determined over a decade of office. By fitting Mandelbrot's simple canonical law (a development of Zipf 's law) to the data, the average cybernetic temperature θ was determined for each year of office. Two contrasting cases were examined. The first, that of the British Prime Minister Tony Blair, showed a steady decline of θ punctuated by partial recovery following certain key events such as re-election. The second, that of the Australian Prime Minister John Howard, showed a more uniform temperature. It is suggested that the first case is an example of the phenomenon of fatigue or habituation, inevitable in any complex system rich in equilibrium states, and the partial de-habituation observed is a consequence of a sharp disturbance to the system. Given the relative ease of carrying out the analysis, it could become a routine tool regularly applied to holders of high office to determine their continuing fitness to occupy the office.
Physico-Chemical and Structural Interpretation of Discrete Derivative Indices on N-Tuples Atoms
Martínez-Santiago, Oscar; Marrero-Ponce, Yovani; Barigye, Stephen J.; Le Thi Thu, Huong; Torres, F. Javier; Zambrano, Cesar H.; Muñiz Olite, Jorge L.; Cruz-Monteagudo, Maykel; Vivas-Reyes, Ricardo; Vázquez Infante, Liliana; Artiles Martínez, Luis M.
2016-01-01
This report examines the interpretation of the Graph Derivative Indices (GDIs) from three different perspectives (i.e., in structural, steric and electronic terms). It is found that the individual vertex frequencies may be expressed in terms of the geometrical and electronic reactivity of the atoms and bonds, respectively. On the other hand, it is demonstrated that the GDIs are sensitive to progressive structural modifications in terms of: size, ramifications, electronic richness, conjugation effects and molecular symmetry. Moreover, it is observed that the GDIs quantify the interaction capacity among molecules and codify information on the activation entropy. A structure property relationship study reveals that there exists a direct correspondence between the individual frequencies of atoms and Hückel’s Free Valence, as well as between the atomic GDIs and the chemical shift in NMR, which collectively validates the theory that these indices codify steric and electronic information of the atoms in a molecule. Taking in consideration the regularity and coherence found in experiments performed with the GDIs, it is possible to say that GDIs possess plausible interpretation in structural and physicochemical terms. PMID:27240357
Sawamura, Jitsuki; Morishita, Shigeru; Ishigooka, Jun
2016-02-09
Previously, we applied basic group theory and related concepts to scales of measurement of clinical disease states and clinical findings (including laboratory data). To gain a more concrete comprehension, we here apply the concept of matrix representation, which was not explicitly exploited in our previous work. Starting with a set of orthonormal vectors, called the basis, an operator Rj (an N-tuple patient disease state at the j-th session) was expressed as a set of stratified vectors representing plural operations on individual components, so as to satisfy the group matrix representation. The stratified vectors containing individual unit operations were combined into one-dimensional square matrices [Rj]s. The [Rj]s meet the matrix representation of a group (ring) as a K-algebra. Using the same-sized matrix of stratified vectors, we can also express changes in the plural set of [Rj]s. The method is demonstrated on simple examples. Despite the incompleteness of our model, the group matrix representation of stratified vectors offers a formal mathematical approach to clinical medicine, aligning it with other branches of natural science.
NASA Astrophysics Data System (ADS)
Furusawa, Chikara; Kaneko, Kunihiko
2003-02-01
Using data from gene expression databases on various organisms and tissues, including yeast, nematodes, human normal and cancer tissues, and embryonic stem cells, we found that the abundances of expressed genes exhibit a power-law distribution with an exponent close to -1; i.e., they obey Zipf’s law. Furthermore, by simulations of a simple model with an intracellular reaction network, we found that Zipf’s law of chemical abundance is a universal feature of cells where such a network optimizes the efficiency and faithfulness of self-reproduction. These findings provide novel insights into the nature of the organization of reaction dynamics in living cells.
Checking Equivalence of SPMD Programs Using Non-Interference
2010-01-29
with it hopes to go beyond the limits of Moore’s law, but also worries that programming will become harder [5]. One of the reasons why parallel...array name in G or L, and e is an arithmetic expression of integer type. In the CUDA code shown in Section 3, b and t are represented by coreId and...b+ t. A second, optimized version of the program (using function “reverse2”, see Section 3) can be modeled as a tuple P2 = ( G ,L2, F 2), with G same
1993-09-23
answer to the question: Is subject s allowed access type a on object o? An authorization was thus seen as a 3-tuple (s,o,a). This view of access...called trusted in a Bell-LaPadula architecture. Work at Carnegie Mellon University on type enforcement contemporaneous with Denning’s was not addressed in...34Implementation Considerations for the Typed Access Matrix Model in a Distributed Environment," Proceedings of the 15th National Computer Security
Scaling and allometry in the building geometries of Greater London
NASA Astrophysics Data System (ADS)
Batty, M.; Carvalho, R.; Hudson-Smith, A.; Milton, R.; Smith, D.; Steadman, P.
2008-06-01
Many aggregate distributions of urban activities such as city sizes reveal scaling but hardly any work exists on the properties of spatial distributions within individual cities, notwithstanding considerable knowledge about their fractal structure. We redress this here by examining scaling relationships in a world city using data on the geometric properties of individual buildings. We first summarise how power laws can be used to approximate the size distributions of buildings, in analogy to city-size distributions which have been widely studied as rank-size and lognormal distributions following Zipf [ Human Behavior and the Principle of Least Effort (Addison-Wesley, Cambridge, 1949)] and Gibrat [ Les Inégalités Économiques (Librarie du Recueil Sirey, Paris, 1931)]. We then extend this analysis to allometric relationships between buildings in terms of their different geometric size properties. We present some preliminary analysis of building heights from the Emporis database which suggests very strong scaling in world cities. The data base for Greater London is then introduced from which we extract 3.6 million buildings whose scaling properties we explore. We examine key allometric relationships between these different properties illustrating how building shape changes according to size, and we extend this analysis to the classification of buildings according to land use types. We conclude with an analysis of two-point correlation functions of building geometries which supports our non-spatial analysis of scaling.
Molecular characterization of deletion breakpoints in adults with 22q11 deletion syndrome
Stachon, Andrea C.; Squire, Jeremy A.; Moldovan, Laura; Bayani, Jane; Meyn, Stephen; Chow, Eva; Bassett, Anne S.
2011-01-01
22q11 Deletion syndrome (22q11DS) is a common microdeletion syndrome with variable expression, including congenital and later onset conditions such as schizophrenia. Most studies indicate that expression does not appear to be related to length of the deletion but there is limited information on the endpoints of even the common deletion breakpoint regions in adults. We used a real-time quantitative PCR (qPCR) approach to fine map 22q11.2 deletions in 44 adults with 22q11DS, 22 with schizophrenia (SZ; 12 M, 10 F; mean age 35.7 SD 8.0 years) and 22 with no history of psychosis (NP; 8 M, 14 F; mean age 27.1 SD 8.6 years). QPCR data were consistent with clinical FISH results using the TUPLE1 or N25 probes. Two subjects (one SZ, one NP) negative for clinical FISH had atypical 22q11.2 deletions confirmed by FISH using the RP11-138C22 probe. Most (n = 34; 18 SZ, 16 NP) subjects shared a common 3 Mb hemizygous 22q11.2 deletion. However, eight subjects showed breakpoint variability: a more telomeric proximal breakpoint (n = 2), or more centromeric (n = 3) or more telomeric distal breakpoint (n = 3). One NP subject had a proximal nested 1.4 Mb deletion. COMT and TBX1 were deleted in all 44 subjects, and PRODH in 40 subjects (19 SZ, 21 NP). The results delineate proximal and distal breakpoint variants in 22q11DS. Neither deletion extent nor PRODH haploinsufficiency appeared to explain the clinical expression of schizophrenia in the present study. Further studies are needed to elucidate the molecular basis of schizophrenia and clinical heterogeneity in 22q11DS. PMID:17028864
Wong, Wing-Cheong; Ng, Hong-Kiat; Tantoso, Erwin; Soong, Richie; Eisenhaber, Frank
2018-02-12
Though earlier works on modelling transcript abundance from vertebrates to lower eukaroytes have specifically singled out the Zip's law, the observed distributions often deviate from a single power-law slope. In hindsight, while power-laws of critical phenomena are derived asymptotically under the conditions of infinite observations, real world observations are finite where the finite-size effects will set in to force a power-law distribution into an exponential decay and consequently, manifests as a curvature (i.e., varying exponent values) in a log-log plot. If transcript abundance is truly power-law distributed, the varying exponent signifies changing mathematical moments (e.g., mean, variance) and creates heteroskedasticity which compromises statistical rigor in analysis. The impact of this deviation from the asymptotic power-law on sequencing count data has never truly been examined and quantified. The anecdotal description of transcript abundance being almost Zipf's law-like distributed can be conceptualized as the imperfect mathematical rendition of the Pareto power-law distribution when subjected to the finite-size effects in the real world; This is regardless of the advancement in sequencing technology since sampling is finite in practice. Our conceptualization agrees well with our empirical analysis of two modern day NGS (Next-generation sequencing) datasets: an in-house generated dilution miRNA study of two gastric cancer cell lines (NUGC3 and AGS) and a publicly available spike-in miRNA data; Firstly, the finite-size effects causes the deviations of sequencing count data from Zipf's law and issues of reproducibility in sequencing experiments. Secondly, it manifests as heteroskedasticity among experimental replicates to bring about statistical woes. Surprisingly, a straightforward power-law correction that restores the distribution distortion to a single exponent value can dramatically reduce data heteroskedasticity to invoke an instant increase in signal-to-noise ratio by 50% and the statistical/detection sensitivity by as high as 30% regardless of the downstream mapping and normalization methods. Most importantly, the power-law correction improves concordance in significant calls among different normalization methods of a data series averagely by 22%. When presented with a higher sequence depth (4 times difference), the improvement in concordance is asymmetrical (32% for the higher sequencing depth instance versus 13% for the lower instance) and demonstrates that the simple power-law correction can increase significant detection with higher sequencing depths. Finally, the correction dramatically enhances the statistical conclusions and eludes the metastasis potential of the NUGC3 cell line against AGS of our dilution analysis. The finite-size effects due to undersampling generally plagues transcript count data with reproducibility issues but can be minimized through a simple power-law correction of the count distribution. This distribution correction has direct implication on the biological interpretation of the study and the rigor of the scientific findings. This article was reviewed by Oliviero Carugo, Thomas Dandekar and Sandor Pongor.
Strong regularities in world wide web surfing
Huberman; Pirolli; Pitkow; Lukose
1998-04-03
One of the most common modes of accessing information in the World Wide Web is surfing from one document to another along hyperlinks. Several large empirical studies have revealed common patterns of surfing behavior. A model that assumes that users make a sequence of decisions to proceed to another page, continuing as long as the value of the current page exceeds some threshold, yields the probability distribution for the number of pages that a user visits within a given Web site. This model was verified by comparing its predictions with detailed measurements of surfing patterns. The model also explains the observed Zipf-like distributions in page hits observed at Web sites.
1985-09-30
further discussed in Sections 4 and 5. "=:.’. 1 ":210 NRL REPORT 8902 Notice that I have used the plural form, OBJECTS, in Fig. 2.1 to indicate that there...Washington, DC. Artificial Intelligence Center, SRI International , Menlo Park, CA, 1978. i 8. G. Gentzen, "Investigations into Logical Deduction," The...one of the form ( relatio -sm term-i term-Z) or (tuple-name term-I ... term-a) with or without the negation operator oT, and atm-exp denotes a timed
The effect of data structures on INGRES performance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Creighton, J.R.
1987-01-01
Computer experiments were conducted to determine the effect of using Heap, ISAM, Hash and B-tree data structures for INGRES relations. Average times for retrieve, append and update were determined for searches by unique key and non-key data. The experiments were conducted on relations of approximately 1000 tuples of 332 byte width. Multiple operations were performed, where appropriate, to obtain average times. Simple models of the data structures are presented and shown to be consistent with experimental results. The models can be used to predict performance, and to select the appropriate data structure for various applications.
1985-05-01
unit in the data base, with knowing one generic assembly language. °-’--a 139 The 5-tuple describing single operation execution time of the operations...TSi-- generate , random eventi ( ,.0-15 tieit tmls - ((floa egus ()16 274 r Ispt imet imel I at :EVE’JS- II ktime=0.0; /0 present time 0/ rrs ptime=0.0...computing machinery capable of performing these tasks within a given time constraint. Because the majority of the available computing machinery is general
NASA Astrophysics Data System (ADS)
Bößwetter, Daniel
Much has been written about the pros and cons of column-orientation as a means to speed up read-mostly analytic workloads in relational databases. In this paper we try to dissect the primitive mechanisms of a database that help express the coherence of tuples and present a novel way of organizing relational data in order to exploit the advantages of both, the row-oriented and the column-oriented world. As we go, we break with yet another bad habit of databases, namely the equal granularity of reads and writes which leads us to the introduction of consecutive clusters of disk pages called super-pages.
NASA Astrophysics Data System (ADS)
Bergstrom, Lars; Reppy, John
Compilers for polymorphic languages are required to treat values in programs in an abstract and generic way at the source level. The challenges of optimizing the boxing of raw values, flattening of argument tuples, and raising the arity of functions that handle complex structures to reduce memory usage are old ones, but take on newfound import with processors that have twice as many registers. We present a novel strategy that uses both control-flow and type information to provide an arity raising implementation addressing these problems. This strategy is conservative - no matter the execution path, the transformed program will not perform extra operations.
On the problem of boundaries and scaling for urban street networks
Masucci, A. Paolo; Arcaute, Elsa; Hatna, Erez; Stanilov, Kiril; Batty, Michael
2015-01-01
Urban morphology has presented significant intellectual challenges to mathematicians and physicists ever since the eighteenth century, when Euler first explored the famous Königsberg bridges problem. Many important regularities and scaling laws have been observed in urban studies, including Zipf's law and Gibrat's law, rendering cities attractive systems for analysis within statistical physics. Nevertheless, a broad consensus on how cities and their boundaries are defined is still lacking. Applying an elementary clustering technique to the street intersection space, we show that growth curves for the maximum cluster size of the largest cities in the UK and in California collapse to a single curve, namely the logistic. Subsequently, by introducing the concept of the condensation threshold, we show that natural boundaries of cities can be well defined in a universal way. This allows us to study and discuss systematically some of the regularities that are present in cities. We show that some scaling laws present consistent behaviour in space and time, thus suggesting the presence of common principles at the basis of the evolution of urban systems. PMID:26468071
NASA Astrophysics Data System (ADS)
Kummer, E. E.; Siegel, Edward Carl-Ludwig
2011-03-01
Clock-model Archimedes [http://linkage.rockeller.edu/ wli/moved.8.04/ 1fnoise/ index. ru.html] HYPERBOLICITY inevitability throughout physics/pure-maths: Newton-law F=ma, Heisenberg and classical uncertainty-principle=Parseval/Plancherel-theorems causes FUZZYICS definition: (so miscalled) "complexity" = UTTER-SIMPLICITY!!! Watkins[www.secamlocal.ex.ac.uk/people/staff/mrwatkin/]-Hubbard[World According to Wavelets (96)-p.14!]-Franklin[1795]-Fourier[1795;1822]-Brillouin[1922] dual/inverse-space(k,w) analysis key to Fourier-unification in Archimedes hyperbolicity inevitability progress up Siegel cognition hierarchy-of-thinking (HoT): data-info.-know.-understand.-meaning-...-unity-simplicity = FUZZYICS!!! Frohlich-Mossbauer-Goldanskii-del Guidice [Nucl.Phys.B:251,375(85);275,185 (86)]-Young [arXiv-0705.4678y2, (5/31/07] theory of health/life=aqueous-electret/ ferroelectric protoplasm BEC = Archimedes-Siegel [Schrodinger Cent.Symp.(87); Symp.Fractals, MRS Fall Mtg.(89)-5-pprs] 1/w-"noise" Zipf-law power-spectrum hyperbolicity INEVITABILITY= Chi; Dirac delta-function limit w=0 concentration= BEC = Chi-Quong.
On the problem of boundaries and scaling for urban street networks.
Masucci, A Paolo; Arcaute, Elsa; Hatna, Erez; Stanilov, Kiril; Batty, Michael
2015-10-06
Urban morphology has presented significant intellectual challenges to mathematicians and physicists ever since the eighteenth century, when Euler first explored the famous Königsberg bridges problem. Many important regularities and scaling laws have been observed in urban studies, including Zipf's law and Gibrat's law, rendering cities attractive systems for analysis within statistical physics. Nevertheless, a broad consensus on how cities and their boundaries are defined is still lacking. Applying an elementary clustering technique to the street intersection space, we show that growth curves for the maximum cluster size of the largest cities in the UK and in California collapse to a single curve, namely the logistic. Subsequently, by introducing the concept of the condensation threshold, we show that natural boundaries of cities can be well defined in a universal way. This allows us to study and discuss systematically some of the regularities that are present in cities. We show that some scaling laws present consistent behaviour in space and time, thus suggesting the presence of common principles at the basis of the evolution of urban systems. © 2015 The Authors.
Rank distributions: A panoramic macroscopic outlook
NASA Astrophysics Data System (ADS)
Eliazar, Iddo I.; Cohen, Morrel H.
2014-01-01
This paper presents a panoramic macroscopic outlook of rank distributions. We establish a general framework for the analysis of rank distributions, which classifies them into five macroscopic "socioeconomic" states: monarchy, oligarchy-feudalism, criticality, socialism-capitalism, and communism. Oligarchy-feudalism is shown to be characterized by discrete macroscopic rank distributions, and socialism-capitalism is shown to be characterized by continuous macroscopic size distributions. Criticality is a transition state between oligarchy-feudalism and socialism-capitalism, which can manifest allometric scaling with multifractal spectra. Monarchy and communism are extreme forms of oligarchy-feudalism and socialism-capitalism, respectively, in which the intrinsic randomness vanishes. The general framework is applied to three different models of rank distributions—top-down, bottom-up, and global—and unveils each model's macroscopic universality and versatility. The global model yields a macroscopic classification of the generalized Zipf law, an omnipresent form of rank distributions observed across the sciences. An amalgamation of the three models establishes a universal rank-distribution explanation for the macroscopic emergence of a prevalent class of continuous size distributions, ones governed by unimodal densities with both Pareto and inverse-Pareto power-law tails.
Rank distributions: a panoramic macroscopic outlook.
Eliazar, Iddo I; Cohen, Morrel H
2014-01-01
This paper presents a panoramic macroscopic outlook of rank distributions. We establish a general framework for the analysis of rank distributions, which classifies them into five macroscopic "socioeconomic" states: monarchy, oligarchy-feudalism, criticality, socialism-capitalism, and communism. Oligarchy-feudalism is shown to be characterized by discrete macroscopic rank distributions, and socialism-capitalism is shown to be characterized by continuous macroscopic size distributions. Criticality is a transition state between oligarchy-feudalism and socialism-capitalism, which can manifest allometric scaling with multifractal spectra. Monarchy and communism are extreme forms of oligarchy-feudalism and socialism-capitalism, respectively, in which the intrinsic randomness vanishes. The general framework is applied to three different models of rank distributions-top-down, bottom-up, and global-and unveils each model's macroscopic universality and versatility. The global model yields a macroscopic classification of the generalized Zipf law, an omnipresent form of rank distributions observed across the sciences. An amalgamation of the three models establishes a universal rank-distribution explanation for the macroscopic emergence of a prevalent class of continuous size distributions, ones governed by unimodal densities with both Pareto and inverse-Pareto power-law tails.
The fast decoding of Reed-Solomon codes using number theoretic transforms
NASA Technical Reports Server (NTRS)
Reed, I. S.; Welch, L. R.; Truong, T. K.
1976-01-01
It is shown that Reed-Solomon (RS) codes can be encoded and decoded by using a fast Fourier transform (FFT) algorithm over finite fields. The arithmetic utilized to perform these transforms requires only integer additions, circular shifts and a minimum number of integer multiplications. The computing time of this transform encoder-decoder for RS codes is less than the time of the standard method for RS codes. More generally, the field GF(q) is also considered, where q is a prime of the form K x 2 to the nth power + 1 and K and n are integers. GF(q) can be used to decode very long RS codes by an efficient FFT algorithm with an improvement in the number of symbols. It is shown that a radix-8 FFT algorithm over GF(q squared) can be utilized to encode and decode very long RS codes with a large number of symbols. For eight symbols in GF(q squared), this transform over GF(q squared) can be made simpler than any other known number theoretic transform with a similar capability. Of special interest is the decoding of a 16-tuple RS code with four errors.
NASA Astrophysics Data System (ADS)
Lazo, Edmundo; Saavedra, Eduardo; Humire, Fernando; Castro, Cristobal; Cortés-Cortés, Francisco
2015-09-01
We study the localization properties of direct transmission lines when we distribute two values of inductances LA and LB according to a generalized Thue-Morse aperiodic sequence generated by the inflation rule: A → ABm-1, B → BAm-1, m ≥ 2 and integer. We regain the usual Thue-Morse sequence for m = 2. We numerically study the changes produced in the localization properties of the I (ω) electric current function with increasing m values. We demonstrate that the m = 2 case does not belong to the family m ≥ 3, because when m changes from m = 2 to m = 3, the number of extended states decreases significantly. However, for m ≫ 3, the localization properties become similar to the m = 2 case. Also, the
Bilinear effect in complex systems
NASA Astrophysics Data System (ADS)
Lam, Lui; Bellavia, David C.; Han, Xiao-Pu; Alston Liu, Chih-Hui; Shu, Chang-Qing; Wei, Zhengjin; Zhou, Tao; Zhu, Jichen
2010-09-01
The distribution of the lifetime of Chinese dynasties (as well as that of the British Isles and Japan) in a linear Zipf plot is found to consist of two straight lines intersecting at a transition point. This two-section piecewise-linear distribution is different from the power law or the stretched exponent distribution, and is called the Bilinear Effect for short. With assumptions mimicking the organization of ancient Chinese regimes, a 3-layer network model is constructed. Numerical results of this model show the bilinear effect, providing a plausible explanation of the historical data. The bilinear effect in two other social systems is presented, indicating that such a piecewise-linear effect is widespread in social systems.
Gradually truncated log-normal in USA publicly traded firm size distribution
NASA Astrophysics Data System (ADS)
Gupta, Hari M.; Campanha, José R.; de Aguiar, Daniela R.; Queiroz, Gabriel A.; Raheja, Charu G.
2007-03-01
We study the statistical distribution of firm size for USA and Brazilian publicly traded firms through the Zipf plot technique. Sale size is used to measure firm size. The Brazilian firm size distribution is given by a log-normal distribution without any adjustable parameter. However, we also need to consider different parameters of log-normal distribution for the largest firms in the distribution, which are mostly foreign firms. The log-normal distribution has to be gradually truncated after a certain critical value for USA firms. Therefore, the original hypothesis of proportional effect proposed by Gibrat is valid with some modification for very large firms. We also consider the possible mechanisms behind this distribution.
The game of go as a complex network
NASA Astrophysics Data System (ADS)
Georgeot, B.; Giraud, O.
2012-03-01
We study the game of go from a complex network perspective. We construct a directed network using a suitable definition of tactical moves including local patterns, and study this network for different datasets of professional and amateur games. The move distribution follows Zipf's law and the network is scale free, with statistical peculiarities different from other real directed networks, such as, e.g., the World Wide Web. These specificities reflect in the outcome of ranking algorithms applied to it. The fine study of the eigenvalues and eigenvectors of matrices used by the ranking algorithms singles out certain strategic situations. Our results should pave the way to a better modelization of board games and other types of human strategic scheming.
VPipe: Virtual Pipelining for Scheduling of DAG Stream Query Plans
NASA Astrophysics Data System (ADS)
Wang, Song; Gupta, Chetan; Mehta, Abhay
There are data streams all around us that can be harnessed for tremendous business and personal advantage. For an enterprise-level stream processing system such as CHAOS [1] (Continuous, Heterogeneous Analytic Over Streams), handling of complex query plans with resource constraints is challenging. While several scheduling strategies exist for stream processing, efficient scheduling of complex DAG query plans is still largely unsolved. In this paper, we propose a novel execution scheme for scheduling complex directed acyclic graph (DAG) query plans with meta-data enriched stream tuples. Our solution, called Virtual Pipelined Chain (or VPipe Chain for short), effectively extends the "Chain" pipelining scheduling approach to complex DAG query plans.
Reinforcement Learning Based Web Service Compositions for Mobile Business
NASA Astrophysics Data System (ADS)
Zhou, Juan; Chen, Shouming
In this paper, we propose a new solution to Reactive Web Service Composition, via molding with Reinforcement Learning, and introducing modified (alterable) QoS variables into the model as elements in the Markov Decision Process tuple. Moreover, we give an example of Reactive-WSC-based mobile banking, to demonstrate the intrinsic capability of the solution in question of obtaining the optimized service composition, characterized by (alterable) target QoS variable sets with optimized values. Consequently, we come to the conclusion that the solution has decent potentials in boosting customer experiences and qualities of services in Web Services, and those in applications in the whole electronic commerce and business sector.
3-base periodicity in coding DNA is affected by intercodon dinucleotides
Sánchez, Joaquín
2011-01-01
All coding DNAs exhibit 3-base periodicity (TBP), which may be defined as the tendency of nucleotides and higher order n-tuples, e.g. trinucleotides (triplets), to be preferentially spaced by 3, 6, 9 etc, bases, and we have proposed an association between TBP and clustering of same-phase triplets. We here investigated if TBP was affected by intercodon dinucleotide tendencies and whether clustering of same-phase triplets was involved. Under constant protein sequence intercodon dinucleotide frequencies depend on the distribution of synonymous codons. So, possible effects were revealed by randomly exchanging synonymous codons without altering protein sequences to subsequently document changes in TBP via frequency distribution of distances (FDD) of DNA triplets. A tripartite positive correlation was found between intercodon dinucleotide frequencies, clustering of same-phase triplets and TBP. So, intercodon C|A (where “|” indicates the boundary between codons) was more frequent in native human DNA than in the codon-shuffled sequences; higher C|A frequency occurred along with more frequent clustering of C|AN triplets (where N jointly represents A, C, G and T) and with intense CAN TBP. The opposite was found for C|G, which was less frequent in native than in shuffled sequences; lower C|G frequency occurred together with reduced clustering of C|GN triplets and with less intense CGN TBP. We hence propose that intercodon dinucleotides affect TBP via same-phase triplet clustering. A possible biological relevance of our findings is briefly discussed. PMID:21814388
Resource-efficient generation of linear cluster states by linear optics with postselection
Uskov, D. B.; Alsing, P. M.; Fanto, M. L.; ...
2015-01-30
Here we report on theoretical research in photonic cluster-state computing. Finding optimal schemes of generating non-classical photonic states is of critical importance for this field as physically implementable photon-photon entangling operations are currently limited to measurement-assisted stochastic transformations. A critical parameter for assessing the efficiency of such transformations is the success probability of a desired measurement outcome. At present there are several experimental groups that are capable of generating multi-photon cluster states carrying more than eight qubits. Separate photonic qubits or small clusters can be fused into a single cluster state by a probabilistic optical CZ gate conditioned on simultaneousmore » detection of all photons with 1/9 success probability for each gate. This design mechanically follows the original theoretical scheme of cluster state generation proposed more than a decade ago by Raussendorf, Browne, and Briegel. The optimality of the destructive CZ gate in application to linear optical cluster state generation has not been analyzed previously. Our results reveal that this method is far from the optimal one. Employing numerical optimization we have identified that the maximal success probability of fusing n unentangled dual-rail optical qubits into a linear cluster state is equal to 1/2 n-1; an m-tuple of photonic Bell pair states, commonly generated via spontaneous parametric down-conversion, can be fused into a single cluster with the maximal success probability of 1/4 m-1.« less
A Z-number-based decision making procedure with ranking fuzzy numbers method
NASA Astrophysics Data System (ADS)
Mohamad, Daud; Shaharani, Saidatull Akma; Kamis, Nor Hanimah
2014-12-01
The theory of fuzzy set has been in the limelight of various applications in decision making problems due to its usefulness in portraying human perception and subjectivity. Generally, the evaluation in the decision making process is represented in the form of linguistic terms and the calculation is performed using fuzzy numbers. In 2011, Zadeh has extended this concept by presenting the idea of Z-number, a 2-tuple fuzzy numbers that describes the restriction and the reliability of the evaluation. The element of reliability in the evaluation is essential as it will affect the final result. Since this concept can still be considered as new, available methods that incorporate reliability for solving decision making problems is still scarce. In this paper, a decision making procedure based on Z-numbers is proposed. Due to the limitation of its basic properties, Z-numbers will be first transformed to fuzzy numbers for simpler calculations. A method of ranking fuzzy number is later used to prioritize the alternatives. A risk analysis problem is presented to illustrate the effectiveness of this proposed procedure.
[Analysis of microdeletions in 22q11 in Colombian patients with congenital heart disease].
Salazar, Marleny; Villalba, Guiovanny; Mateus, Heidi; Villegas, Victoria; Fonseca, Dora; Núñez, Federico; Caicedo, Víctor; Pachón, Sonia; Bernal, Jaime E
2011-12-01
Cardiac defects are the most frequent congenital malformations, with an incidence estimated between 4 and 12 per 1000 newborns. Their etiology is multifactorial and might be attributed to genetic predispositions and environmental factors. Since 1990 these types of pathologies have been associated with 22q11 microdeletion. In this study, the frequency of microdeletion 22q11 was determined in 61 patients with non-syndromic congenital heart disease. DNA was extracted from peripheral blood and TUPLE1 and STR D10S2198 genes were amplified by multiplex PCR and visualized in agarose gels. Gene content was quantified by densitometry. Three patients were found with microdeletion 22q11, representing a 4.9% frequency. This microdeletion was associated with two cases of Tetralogy of Fallot and a third case with atrial septal defect (ASD). In conclusion, the frequency for microdeletion 22q11 in the population analyzed was 4.9%. The cases that presented Teratology of Fallot had a frequency for this microdeletion of 7.4% and for ASD of 11.1%.
Chiu, Pei-Hsun; Hsieh, Hsin-Ying; Wang, Sun-Chong
2012-01-01
Targeted cancer therapies, with specific molecular targets, ameliorate the side effect issue of radiation and chemotherapy and also point to the development of personalized medicine. Combination of drugs targeting multiple pathways of carcinogenesis is potentially more fruitful. Traditional Chinese medicine (TCM) has been tailoring herbal mixtures for individualized healthcare for two thousand years. A systematic study of the patterns of TCM formulas and herbs prescribed to cancers is valuable. We analysed a total of 187,230 TCM prescriptions to 30 types of cancer in Taiwan in 2007, a year's worth of collection from the National Health Insurance reimbursement database (Taiwan). We found that a TCM cancer prescription consists on average of two formulas and four herbs. We show that the percentage weights of TCM formulas and herbs in a TCM prescription follow Zipf's law with an exponent around 0.6. TCM prescriptions to benign neoplasms have a larger Zipf's exponent than those to malignant cancers. Furthermore, we show that TCM prescriptions, via weighted combination of formulas and herbs, are specific to not only the malignancy of neoplasms but also the sites of origins of malignant cancers. From the effects of formulas and natures of herbs that were heavily prescribed to cancers, that cancers are a 'warm and stagnant' syndrome in TCM can be proposed, suggesting anti-inflammatory regimens for better prevention and treatment of cancers. We show that TCM incorporated relevant formulas to the prescriptions to cancer patients with a secondary morbidity. We compared TCM prescriptions made in different seasons and identified temperatures as the environmental factor that correlates with changes in TCM prescriptions in Taiwan. Lung cancer patients were among the patients whose prescriptions were adjusted when temperatures drop. The findings of our study provide insight to TCM cancer treatment, helping dialogue between modern western medicine and TCM for better cancer care.
Chiu, Pei-Hsun; Hsieh, Hsin-Ying; Wang, Sun-Chong
2012-01-01
Targeted cancer therapies, with specific molecular targets, ameliorate the side effect issue of radiation and chemotherapy and also point to the development of personalized medicine. Combination of drugs targeting multiple pathways of carcinogenesis is potentially more fruitful. Traditional Chinese medicine (TCM) has been tailoring herbal mixtures for individualized healthcare for two thousand years. A systematic study of the patterns of TCM formulas and herbs prescribed to cancers is valuable. We analysed a total of 187,230 TCM prescriptions to 30 types of cancer in Taiwan in 2007, a year's worth of collection from the National Health Insurance reimbursement database (Taiwan). We found that a TCM cancer prescription consists on average of two formulas and four herbs. We show that the percentage weights of TCM formulas and herbs in a TCM prescription follow Zipf's law with an exponent around 0.6. TCM prescriptions to benign neoplasms have a larger Zipf's exponent than those to malignant cancers. Furthermore, we show that TCM prescriptions, via weighted combination of formulas and herbs, are specific to not only the malignancy of neoplasms but also the sites of origins of malignant cancers. From the effects of formulas and natures of herbs that were heavily prescribed to cancers, that cancers are a ‘warm and stagnant’ syndrome in TCM can be proposed, suggesting anti-inflammatory regimens for better prevention and treatment of cancers. We show that TCM incorporated relevant formulas to the prescriptions to cancer patients with a secondary morbidity. We compared TCM prescriptions made in different seasons and identified temperatures as the environmental factor that correlates with changes in TCM prescriptions in Taiwan. Lung cancer patients were among the patients whose prescriptions were adjusted when temperatures drop. The findings of our study provide insight to TCM cancer treatment, helping dialogue between modern western medicine and TCM for better cancer care. PMID:22359613
Optimal growth entails risky localization in population dynamics
NASA Astrophysics Data System (ADS)
Gueudré, Thomas; Martin, David G.
2018-03-01
Essential to each other, growth and exploration are jointly observed in alive and inanimate entities, such as animals, cells or goods. But how the environment's structural and temporal properties weights in this balance remains elusive. We analyze a model of stochastic growth with time correlations and diffusive dynamics that sheds light on the way populations grow and spread over general networks. This model suggests natural explanations of empirical facts in econo-physics or ecology, such as the risk-return trade-off and the Zipf law. We conclude that optimal growth leads to a localized population distribution, but such risky position can be mitigated through the space geometry. These results have broad applicability and are subsequently illustrated over an empirical study of financial data.
Stochastic Model for Phonemes Uncovers an Author-Dependency of Their Usage.
Deng, Weibing; Allahverdyan, Armen E
2016-01-01
We study rank-frequency relations for phonemes, the minimal units that still relate to linguistic meaning. We show that these relations can be described by the Dirichlet distribution, a direct analogue of the ideal-gas model in statistical mechanics. This description allows us to demonstrate that the rank-frequency relations for phonemes of a text do depend on its author. The author-dependency effect is not caused by the author's vocabulary (common words used in different texts), and is confirmed by several alternative means. This suggests that it can be directly related to phonemes. These features contrast to rank-frequency relations for words, which are both author and text independent and are governed by the Zipf's law.
A formation control strategy with coupling weights for the multi-robot system
NASA Astrophysics Data System (ADS)
Liang, Xudong; Wang, Siming; Li, Weijie
2017-12-01
The distributed formation problem of the multi-robot system with general linear dynamic characteristics and directed communication topology is discussed. In order to avoid that the multi-robot system can not maintain the desired formation in the complex communication environment, the distributed cooperative algorithm with coupling weights based on zipf distribution is designed. The asymptotic stability condition for the formation of the multi-robot system is given, and the theory of the graph and the Lyapunov theory are used to prove that the formation can converge to the desired geometry formation and the desired motion rules of the virtual leader under this condition. Nontrivial simulations are performed to validate the effectiveness of the distributed cooperative algorithm with coupling weights.
Scaling laws and fluctuations in the statistics of word frequencies
NASA Astrophysics Data System (ADS)
Gerlach, Martin; Altmann, Eduardo G.
2014-11-01
In this paper, we combine statistical analysis of written texts and simple stochastic models to explain the appearance of scaling laws in the statistics of word frequencies. The average vocabulary of an ensemble of fixed-length texts is known to scale sublinearly with the total number of words (Heaps’ law). Analyzing the fluctuations around this average in three large databases (Google-ngram, English Wikipedia, and a collection of scientific articles), we find that the standard deviation scales linearly with the average (Taylor's law), in contrast to the prediction of decaying fluctuations obtained using simple sampling arguments. We explain both scaling laws (Heaps’ and Taylor) by modeling the usage of words using a Poisson process with a fat-tailed distribution of word frequencies (Zipf's law) and topic-dependent frequencies of individual words (as in topic models). Considering topical variations lead to quenched averages, turn the vocabulary size a non-self-averaging quantity, and explain the empirical observations. For the numerous practical applications relying on estimations of vocabulary size, our results show that uncertainties remain large even for long texts. We show how to account for these uncertainties in measurements of lexical richness of texts with different lengths.
Orbifold genera, product formulas and power operations
NASA Astrophysics Data System (ADS)
Ganter, Nora
2004-07-01
We generalize the definition of orbifold elliptic genus, and introduce orbifold genera of chromatic level h, using h-tuples rather than pairs of commuting elements. We show that our genera are in fact orbifold invariants, and we prove integrality results for them. If the genus arises from an H-infinity-map into the Morava-Lubin-Tate theory E_h, then we give a formula expressing the orbifold genus of the symmetric powers of a stably almost complex manifold M in terms of the genus of M itself. Our formula is the p-typical analogue of the Dijkgraaf-Moore-Verlinde-Verlinde formula for the orbifold elliptic genus. It depends only on h and not on the genus.
Relationship of order and number of siblings to perceived parental attitudes in childhood.
Kitamura, T; Sugawara, M; Shima, S; Toda, M A
1998-06-01
Despite the increasingly recognized link between perceived parenting behavior and the onset of psychopathology in adults, studies of the possible determinants of perceptions of parenting behavior are rare. In a sample of 1,145 pregnant Japanese women, correlations were examined between the numbers and sexes of siblings and perceived rearing practices, as rated by the Parental Bonding Instrument (PBI; Parker, Tupling, & Brown, 1979). The participants with more elder sisters viewed their parents' attitudes as less caring, whereas those with more brothers, particularly younger brothers, viewed their parents' attitudes as less overprotective. However, the proportion of the variance of all the PBI scores explained by different types of siblings was very small.
A simple system for 160GHz optical terahertz wave generation and data modulation
NASA Astrophysics Data System (ADS)
Li, Yihan; He, Jingsuo; Sun, Xueming; Shi, Zexia; Wang, Ruike; Cui, Hailin; Su, Bo; Zhang, Cunlin
2018-01-01
A simple system based on two cascaded Mach-Zehnder modulators, which can generate 160GHz optical terahertz waves from 40GHz microwave sources, is simulated and tested in this paper. Fiber grating filter is used in the system to filter out optical carrier. By properly adjusting the modulator DC bias voltages and the signal voltages and phases, 4-tupling optical terahertz wave can be generated with fiber grating. This notch fiber grating filter is greatly suitable for terahertz over fiber (TOF) communication system. This scheme greatly reduces the cost of long-distance terahertz communication. Furthermore, 10Gbps digital signal is modulated in the 160GHz optical terahertz wave.
Statistical and linguistic features of DNA sequences
NASA Technical Reports Server (NTRS)
Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.
1995-01-01
We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.
NASA Astrophysics Data System (ADS)
Matassini, Lorenzo; Franci, Fabio
2001-01-01
Starting from the observation of the real trading activity, we propose a model of a stockmarket simulating all the typical phases taking place in a stock exchange. We show that there is no need of several classes of agents once one has introduced realistic constraints in order to confine money, time, gain and loss within an appropriate range. The main ingredients are local and global coupling, randomness, Zipf distribution of resources and price formation when inserting an order. The simulation starts with the initial public offer and comprises the broadcasting of news/advertisements and the building of the book, where all the selling and buying orders are stored. The model is able to reproduce fat tails and clustered volatility, the two most significant characteristics of a real stockmarket, being driven by very intuitive parameters.
The game of go as a complex network
NASA Astrophysics Data System (ADS)
Georgeot, Bertrand; Giraud, Olivier; Kandiah, Vivek
2014-03-01
We have studied the game of go, one of the most ancient and complex board games, from a complex network perspective. We have defined a proper categorization of moves taking into account the local environment, and shown that in this case Zipf's law emerges from data taken from real games. The network shows differences between professional and amateur games, different level of amateurs, or different phases of the game. Certain eigenvectors are localized on specific groups of moves which correspond to different strategies (communities of moves). The point of view developed should allow to better modelize such games and could also help to design simulators which could in the future beat good human players. Our approach could be used for other types of games, and in parallel shed light on the human decision making process.
Fault-Tolerant and Elastic Streaming MapReduce with Decentralized Coordination
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kumbhare, Alok; Frincu, Marc; Simmhan, Yogesh
2015-06-29
The MapReduce programming model, due to its simplicity and scalability, has become an essential tool for processing large data volumes in distributed environments. Recent Stream Processing Systems (SPS) extend this model to provide low-latency analysis of high-velocity continuous data streams. However, integrating MapReduce with streaming poses challenges: first, the runtime variations in data characteristics such as data-rates and key-distribution cause resource overload, that inturn leads to fluctuations in the Quality of the Service (QoS); and second, the stateful reducers, whose state depends on the complete tuple history, necessitates efficient fault-recovery mechanisms to maintain the desired QoS in the presence ofmore » resource failures. We propose an integrated streaming MapReduce architecture leveraging the concept of consistent hashing to support runtime elasticity along with locality-aware data and state replication to provide efficient load-balancing with low-overhead fault-tolerance and parallel fault-recovery from multiple simultaneous failures. Our evaluation on a private cloud shows up to 2:8 improvement in peak throughput compared to Apache Storm SPS, and a low recovery latency of 700 -1500 ms from multiple failures.« less
Mathematical Theory of Generalized Duality Quantum Computers Acting on Vector-States
NASA Astrophysics Data System (ADS)
Cao, Huai-Xin; Long, Gui-Lu; Guo, Zhi-Hua; Chen, Zheng-Li
2013-06-01
Following the idea of duality quantum computation, a generalized duality quantum computer (GDQC) acting on vector-states is defined as a tuple consisting of a generalized quantum wave divider (GQWD) and a finite number of unitary operators as well as a generalized quantum wave combiner (GQWC). It is proved that the GQWD and GQWC of a GDQC are an isometry and a co-isometry, respectively, and mutually dual. It is also proved that every GDQC gives a contraction, called a generalized duality quantum gate (GDQG). A classification of GDQCs is given and the properties of GDQGs are discussed. Some applications are obtained, including two orthogonal duality quantum computer algorithms for unsorted database search and an understanding of the Mach-Zehnder interferometer.
Integrated 3-D vision system for autonomous vehicles
NASA Astrophysics Data System (ADS)
Hou, Kun M.; Shawky, Mohamed; Tu, Xiaowei
1992-03-01
Nowadays, autonomous vehicles have become a multidiscipline field. Its evolution is taking advantage of the recent technological progress in computer architectures. As the development tools became more sophisticated, the trend is being more specialized, or even dedicated architectures. In this paper, we will focus our interest on a parallel vision subsystem integrated in the overall system architecture. The system modules work in parallel, communicating through a hierarchical blackboard, an extension of the 'tuple space' from LINDA concepts, where they may exchange data or synchronization messages. The general purpose processing elements are of different skills, built around 40 MHz i860 Intel RISC processors for high level processing and pipelined systolic array processors based on PLAs or FPGAs for low-level processing.
TEQUEL: The query language of SADDLE
NASA Technical Reports Server (NTRS)
Rajan, S. D.
1984-01-01
A relational database management system is presented that is tailored for engineering applications. A wide variety of engineering data types are supported and the data definition language (DDL) and data manipulation language (DML) are extended to handle matrices. The system can be used either in the standalone mode or through a FORTRAN or PASCAL application program. The query language is of the relational calculus type and allows the user to store, retrieve, update and delete tuples from relations. The relational operations including union, intersect and differ facilitate creation of temporary relations that can be used for manipulating information in a powerful manner. Sample applications are shown to illustrate the creation of data through a FORTRAN program and data manipulation using the TEQUEL DML.
DataSpread: Unifying Databases and Spreadsheets.
Bendre, Mangesh; Sun, Bofan; Zhang, Ding; Zhou, Xinyan; Chang, Kevin ChenChuan; Parameswaran, Aditya
2015-08-01
Spreadsheet software is often the tool of choice for ad-hoc tabular data management, processing, and visualization, especially on tiny data sets. On the other hand, relational database systems offer significant power, expressivity, and efficiency over spreadsheet software for data management, while lacking in the ease of use and ad-hoc analysis capabilities. We demonstrate DataSpread, a data exploration tool that holistically unifies databases and spreadsheets. It continues to offer a Microsoft Excel-based spreadsheet front-end, while in parallel managing all the data in a back-end database, specifically, PostgreSQL. DataSpread retains all the advantages of spreadsheets, including ease of use, ad-hoc analysis and visualization capabilities, and a schema-free nature, while also adding the advantages of traditional relational databases, such as scalability and the ability to use arbitrary SQL to import, filter, or join external or internal tables and have the results appear in the spreadsheet. DataSpread needs to reason about and reconcile differences in the notions of schema, addressing of cells and tuples, and the current "pane" (which exists in spreadsheets but not in traditional databases), and support data modifications at both the front-end and the back-end. Our demonstration will center on our first and early prototype of the DataSpread, and will give the attendees a sense for the enormous data exploration capabilities offered by unifying spreadsheets and databases.
DataSpread: Unifying Databases and Spreadsheets
Bendre, Mangesh; Sun, Bofan; Zhang, Ding; Zhou, Xinyan; Chang, Kevin ChenChuan; Parameswaran, Aditya
2015-01-01
Spreadsheet software is often the tool of choice for ad-hoc tabular data management, processing, and visualization, especially on tiny data sets. On the other hand, relational database systems offer significant power, expressivity, and efficiency over spreadsheet software for data management, while lacking in the ease of use and ad-hoc analysis capabilities. We demonstrate DataSpread, a data exploration tool that holistically unifies databases and spreadsheets. It continues to offer a Microsoft Excel-based spreadsheet front-end, while in parallel managing all the data in a back-end database, specifically, PostgreSQL. DataSpread retains all the advantages of spreadsheets, including ease of use, ad-hoc analysis and visualization capabilities, and a schema-free nature, while also adding the advantages of traditional relational databases, such as scalability and the ability to use arbitrary SQL to import, filter, or join external or internal tables and have the results appear in the spreadsheet. DataSpread needs to reason about and reconcile differences in the notions of schema, addressing of cells and tuples, and the current “pane” (which exists in spreadsheets but not in traditional databases), and support data modifications at both the front-end and the back-end. Our demonstration will center on our first and early prototype of the DataSpread, and will give the attendees a sense for the enormous data exploration capabilities offered by unifying spreadsheets and databases. PMID:26900487
Super-bridges suspended over carbon nanotube cables
NASA Astrophysics Data System (ADS)
Carpinteri, Alberto; Pugno, Nicola M.
2008-11-01
In this paper the new concept of 'super-bridges', i.e. kilometre-long bridges suspended over carbon nanotube cables, is introduced. The analysis shows that the use of realistic (thus defective) carbon nanotube bundles as suspension cables can enlarge the current limit main span by a factor of ~3. Too large compliance and dynamic self-excited resonances could be avoided by additional strands, rendering the super-bridge anchored as a spider's cobweb. As an example, we have computed the limit main spans of the current existing 19 suspended-deck bridges longer than 1 km assuming them to have substituted their cables with carbon nanotube bundles (thus maintaining the same geometry, with the exception of the length) finding spans of up to ~6.3 km. We thus suggest that the design of the Messina bridge in Italy, which would require a main span of ~3.3 km, could benefit from the use of carbon nanotube bundles. We believe that their use represents a feasible and economically convenient solution. The plausibility of these affirmations is confirmed by a statistical analysis of the existing 100 longest suspended bridges, which follow a Zipf's law with an exponent of 1.1615: we have found a Moore-like (i.e. exponential) law, in which the doubling of the capacity (here the main span) per year is substituted by the factor 1.0138. Such a law predicts that the realization of the Messina bridge using conventional materials will only occur around the middle of the present century, whereas it could be expected in the near future if carbon nanotube bundles were used. A simple cost analysis concludes the paper.
Black swans and dragon kings: A unified model
NASA Astrophysics Data System (ADS)
Eliazar, Iddo
2017-09-01
The term “black swan” is a metaphor for outlier events whose statistics are characterized by Pareto's Law and by Zipf's Law; namely, statistics governed by power-law tails. The term “dragon king” is a metaphor for a singular outlier event which, in comparison with all other outlier events, is in a league of its own. As an illustrative example consider the wealth of a family that is sampled at random from a medieval society: the nobility constitutes the black-swan category, and the royal family constitutes the dragon-king category. In this paper we present and analyze a dynamical model that generates, universally and jointly, black swans and dragon kings. According to this model, growing from the microscopic scale to the macroscopic scale, black swans and dragon kings emerge together and invariantly with respect to initial conditions.
Abe, Sumiyoshi
2002-10-01
The q-exponential distributions, which are generalizations of the Zipf-Mandelbrot power-law distribution, are frequently encountered in complex systems at their stationary states. From the viewpoint of the principle of maximum entropy, they can apparently be derived from three different generalized entropies: the Rényi entropy, the Tsallis entropy, and the normalized Tsallis entropy. Accordingly, mere fittings of observed data by the q-exponential distributions do not lead to identification of the correct physical entropy. Here, stabilities of these entropies, i.e., their behaviors under arbitrary small deformation of a distribution, are examined. It is shown that, among the three, the Tsallis entropy is stable and can provide an entropic basis for the q-exponential distributions, whereas the others are unstable and cannot represent any experimentally observable quantities.
Martínez-Santiago, O; Marrero-Ponce, Y; Vivas-Reyes, R; Rivera-Borroto, O M; Hurtado, E; Treto-Suarez, M A; Ramos, Y; Vergara-Murillo, F; Orozco-Ugarriza, M E; Martínez-López, Y
2017-05-01
Graph derivative indices (GDIs) have recently been defined over N-atoms (N = 2, 3 and 4) simultaneously, which are based on the concept of derivatives in discrete mathematics (finite difference), metaphorical to the derivative concept in classical mathematical analysis. These molecular descriptors (MDs) codify topo-chemical and topo-structural information based on the concept of the derivative of a molecular graph with respect to a given event (S) over duplex, triplex and quadruplex relations of atoms (vertices). These GDIs have been successfully applied in the description of physicochemical properties like reactivity, solubility and chemical shift, among others, and in several comparative quantitative structure activity/property relationship (QSAR/QSPR) studies. Although satisfactory results have been obtained in previous modelling studies with the aforementioned indices, it is necessary to develop new, more rigorous analysis to assess the true predictive performance of the novel structure codification. So, in the present paper, an assessment and statistical validation of the performance of these novel approaches in QSAR studies are executed, as well as a comparison with those of other QSAR procedures reported in the literature. To achieve the main aim of this research, QSARs were developed on eight chemical datasets widely used as benchmarks in the evaluation/validation of several QSAR methods and/or many different MDs (fundamentally 3D MDs). Three to seven variable QSAR models were built for each chemical dataset, according to the original dissection into training/test sets. The models were developed by using multiple linear regression (MLR) coupled with a genetic algorithm as the feature wrapper selection technique in the MobyDigs software. Each family of GDIs (for duplex, triplex and quadruplex) behaves similarly in all modelling, although there were some exceptions. However, when all families were used in combination, the results achieved were quantitatively higher than those reported by other authors in similar experiments. Comparisons with respect to external correlation coefficients (q 2 ext ) revealed that the models based on GDIs possess superior predictive ability in seven of the eight datasets analysed, outperforming methodologies based on similar or more complex techniques and confirming the good predictive power of the obtained models. For the q 2 ext values, the non-parametric comparison revealed significantly different results to those reported so far, which demonstrated that the models based on DIVATI's indices presented the best global performance and yielded significantly better predictions than the 12 0-3D QSAR procedures used in the comparison. Therefore, GDIs are suitable for structure codification of the molecules and constitute a good alternative to build QSARs for the prediction of physicochemical, biological and environmental endpoints.
Long frame sync words for binary PSK telemetry
NASA Technical Reports Server (NTRS)
Levitt, B. K.
1975-01-01
Correlation criteria have previously been established for identifying whether a given binary sequence would be a good frame sync word for phase-shift keyed telemetry. In the past, the search for a good K-bit sync word has involved the application of these criteria to the entire set of 2 exponent K binary K-tuples. It is shown that restricting this search to a much smaller subset consisting of K-bit prefixes of pseudonoise sequences results in sync words of comparable quality, with greatly reduced computer search times for larger values of K. As an example, this procedure is used to find good sync words of length 16-63; from a storage viewpoint, each of these sequences can be generated by a 5- or 6-bit linear feedback shift register.
What Is Spatio-Temporal Data Warehousing?
NASA Astrophysics Data System (ADS)
Vaisman, Alejandro; Zimányi, Esteban
In the last years, extending OLAP (On-Line Analytical Processing) systems with spatial and temporal features has attracted the attention of the GIS (Geographic Information Systems) and database communities. However, there is no a commonly agreed definition of what is a spatio-temporal data warehouse and what functionality such a data warehouse should support. Further, the solutions proposed in the literature vary considerably in the kind of data that can be represented as well as the kind of queries that can be expressed. In this paper we present a conceptual framework for defining spatio-temporal data warehouses using an extensible data type system. We also define a taxonomy of different classes of queries of increasing expressive power, and show how to express such queries using an extension of the tuple relational calculus with aggregated functions.
Alignment-free sequence comparison (II): theoretical power of comparison statistics.
Wan, Lin; Reinert, Gesine; Sun, Fengzhu; Waterman, Michael S
2010-11-01
Rapid methods for alignment-free sequence comparison make large-scale comparisons between sequences increasingly feasible. Here we study the power of the statistic D2, which counts the number of matching k-tuples between two sequences, as well as D2*, which uses centralized counts, and D2S, which is a self-standardized version, both from a theoretical viewpoint and numerically, providing an easy to use program. The power is assessed under two alternative hidden Markov models; the first one assumes that the two sequences share a common motif, whereas the second model is a pattern transfer model; the null model is that the two sequences are composed of independent and identically distributed letters and they are independent. Under the first alternative model, the means of the tuple counts in the individual sequences change, whereas under the second alternative model, the marginal means are the same as under the null model. Using the limit distributions of the count statistics under the null and the alternative models, we find that generally, asymptotically D2S has the largest power, followed by D2*, whereas the power of D2 can even be zero in some cases. In contrast, even for sequences of length 140,000 bp, in simulations D2* generally has the largest power. Under the first alternative model of a shared motif, the power of D2*approaches 100% when sufficiently many motifs are shared, and we recommend the use of D2* for such practical applications. Under the second alternative model of pattern transfer,the power for all three count statistics does not increase with sequence length when the sequence is sufficiently long, and hence none of the three statistics under consideration canbe recommended in such a situation. We illustrate the approach on 323 transcription factor binding motifs with length at most 10 from JASPAR CORE (October 12, 2009 version),verifying that D2* is generally more powerful than D2. The program to calculate the power of D2, D2* and D2S can be downloaded from http://meta.cmb.usc.edu/d2. Supplementary Material is available at www.liebertonline.com/cmb.
Si, Sheng-Li; You, Xiao-Yue; Liu, Hu-Chen; Huang, Jia
2017-08-19
Performance analysis is an important way for hospitals to achieve higher efficiency and effectiveness in providing services to their customers. The performance of the healthcare system can be measured by many indicators, but it is difficult to improve them simultaneously due to the limited resources. A feasible way is to identify the central and influential indicators to improve healthcare performance in a stepwise manner. In this paper, we propose a hybrid multiple criteria decision making (MCDM) approach to identify key performance indicators (KPIs) for holistic hospital management. First, through integrating evidential reasoning approach and interval 2-tuple linguistic variables, various assessments of performance indicators provided by healthcare experts are modeled. Then, the decision making trial and evaluation laboratory (DEMATEL) technique is adopted to build an interactive network and visualize the causal relationships between the performance indicators. Finally, an empirical case study is provided to demonstrate the proposed approach for improving the efficiency of healthcare management. The results show that "accidents/adverse events", "nosocomial infection", ''incidents/errors", "number of operations/procedures" are significant influential indicators. Also, the indicators of "length of stay", "bed occupancy" and "financial measures" play important roles in performance evaluation of the healthcare organization. The proposed decision making approach could be considered as a reference for healthcare administrators to enhance the performance of their healthcare institutions.
Tarafder, Sumit; Toukir Ahmed, Md; Iqbal, Sumaiya; Tamjidul Hoque, Md; Sohel Rahman, M
2018-03-14
Accessible surface area (ASA) of a protein residue is an effective feature for protein structure prediction, binding region identification, fold recognition problems etc. Improving the prediction of ASA by the application of effective feature variables is a challenging but explorable task to consider, specially in the field of machine learning. Among the existing predictors of ASA, REGAd 3 p is a highly accurate ASA predictor which is based on regularized exact regression with polynomial kernel of degree 3. In this work, we present a new predictor RBSURFpred, which extends REGAd 3 p on several dimensions by incorporating 58 physicochemical, evolutionary and structural properties into 9-tuple peptides via Chou's general PseAAC, which allowed us to obtain higher accuracies in predicting both real-valued and binary ASA. We have compared RBSURFpred for both real and binary space predictions with state-of-the-art predictors, such as REGAd 3 p and SPIDER2. We also have carried out a rigorous analysis of the performance of RBSURFpred in terms of different amino acids and their properties, and also with biologically relevant case-studies. The performance of RBSURFpred establishes itself as a useful tool for the community. Copyright © 2018 Elsevier Ltd. All rights reserved.
Zhang, Kejiang; Achari, Gopal; Pei, Yuansheng
2010-10-01
Different types of uncertain information-linguistic, probabilistic, and possibilistic-exist in site characterization. Their representation and propagation significantly influence the management of contaminated sites. In the absence of a framework with which to properly represent and integrate these quantitative and qualitative inputs together, decision makers cannot fully take advantage of the available and necessary information to identify all the plausible alternatives. A systematic methodology was developed in the present work to incorporate linguistic, probabilistic, and possibilistic information into the Preference Ranking Organization METHod for Enrichment Evaluation (PROMETHEE), a subgroup of Multi-Criteria Decision Analysis (MCDA) methods for ranking contaminated sites. The identification of criteria based on the paradigm of comparative risk assessment provides a rationale for risk-based prioritization. Uncertain linguistic, probabilistic, and possibilistic information identified in characterizing contaminated sites can be properly represented as numerical values, intervals, probability distributions, and fuzzy sets or possibility distributions, and linguistic variables according to their nature. These different kinds of representation are first transformed into a 2-tuple linguistic representation domain. The propagation of hybrid uncertainties is then carried out in the same domain. This methodology can use the original site information directly as much as possible. The case study shows that this systematic methodology provides more reasonable results. © 2010 SETAC.
Distance-weighted city growth.
Rybski, Diego; García Cantú Ros, Anselmo; Kropp, Jürgen P
2013-04-01
Urban agglomerations exhibit complex emergent features of which Zipf's law, i.e., a power-law size distribution, and fractality may be regarded as the most prominent ones. We propose a simplistic model for the generation of citylike structures which is solely based on the assumption that growth is more likely to take place close to inhabited space. The model involves one parameter which is an exponent determining how strongly the attraction decays with the distance. In addition, the model is run iteratively so that existing clusters can grow (together) and new ones can emerge. The model is capable of reproducing the size distribution and the fractality of the boundary of the largest cluster. Although the power-law distribution depends on both, the imposed exponent and the iteration, the fractality seems to be independent of the former and only depends on the latter. Analyzing land-cover data, we estimate the parameter-value γ≈2.5 for Paris and its surroundings.
NASA Astrophysics Data System (ADS)
Lyons, M.; Siegel, Edward Carl-Ludwig
2011-03-01
Weiss-Page-Holthaus[Physica A,341,586(04); http://arxiv.org/abs/cond-mat/0403295] number-FACTORIZATION VIA BEQS BEC VS.(?) Shor-algorithm, strongly-supporting Watkins' [www.secamlocal.ex.ac.uk/people/staff/mrwatkin/] Intersection of number-theory "pure"-maths WITH (Statistical)-Physics, as Siegel[AMS Joint.Mtg.(02)-Abs.973-60-124] Benford logarithmic-law algebraic-INVERSION to ONLY BEQS with d=0 digit P (d = 0) > = oogapFULBEC ! ! ! SiegelRiemann - hypothesisproofviaRayleigh [ Phil . Trans . CLXI (1870) ] - Polya [ Math . Ann . (21) ] - [ Random - WalksElectric - Nets . , MAA (81) ] - nderson [ PRL (58) ] - localization - Siegel [ Symp . Fractals , MRSFallMtg . (89) - 5 - papers ! ! ! ] FUZZYICS = CATEGORYICS : [ LOCALITY ]- MORPHISM / CROSSOVER / AUTMATHCAT / DIM - CAT / ANTONYM- > (GLOBALITY) FUNCTOR / SYNONYM / concomitancetonoise = / Fluct . - Dissip . theorem / FUNCTOR / SYNONYM / equivalence / proportionalityto = > generalized - susceptibilitypower - spectrum [ FLAT / FUNCTIONLESS / WHITE ]- MORPHISM / CROSSOVER / AUTMATHCAT / DIM - CAT / ANTONYM- > HYPERBOLICITY/ZIPF-law INEVITABILITY) intersection with ONLY BEQS BEC).
Power Laws and Market Crashes ---Empirical Laws on Bursting Bubbles---
NASA Astrophysics Data System (ADS)
Kaizoji, T.
In this paper, we quantitatively investigate the statistical properties of a statistical ensemble of stock prices. We selected 1200 stocks traded on the Tokyo Stock Exchange, and formed a statistical ensemble of daily stock prices for each trading day in the 3-year period from January 4, 1999 to December 28, 2001, corresponding to the period of the forming of the internet bubble in Japn, and its bursting in the Japanese stock market. We found that the tail of the complementary cumulative distribution function of the ensemble of stock prices in the high value of the price is well described by a power-law distribution, P (S > x) ˜ x^{-α}, with an exponent that moves in the range of 1.09 < α < 1.27. Furthermore, we found that as the power-law exponents α approached unity, the bubbles collapsed. This suggests that Zipf's law for stock prices is a sign that bubbles are going to burst.
Evolution of the most common English words and phrases over the centuries.
Perc, Matjaz
2012-12-07
By determining the most common English words and phrases since the beginning of the sixteenth century, we obtain a unique large-scale view of the evolution of written text. We find that the most common words and phrases in any given year had a much shorter popularity lifespan in the sixteenth century than they had in the twentieth century. By measuring how their usage propagated across the years, we show that for the past two centuries, the process has been governed by linear preferential attachment. Along with the steady growth of the English lexicon, this provides an empirical explanation for the ubiquity of Zipf's law in language statistics and confirms that writing, although undoubtedly an expression of art and skill, is not immune to the same influences of self-organization that are known to regulate processes as diverse as the making of new friends and World Wide Web growth.
NASA Astrophysics Data System (ADS)
Lin, W.; Ren, P.; Zheng, H.; Liu, X.; Huang, M.; Wada, R.; Qu, G.
2018-05-01
The experimental measures of the multiplicity derivatives—the moment parameters, the bimodal parameter, the fluctuation of maximum fragment charge number (normalized variance of Zmax, or NVZ), the Fisher exponent (τ ), and the Zipf law parameter (ξ )—are examined to search for the liquid-gas phase transition in nuclear multifragmention processes within the framework of the statistical multifragmentation model (SMM). The sensitivities of these measures are studied. All these measures predict a critical signature at or near to the critical point both for the primary and secondary fragments. Among these measures, the total multiplicity derivative and the NVZ provide accurate measures for the critical point from the final cold fragments as well as the primary fragments. The present study will provide a guide for future experiments and analyses in the study of the nuclear liquid-gas phase transition.
Rêgo, Hênio Henrique Aragão; Braunstein, Lidia A.; D′Agostino, Gregorio; Stanley, H. Eugene; Miyazima, Sasuke
2014-01-01
In linguistic studies, the academic level of the vocabulary in a text can be described in terms of statistical physics by using a “temperature” concept related to the text's word-frequency distribution. We propose a “comparative thermo-linguistic” technique to analyze the vocabulary of a text to determine its academic level and its target readership in any given language. We apply this technique to a large number of books by several authors and examine how the vocabulary of a text changes when it is translated from one language to another. Unlike the uniform results produced using the Zipf law, using our “word energy” distribution technique we find variations in the power-law behavior. We also examine some common features that span across languages and identify some intriguing questions concerning how to determine when a text is suitable for its intended readership. PMID:25353343
Rêgo, Hênio Henrique Aragão; Braunstein, Lidia A; D'Agostino, Gregorio; Stanley, H Eugene; Miyazima, Sasuke
2014-01-01
In linguistic studies, the academic level of the vocabulary in a text can be described in terms of statistical physics by using a "temperature" concept related to the text's word-frequency distribution. We propose a "comparative thermo-linguistic" technique to analyze the vocabulary of a text to determine its academic level and its target readership in any given language. We apply this technique to a large number of books by several authors and examine how the vocabulary of a text changes when it is translated from one language to another. Unlike the uniform results produced using the Zipf law, using our "word energy" distribution technique we find variations in the power-law behavior. We also examine some common features that span across languages and identify some intriguing questions concerning how to determine when a text is suitable for its intended readership.
Power-law regularities in human language
NASA Astrophysics Data System (ADS)
Mehri, Ali; Lashkari, Sahar Mohammadpour
2016-11-01
Complex structure of human language enables us to exchange very complicated information. This communication system obeys some common nonlinear statistical regularities. We investigate four important long-range features of human language. We perform our calculations for adopted works of seven famous litterateurs. Zipf's law and Heaps' law, which imply well-known power-law behaviors, are established in human language, showing a qualitative inverse relation with each other. Furthermore, the informational content associated with the words ordering, is measured by using an entropic metric. We also calculate fractal dimension of words in the text by using box counting method. The fractal dimension of each word, that is a positive value less than or equal to one, exhibits its spatial distribution in the text. Generally, we can claim that the Human language follows the mentioned power-law regularities. Power-law relations imply the existence of long-range correlations between the word types, to convey an especial idea.
Healthcare4VideoStorm: Making Smart Decisions Based on Storm Metrics.
Zhang, Weishan; Duan, Pengcheng; Chen, Xiufeng; Lu, Qinghua
2016-04-23
Storm-based stream processing is widely used for real-time large-scale distributed processing. Knowing the run-time status and ensuring performance is critical to providing expected dependability for some applications, e.g., continuous video processing for security surveillance. The existing scheduling strategies' granularity is too coarse to have good performance, and mainly considers network resources without computing resources while scheduling. In this paper, we propose Healthcare4Storm, a framework that finds Storm insights based on Storm metrics to gain knowledge from the health status of an application, finally ending up with smart scheduling decisions. It takes into account both network and computing resources and conducts scheduling at a fine-grained level using tuples instead of topologies. The comprehensive evaluation shows that the proposed framework has good performance and can improve the dependability of the Storm-based applications.
Isolation of expressed sequences from the region commonly deleted in Velo-cardio-facial syndrome
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sirotkin, H.; Morrow, B.; DasGupta, R.
Velo-cardio-facial syndrome (VCFS) is a relatively common autosomal dominant genetic disorder characterized by cleft palate, cardiac abnormalities, learning disabilities and a characteristic facial dysmorphology. Most VCFS patients have interstitial deletions of 22q11 of 1-2 mb. In an effort to isolate the gene(s) responsible for VCFS we have utilized a hybrid selection protocol to recover expressed sequences from three non-overlapping YACs comprising almost 1 mb of the commonly deleted region. Total yeast genomic DNA or isolated YAC DNA was immobilized on Hybond-N filters, blocked with yeast and human ribosomal and human repetitive sequences and hybridized with a mixture of random primedmore » short fragment cDNA libraries. Six human short fragment libraries derived from total fetus, fetal brain, adult brain, testes, thymus and spleen have been used for the selections. Short fragment cDNAs retained on the filter were passed through a second round of selection and cloned into lambda gt10. cDNAs shown to originate from the YACs and from chromosome 22 are being used to isolate full length cDNAs. Three genes known to be present on these YACs, catechol-O-methyltransferase, tuple 1 and clathrin heavy chain have been recovered. Additionally, a gene related to the murine p120 gene and a number of novel short cDNAs have been isolated. The role of these genes in VCFS is being investigated.« less
Adaptive Bloom Filter: A Space-Efficient Counting Algorithm for Unpredictable Network Traffic
NASA Astrophysics Data System (ADS)
Matsumoto, Yoshihide; Hazeyama, Hiroaki; Kadobayashi, Youki
The Bloom Filter (BF), a space-and-time-efficient hashcoding method, is used as one of the fundamental modules in several network processing algorithms and applications such as route lookups, cache hits, packet classification, per-flow state management or network monitoring. BF is a simple space-efficient randomized data structure used to represent a data set in order to support membership queries. However, BF generates false positives, and cannot count the number of distinct elements. A counting Bloom Filter (CBF) can count the number of distinct elements, but CBF needs more space than BF. We propose an alternative data structure of CBF, and we called this structure an Adaptive Bloom Filter (ABF). Although ABF uses the same-sized bit-vector used in BF, the number of hash functions employed by ABF is dynamically changed to record the number of appearances of a each key element. Considering the hash collisions, the multiplicity of a each key element on ABF can be estimated from the number of hash functions used to decode the membership of the each key element. Although ABF can realize the same functionality as CBF, ABF requires the same memory size as BF. We describe the construction of ABF and IABF (Improved ABF), and provide a mathematical analysis and simulation using Zipf's distribution. Finally, we show that ABF can be used for an unpredictable data set such as real network traffic.
Parameterized Complexity of k-Anonymity: Hardness and Tractability
NASA Astrophysics Data System (ADS)
Bonizzoni, Paola; Della Vedova, Gianluca; Dondi, Riccardo; Pirola, Yuri
The problem of publishing personal data without giving up privacy is becoming increasingly important. A precise formalization that has been recently proposed is the k-anonymity, where the rows of a table are partitioned in clusters of size at least k and all rows in a cluster become the same tuple after the suppression of some entries. The natural optimization problem, where the goal is to minimize the number of suppressed entries, is hard even when the stored values are over a binary alphabet or the table consists of a bounded number of columns. In this paper we study how the complexity of the problem is influenced by different parameters. First we show that the problem is W[1]-hard when parameterized by the value of the solution (and k). Then we exhibit a fixed-parameter algorithm when the problem is parameterized by the number of columns and the number of different values in any column.
Non-unique key B-Tree implementation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ries, D.R.
1980-12-23
The B-Trees are an indexed method to allow fast retrieval and order preserving updates to a FRAMIS relation based on a designated set of keys in the relation. A B-Tree access method is being implemented to provide indexed and sequential (in index order) access to FRAMIS relations. The implementation modifies the basic B-Tree structure to correctly allow multiple key values and still maintain the balanced page fill property of B-Trees. The data structures of the B-Tree are presented first, including the FRAMIS solution to the duplicate key value problem. Then the access level routines and utilities are presented. These routinesmore » include the original B-Tree creation; searching the B-Tree; and inserting, deleting, and replacing tuples on the B-Tree. In conclusion, the uses of the B-Tree access structures at the semantic level to enhance the FRAMIS performance are discussed. 10 figures.« less
Computer systems and methods for the query and visualization multidimensional databases
Stolte, Chris; Tang, Diane L.; Hanrahan, Patrick
2017-04-25
A method of generating a data visualization is performed at a computer having a display, one or more processors, and memory. The memory stores one or more programs for execution by the one or more processors. The process receives user specification of a plurality of characteristics of a data visualization. The data visualization is based on data from a multidimensional database. The characteristics specify at least x-position and y-position of data marks corresponding to tuples of data retrieved from the database. The process generates a data visualization according to the specified plurality of characteristics. The data visualization has an x-axis defined based on data for one or more first fields from the database that specify x-position of the data marks and the data visualization has a y-axis defined based on data for one or more second fields from the database that specify y-position of the data marks.
Data Auditor: Analyzing Data Quality Using Pattern Tableaux
NASA Astrophysics Data System (ADS)
Srivastava, Divesh
Monitoring databases maintain configuration and measurement tables about computer systems, such as networks and computing clusters, and serve important business functions, such as troubleshooting customer problems, analyzing equipment failures, planning system upgrades, etc. These databases are prone to many data quality issues: configuration tables may be incorrect due to data entry errors, while measurement tables may be affected by incorrect, missing, duplicate and delayed polls. We describe Data Auditor, a tool for analyzing data quality and exploring data semantics of monitoring databases. Given a user-supplied constraint, such as a boolean predicate expected to be satisfied by every tuple, a functional dependency, or an inclusion dependency, Data Auditor computes "pattern tableaux", which are concise summaries of subsets of the data that satisfy or fail the constraint. We discuss the architecture of Data Auditor, including the supported types of constraints and the tableau generation mechanism. We also show the utility of our approach on an operational network monitoring database.
A data fusion approach to indications and warnings of terrorist attacks
NASA Astrophysics Data System (ADS)
McDaniel, David; Schaefer, Gregory
2014-05-01
Indications and Warning (I&W) of terrorist attacks, particularly IED attacks, require detection of networks of agents and patterns of behavior. Social Network Analysis tries to detect a network; activity analysis tries to detect anomalous activities. This work builds on both to detect elements of an activity model of terrorist attack activity - the agents, resources, networks, and behaviors. The activity model is expressed as RDF triples statements where the tuple positions are elements or subsets of a formal ontology for activity models. The advantage of a model is that elements are interdependent and evidence for or against one will influence others so that there is a multiplier effect. The advantage of the formality is that detection could occur hierarchically, that is, at different levels of abstraction. The model matching is expressed as a likelihood ratio between input text and the model triples. The likelihood ratio is designed to be analogous to track correlation likelihood ratios common in JDL fusion level 1. This required development of a semantic distance metric for positive and null hypotheses as well as for complex objects. The metric uses the Web 1Terabype database of one to five gram frequencies for priors. This size requires the use of big data technologies so a Hadoop cluster is used in conjunction with OpenNLP natural language and Mahout clustering software. Distributed data fusion Map Reduce jobs distribute parts of the data fusion problem to the Hadoop nodes. For the purposes of this initial testing, open source models and text inputs of similar complexity to terrorist events were used as surrogates for the intended counter-terrorist application.
Si, Sheng-Li; You, Xiao-Yue; Huang, Jia
2017-01-01
Performance analysis is an important way for hospitals to achieve higher efficiency and effectiveness in providing services to their customers. The performance of the healthcare system can be measured by many indicators, but it is difficult to improve them simultaneously due to the limited resources. A feasible way is to identify the central and influential indicators to improve healthcare performance in a stepwise manner. In this paper, we propose a hybrid multiple criteria decision making (MCDM) approach to identify key performance indicators (KPIs) for holistic hospital management. First, through integrating evidential reasoning approach and interval 2-tuple linguistic variables, various assessments of performance indicators provided by healthcare experts are modeled. Then, the decision making trial and evaluation laboratory (DEMATEL) technique is adopted to build an interactive network and visualize the causal relationships between the performance indicators. Finally, an empirical case study is provided to demonstrate the proposed approach for improving the efficiency of healthcare management. The results show that “accidents/adverse events”, “nosocomial infection”, ‘‘incidents/errors”, “number of operations/procedures” are significant influential indicators. Also, the indicators of “length of stay”, “bed occupancy” and “financial measures” play important roles in performance evaluation of the healthcare organization. The proposed decision making approach could be considered as a reference for healthcare administrators to enhance the performance of their healthcare institutions. PMID:28825613
Li, Hongtao; Guo, Feng; Zhang, Wenyin; Wang, Jie; Xing, Jinsheng
2018-02-14
The widely use of IoT technologies in healthcare services has pushed forward medical intelligence level of services. However, it also brings potential privacy threat to the data collection. In healthcare services system, health and medical data that contains privacy information are often transmitted among networks, and such privacy information should be protected. Therefore, there is a need for privacy-preserving data collection (PPDC) scheme to protect clients (patients) data. We adopt (a,k)-anonymity model as privacy pretection scheme for data collection, and propose a novel anonymity-based PPDC method for healthcare services in this paper. The threat model is analyzed in the client-server-to-user (CS2U) model. On client-side, we utilize (a,k)-anonymity notion to generate anonymous tuples which can resist possible attack, and adopt a bottom-up clustering method to create clusters that satisfy a base privacy level of (a 1 ,k 1 )-anonymity. On server-side, we reduce the communication cost through generalization technology, and compress (a 1 ,k 1 )-anonymous data through an UPGMA-based cluster combination method to make the data meet the deeper level of privacy (a 2 ,k 2 )-anonymity (a 1 ≥ a 2 , k 2 ≥ k 1 ). Theoretical analysis and experimental results prove that our scheme is effective in privacy-preserving and data quality.
Innovation and nested preferential growth in chess playing behavior
NASA Astrophysics Data System (ADS)
Perotti, J. I.; Jo, H.-H.; Schaigorodsky, A. L.; Billoni, O. V.
2013-11-01
Complexity develops via the incorporation of innovative properties. Chess is one of the most complex strategy games, where expert contenders exercise decision making by imitating old games or introducing innovations. In this work, we study innovation in chess by analyzing how different move sequences are played at the population level. It is found that the probability of exploring a new or innovative move decreases as a power law with the frequency of the preceding move sequence. Chess players also exploit already known move sequences according to their frequencies, following a preferential growth mechanism. Furthermore, innovation in chess exhibits Heaps' law suggesting similarities with the process of vocabulary growth. We propose a robust generative mechanism based on nested Yule-Simon preferential growth processes that reproduces the empirical observations. These results, supporting the self-similar nature of innovations in chess are important in the context of decision making in a competitive scenario, and extend the scope of relevant findings recently discovered regarding the emergence of Zipf's law in chess.
Text mixing shapes the anatomy of rank-frequency distributions
NASA Astrophysics Data System (ADS)
Williams, Jake Ryland; Bagrow, James P.; Danforth, Christopher M.; Dodds, Peter Sheridan
2015-05-01
Natural languages are full of rules and exceptions. One of the most famous quantitative rules is Zipf's law, which states that the frequency of occurrence of a word is approximately inversely proportional to its rank. Though this "law" of ranks has been found to hold across disparate texts and forms of data, analyses of increasingly large corpora since the late 1990s have revealed the existence of two scaling regimes. These regimes have thus far been explained by a hypothesis suggesting a separability of languages into core and noncore lexica. Here we present and defend an alternative hypothesis that the two scaling regimes result from the act of aggregating texts. We observe that text mixing leads to an effective decay of word introduction, which we show provides accurate predictions of the location and severity of breaks in scaling. Upon examining large corpora from 10 languages in the Project Gutenberg eBooks collection, we find emphatic empirical support for the universality of our claim.
Coarse-Grained Models for Automated Fragmentation and Parametrization of Molecular Databases.
Fraaije, Johannes G E M; van Male, Jan; Becherer, Paul; Serral Gracià, Rubèn
2016-12-27
We calibrate coarse-grained interaction potentials suitable for screening large data sets in top-down fashion. Three new algorithms are introduced: (i) automated decomposition of molecules into coarse-grained units (fragmentation); (ii) Coarse-Grained Reference Interaction Site Model-Hypernetted Chain (CG RISM-HNC) as an intermediate proxy for dissipative particle dynamics (DPD); and (iii) a simple top-down coarse-grained interaction potential/model based on activity coefficient theories from engineering (using COSMO-RS). We find that the fragment distribution follows Zipf and Heaps scaling laws. The accuracy in Gibbs energy of mixing calculations is a few tenths of a kilocalorie per mole. As a final proof of principle, we use full coarse-grained sampling through DPD thermodynamics integration to calculate log P OW for 4627 compounds with an average error of 0.84 log unit. The computational speeds per calculation are a few seconds for CG RISM-HNC and a few minutes for DPD thermodynamic integration.
Zipf’s word frequency law in natural language: A critical review and future directions
2014-01-01
The frequency distribution of words has been a key object of study in statistical linguistics for the past 70 years. This distribution approximately follows a simple mathematical form known as Zipf ’ s law. This article first shows that human language has a highly complex, reliable structure in the frequency distribution over and above this classic law, although prior data visualization methods have obscured this fact. A number of empirical phenomena related to word frequencies are then reviewed. These facts are chosen to be informative about the mechanisms giving rise to Zipf’s law and are then used to evaluate many of the theoretical explanations of Zipf’s law in language. No prior account straightforwardly explains all the basic facts or is supported with independent evaluation of its underlying assumptions. To make progress at understanding why language obeys Zipf’s law, studies must seek evidence beyond the law itself, testing assumptions and evaluating novel predictions with new, independent data. PMID:24664880
NASA Astrophysics Data System (ADS)
Buldyrev, S. V.; Pammolli, F.; Riccaboni, M.; Yamasaki, K.; Fu, D.-F.; Matia, K.; Stanley, H. E.
2007-05-01
We present a preferential attachment growth model to obtain the distribution P(K) of number of units K in the classes which may represent business firms or other socio-economic entities. We found that P(K) is described in its central part by a power law with an exponent ϕ = 2+b/(1-b) which depends on the probability of entry of new classes, b. In a particular problem of city population this distribution is equivalent to the well known Zipf law. In the absence of the new classes entry, the distribution P(K) is exponential. Using analytical form of P(K) and assuming proportional growth for units, we derive P(g), the distribution of business firm growth rates. The model predicts that P(g) has a Laplacian cusp in the central part and asymptotic power-law tails with an exponent ζ = 3. We test the analytical expressions derived using heuristic arguments by simulations. The model might also explain the size-variance relationship of the firm growth rates.
NASA Astrophysics Data System (ADS)
Gershenson, Carlos
Studies of rank distributions have been popular for decades, especially since the work of Zipf. For example, if we rank words of a given language by use frequency (most used word in English is 'the', rank 1; second most common word is 'of', rank 2), the distribution can be approximated roughly with a power law. The same applies for cities (most populated city in a country ranks first), earthquakes, metabolism, the Internet, and dozens of other phenomena. We recently proposed ``rank diversity'' to measure how ranks change in time, using the Google Books Ngram dataset. Studying six languages between 1800 and 2009, we found that the rank diversity curves of languages are universal, adjusted with a sigmoid on log-normal scale. We are studying several other datasets (sports, economies, social systems, urban systems, earthquakes, artificial life). Rank diversity seems to be universal, independently of the shape of the rank distribution. I will present our work in progress towards a general description of the features of rank change in time, along with simple models which reproduce it
Exploring empirical rank-frequency distributions longitudinally through a simple stochastic process.
Finley, Benjamin J; Kilkki, Kalevi
2014-01-01
The frequent appearance of empirical rank-frequency laws, such as Zipf's law, in a wide range of domains reinforces the importance of understanding and modeling these laws and rank-frequency distributions in general. In this spirit, we utilize a simple stochastic cascade process to simulate several empirical rank-frequency distributions longitudinally. We focus especially on limiting the process's complexity to increase accessibility for non-experts in mathematics. The process provides a good fit for many empirical distributions because the stochastic multiplicative nature of the process leads to an often observed concave rank-frequency distribution (on a log-log scale) and the finiteness of the cascade replicates real-world finite size effects. Furthermore, we show that repeated trials of the process can roughly simulate the longitudinal variation of empirical ranks. However, we find that the empirical variation is often less that the average simulated process variation, likely due to longitudinal dependencies in the empirical datasets. Finally, we discuss the process limitations and practical applications.
Text mixing shapes the anatomy of rank-frequency distributions.
Williams, Jake Ryland; Bagrow, James P; Danforth, Christopher M; Dodds, Peter Sheridan
2015-05-01
Natural languages are full of rules and exceptions. One of the most famous quantitative rules is Zipf's law, which states that the frequency of occurrence of a word is approximately inversely proportional to its rank. Though this "law" of ranks has been found to hold across disparate texts and forms of data, analyses of increasingly large corpora since the late 1990s have revealed the existence of two scaling regimes. These regimes have thus far been explained by a hypothesis suggesting a separability of languages into core and noncore lexica. Here we present and defend an alternative hypothesis that the two scaling regimes result from the act of aggregating texts. We observe that text mixing leads to an effective decay of word introduction, which we show provides accurate predictions of the location and severity of breaks in scaling. Upon examining large corpora from 10 languages in the Project Gutenberg eBooks collection, we find emphatic empirical support for the universality of our claim.
Size distribution of Portuguese firms between 2006 and 2012
NASA Astrophysics Data System (ADS)
Pascoal, Rui; Augusto, Mário; Monteiro, A. M.
2016-09-01
This study aims to describe the size distribution of Portuguese firms, as measured by annual sales and total assets, between 2006 and 2012, giving an economic interpretation for the evolution of the distribution along the time. Three distributions are fitted to data: the lognormal, the Pareto (and as a particular case Zipf) and the Simplified Canonical Law (SCL). We present the main arguments found in literature to justify the use of distributions and emphasize the interpretation of SCL coefficients. Methods of estimation include Maximum Likelihood, modified Ordinary Least Squares in log-log scale and Nonlinear Least Squares considering the Levenberg-Marquardt algorithm. When applying these approaches to Portuguese's firms data, we analyze if the evolution of estimated parameters in both lognormal power and SCL is in accordance with the known existence of a recession period after 2008. This is confirmed for sales but not for assets, leading to the conclusion that the first variable is a best proxy for firm size.
Zipf's Law in Short-Time Timbral Codings of Speech, Music, and Environmental Sound Signals
Haro, Martín; Serrà, Joan; Herrera, Perfecto; Corral, Álvaro
2012-01-01
Timbre is a key perceptual feature that allows discrimination between different sounds. Timbral sensations are highly dependent on the temporal evolution of the power spectrum of an audio signal. In order to quantitatively characterize such sensations, the shape of the power spectrum has to be encoded in a way that preserves certain physical and perceptual properties. Therefore, it is common practice to encode short-time power spectra using psychoacoustical frequency scales. In this paper, we study and characterize the statistical properties of such encodings, here called timbral code-words. In particular, we report on rank-frequency distributions of timbral code-words extracted from 740 hours of audio coming from disparate sources such as speech, music, and environmental sounds. Analogously to text corpora, we find a heavy-tailed Zipfian distribution with exponent close to one. Importantly, this distribution is found independently of different encoding decisions and regardless of the audio source. Further analysis on the intrinsic characteristics of most and least frequent code-words reveals that the most frequent code-words tend to have a more homogeneous structure. We also find that speech and music databases have specific, distinctive code-words while, in the case of the environmental sounds, this database-specific code-words are not present. Finally, we find that a Yule-Simon process with memory provides a reasonable quantitative approximation for our data, suggesting the existence of a common simple generative mechanism for all considered sound sources. PMID:22479497
Bell, Michael J; Gillespie, Colin S; Swan, Daniel; Lord, Phillip
2012-09-15
Annotations are a key feature of many biological databases, used to convey our knowledge of a sequence to the reader. Ideally, annotations are curated manually, however manual curation is costly, time consuming and requires expert knowledge and training. Given these issues and the exponential increase of data, many databases implement automated annotation pipelines in an attempt to avoid un-annotated entries. Both manual and automated annotations vary in quality between databases and annotators, making assessment of annotation reliability problematic for users. The community lacks a generic measure for determining annotation quality and correctness, which we look at addressing within this article. Specifically we investigate word reuse within bulk textual annotations and relate this to Zipf's Principle of Least Effort. We use the UniProt Knowledgebase (UniProtKB) as a case study to demonstrate this approach since it allows us to compare annotation change, both over time and between automated and manually curated annotations. By applying power-law distributions to word reuse in annotation, we show clear trends in UniProtKB over time, which are consistent with existing studies of quality on free text English. Further, we show a clear distinction between manual and automated analysis and investigate cohorts of protein records as they mature. These results suggest that this approach holds distinct promise as a mechanism for judging annotation quality. Source code is available at the authors website: http://homepages.cs.ncl.ac.uk/m.j.bell1/annotation. phillip.lord@newcastle.ac.uk.
Spatio-Temporal Evolution and Scaling Properties of Human Settlements (Invited)
NASA Astrophysics Data System (ADS)
Small, C.; Milesi, C.; Elvidge, C.; Baugh, K.; Henebry, G. M.; Nghiem, S. V.
2013-12-01
Growth and evolution of cities and smaller settlements is usually studied in the context of population and other socioeconomic variables. While this is logical in the sense that settlements are groups of humans engaged in socioeconomic processes, our means of collecting information about spatio-temporal distributions of population and socioeconomic variables often lack the spatial and temporal resolution to represent the processes at scales which they are known to occur. Furthermore, metrics and definitions often vary with country and through time. However, remote sensing provides globally consistent, synoptic observations of several proxies for human settlement at spatial and temporal resolutions sufficient to represent the evolution of settlements over the past 40 years. We use several independent but complementary proxies for anthropogenic land cover to quantify spatio-temporal (ST) evolution and scaling properties of human settlements globally. In this study we begin by comparing land cover and night lights in 8 diverse settings - each spanning gradients of population density and degree of land surface modification. Stable anthropogenic night light is derived from multi-temporal composites of emitted luminance measured by the VIIRS and DMSP-OLS sensors. Land cover is represented as mixtures of sub-pixel fractions of rock, soil and impervious Substrates, Vegetation and Dark surfaces (shadow, water and absorptive materials) estimated from Landsat imagery with > 94% accuracy. Multi-season stability and variability of land cover fractions effectively distinguishes between spectrally similar land covers that corrupt thematic classifications based on single images. We find that temporal stability of impervious substrates combined with persistent shadow cast between buildings results in temporally stable aggregate reflectance across seasons at the 30 m scale of a Landsat pixel. Comparison of night light brightness with land cover composition, stability and variability yields several consistent relationships that persist across a variety of settlement types and physical environments. We use the multiple threshold method of Small et al (2011) to represent a continuum of settlement density by segmenting both night light brightness and multi-season land cover characteristics. Rank-size distributions of spatially contiguous segments quantify scaling and connectivity of land cover. Spatial and temporal evolution of rank-size distributions is consistent with power laws as suggested by Zipf's Law for city size based on population. However, unlike Zipf's Law, the observed distributions persist to global scales in which the larger agglomerations are much larger than individual cities. The scaling relations observed extend from the scale of cities and smaller settlements up to vast spatial networks of interconnected settlements.
Data analytics for simplifying thermal efficiency planning in cities
Abdolhosseini Qomi, Mohammad Javad; Noshadravan, Arash; Sobstyl, Jake M.; Toole, Jameson; Ferreira, Joseph; Pellenq, Roland J.-M.; Ulm, Franz-Josef; Gonzalez, Marta C.
2016-01-01
More than 44% of building energy consumption in the USA is used for space heating and cooling, and this accounts for 20% of national CO2 emissions. This prompts the need to identify among the 130 million households in the USA those with the greatest energy-saving potential and the associated costs of the path to reach that goal. Whereas current solutions address this problem by analysing each building in detail, we herein reduce the dimensionality of the problem by simplifying the calculations of energy losses in buildings. We present a novel inference method that can be used via a ranking algorithm that allows us to estimate the potential energy saving for heating purposes. To that end, we only need consumption from records of gas bills integrated with a building's footprint. The method entails a statistical screening of the intricate interplay between weather, infrastructural and residents' choice variables to determine building gas consumption and potential savings at a city scale. We derive a general statistical pattern of consumption in an urban settlement, reducing it to a set of the most influential buildings' parameters that operate locally. By way of example, the implications are explored using records of a set of (N = 6200) buildings in Cambridge, MA, USA, which indicate that retrofitting only 16% of buildings entails a 40% reduction in gas consumption of the whole building stock. We find that the inferred heat loss rate of buildings exhibits a power-law data distribution akin to Zipf's law, which provides a means to map an optimum path for gas savings per retrofit at a city scale. These findings have implications for improving the thermal efficiency of cities' building stock, as outlined by current policy efforts seeking to reduce home heating and cooling energy consumption and lower associated greenhouse gas emissions. PMID:27097652
iRSpot-EL: identify recombination spots with an ensemble learning approach.
Liu, Bin; Wang, Shanyi; Long, Ren; Chou, Kuo-Chen
2017-01-01
Coexisting in a DNA system, meiosis and recombination are two indispensible aspects for cell reproduction and growth. With the avalanche of genome sequences emerging in the post-genomic age, it is an urgent challenge to acquire the information of DNA recombination spots because it can timely provide very useful insights into the mechanism of meiotic recombination and the process of genome evolution. To address such a challenge, we have developed a predictor, called IRSPOT-EL: , by fusing different modes of pseudo K-tuple nucleotide composition and mode of dinucleotide-based auto-cross covariance into an ensemble classifier of clustering approach. Five-fold cross tests on a widely used benchmark dataset have indicated that the new predictor remarkably outperforms its existing counterparts. Particularly, far beyond their reach, the new predictor can be easily used to conduct the genome-wide analysis and the results obtained are quite consistent with the experimental map. For the convenience of most experimental scientists, a user-friendly web-server for iRSpot-EL has been established at http://bioinformatics.hitsz.edu.cn/iRSpot-EL/, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved. bliu@gordonlifescience.org or bliu@insun.hit.edu.cnSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A new approach to preserve privacy data mining based on fuzzy theory in numerical database
NASA Astrophysics Data System (ADS)
Cui, Run; Kim, Hyoung Joong
2014-01-01
With the rapid development of information techniques, data mining approaches have become one of the most important tools to discover the in-deep associations of tuples in large-scale database. Hence how to protect the private information is quite a huge challenge, especially during the data mining procedure. In this paper, a new method is proposed for privacy protection which is based on fuzzy theory. The traditional fuzzy approach in this area will apply fuzzification to the data without considering its readability. A new style of obscured data expression is introduced to provide more details of the subsets without reducing the readability. Also we adopt a balance approach between the privacy level and utility when to achieve the suitable subgroups. An experiment is provided to show that this approach is suitable for the classification without a lower accuracy. In the future, this approach can be adapted to the data stream as the low computation complexity of the fuzzy function with a suitable modification.
Tuple spaces in hardware for accelerated implicit routing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Zachary Kent; Tripp, Justin
2010-12-01
Organizing and optimizing data objects on networks with support for data migration and failing nodes is a complicated problem to handle as systems grow. The goal of this work is to demonstrate that high levels of speedup can be achieved by moving responsibility for finding, fetching, and staging data into an FPGA-based network card. We present a system for implicit routing of data via FPGA-based network cards. In this system, data structures are requested by name, and the network of FPGAs finds the data within the network and relays the structure to the requester. This is acheived through successive examinationmore » of hardware hash tables implemented in the FPGA. By avoiding software stacks between nodes, the data is quickly fetched entirely through FPGA-FPGA interaction. The performance of this system is orders of magnitude faster than software implementations due to the improved speed of the hash tables and lowered latency between the network nodes.« less
Scalable Regression Tree Learning on Hadoop using OpenPlanet
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yin, Wei; Simmhan, Yogesh; Prasanna, Viktor
As scientific and engineering domains attempt to effectively analyze the deluge of data arriving from sensors and instruments, machine learning is becoming a key data mining tool to build prediction models. Regression tree is a popular learning model that combines decision trees and linear regression to forecast numerical target variables based on a set of input features. Map Reduce is well suited for addressing such data intensive learning applications, and a proprietary regression tree algorithm, PLANET, using MapReduce has been proposed earlier. In this paper, we describe an open source implement of this algorithm, OpenPlanet, on the Hadoop framework usingmore » a hybrid approach. Further, we evaluate the performance of OpenPlanet using realworld datasets from the Smart Power Grid domain to perform energy use forecasting, and propose tuning strategies of Hadoop parameters to improve the performance of the default configuration by 75% for a training dataset of 17 million tuples on a 64-core Hadoop cluster on FutureGrid.« less
Working with HITRAN Database Using Hapi: HITRAN Application Programming Interface
NASA Astrophysics Data System (ADS)
Kochanov, Roman V.; Hill, Christian; Wcislo, Piotr; Gordon, Iouli E.; Rothman, Laurence S.; Wilzewski, Jonas
2015-06-01
A HITRAN Application Programing Interface (HAPI) has been developed to allow users on their local machines much more flexibility and power. HAPI is a programming interface for the main data-searching capabilities of the new "HITRANonline" web service (http://www.hitran.org). It provides the possibility to query spectroscopic data from the HITRAN database in a flexible manner using either functions or query language. Some of the prominent current features of HAPI are: a) Downloading line-by-line data from the HITRANonline site to a local machine b) Filtering and processing the data in SQL-like fashion c) Conventional Python structures (lists, tuples, and dictionaries) for representing spectroscopic data d) Possibility to use a large set of third-party Python libraries to work with the data e) Python implementation of the HT lineshape which can be reduced to a number of conventional line profiles f) Python implementation of total internal partition sums (TIPS-2011) for spectra simulations g) High-resolution spectra calculation accounting for pressure, temperature and optical path length h) Providing instrumental functions to simulate experimental spectra i) Possibility to extend HAPI's functionality by custom line profiles, partitions sums and instrumental functions Currently the API is a module written in Python and uses Numpy library providing fast array operations. The API is designed to deal with data in multiple formats such as ASCII, CSV, HDF5 and XSAMS. This work has been supported by NASA Aura Science Team Grant NNX14AI55G and NASA Planetary Atmospheres Grant NNX13AI59G. L.S. Rothman et al. JQSRT, Volume 130, 2013, Pages 4-50 N.H. Ngo et al. JQSRT, Volume 129, November 2013, Pages 89-100 A. L. Laraia at al. Icarus, Volume 215, Issue 1, September 2011, Pages 391-400
Multi-material decomposition of spectral CT images
NASA Astrophysics Data System (ADS)
Mendonça, Paulo R. S.; Bhotika, Rahul; Maddah, Mahnaz; Thomsen, Brian; Dutta, Sandeep; Licato, Paul E.; Joshi, Mukta C.
2010-04-01
Spectral Computed Tomography (Spectral CT), and in particular fast kVp switching dual-energy computed tomography, is an imaging modality that extends the capabilities of conventional computed tomography (CT). Spectral CT enables the estimation of the full linear attenuation curve of the imaged subject at each voxel in the CT volume, instead of a scalar image in Hounsfield units. Because the space of linear attenuation curves in the energy ranges of medical applications can be accurately described through a two-dimensional manifold, this decomposition procedure would be, in principle, limited to two materials. This paper describes an algorithm that overcomes this limitation, allowing for the estimation of N-tuples of material-decomposed images. The algorithm works by assuming that the mixing of substances and tissue types in the human body has the physicochemical properties of an ideal solution, which yields a model for the density of the imaged material mix. Under this model the mass attenuation curve of each voxel in the image can be estimated, immediately resulting in a material-decomposed image triplet. Decomposition into an arbitrary number of pre-selected materials can be achieved by automatically selecting adequate triplets from an application-specific material library. The decomposition is expressed in terms of the volume fractions of each constituent material in the mix; this provides for a straightforward, physically meaningful interpretation of the data. One important application of this technique is in the digital removal of contrast agent from a dual-energy exam, producing a virtual nonenhanced image, as well as in the quantification of the concentration of contrast observed in a targeted region, thus providing an accurate measure of tissue perfusion.
NASA Astrophysics Data System (ADS)
Nicolis, John S.; Katsikas, Anastassis A.
Collective parameters such as the Zipf's law-like statistics, the Transinformation, the Block Entropy and the Markovian character are compared for natural, genetic, musical and artificially generated long texts from generating partitions (alphabets) on homogeneous as well as on multifractal chaotic maps. It appears that minimal requirements for a language at the syntactical level such as memory, selectivity of few keywords and broken symmetry in one dimension (polarity) are more or less met by dynamically iterating simple maps or flows e.g. very simple chaotic hardware. The same selectivity is observed at the semantic level where the aim refers to partitioning a set of enviromental impinging stimuli onto coexisting attractors-categories. Under the regime of pattern recognition and classification, few key features of a pattern or few categories claim the lion's share of the information stored in this pattern and practically, only these key features are persistently scanned by the cognitive processor. A multifractal attractor model can in principle explain this high selectivity, both at the syntactical and the semantic levels.
NASA Technical Reports Server (NTRS)
Zipf, E. C.
1986-01-01
The ratio of the cross sections for the direct and dissociative excitation of the OI(3s 3S0-2p 3P; 1304 A wavelength) transition, sigma A/sigma D, are accurately determined, and the sigma A/sigma D ratio is directly normalized to the ratio of the O(+) and O2(+) ionization cross sections using a high-density diffuse gas source, an electrostatically focused electron gun, a vacuum-ultraviolet monochromater, and a quadrupole mass spectrometer for simultaneous optical and composition measurements. Using revised sigma A(1304 A) values calculated with new calibration standards, the shape of the cross section for the excitation of the O(3s 3S0) state agrees well with previous results, though the absolute magnitude of sigma A(1304 A) is smaller than the results of Stone and Zipf (1974) by a factor of 2.8. The revised cross sections agree well with recent quantum calculations when cascade excitation of the 3s 3S0 state is taken into account.
Cache Scheme Based on Pre-Fetch Operation in ICN
Duan, Jie; Wang, Xiong; Xu, Shizhong; Liu, Yuanni; Xu, Chuan; Zhao, Guofeng
2016-01-01
Many recent researches focus on ICN (Information-Centric Network), in which named content becomes the first citizen instead of end-host. In ICN, Named content can be further divided into many small sized chunks, and chunk-based communication has merits over content-based communication. The universal in-network cache is one of the fundamental infrastructures for ICN. In this work, a chunk-level cache mechanism based on pre-fetch operation is proposed. The main idea is that, routers with cache store should pre-fetch and cache the next chunks which may be accessed in the near future according to received requests and cache policy for reducing the users’ perceived latency. Two pre-fetch driven modes are present to answer when and how to pre-fetch. The LRU (Least Recently Used) is employed for the cache replacement. Simulation results show that the average user perceived latency and hops can be decreased by employed this cache mechanism based on pre-fetch operation. Furthermore, we also demonstrate that the results are influenced by many factors, such as the cache capacity, Zipf parameters and pre-fetch window size. PMID:27362478
Li, Zhendong; Liu, Wenjian
2010-08-14
The spin-adaptation of single-reference quantum chemical methods for excited states of open-shell systems has been nontrivial. The primary reason is that the configuration space, generated by a truncated rank of excitations from only one component of a reference multiplet, is spin-incomplete. Those "missing" configurations are of higher ranks and can, in principle, be recaptured by a particular class of excitation operators. However, the resulting formalisms are then quite involved and there are situations [e.g., time-dependent density functional theory (TD-DFT) under the adiabatic approximation] that prevent one from doing so. To solve this issue, we propose here a tensor-coupling scheme that invokes all the components of a reference multiplet (i.e., a tensor reference) rather than increases the excitation ranks. A minimal spin-adapted n-tuply excited configuration space can readily be constructed by tensor products between the n-tuple tensor excitation operators and the chosen tensor reference. Further combined with the tensor equation-of-motion formalism, very compact expressions for excitation energies can be obtained. As a first application of this general idea, a spin-adapted open-shell random phase approximation is first developed. The so-called "translation rule" is then adopted to formulate a spin-adapted, restricted open-shell Kohn-Sham (ROKS)-based TD-DFT (ROKS-TD-DFT). Here, a particular symmetry structure has to be imposed on the exchange-correlation kernel. While the standard ROKS-TD-DFT can access only excited states due to singlet-coupled single excitations, i.e., only some of the singly excited states of the same spin (S(i)) as the reference, the new scheme can capture all the excited states of spin S(i)-1, S(i), or S(i)+1 due to both singlet- and triplet-coupled single excitations. The actual implementation and computation are very much like the (spin-contaminated) unrestricted Kohn-Sham-based TD-DFT. It is also shown that spin-contaminated spin-flip configuration interaction approaches can easily be spin-adapted via the tensor-coupling scheme.
Scaling Linguistic Characterization of Precipitation Variability
NASA Astrophysics Data System (ADS)
Primo, C.; Gutierrez, J. M.
2003-04-01
Rainfall variability is influenced by changes in the aggregation of daily rainfall. This problem is of great importance for hydrological, agricultural and ecological applications. Rainfall averages, or accumulations, are widely used as standard climatic parameters. However different aggregation schemes may lead to the same average or accumulated values. In this paper we present a fractal method to characterize different aggregation schemes. The method provides scaling exponents characterizing weekly or monthly rainfall patterns for a given station. To this aim, we establish an analogy with linguistic analysis, considering precipitation as a discrete variable (e.g., rain, no rain). Each weekly, or monthly, symbolic precipitation sequence of observed precipitation is then considered as a "word" (in this case, a binary word) which defines a specific weekly rainfall pattern. Thus, each site defines a "language" characterized by the words observed in that site during a period representative of the climatology. Then, the more variable the observed weekly precipitation sequences, the more complex the obtained language. To characterize these languages, we first applied the Zipf's method obtaining scaling histograms of rank ordered frequencies. However, to obtain significant exponents, the scaling must be maintained some orders of magnitude, requiring long sequences of daily precipitation which are not available at particular stations. Thus this analysis is not suitable for applications involving particular stations (such as regionalization). Then, we introduce an alternative fractal method applicable to data from local stations. The so-called Chaos-Game method uses Iterated Function Systems (IFS) for graphically representing rainfall languages, in a way that complex languages define complex graphical patterns. The box-counting dimension and the entropy of the resulting patterns are used as linguistic parameters to quantitatively characterize the complexity of the patterns. We illustrate the high climatological discrimination power of the linguistic parameters in the Iberian peninsula, when compared with other standard techniques (such as seasonal mean accumulated precipitation). As an example, standard and linguistic parameters are used as inputs for a clustering regionalization method, comparing the resulting clusters.
Detection of Cardiovascular Disease Risk's Level for Adults Using Naive Bayes Classifier.
Miranda, Eka; Irwansyah, Edy; Amelga, Alowisius Y; Maribondang, Marco M; Salim, Mulyadi
2016-07-01
The number of deaths caused by cardiovascular disease and stroke is predicted to reach 23.3 million in 2030. As a contribution to support prevention of this phenomenon, this paper proposes a mining model using a naïve Bayes classifier that could detect cardiovascular disease and identify its risk level for adults. The process of designing the method began by identifying the knowledge related to the cardiovascular disease profile and the level of cardiovascular disease risk factors for adults based on the medical record, and designing a mining technique model using a naïve Bayes classifier. Evaluation of this research employed two methods: accuracy, sensitivity, and specificity calculation as well as an evaluation session with cardiologists and internists. The characteristics of cardiovascular disease are identified by its primary risk factors. Those factors are diabetes mellitus, the level of lipids in the blood, coronary artery function, and kidney function. Class labels were assigned according to the values of these factors: risk level 1, risk level 2 and risk level 3. The evaluation of the classifier performance (accuracy, sensitivity, and specificity) in this research showed that the proposed model predicted the class label of tuples correctly (above 80%). More than eighty percent of respondents (including cardiologists and internists) who participated in the evaluation session agree till strongly agreed that this research followed medical procedures and that the result can support medical analysis related to cardiovascular disease. The research showed that the proposed model achieves good performance for risk level detection of cardiovascular disease.
Case study of open-source enterprise resource planning implementation in a small business
NASA Astrophysics Data System (ADS)
Olson, David L.; Staley, Jesse
2012-02-01
Enterprise resource planning (ERP) systems have been recognised as offering great benefit to some organisations, although they are expensive and problematic to implement. The cost and risk make well-developed proprietorial systems unaffordable to small businesses. Open-source software (OSS) has become a viable means of producing ERP system products. The question this paper addresses is the feasibility of OSS ERP systems for small businesses. A case is reported involving two efforts to implement freely distributed ERP software products in a small US make-to-order engineering firm. The case emphasises the potential of freely distributed ERP systems, as well as some of the hurdles involved in their implementation. The paper briefly reviews highlights of OSS ERP systems, with the primary focus on reporting the case experiences for efforts to implement ERPLite software and xTuple software. While both systems worked from a technical perspective, both failed due to economic factors. While these economic conditions led to imperfect results, the case demonstrates the feasibility of OSS ERP for small businesses. Both experiences are evaluated in terms of risk dimension.
Progress toward a Semantic eScience Framework; building on advanced cyberinfrastructure
NASA Astrophysics Data System (ADS)
McGuinness, D. L.; Fox, P. A.; West, P.; Rozell, E.; Zednik, S.; Chang, C.
2010-12-01
The configurable and extensible semantic eScience framework (SESF) has begun development and implementation of several semantic application components. Extensions and improvements to several ontologies have been made based on distinct interdisciplinary use cases ranging from solar physics, to biologicl and chemical oceanography. Importantly, these semantic representations mediate access to a diverse set of existing and emerging cyberinfrastructure. Among the advances are the population of triple stores with web accessible query services. A triple store is akin to a relational data store where the basic stored unit is a subject-predicate-object tuple. Access via a query is provided by the W3 Recommendation language specification SPARQL. Upon this middle tier of semantic cyberinfrastructure, we have developed several forms of semantic faceted search, including provenance-awareness. We report on the rapid advances in semantic technologies and tools and how we are sustaining the software path for the required technical advances as well as the ontology improvements and increased functionality of the semantic applications including how they are integrated into web-based portals (e.g. Drupal) and web services. Lastly, we indicate future work direction and opportunities for collaboration.
Shi, Hua; Liu, Hu-Chen; Li, Ping; Xu, Xue-Guo
2017-01-01
With increased worldwide awareness of environmental issues, healthcare waste (HCW) management has received much attention from both researchers and practitioners over the past decade. The task of selecting the optimum treatment technology for HCWs is a challenging decision making problem involving conflicting evaluation criteria and multiple stakeholders. In this paper, we develop an integrated decision making framework based on cloud model and MABAC method for evaluating and selecting the best HCW treatment technology from a multiple stakeholder perspective. The introduced framework deals with uncertain linguistic assessments of alternatives by using interval 2-tuple linguistic variables, determines decision makers' relative weights based on the uncertainty and divergence degrees of every decision maker, and obtains the ranking of all HCW disposal alternatives with the aid of an extended MABAC method. Finally, an empirical example from Shanghai, China, is provided to illustrate the feasibility and effectiveness of the proposed approach. Results indicate that the methodology being proposed is more suitable and effective to handle the HCW treatment technology selection problem under vague and uncertain information environment. Copyright © 2016 Elsevier Ltd. All rights reserved.
Information theory, animal communication, and the search for extraterrestrial intelligence
NASA Astrophysics Data System (ADS)
Doyle, Laurance R.; McCowan, Brenda; Johnston, Simon; Hanser, Sean F.
2011-02-01
We present ongoing research in the application of information theory to animal communication systems with the goal of developing additional detectors and estimators for possible extraterrestrial intelligent signals. Regardless of the species, for intelligence (i.e., complex knowledge) to be transmitted certain rules of information theory must still be obeyed. We demonstrate some preliminary results of applying information theory to socially complex marine mammal species (bottlenose dolphins and humpback whales) as well as arboreal squirrel monkeys, because they almost exclusively rely on vocal signals for their communications, producing signals which can be readily characterized by signal analysis. Metrics such as Zipf's Law and higher-order information-entropic structure are emerging as indicators of the communicative complexity characteristic of an "intelligent message" content within these animals' signals, perhaps not surprising given these species' social complexity. In addition to human languages, for comparison we also apply these metrics to pulsar signals—perhaps (arguably) the most "organized" of stellar systems—as an example of astrophysical systems that would have to be distinguished from an extraterrestrial intelligence message by such information theoretic filters. We also look at a message transmitted from Earth (Arecibo Observatory) that contains a lot of meaning but little information in the mathematical sense we define it here. We conclude that the study of non-human communication systems on our own planet can make a valuable contribution to the detection of extraterrestrial intelligence by providing quantitative general measures of communicative complexity. Studying the complex communication systems of other intelligent species on our own planet may also be one of the best ways to deprovincialize our thinking about extraterrestrial communication systems in general.
Data analytics for simplifying thermal efficiency planning in cities.
Abdolhosseini Qomi, Mohammad Javad; Noshadravan, Arash; Sobstyl, Jake M; Toole, Jameson; Ferreira, Joseph; Pellenq, Roland J-M; Ulm, Franz-Josef; Gonzalez, Marta C
2016-04-01
More than 44% of building energy consumption in the USA is used for space heating and cooling, and this accounts for 20% of national CO2emissions. This prompts the need to identify among the 130 million households in the USA those with the greatest energy-saving potential and the associated costs of the path to reach that goal. Whereas current solutions address this problem by analysing each building in detail, we herein reduce the dimensionality of the problem by simplifying the calculations of energy losses in buildings. We present a novel inference method that can be used via a ranking algorithm that allows us to estimate the potential energy saving for heating purposes. To that end, we only need consumption from records of gas bills integrated with a building's footprint. The method entails a statistical screening of the intricate interplay between weather, infrastructural and residents' choice variables to determine building gas consumption and potential savings at a city scale. We derive a general statistical pattern of consumption in an urban settlement, reducing it to a set of the most influential buildings' parameters that operate locally. By way of example, the implications are explored using records of a set of (N= 6200) buildings in Cambridge, MA, USA, which indicate that retrofitting only 16% of buildings entails a 40% reduction in gas consumption of the whole building stock. We find that the inferred heat loss rate of buildings exhibits a power-law data distribution akin to Zipf's law, which provides a means to map an optimum path for gas savings per retrofit at a city scale. These findings have implications for improving the thermal efficiency of cities' building stock, as outlined by current policy efforts seeking to reduce home heating and cooling energy consumption and lower associated greenhouse gas emissions. © 2016 The Author(s).
Bettencourt, Luís M. A.; Lobo, José
2016-01-01
Over the last few decades, in disciplines as diverse as economics, geography and complex systems, a perspective has arisen proposing that many properties of cities are quantitatively predictable due to agglomeration or scaling effects. Using new harmonized definitions for functional urban areas, we examine to what extent these ideas apply to European cities. We show that while most large urban systems in Western Europe (France, Germany, Italy, Spain, UK) approximately agree with theoretical expectations, the small number of cities in each nation and their natural variability preclude drawing strong conclusions. We demonstrate how this problem can be overcome so that cities from different urban systems can be pooled together to construct larger datasets. This leads to a simple statistical procedure to identify urban scaling relations, which then clearly emerge as a property of European cities. We compare the predictions of urban scaling to Zipf's law for the size distribution of cities and show that while the former holds well the latter is a poor descriptor of European cities. We conclude with scenarios for the size and properties of future pan-European megacities and their implications for the economic productivity, technological sophistication and regional inequalities of an integrated European urban system. PMID:26984190
Genus age, provincial area and the taxonomic structure of marine faunas.
Harnik, Paul G; Jablonski, David; Krug, Andrew Z; Valentine, James W
2010-11-22
Species are unevenly distributed among genera within clades and regions, with most genera species-poor and few species-rich. At regional scales, this structure to taxonomic diversity is generated via speciation, extinction and geographical range dynamics. Here, we use a global database of extant marine bivalves to characterize the taxonomic structure of climate zones and provinces. Our analyses reveal a general, Zipf-Mandelbrot form to the distribution of species among genera, with faunas from similar climate zones exhibiting similar taxonomic structure. Provinces that contain older taxa and/or encompass larger areas are expected to be more species-rich. Although both median genus age and provincial area correlate with measures of taxonomic structure, these relationships are interdependent, nonlinear and driven primarily by contrasts between tropical and extra-tropical faunas. Provincial area and taxonomic structure are largely decoupled within climate zones. Counter to the expectation that genus age and species richness should positively covary, diverse and highly structured provincial faunas are dominated by young genera. The marked differences between tropical and temperate faunas suggest strong spatial variation in evolutionary rates and invasion frequencies. Such variation contradicts biogeographic models that scale taxonomic diversity to geographical area.
The dynamics of correlated novelties.
Tria, F; Loreto, V; Servedio, V D P; Strogatz, S H
2014-07-31
Novelties are a familiar part of daily life. They are also fundamental to the evolution of biological systems, human society, and technology. By opening new possibilities, one novelty can pave the way for others in a process that Kauffman has called "expanding the adjacent possible". The dynamics of correlated novelties, however, have yet to be quantified empirically or modeled mathematically. Here we propose a simple mathematical model that mimics the process of exploring a physical, biological, or conceptual space that enlarges whenever a novelty occurs. The model, a generalization of Polya's urn, predicts statistical laws for the rate at which novelties happen (Heaps' law) and for the probability distribution on the space explored (Zipf's law), as well as signatures of the process by which one novelty sets the stage for another. We test these predictions on four data sets of human activity: the edit events of Wikipedia pages, the emergence of tags in annotation systems, the sequence of words in texts, and listening to new songs in online music catalogues. By quantifying the dynamics of correlated novelties, our results provide a starting point for a deeper understanding of the adjacent possible and its role in biological, cultural, and technological evolution.
The dynamics of correlated novelties
NASA Astrophysics Data System (ADS)
Tria, F.; Loreto, V.; Servedio, V. D. P.; Strogatz, S. H.
2014-07-01
Novelties are a familiar part of daily life. They are also fundamental to the evolution of biological systems, human society, and technology. By opening new possibilities, one novelty can pave the way for others in a process that Kauffman has called ``expanding the adjacent possible''. The dynamics of correlated novelties, however, have yet to be quantified empirically or modeled mathematically. Here we propose a simple mathematical model that mimics the process of exploring a physical, biological, or conceptual space that enlarges whenever a novelty occurs. The model, a generalization of Polya's urn, predicts statistical laws for the rate at which novelties happen (Heaps' law) and for the probability distribution on the space explored (Zipf's law), as well as signatures of the process by which one novelty sets the stage for another. We test these predictions on four data sets of human activity: the edit events of Wikipedia pages, the emergence of tags in annotation systems, the sequence of words in texts, and listening to new songs in online music catalogues. By quantifying the dynamics of correlated novelties, our results provide a starting point for a deeper understanding of the adjacent possible and its role in biological, cultural, and technological evolution.
Power Law Distributions in Two Community Currencies
NASA Astrophysics Data System (ADS)
Kichiji, N.; Nishibe, M.
2007-07-01
The purpose of this paper is to highlight certain newly discovered social phenomena that accord with Zipf's law, in addition to the famous natural and social phenomena including word frequencies, earthquake magnitude, city size, income1 etc. that are already known to follow it. These phenomena have recently been discovered within the transaction amount (payments or receipts) distributions within two different Community Currencies (CC) that had been initiated as social experiments. One is a local CC circulating in a specific geographical area, such as a town. The other is a virtual CC used among members who belong to a certain community of interest (COI) on the Internet. We conducted two empirical studies to estimate the economic vitalization effects they had on their respective local economies. The results we found were that the amount of transactions (payments and receipts) of the two CCs was distributed according to a power-law distribution with a unity rank exponent. In addition, we found differences between the two CCs with regard to the shapes of their distribution over a low-transaction range. The result may originate from the difference in methods of issuing CCs or in the magnitudes of the minimum-value unit; however, this result calls for further investigation.
Scaling features of noncoding DNA
NASA Technical Reports Server (NTRS)
Stanley, H. E.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.
1999-01-01
We review evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene, and utilize this fact to build a Coding Sequence Finder Algorithm, which uses statistical ideas to locate the coding regions of an unknown DNA sequence. Finally, we describe briefly some recent work adapting to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function, and reporting that noncoding regions in eukaryotes display a larger redundancy than coding regions. Specifically, we consider the possibility that this result is solely a consequence of nucleotide concentration differences as first noted by Bonhoeffer and his collaborators. We find that cytosine-guanine (CG) concentration does have a strong "background" effect on redundancy. However, we find that for the purine-pyrimidine binary mapping rule, which is not affected by the difference in CG concentration, the Shannon redundancy for the set of analyzed sequences is larger for noncoding regions compared to coding regions.
NASA Astrophysics Data System (ADS)
Rota, G.-C.; Siegel, Edward Carl-Ludwig
2011-03-01
Seminal Apostol[Math.Mag.81,3,178(08);Am.Math.Month.115,9,795(08)]-Rota[Intro.Prob. Thy.(95)-p.50-55] DichotomY equivalence-class: set-theory: sets V multisets; closed V open; to Abromowitz-Stegun[Hdbk.Math.Fns.(64)]-ch.23,p.803!]: numbers/polynomials generating-functions: Euler V Bernoulli; to Siegel[Schrodinger Cent.Symp.(87); Symp.Fractals, MRS Fall Mtg.,(1989)-5-papers!] power-spectrum: 1/ f {0}-White V 1/ f {1}-Zipf/Pink (Archimedes) HYPERBOLICITY INEVITABILITY; to analytic-geometry Conic-Sections: Ellipse V (via Parabola) V Hyperbola; to Extent/Scale/Radius: Locality V Globality, Root-Causes/Ultimate-Origins: Dimensionality: odd-Z V (via fractal) V even-Z, to Symmetries/(Noether's-theorem connected)/Conservation-Laws Dichotomy: restored/conservation/convergence=0- V broken/non-conservation/divergence=/=0: with asymptotic-limit antipodes morphisms/ crossovers: Eureka!!!; "FUZZYICS"=''CATEGORYICS''!!! Connection to Kummer(1850) Bernoulli-numbers proof of FLT is via Siegel(CCNY;1964) < (1994)[AMS Joint Mtg. (2002)-Abs.973-60-124] short succinct physics proof: FLT = Least-Action Principle!!!
A worldwide model for boundaries of urban settlements.
Oliveira, Erneson A; Furtado, Vasco; Andrade, José S; Makse, Hernán A
2018-05-01
The shape of urban settlements plays a fundamental role in their sustainable planning. Properly defining the boundaries of cities is challenging and remains an open problem in the science of cities. Here, we propose a worldwide model to define urban settlements beyond their administrative boundaries through a bottom-up approach that takes into account geographical biases intrinsically associated with most societies around the world, and reflected in their different regional growing dynamics. The generality of the model allows one to study the scaling laws of cities at all geographical levels: countries, continents and the entire world. Our definition of cities is robust and holds to one of the most famous results in social sciences: Zipf's law. According to our results, the largest cities in the world are not in line with what was recently reported by the United Nations. For example, we find that the largest city in the world is an agglomeration of several small settlements close to each other, connecting three large settlements: Alexandria, Cairo and Luxor. Our definition of cities opens the doors to the study of the economy of cities in a systematic way independently of arbitrary definitions that employ administrative boundaries.
The dynamics of correlated novelties
Tria, F.; Loreto, V.; Servedio, V. D. P.; Strogatz, S. H.
2014-01-01
Novelties are a familiar part of daily life. They are also fundamental to the evolution of biological systems, human society, and technology. By opening new possibilities, one novelty can pave the way for others in a process that Kauffman has called “expanding the adjacent possible”. The dynamics of correlated novelties, however, have yet to be quantified empirically or modeled mathematically. Here we propose a simple mathematical model that mimics the process of exploring a physical, biological, or conceptual space that enlarges whenever a novelty occurs. The model, a generalization of Polya's urn, predicts statistical laws for the rate at which novelties happen (Heaps' law) and for the probability distribution on the space explored (Zipf's law), as well as signatures of the process by which one novelty sets the stage for another. We test these predictions on four data sets of human activity: the edit events of Wikipedia pages, the emergence of tags in annotation systems, the sequence of words in texts, and listening to new songs in online music catalogues. By quantifying the dynamics of correlated novelties, our results provide a starting point for a deeper understanding of the adjacent possible and its role in biological, cultural, and technological evolution. PMID:25080941
Quasirandom geometric networks from low-discrepancy sequences
NASA Astrophysics Data System (ADS)
Estrada, Ernesto
2017-08-01
We define quasirandom geometric networks using low-discrepancy sequences, such as Halton, Sobol, and Niederreiter. The networks are built in d dimensions by considering the d -tuples of digits generated by these sequences as the coordinates of the vertices of the networks in a d -dimensional Id unit hypercube. Then, two vertices are connected by an edge if they are at a distance smaller than a connection radius. We investigate computationally 11 network-theoretic properties of two-dimensional quasirandom networks and compare them with analogous random geometric networks. We also study their degree distribution and their spectral density distributions. We conclude from this intensive computational study that in terms of the uniformity of the distribution of the vertices in the unit square, the quasirandom networks look more random than the random geometric networks. We include an analysis of potential strategies for generating higher-dimensional quasirandom networks, where it is know that some of the low-discrepancy sequences are highly correlated. In this respect, we conclude that up to dimension 20, the use of scrambling, skipping and leaping strategies generate quasirandom networks with the desired properties of uniformity. Finally, we consider a diffusive process taking place on the nodes and edges of the quasirandom and random geometric graphs. We show that the diffusion time is shorter in the quasirandom graphs as a consequence of their larger structural homogeneity. In the random geometric graphs the diffusion produces clusters of concentration that make the process more slow. Such clusters are a direct consequence of the heterogeneous and irregular distribution of the nodes in the unit square in which the generation of random geometric graphs is based on.
The Stochastic Evolutionary Game for a Population of Biological Networks Under Natural Selection
Chen, Bor-Sen; Ho, Shih-Ju
2014-01-01
In this study, a population of evolutionary biological networks is described by a stochastic dynamic system with intrinsic random parameter fluctuations due to genetic variations and external disturbances caused by environmental changes in the evolutionary process. Since information on environmental changes is unavailable and their occurrence is unpredictable, they can be considered as a game player with the potential to destroy phenotypic stability. The biological network needs to develop an evolutionary strategy to improve phenotypic stability as much as possible, so it can be considered as another game player in the evolutionary process, ie, a stochastic Nash game of minimizing the maximum network evolution level caused by the worst environmental disturbances. Based on the nonlinear stochastic evolutionary game strategy, we find that some genetic variations can be used in natural selection to construct negative feedback loops, efficiently improving network robustness. This provides larger genetic robustness as a buffer against neutral genetic variations, as well as larger environmental robustness to resist environmental disturbances and maintain a network phenotypic traits in the evolutionary process. In this situation, the robust phenotypic traits of stochastic biological networks can be more frequently selected by natural selection in evolution. However, if the harbored neutral genetic variations are accumulated to a sufficiently large degree, and environmental disturbances are strong enough that the network robustness can no longer confer enough genetic robustness and environmental robustness, then the phenotype robustness might break down. In this case, a network phenotypic trait may be pushed from one equilibrium point to another, changing the phenotypic trait and starting a new phase of network evolution through the hidden neutral genetic variations harbored in network robustness by adaptive evolution. Further, the proposed evolutionary game is extended to an n-tuple evolutionary game of stochastic biological networks with m players (competitive populations) and k environmental dynamics. PMID:24558296
Aita, Takuyo; Nishigaki, Koichi
2012-11-01
To visualize a bird's-eye view of an ensemble of mitochondrial genome sequences for various species, we recently developed a novel method of mapping a biological sequence ensemble into Three-Dimensional (3D) vector space. First, we represented a biological sequence of a species s by a word-composition vector x(s), where its length [absolute value]x(s)[absolute value] represents the sequence length, and its unit vector x(s)/[absolute value]x(s)[absolute value] represents the relative composition of the K-tuple words through the sequence and the size of the dimension, N=4(K), is the number of all possible words with the length of K. Second, we mapped the vector x(s) to the 3D position vector y(s), based on the two following simple principles: (1) [absolute value]y(s)[absolute value]=[absolute value]x(s)[absolute value] and (2) the angle between y(s) and y(t) maximally correlates with the angle between x(s) and x(t). The mitochondrial genome sequences for 311 species, including 177 Animalia, 85 Fungi and 49 Green plants, were mapped into 3D space by using K=7. The mapping was successful because the angles between vectors before and after the mapping highly correlated with each other (correlation coefficients were 0.92-0.97). Interestingly, the Animalia kingdom is distributed along a single arc belt (just like the Milky Way on a Celestial Globe), and the Fungi and Green plant kingdoms are distributed in a similar arc belt. These two arc belts intersect at their respective middle regions and form a cross structure just like a jet aircraft fuselage and its wings. This new mapping method will allow researchers to intuitively interpret the visual information presented in the maps in a highly effective manner. Copyright © 2012 Elsevier Inc. All rights reserved.
Demazure Modules, Fusion Products and Q-Systems
NASA Astrophysics Data System (ADS)
Chari, Vyjayanthi; Venkatesh, R.
2015-01-01
In this paper, we introduce a family of indecomposable finite-dimensional graded modules for the current algebra associated to a simple Lie algebra. These modules are indexed by an -tuple of partitions , where α varies over a set of positive roots of and we assume that they satisfy a natural compatibility condition. In the case when the are all rectangular, for instance, we prove that these modules are Demazure modules in various levels. As a consequence, we see that the defining relations of Demazure modules can be greatly simplified. We use this simplified presentation to relate our results to the fusion products, defined in (Feigin and Loktev in Am Math Soc Transl Ser (2) 194:61-79, 1999), of representations of the current algebra. We prove that the Q-system of (Hatayama et al. in Contemporary Mathematics, vol. 248, pp. 243-291. American Mathematical Society, Providence, 1998) extends to a canonical short exact sequence of fusion products of representations associated to certain special partitions .Finally, in the last section we deal with the case of and prove that the modules we define are just fusion products of irreducible representations of the associated current algebra and give monomial bases for these modules.
Long-Range Memory in Literary Texts: On the Universal Clustering of the Rare Words.
Tanaka-Ishii, Kumiko; Bunde, Armin
2016-01-01
A fundamental problem in linguistics is how literary texts can be quantified mathematically. It is well known that the frequency of a (rare) word in a text is roughly inverse proportional to its rank (Zipf's law). Here we address the complementary question, if also the rhythm of the text, characterized by the arrangement of the rare words in the text, can be quantified mathematically in a similar basic way. To this end, we consider representative classic single-authored texts from England/Ireland, France, Germany, China, and Japan. In each text, we classify each word by its rank. We focus on the rare words with ranks above some threshold Q and study the lengths of the (return) intervals between them. We find that for all texts considered, the probability SQ(r) that the length of an interval exceeds r, follows a perfect Weibull-function, SQ(r) = exp(-b(β)rβ), with β around 0.7. The return intervals themselves are arranged in a long-range correlated self-similar fashion, where the autocorrelation function CQ(s) of the intervals follows a power law, CQ(s) ∼ s-γ, with an exponent γ between 0.14 and 0.48. We show that these features lead to a pronounced clustering of the rare words in the text.
Furlanello, Cesare; Serafini, Maria; Merler, Stefano; Jurman, Giuseppe
2003-11-06
We describe the E-RFE method for gene ranking, which is useful for the identification of markers in the predictive classification of array data. The method supports a practical modeling scheme designed to avoid the construction of classification rules based on the selection of too small gene subsets (an effect known as the selection bias, in which the estimated predictive errors are too optimistic due to testing on samples already considered in the feature selection process). With E-RFE, we speed up the recursive feature elimination (RFE) with SVM classifiers by eliminating chunks of uninteresting genes using an entropy measure of the SVM weights distribution. An optimal subset of genes is selected according to a two-strata model evaluation procedure: modeling is replicated by an external stratified-partition resampling scheme, and, within each run, an internal K-fold cross-validation is used for E-RFE ranking. Also, the optimal number of genes can be estimated according to the saturation of Zipf's law profiles. Without a decrease of classification accuracy, E-RFE allows a speed-up factor of 100 with respect to standard RFE, while improving on alternative parametric RFE reduction strategies. Thus, a process for gene selection and error estimation is made practical, ensuring control of the selection bias, and providing additional diagnostic indicators of gene importance.
NASA Astrophysics Data System (ADS)
Chavira, Aldo; Gregson, Victor, Jr.; Green, Sidney; Siegel, Edward
2011-06-01
SHOCKS impulse-jerk(I-J) [apply strain/impulse to get stress/jerk ],{VS. NON-shocks[apply stress to get strain]}, plasticity/fracture BAE[E. S.: MSE 8.,310(71); PSS: (a) 5, 601/607(71); Xl..-Latt. Defects 5, 277(74); Scripta Met.: 6, 785(72); 8, 587/617(74); 3rd Tokyo A.-E. Symp. (76);Acta Met.25,383(77); JMMM 7, 312(78)] NON: ``1''/ ω -``Noise'' Zipf(NON-Pareto); power-law ; universality power-spectrum is manifestly-demonstrated in ONLY ``PURE''-MATHS way to be nothing but d[F(t)=m(t)a(t)=Newton's (3rd) Law of Motion=(I-J)]/dt I-Jderivative d(I-J)/dt=dF(t)/dt=[m(t)da(t)/dt+a(t)dm(t)/dt] REdiscovery!!! A/Siegel NON-shock PHYSICS derivation fails!!!; ''PURE''-MATHS: dF(t)/dt=d2p(t)/dt2=[m(t)da(t)/dt+a(t)dm(t)/dt] TRIPLE-integral [VS. NON -shocks F = ma time-series DOUBLE-integral] Dichotomy: s(t) = [v0+(1/2)a(t)t2+EXTRA-TERM(S)], {VS. s(t) = [v0t+(1/2) at2]}, integral-transform formally defines power-spectrum Dichotomy:
NASA Astrophysics Data System (ADS)
Siegel, Edward; Nabarro, Frank; Brailsford, Alan; Tatro, Clement
2011-06-01
NON-shock-plasticity/fracture BAE[E.S.:MSE 8,310(71);PSS:(a)5,601/607(71);Xl.-Latt. Defects 5,277(74);Scripta Met.:6,785(72); 8,587/617(74);3rd Tokyo AE Symp.(76);Acta Met. 25,383(77);JMMM 7,312(78)] ``1''/ ω-``noise'' power-spectrum ``pink''-Zipf-(NOT ``red''-Pareto) power-law UNIVERSALITY is manifestly-demonstrated in two distinct ways to be nothing but Newton Law of Motion F = ma REdiscovery!!!(aka ``Bak''(1988)-``SOC'':1687 < < < 1988: 1988-1687=301-years!!! PHYSICS:(1687) cross-multiplied F=ma rewritten as 1/m=a/F=OUTPUT/IN-PUT=EFFECT/CAUSE=inverse-mass mechanical-susceptibility=X(`` ω'') X(`` ω '') ~(F.-D. thm.) ~P(`` ω'') ``noise'' power-spectrum; (``Max & Al show''): E ~ ω , & E ~(or any/all media with upper-limiting-speeds) ~m. Thus: ω ~ E ~m inverting: 1/ ω ~ 1/E ~1/m ~a/F= X(`` ω'') ~ P(`` ω'') thus: F=ma integral-transform(I-T) is ```SOC'''s'' P(ω) ~ 1/ ω !!!; ``PURE''-MATHS: F=ma DOUBLE-integral time-series(T-S) s(t)=[v0t+(1/2)at2] I-T formally defines power-spectrum:
Yoshikura, Hiroshi
2018-04-27
Relation between number of measles patients (y) and population size (x) was expressed by an equation y = ax s , where a is a constant and s the slope of the plot; s was 2.04-2.17 for prefectures in Japan, i.e., the number of patients was proportional to square of the prefecture population size. For European countries that joined European Union no later than 2009, the slope was 1.43-1.87. The measles' population dependency found among prefectures in Japan was thus scalable up to European countries. It was surprising because, unlike Japan, population density in EU countries was not uniform and not proportional to the population size. The population size dependency was not observed among Western Pacific and South-East Asian countries probably on account of confounding interacting socioeconomic factors. Correlation between measles incidence and birth rate, infant mortality or GDP per capita was almost insignificant.Size distribution of local infection clusters (LICs) of measles and rubella in Japan followed power law. For measles, though the population dependency remained unchanged after "elimination", there was change in the Zipf-type plot of LIC sizes. After the "elimination", LICs linked to importation-related outbreaks in less populated prefectures emerged as the top-ranked LICs.
The Role of Discrete Global Grid Systems in the Global Statistical Geospatial Framework
NASA Astrophysics Data System (ADS)
Purss, M. B. J.; Peterson, P.; Minchin, S. A.; Bermudez, L. E.
2016-12-01
The United Nations Committee of Experts on Global Geospatial Information Management (UN-GGIM) has proposed the development of a Global Statistical Geospatial Framework (GSGF) as a mechanism for the establishment of common analytical systems that enable the integration of statistical and geospatial information. Conventional coordinate reference systems address the globe with a continuous field of points suitable for repeatable navigation and analytical geometry. While this continuous field is represented on a computer in a digitized and discrete fashion by tuples of fixed-precision floating point values, it is a non-trivial exercise to relate point observations spatially referenced in this way to areal coverages on the surface of the Earth. The GSGF states the need to move to gridded data delivery and the importance of using common geographies and geocoding. The challenges associated with meeting these goals are not new and there has been a significant effort within the geospatial community to develop nested gridding standards to tackle these issues over many years. These efforts have recently culminated in the development of a Discrete Global Grid Systems (DGGS) standard which has been developed under the auspices of Open Geospatial Consortium (OGC). DGGS provide a fixed areal based geospatial reference frame for the persistent location of measured Earth observations, feature interpretations, and modelled predictions. DGGS address the entire planet by partitioning it into a discrete hierarchical tessellation of progressively finer resolution cells, which are referenced by a unique index that facilitates rapid computation, query and analysis. The geometry and location of the cell is the principle aspect of a DGGS. Data integration, decomposition, and aggregation is optimised in the DGGS hierarchical structure and can be exploited for efficient multi-source data processing, storage, discovery, transmission, visualization, computation, analysis, and modelling. During the 6th Session of the UN-GGIM in August 2016 the role of DGGS in the context of the GSGF was formally acknowledged. This paper proposes to highlight the synergies and role of DGGS in the Global Statistical Geospatial Framework and to show examples of the use of DGGS to combine geospatial statistics with traditional geoscientific data.
Mahnke, Donna K.; Larson, Joshua M.; Ghanta, Sujana; Feng, Ying; Simpson, Pippa M.; Broeckel, Ulrich; Duffy, Kelly; Tweddell, James S.; Grossman, William J.; Routes, John M.; Mitchell, Michael E.
2010-01-01
22q11.2 Deletion syndrome (22q11.2 DS) [DiGeorge syndrome type 1 (DGS1)] occurs in ∼1:3,000 live births; 75% of children with DGS1 have severe congenital heart disease requiring early intervention. The gold standard for detection of DGS1 is fluorescence in situ hybridization (FISH) with a probe at the TUPLE1 gene. However, FISH is costly and is typically ordered in conjunction with a karyotype analysis that takes several days. Therefore, FISH is underutilized and the diagnosis of 22q11.2 DS is frequently delayed, often resulting in profound clinical consequences. Our goal was to determine whether multiplexed, quantitative real-time PCR (MQPCR) could be used to detect the haploinsufficiency characteristic of 22q11.2 DS. A retrospective blinded study was performed on 382 subjects who had undergone congenital heart surgery. MQPCR was performed with a probe localized to the TBX1 gene on human chromosome 22, a gene typically deleted in 22q11.2 DS. Cycle threshold (Ct) was used to calculate the relative gene copy number (rGCN). Confirmation analysis was performed with the Affymetrix 6.0 Genome-Wide SNP Array. With MQPCR, 361 subjects were identified as nondeleted with an rGCN near 1.0 and 21 subjects were identified as deleted with an rGCN near 0.5, indicative of a hemizygous deletion. The sensitivity (21/21) and specificity (361/361) of MQPCR to detect 22q11.2 deletions was 100% at an rGCN value drawn at 0.7. One of 21 subjects with a prior clinical (not genetically confirmed) DGS1 diagnosis was found not to carry the deletion, while another subject, not previously identified as DGS1, was detected as deleted and subsequently confirmed via microarray. The MQPCR assay is a rapid, inexpensive, sensitive, and specific assay that can be used to screen for 22q11.2 deletion syndrome. The assay is readily adaptable to high throughput. PMID:20551144
Conceptual-driven classification for coding advise in health insurance reimbursement.
Li, Sheng-Tun; Chen, Chih-Chuan; Huang, Fernando
2011-01-01
With the non-stop increases in medical treatment fees, the economic survival of a hospital in Taiwan relies on the reimbursements received from the Bureau of National Health Insurance, which in turn depend on the accuracy and completeness of the content of the discharge summaries as well as the correctness of their International Classification of Diseases (ICD) codes. The purpose of this research is to enforce the entire disease classification framework by supporting disease classification specialists in the coding process. This study developed an ICD code advisory system (ICD-AS) that performed knowledge discovery from discharge summaries and suggested ICD codes. Natural language processing and information retrieval techniques based on Zipf's Law were applied to process the content of discharge summaries, and fuzzy formal concept analysis was used to analyze and represent the relationships between the medical terms identified by MeSH. In addition, a certainty factor used as reference during the coding process was calculated to account for uncertainty and strengthen the credibility of the outcome. Two sets of 360 and 2579 textual discharge summaries of patients suffering from cerebrovascular disease was processed to build up ICD-AS and to evaluate the prediction performance. A number of experiments were conducted to investigate the impact of system parameters on accuracy and compare the proposed model to traditional classification techniques including linear-kernel support vector machines. The comparison results showed that the proposed system achieves the better overall performance in terms of several measures. In addition, some useful implication rules were obtained, which improve comprehension of the field of cerebrovascular disease and give insights to the relationships between relevant medical terms. Our system contributes valuable guidance to disease classification specialists in the process of coding discharge summaries, which consequently brings benefits in aspects of patient, hospital, and healthcare system. Copyright © 2010 Elsevier B.V. All rights reserved.
Efficient Execution Methods of Pivoting for Bulk Extraction of Entity-Attribute-Value-Modeled Data
Luo, Gang; Frey, Lewis J.
2017-01-01
Entity-attribute-value (EAV) tables are widely used to store data in electronic medical records and clinical study data management systems. Before they can be used by various analytical (e.g., data mining and machine learning) programs, EAV-modeled data usually must be transformed into conventional relational table format through pivot operations. This time-consuming and resource-intensive process is often performed repeatedly on a regular basis, e.g., to provide a daily refresh of the content in a clinical data warehouse. Thus, it would be beneficial to make pivot operations as efficient as possible. In this paper, we present three techniques for improving the efficiency of pivot operations: 1) filtering out EAV tuples related to unneeded clinical parameters early on; 2) supporting pivoting across multiple EAV tables; and 3) conducting multi-query optimization. We demonstrate the effectiveness of our techniques through implementation. We show that our optimized execution method of pivoting using these techniques significantly outperforms the current basic execution method of pivoting. Our techniques can be used to build a data extraction tool to simplify the specification of and improve the efficiency of extracting data from the EAV tables in electronic medical records and clinical study data management systems. PMID:25608318
Repressive coping and self-reports of parenting.
Myers, L B; Brewin, C R; Winter, D A
1999-03-01
To investigate whether women who possess a repressive coping style (repressors) self-report more positive judgments of their childhood on questionnaire and repertory grid measures compared with non-repressors. Repressors (low anxiety-high defensiveness) were compared with a composite group of non-repressors, containing some low anxious (low anxiety-low defensiveness), some high anxious (high anxiety-low defensiveness), some defensive high anxious (high anxiety-high defensiveness) and some non-extreme scorers. Participants completed the Parental Bonding Instrument (PBI; Parker, Tupling & Brown, 1979) and a 10 x 10 repertory grid, Self-Identification Form. On the PBI, repressors scored significantly higher than non-repressors on paternal care and significantly lower on paternal overprotection. There were no group differences for maternal measures. On the repertory grid, repressors compared with non-repressors perceived (a) themselves as significantly closer to their father, a woman they like, and their ideal partner, and significantly further from a woman they dislike, and a man they dislike; and (b) their father as significantly closer to a woman they like, a partner/person they admire, and an ideal partner. In addition, repressors were significantly tighter on construing than non-repressors. The results supported the hypothesis that repressors would rate their interactions with their fathers more positively than non-repressors when allowed to do so on self-report measures.
Empirical cost models for estimating power and energy consumption in database servers
NASA Astrophysics Data System (ADS)
Valdivia Garcia, Harold Dwight
The explosive growth in the size of data centers, coupled with the widespread use of virtualization technology has brought power and energy consumption as major concerns for data center administrators. Provisioning decisions must take into consideration not only target application performance but also the power demands and total energy consumption incurred by the hardware and software to be deployed at the data center. Failure to do so will result in damaged equipment, power outages, and inefficient operation. Since database servers comprise one of the most popular and important server applications deployed in such facilities, it becomes necessary to have accurate cost models that can predict the power and energy demands that each database workloads will impose in the system. In this work we present an empirical methodology to estimate the power and energy cost of database operations. Our methodology uses multiple-linear regression to derive accurate cost models that depend only on readily available statistics such as selectivity factors, tuple size, numbers columns and relational cardinality. Moreover, our method does not need measurement of individual hardware components, but rather total power and energy consumption measured at a server. We have implemented our methodology, and ran experiments with several server configurations. Our experiments indicate that we can predict power and energy more accurately than alternative methods found in the literature.
The Structure and Evolution of Buyer-Supplier Networks
Mizuno, Takayuki; Souma, Wataru; Watanabe, Tsutomu
2014-01-01
In this paper, we investigate the structure and evolution of customer-supplier networks in Japan using a unique dataset that contains information on customer and supplier linkages for more than 500,000 incorporated non-financial firms for the five years from 2008 to 2012. We find, first, that the number of customer links is unequal across firms; the customer link distribution has a power-law tail with an exponent of unity (i.e., it follows Zipf's law). We interpret this as implying that competition among firms to acquire new customers yields winners with a large number of customers, as well as losers with fewer customers. We also show that the shortest path length for any pair of firms is, on average, 4.3 links. Second, we find that link switching is relatively rare. Our estimates indicate that the survival rate per year for customer links is 92 percent and for supplier links 93 percent. Third and finally, we find that firm growth rates tend to be more highly correlated the closer two firms are to each other in a customer-supplier network (i.e., the smaller is the shortest path length for the two firms). This suggests that a non-negligible portion of fluctuations in firm growth stems from the propagation of microeconomic shocks – shocks affecting only a particular firm – through customer-supplier chains. PMID:25000368
The structure and evolution of buyer-supplier networks.
Mizuno, Takayuki; Souma, Wataru; Watanabe, Tsutomu
2014-01-01
In this paper, we investigate the structure and evolution of customer-supplier networks in Japan using a unique dataset that contains information on customer and supplier linkages for more than 500,000 incorporated non-financial firms for the five years from 2008 to 2012. We find, first, that the number of customer links is unequal across firms; the customer link distribution has a power-law tail with an exponent of unity (i.e., it follows Zipf's law). We interpret this as implying that competition among firms to acquire new customers yields winners with a large number of customers, as well as losers with fewer customers. We also show that the shortest path length for any pair of firms is, on average, 4.3 links. Second, we find that link switching is relatively rare. Our estimates indicate that the survival rate per year for customer links is 92 percent and for supplier links 93 percent. Third and finally, we find that firm growth rates tend to be more highly correlated the closer two firms are to each other in a customer-supplier network (i.e., the smaller is the shortest path length for the two firms). This suggests that a non-negligible portion of fluctuations in firm growth stems from the propagation of microeconomic shocks - shocks affecting only a particular firm - through customer-supplier chains.
Electoral Susceptibility and Entropically Driven Interactions
NASA Astrophysics Data System (ADS)
Caravan, Bassir; Levine, Gregory
2013-03-01
In the United States electoral system the election is usually decided by the electoral votes cast by a small number of ``swing states'' where the two candidates historically have roughly equal probabilities of winning. The effective value of a swing state is determined not only by the number of its electoral votes but by the frequency of its appearance in the set of winning partitions of the electoral college. Since the electoral vote values of swing states are not identical, the presence or absence of a state in a winning partition is generally correlated with the frequency of appearance of other states and, hence, their effective values. We quantify the effective value of states by an electoral susceptibility, χj, the variation of the winning probability with the ``cost'' of changing the probability of winning state j. Associating entropy with the logarithm of the number of appearances of a state within the set of winning partitions, the entropy per state (in effect, the chemical potential) is not additive and the states may be said to ``interact.'' We study χj for a simple model with a Zipf's law type distribution of electoral votes. We show that the susceptibility for small states is largest in ``one-sided'' electoral contests and smallest in close contests. This research was supported by Department of Energy DE-FG02-08ER64623, Research Corporation CC6535 (GL) and HHMI Scholar Program (BC)
Spatial Linkage and Urban Expansion: AN Urban Agglomeration View
NASA Astrophysics Data System (ADS)
Jiao, L. M.; Tang, X.; Liu, X. P.
2017-09-01
Urban expansion displays different characteristics in each period. From the perspective of the urban agglomeration, studying the spatial and temporal characteristics of urban expansion plays an important role in understanding the complex relationship between urban expansion and network structure of urban agglomeration. We analyze urban expansion in the Yangtze River Delta Urban Agglomeration (YRD) through accessibility to and spatial interaction intensity from core cities as well as accessibility of road network. Results show that: (1) Correlation between urban expansion intensity and spatial indicators such as location and space syntax variables is remarkable and positive, while it decreases after rapid expansion. (2) Urban expansion velocity displays a positive correlation with spatial indicators mentioned above in the first (1980-1990) and second (1990-2000) period. However, it exhibits a negative relationship in the third period (2000-2010), i.e., cities located in the periphery of urban agglomeration developing more quickly. Consequently, the hypothesis of convergence of urban expansion in rapid expansion stage is put forward. (3) Results of Zipf's law and Gibrat's law show urban expansion in YRD displays a convergent trend in rapid expansion stage, small and medium-sized cities growing faster. This study shows that spatial linkage plays an important but evolving role in urban expansion within the urban agglomeration. In addition, it serves as a reference to the planning of Yangtze River Delta Urban Agglomeration and regulation of urban expansion of other urban agglomerations.
Do young children have adult-like syntactic categories? Zipf's law and the case of the determiner.
Pine, Julian M; Freudenthal, Daniel; Krajewski, Grzegorz; Gobet, Fernand
2013-06-01
Generativist models of grammatical development assume that children have adult-like grammatical categories from the earliest observable stages, whereas constructivist models assume that children's early categories are more limited in scope. In the present paper, we test these assumptions with respect to one particular syntactic category, the determiner. This is done by comparing controlled measures of overlap in the set of nouns with which children and their caregivers use different instances of the determiner category in their spontaneous speech. In a series of studies, we show, first, that it is important to control for both sample size and vocabulary range when comparing child and adult overlap measures; second, that, once the appropriate controls have been applied, there is significantly less overlap in the nouns with which young children use the determiners a/an and the in their speech than in the nouns with which their caregivers use these same determiners; and, third, that the level of (controlled) overlap in the nouns that the children use with the determiners a/an and the increases significantly over the course of development. The implication is that children do not have an adult-like determiner category during the earliest observable stages, and that their knowledge of the determiner category only gradually approximates that of adults as a function of their linguistic experience. Copyright © 2013 Elsevier B.V. All rights reserved.
Detecting overlapping instances in microscopy images using extremal region trees.
Arteta, Carlos; Lempitsky, Victor; Noble, J Alison; Zisserman, Andrew
2016-01-01
In many microscopy applications the images may contain both regions of low and high cell densities corresponding to different tissues or colonies at different stages of growth. This poses a challenge to most previously developed automated cell detection and counting methods, which are designed to handle either the low-density scenario (through cell detection) or the high-density scenario (through density estimation or texture analysis). The objective of this work is to detect all the instances of an object of interest in microscopy images. The instances may be partially overlapping and clustered. To this end we introduce a tree-structured discrete graphical model that is used to select and label a set of non-overlapping regions in the image by a global optimization of a classification score. Each region is labeled with the number of instances it contains - for example regions can be selected that contain two or three object instances, by defining separate classes for tuples of objects in the detection process. We show that this formulation can be learned within the structured output SVM framework and that the inference in such a model can be accomplished using dynamic programming on a tree structured region graph. Furthermore, the learning only requires weak annotations - a dot on each instance. The candidate regions for the selection are obtained as extremal region of a surface computed from the microscopy image, and we show that the performance of the model can be improved by considering a proxy problem for learning the surface that allows better selection of the extremal regions. Furthermore, we consider a number of variations for the loss function used in the structured output learning. The model is applied and evaluated over six quite disparate data sets of images covering: fluorescence microscopy, weak-fluorescence molecular images, phase contrast microscopy and histopathology images, and is shown to exceed the state of the art in performance. Copyright © 2015 Elsevier B.V. All rights reserved.
Intermediate grouping on remotely sensed data using Gestalt algebra
NASA Astrophysics Data System (ADS)
Michaelsen, Eckart
2014-10-01
Human observers often achieve striking recognition performance on remotely sensed data unmatched by machine vision algorithms. This holds even for thermal images (IR) or synthetic aperture radar (SAR). Psychologists refer to these capabilities as Gestalt perceptive skills. Gestalt Algebra is a mathematical structure recently proposed for such laws of perceptual grouping. It gives operations for mirror symmetry, continuation in rows and rotational symmetric patterns. Each of these operations forms an aggregate-Gestalt of a tuple of part-Gestalten. Each Gestalt is attributed with a position, an orientation, a rotational frequency, a scale, and an assessment respectively. Any Gestalt can be combined with any other Gestalt using any of the three operations. Most often the assessment of the new aggregate-Gestalt will be close to zero. Only if the part-Gestalten perfectly fit into the desired pattern the new aggregate-Gestalt will be assessed with value one. The algebra is suitable in both directions: It may render an organized symmetric mandala using random numbers. Or it may recognize deep hidden visual relationships between meaningful parts of a picture. For the latter primitives must be obtained from the image by some key-point detector and a threshold. Intelligent search strategies are required for this search in the combinatorial space of possible Gestalt Algebra terms. Exemplarily, maximal assessed Gestalten found in selected aerial images as well as in IR and SAR images are presented.
iEnhancer-EL: Identifying enhancers and their strength with ensemble learning approach.
Liu, Bin; Li, Kai; Huang, De-Shuang; Chou, Kuo-Chen
2018-06-07
Identification of enhancers and their strength is important because they play a critical role in controlling gene expression. Although some bioinformatics tools were developed, they are limited in discriminating enhancers from non-enhancers only. Recently, a two-layer predictor called "iEnhancer-2L" was developed that can be used to predict the enhancer's strength as well. However, its prediction quality needs further improvement to enhance the practical application value. A new predictor called "iEnhancer-EL" was proposed that contains two layer predictors: the first one (for identifying enhancers) is formed by fusing an array of six key individual classifiers, and the second one (for their strength) formed by fusing an array of ten key individual classifiers. All these key classifiers were selected from 171 elementary classifiers formed by SVM (Support Vector Machine) based on kmer, subsequence profile, and PseKNC (Pseudo K-tuple Nucleotide Composition), respectively. Rigorous cross-validations have indicated that the proposed predictor is remarkably superior to the existing state-of-the-art one in this area. A web server for the iEnhancer-EL has been established at http://bioinformatics.hitsz.edu.cn/iEnhancer-EL/, by which users can easily get their desired results without the need to go through the mathematical details. bliu@hit.edu.cn, dshuang@tongji.edu.cn or kcchou@gordonlifescience.org. Supplementary data are available at Bioinformatics online.
Optimal choice of word length when comparing two Markov sequences using a χ 2-statistic.
Bai, Xin; Tang, Kujin; Ren, Jie; Waterman, Michael; Sun, Fengzhu
2017-10-03
Alignment-free sequence comparison using counts of word patterns (grams, k-tuples) has become an active research topic due to the large amount of sequence data from the new sequencing technologies. Genome sequences are frequently modelled by Markov chains and the likelihood ratio test or the corresponding approximate χ 2 -statistic has been suggested to compare two sequences. However, it is not known how to best choose the word length k in such studies. We develop an optimal strategy to choose k by maximizing the statistical power of detecting differences between two sequences. Let the orders of the Markov chains for the two sequences be r 1 and r 2 , respectively. We show through both simulations and theoretical studies that the optimal k= max(r 1 ,r 2 )+1 for both long sequences and next generation sequencing (NGS) read data. The orders of the Markov chains may be unknown and several methods have been developed to estimate the orders of Markov chains based on both long sequences and NGS reads. We study the power loss of the statistics when the estimated orders are used. It is shown that the power loss is minimal for some of the estimators of the orders of Markov chains. Our studies provide guidelines on choosing the optimal word length for the comparison of Markov sequences.
Thermodynamics of firms' growth
Zambrano, Eduardo; Hernando, Alberto; Hernando, Ricardo; Plastino, Angelo
2015-01-01
The distribution of firms' growth and firms' sizes is a topic under intense scrutiny. In this paper, we show that a thermodynamic model based on the maximum entropy principle, with dynamical prior information, can be constructed that adequately describes the dynamics and distribution of firms' growth. Our theoretical framework is tested against a comprehensive database of Spanish firms, which covers, to a very large extent, Spain's economic activity, with a total of 1 155 142 firms evolving along a full decade. We show that the empirical exponent of Pareto's law, a rule often observed in the rank distribution of large-size firms, is explained by the capacity of economic system for creating/destroying firms, and that can be used to measure the health of a capitalist-based economy. Indeed, our model predicts that when the exponent is larger than 1, creation of firms is favoured; when it is smaller than 1, destruction of firms is favoured instead; and when it equals 1 (matching Zipf's law), the system is in a full macroeconomic equilibrium, entailing ‘free’ creation and/or destruction of firms. For medium and smaller firm sizes, the dynamical regime changes, the whole distribution can no longer be fitted to a single simple analytical form and numerical prediction is required. Our model constitutes the basis for a full predictive framework regarding the economic evolution of an ensemble of firms. Such a structure can be potentially used to develop simulations and test hypothetical scenarios, such as economic crisis or the response to specific policy measures. PMID:26510828
Functional roles affect diversity-succession relationships for boreal beetles.
Gibb, Heloise; Johansson, Therese; Stenbacka, Fredrik; Hjältén, Joakim
2013-01-01
Species diversity commonly increases with succession and this relationship is an important justification for conserving large areas of old-growth habitats. However, species with different ecological roles respond differently to succession. We examined the relationship between a range of diversity measures and time since disturbance for boreal forest beetles collected over a 285 year forest chronosequence. We compared responses of "functional" groups related to threat status, dependence on dead wood habitats, diet and the type of trap in which they were collected (indicative of the breadth of ecologies of species). We examined fits of commonly used rank-abundance models for each age class and traditional and derived diversity indices. Rank abundance distributions were closest to the Zipf-Mandelbrot distribution, suggesting little role for competition in structuring most assemblages. Diversity measures for most functional groups increased with succession, but differences in slopes were common. Evenness declined with succession; more so for red-listed species than common species. Saproxylic species increased in diversity with succession while non-saproxylic species did not. Slopes for fungivores were steeper than other diet groups, while detritivores were not strongly affected by succession. Species trapped using emergence traps (log specialists) responded more weakly to succession than those trapped using flight intercept traps (representing a broader set of ecologies). Species associated with microhabitats that accumulate with succession (fungi and dead wood) thus showed the strongest diversity responses to succession. These clear differences between functional group responses to forest succession should be considered in planning landscapes for optimum conservation value, particularly functional resilience.
Thermodynamics of firms' growth.
Zambrano, Eduardo; Hernando, Alberto; Fernández Bariviera, Aurelio; Hernando, Ricardo; Plastino, Angelo
2015-11-06
The distribution of firms' growth and firms' sizes is a topic under intense scrutiny. In this paper, we show that a thermodynamic model based on the maximum entropy principle, with dynamical prior information, can be constructed that adequately describes the dynamics and distribution of firms' growth. Our theoretical framework is tested against a comprehensive database of Spanish firms, which covers, to a very large extent, Spain's economic activity, with a total of 1,155,142 firms evolving along a full decade. We show that the empirical exponent of Pareto's law, a rule often observed in the rank distribution of large-size firms, is explained by the capacity of economic system for creating/destroying firms, and that can be used to measure the health of a capitalist-based economy. Indeed, our model predicts that when the exponent is larger than 1, creation of firms is favoured; when it is smaller than 1, destruction of firms is favoured instead; and when it equals 1 (matching Zipf's law), the system is in a full macroeconomic equilibrium, entailing 'free' creation and/or destruction of firms. For medium and smaller firm sizes, the dynamical regime changes, the whole distribution can no longer be fitted to a single simple analytical form and numerical prediction is required. Our model constitutes the basis for a full predictive framework regarding the economic evolution of an ensemble of firms. Such a structure can be potentially used to develop simulations and test hypothetical scenarios, such as economic crisis or the response to specific policy measures. © 2015 The Authors.
NASA Astrophysics Data System (ADS)
Siegel, Edward
2008-03-01
Buzzwordism,Bandwagonism,Sloganeering for:Fun,Profit,Survival, Ego=ethics DYSunctionality: Digits log-law: Siegel INVERSION: bosons=digits; Excluded d=0? P(0)=oo V P(1)
Anatomisation with slicing: a new privacy preservation approach for multiple sensitive attributes.
Susan, V Shyamala; Christopher, T
2016-01-01
An enormous quantity of personal health information is available in recent decades and tampering of any part of this information imposes a great risk to the health care field. Existing anonymization methods are only apt for single sensitive and low dimensional data to keep up with privacy specifically like generalization and bucketization. In this paper, an anonymization technique is proposed that is a combination of the benefits of anatomization, and enhanced slicing approach adhering to the principle of k-anonymity and l-diversity for the purpose of dealing with high dimensional data along with multiple sensitive data. The anatomization approach dissociates the correlation observed between the quasi identifier attributes and sensitive attributes (SA) and yields two separate tables with non-overlapping attributes. In the enhanced slicing algorithm, vertical partitioning does the grouping of the correlated SA in ST together and thereby minimizes the dimensionality by employing the advanced clustering algorithm. In order to get the optimal size of buckets, tuple partitioning is conducted by MFA. The experimental outcomes indicate that the proposed method can preserve privacy of data with numerous SA. The anatomization approach minimizes the loss of information and slicing algorithm helps in the preservation of correlation and utility which in turn results in reducing the data dimensionality and information loss. The advanced clustering algorithms prove its efficiency by minimizing the time and complexity. Furthermore, this work sticks to the principle of k-anonymity, l-diversity and thus avoids privacy threats like membership, identity and attributes disclosure.
Luk, Jeremy W; Patock-Peckham, Julie A; King, Kevin M
2015-01-01
Parental warmth and autonomy granting are commonly thought of as protective factors against substance use among Caucasians. However, limited research has examined whether associations between parenting dimensions and substance use outcomes are the same or different among Asian Americans. A final analytic sample of 839 college students was used to test whether race (Caucasian vs. Asian American) moderated the relations between parenting dimensions and substance use outcomes across Caucasians and Asian Americans. We utilized the Parental Bonding Instrument (Parker, Tupling, & Brown, 1979) to measure maternal and paternal warmth, encouragement of behavioral freedom, and denial of psychological autonomy. Multivariate regression models controlling for covariates including age, gender, and paternal education indicated four significant parenting by race interactions on alcohol problems and/or marijuana use. Specifically, maternal warmth was inversely associated with both alcohol problems and marijuana use among Caucasians but not among Asian Americans. Both maternal and paternal denial of psychological autonomy were positively associated with alcohol problems among Caucasians but not among Asian Americans. Consistent with emerging cross-cultural research, the associations between parenting dimensions and substance use behaviors observed in Caucasian populations may not be readily generalized to Asian Americans. These findings highlight the importance of considering different parenting dimensions in understanding substance use etiology among Asian Americans. Future research should use longitudinal data to replicate these findings across development and seek to identify other parenting dimensions that may be more relevant for Asian American youth.
NASA Astrophysics Data System (ADS)
Othman, Rozmie R.; Ahmad, Mohd Zamri Zahir; Ali, Mohd Shaiful Aziz Rashid; Zakaria, Hasneeza Liza; Rahman, Md. Mostafijur
2015-05-01
Consuming 40 to 50 percent of software development cost, software testing is one of the most resource consuming activities in software development lifecycle. To ensure an acceptable level of quality and reliability of a typical software product, it is desirable to test every possible combination of input data under various configurations. Due to combinatorial explosion problem, considering all exhaustive testing is practically impossible. Resource constraints, costing factors as well as strict time-to-market deadlines are amongst the main factors that inhibit such consideration. Earlier work suggests that sampling strategy (i.e. based on t-way parameter interaction or called as t-way testing) can be effective to reduce number of test cases without effecting the fault detection capability. However, for a very large system, even t-way strategy will produce a large test suite that need to be executed. In the end, only part of the planned test suite can be executed in order to meet the aforementioned constraints. Here, there is a need for test engineers to measure the effectiveness of partially executed test suite in order for them to assess the risk they have to take. Motivated by the abovementioned problem, this paper presents the effectiveness comparison of partially executed t-way test suite generated by existing strategies using tuples coverage method. Here, test engineers can predict the effectiveness of the testing process if only part of the original test cases is executed.
On the origins of generalized fractional calculus
NASA Astrophysics Data System (ADS)
Kiryakova, Virginia
2015-11-01
In Fractional Calculus (FC), as in the (classical) Calculus, the notions of derivatives and integrals (of first, second, etc. or arbitrary, incl. non-integer order) are basic and co-related. One of the most frequent approach in FC is to define first the Riemann-Liouville (R-L) integral of fractional order, and then by means of suitable integer-order differentiation operation applied over it (or under its sign) a fractional derivative is defined - in the R-L sense (or in Caputo sense). The first mentioned (R-L type) is closer to the theoretical studies in analysis, but has some shortages - from the point of view of interpretation of the initial conditions for Cauchy problems for fractional differential equations (stated also by means of fractional order derivatives/ integrals), and also for the analysts' confusion that such a derivative of a constant is not zero in general. The Caputo (C-) derivative, arising first in geophysical studies, helps to overcome these problems and to describe models of applied problems with physically consistent initial conditions. The operators of the Generalized Fractional Calculus - GFC (integrals and derivatives) are based on commuting m-tuple (m = 1, 2, 3, …) compositions of operators of the classical FC with power weights (the so-called Erdélyi-Kober operators), but represented in compact and explicit form by means of integral, integro-differential (R-L type) or differential-integral (C-type) operators, where the kernels are special functions of most general hypergeometric kind. The foundations of this theory are given in Kiryakova 18. In this survey we present the genesis of the definitions of the GFC - the generalized fractional integrals and derivatives (of fractional multi-order) of R-L type and Caputo type, analyze their properties and applications. Their special cases are all the known operators of classical FC, their generalizations introduced by other authors, the hyper-Bessel differential operators of higher integer order m as a multi-order (1, 1,…, 1), the Gelfond-Leontiev generalized differentiation operators, many other integral and differential operators in Calculus that have been used in various topics, some of them not related to FC at all, others involved in differential and integral equations for treating fractional order models.
Xu, Zhouying; Wu, Yang; Jiang, Yinghe; Zhang, Xiangling; Li, Junli; Ban, Yihui
2018-05-01
Over the last three decades, the presence of arbuscular mycorrhizal fungi (AMF) in wetland habitats had been proven, and their roles played in wetland ecosystems and potential functions in wastewater bioremediation technical installations are interesting issues. To increase knowledge on the functions of AMF in the plant-based bioremediation of wastewater, we constructed two vertical-flow wetlands planting with Phragmites australis and investigated AMF distribution in plant roots and their roles played in purification of wastewater polluted by heavy metals (HMs), utilizing the Illumina sequencing technique. A total of 17 operational taxonomic units (OTUs) from 33,031 AMF sequences were obtained, with Glomus being the most dominant. P. australis living in the two vertical-flow constructed wetlands (CWs) harbored diverse AMF comparable with the AM fungal communities in upland habitats. The AMF composition profiles of CW1 (vegetated with non-inoculated plants) and CW2 (vegetated with mycorrhizal plants inoculated with Rhizophagus intraradices) were significantly different. CW1 (15 OTUs) harbored more diverse AMF than CW2 (7 OTUs); however, CW2 harbored much more OTU13 than CW1. In addition, a zipf species abundance distribution (SAD), which might due to the heavy overdominance of OTU13, was observed across AM fugal taxa in P. australis roots of the two CWs. CW1 and CW2 showed high (> 70%) removal capacity of HMs. CW2 exhibited significant higher Cd and Zn removal efficiencies than CW1 (CK) (p = 0.005 and p = 0.008, respectively). It was considered that AMF might play a role in HM removal in CWs.
Procura-PALavras (P-PAL): A Web-based interface for a new European Portuguese lexical database.
Soares, Ana Paula; Iriarte, Álvaro; de Almeida, José João; Simões, Alberto; Costa, Ana; Machado, João; França, Patrícia; Comesaña, Montserrat; Rauber, Andreia; Rato, Anabela; Perea, Manuel
2018-05-31
In this article, we present Procura-PALavras (P-PAL), a Web-based interface for a new European Portuguese (EP) lexical database. Based on a contemporary printed corpus of over 227 million words, P-PAL provides a broad range of word attributes and statistics, including several measures of word frequency (e.g., raw counts, per-million word frequency, logarithmic Zipf scale), morpho-syntactic information (e.g., parts of speech [PoSs], grammatical gender and number, dominant PoS, and frequency and relative frequency of the dominant PoS), as well as several lexical and sublexical orthographic (e.g., number of letters; consonant-vowel orthographic structure; density and frequency of orthographic neighbors; orthographic Levenshtein distance; orthographic uniqueness point; orthographic syllabification; and trigram, bigram, and letter type and token frequencies), and phonological measures (e.g., pronunciation, number of phonemes, stress, density and frequency of phonological neighbors, transposed and phonographic neighbors, syllabification, and biphone and phone type and token frequencies) for ~53,000 lemmatized and ~208,000 nonlemmatized EP word forms. To obtain these metrics, researchers can choose between two word queries in the application: (i) analyze words previously selected for specific attributes and/or lexical and sublexical characteristics, or (ii) generate word lists that meet word requirements defined by the user in the menu of analyses. For the measures it provides and the flexibility it allows, P-PAL will be a key resource to support research in all cognitive areas that use EP verbal stimuli. P-PAL is freely available at http://p-pal.di.uminho.pt/tools .
Methods for semi-automated indexing for high precision information retrieval.
Berrios, Daniel C; Cucina, Russell J; Fagan, Lawrence M
2002-01-01
To evaluate a new system, ISAID (Internet-based Semi-automated Indexing of Documents), and to generate textbook indexes that are more detailed and more useful to readers. Pilot evaluation: simple, nonrandomized trial comparing ISAID with manual indexing methods. Methods evaluation: randomized, cross-over trial comparing three versions of ISAID and usability survey. Pilot evaluation: two physicians. Methods evaluation: twelve physicians, each of whom used three different versions of the system for a total of 36 indexing sessions. Total index term tuples generated per document per minute (TPM), with and without adjustment for concordance with other subjects; inter-indexer consistency; ratings of the usability of the ISAID indexing system. Compared with manual methods, ISAID decreased indexing times greatly. Using three versions of ISAID, inter-indexer consistency ranged from 15% to 65% with a mean of 41%, 31%, and 40% for each of three documents. Subjects using the full version of ISAID were faster (average TPM: 5.6) and had higher rates of concordant index generation. There were substantial learning effects, despite our use of a training/run-in phase. Subjects using the full version of ISAID were much faster by the third indexing session (average TPM: 9.1). There was a statistically significant increase in three-subject concordant indexing rate using the full version of ISAID during the second indexing session (p < 0.05). Users of the ISAID indexing system create complex, precise, and accurate indexing for full-text documents much faster than users of manual methods. Furthermore, the natural language processing methods that ISAID uses to suggest indexes contributes substantially to increased indexing speed and accuracy.
Identification of microRNA precursor with the degenerate K-tuple or Kmer strategy.
Liu, Bin; Fang, Longyun; Wang, Shanyi; Wang, Xiaolong; Li, Hongtao; Chou, Kuo-Chen
2015-11-21
The microRNA (miRNA), a small non-coding RNA molecule, plays an important role in transcriptional and post-transcriptional regulation of gene expression. Its abnormal expression, however, has been observed in many cancers and other disease states, implying that the miRNA molecules are also deeply involved in these diseases, particularly in carcinogenesis. Therefore, it is important for both basic research and miRNA-based therapy to discriminate the real pre-miRNAs from the false ones (such as hairpin sequences with similar stem-loops). Most existing methods in this regard were based on the strategy in which RNA samples were formulated by a vector formed by their Kmer components. But the length of Kmers must be very short; otherwise, the vector's dimension would be extremely large, leading to the "high-dimension disaster" or overfitting problem. Inspired by the concept of "degenerate energy levels" in quantum mechanics, we introduced the "degenerate Kmer" (deKmer) to represent RNA samples. By doing so, not only we can accommodate long-range coupling effects but also we can avoid the high-dimension problem. Rigorous jackknife tests and cross-species experiments indicated that our approach is very promising. It has not escaped our notice that the deKmer approach can also be applied to many other areas of computational biology. A user-friendly web-server for the new predictor has been established at http://bioinformatics.hitsz.edu.cn/miRNA-deKmer/, by which users can easily get their desired results. Copyright © 2015 Elsevier Ltd. All rights reserved.
The right-hand side of the Jacobi identity: to be naught or not to be ?
NASA Astrophysics Data System (ADS)
Kiselev, Arthemy V.
2016-01-01
The geometric approach to iterated variations of local functionals -e.g., of the (master-)action functional - resulted in an extension of the deformation quantisation technique to the set-up of Poisson models of field theory. It also allowed of a rigorous proof for the main inter-relations between the Batalin-Vilkovisky (BV) Laplacian Δ and variational Schouten bracket [,]. The ad hoc use of these relations had been a known analytic difficulty in the BV- formalism for quantisation of gauge systems; now achieved, the proof does actually not require the assumption of graded-commutativity. Explained in our previous work, geometry's self- regularisation is rendered by Gel'fand's calculus of singular linear integral operators supported on the diagonal. We now illustrate that analytic technique by inspecting the validity mechanism for the graded Jacobi identity which the variational Schouten bracket does satisfy (whence Δ2 = 0, i.e., the BV-Laplacian is a differential acting in the algebra of local functionals). By using one tuple of three variational multi-vectors twice, we contrast the new logic of iterated variations - when the right-hand side of Jacobi's identity vanishes altogether - with the old method: interlacing its steps and stops, it could produce some non-zero representative of the trivial class in the top- degree horizontal cohomology. But we then show at once by an elementary counterexample why, in the frames of the old approach that did not rely on Gel'fand's calculus, the BV-Laplacian failed to be a graded derivation of the variational Schouten bracket.
Phillips, Steven; Niki, Kazuhisa
2002-10-01
Working memory is affected by items stored and the relations between them. However, separating these factors has been difficult, because increased items usually accompany increased associations/relations. Hence, some have argued, relational effects are reducible to item effects. We overcome this problem by manipulating index length: the fewest number of item positions at which there is a unique item, or tuple of items (if length >1), for every instance in the relational (memory) set. Longer indexes imply greater similarity (number of shared items) between instances and higher load on encoding processes. Subjects were given lists of study pairs and asked to make a recognition judgement. The number of unique items and index length in the three list conditions were: (1) AB, CD: four/one; (2) AB, CD, EF: six/one; and (3) AB, AD, CB: four/two, respectively. Japanese letters were used in Experiments 1 (kanji-ideograms) and 2 (hiragana-phonograms); numbers in Experiment 3; and shapes generated from Fourier descriptors in Experiment 4. Across all materials, right dominant temporoparietal and middle frontal gyral activity was found with increased index length, but not items during study. In Experiment 5, a longer delay was used to isolate retention effects in the absence of visual stimuli. Increased left hemispheric activity was observed in the precuneus, middle frontal gyrus, and superior temporal gyrus with increased index length for the delay period. These results show that relational load is not reducible to item load.
NASA Astrophysics Data System (ADS)
Lewis, Thomas; Siegel, Edward
2011-06-01
ROTATIONAL-[``spin-up''/``spin-down'']-SHOCK(S)-plasticity/fracture BAE[E.S.:MSE 8,310(71); PSS:(a)5,601 /607(71); Xl..-Latt. Defects 5,277(74);Scripta Met.:6,785(72);8,587/617(74);3rd Tokyo A.-E. Symp.(76);Acta Met. 25,383(77);JMMM 7,312(78)] NON: ``1''/ ω noise'' Zipf-(Pareto); power-law universality power-spectrum; is manifestly-demonstrated in two distinct ways to be nothing but ROTATIONAL(in 2 OR 3-dimensions)ANGULAR-momentum Newton's 3rd Law of Motion T=I α=dJ/dt REdiscovery!!! A/Siegel PHYSICS derivation FAILS!!! ''PURE''-MATHS: dT(t)/dt=(dJ(t)/dt)2=[I(t)d α(t)/dt+ α(t)(t)dI(t)/dt TRIPLE-integral VS. T=I α DOUBLE-integral time-series(T-S) Dichotomy: θ(t)=[ϖ0 t + α(t) t 2 / 2 + EXTRA-TERM(S)] VS. θ(t)=[ϖ0 t + α(t) t 2 / 2 ] integral-transform formally defines power-spectrum Dichotomy: P(ω) =? θ(t)e-iωtdt=?[ϖ0 t + αt2 / 2 ]e-iωtdt=φ0?te-iωtdt+?{[ α ≠ α (t)]/2}t2eiωtdt= φ0 (ω) /d ω+{[a ≠a(t)]/2}d2 δ (ω) /dω2 =φ0 /ω0+{[ α ≠ α (t)]/2}/ω 1 . 000 ...: if α=0, then P(ω) 1/ω0, VS. if α ≠ α (t) ≠0, then P(ω) 1/ ω 1/ω 1 . 000 ...
Statistics of Shared Components in Complex Component Systems
NASA Astrophysics Data System (ADS)
Mazzolini, Andrea; Gherardi, Marco; Caselle, Michele; Cosentino Lagomarsino, Marco; Osella, Matteo
2018-04-01
Many complex systems are modular. Such systems can be represented as "component systems," i.e., sets of elementary components, such as LEGO bricks in LEGO sets. The bricks found in a LEGO set reflect a target architecture, which can be built following a set-specific list of instructions. In other component systems, instead, the underlying functional design and constraints are not obvious a priori, and their detection is often a challenge of both scientific and practical importance, requiring a clear understanding of component statistics. Importantly, some quantitative invariants appear to be common to many component systems, most notably a common broad distribution of component abundances, which often resembles the well-known Zipf's law. Such "laws" affect in a general and nontrivial way the component statistics, potentially hindering the identification of system-specific functional constraints or generative processes. Here, we specifically focus on the statistics of shared components, i.e., the distribution of the number of components shared by different system realizations, such as the common bricks found in different LEGO sets. To account for the effects of component heterogeneity, we consider a simple null model, which builds system realizations by random draws from a universe of possible components. Under general assumptions on abundance heterogeneity, we provide analytical estimates of component occurrence, which quantify exhaustively the statistics of shared components. Surprisingly, this simple null model can positively explain important features of empirical component-occurrence distributions obtained from large-scale data on bacterial genomes, LEGO sets, and book chapters. Specific architectural features and functional constraints can be detected from occurrence patterns as deviations from these null predictions, as we show for the illustrative case of the "core" genome in bacteria.
Mokam, Didi Gaëlle; Djiéto-Lordon, Champlain; Bilong Bilong, Charles-Félix
2014-01-01
Patterns of species diversity and community structure of insects associated with fruits of domesticated cucurbits were investigated from January 2009 to 2011 in three localities from two agroecological zones in the southern part of Cameroon. Rarefaction curves combined with nonparametric estimators of species richness were used to extrapolate species richness beyond our own data. Sampling efforts of over 92% were reached in each of the three study localities. Data collected revealed a total of 66 insect morphospecies belonging to 37 families and five orders, identified from a set of 57,510 insects. The orders Diptera (especially Tephritidae and Lonchaeidae) and Hymenoptera (mainly Braconidae and Eulophidae) were the most important, in terms of both abundance and species richness on the one hand, and effects on agronomic performance on the other. Values for both the species diversity (Shannon and Simpson) and the species richness indices (Margalef and Berger-Parker) calculated showed that the insect communities were species-rich but dominated, all to a similar extent, by five main species (including four fruit fly species and one parasitoid). Species abundance distributions in these communities ranged from the Zipf-Mandelbrot to Mandelbrot models. The communities are structured as tritrophic networks, including cucurbit fruits, fruit-feeding species (fruit flies) and carnivorous species (parasitoids). Within the guild of the parasitoids, about 30% of species, despite their low abundance, may potentially be of use in biological control of important pests. Our field data contribute in important ways to basic knowledge of biodiversity patterns in agrosystems and constitute baseline data for the planned implementation of biological control in Integrated Pest Management. © The Author 2014. Published by Oxford University Press on behalf of the Entomological Society of America.
Taguchi, Katsuyuki; Stierstorfer, Karl; Polster, Christoph; Lee, Okkyun; Kappler, Steffen
2018-05-01
The interpixel cross-talk of energy-sensitive photon counting x-ray detectors (PCDs) has been studied and an analytical model (version 2.1) has been developed for double-counting between neighboring pixels due to charge sharing and K-shell fluorescence x-ray emission followed by its reabsorption (Taguchi K, et al., Medical Physics 2016;43(12):6386-6404). While the model version 2.1 simulated the spectral degradation well, it had the following problems that has been found to be significant recently: (1) The spectrum is inaccurate with smaller pixel sizes; (2) the charge cloud size must be smaller than the pixel size; (3) the model underestimates the spectrum/counts for 10-40 keV; and (4) the model version 2.1 cannot handlen-tuple-counting withn > 2 (i.e., triple-counting or higher). These problems are inherent to the design of the model version 2.1; therefore, we developed a new model and addressed these problems in this study. We propose a new PCD cross-talk model (version 3.2; Pc TK for "photon counting toolkit") that is based on a completely different design concept from the previous version. It uses a numerical approach and starts with a 2-D model of charge sharing (as opposed to an analytical approach and a 1-D model with version 2.1) and addresses all of the four problems. The model takes the following factors into account: (1) shift-variant electron density of the charge cloud (Gaussian-distributed), (2) detection efficiency, (3) interactions between photons and PCDs via photoelectric effect, and (4) electronic noise. Correlated noisy PCD data can be generated using either a multivariate normal random number generator or a Poisson random number generator. The effect of the two parameters, the effective charge cloud diameter (d 0 ) and pixel size (d pix ), was studied and results were compared with Monte Carlo simulations and the previous model version 2.1. Finally, a script for the workflow for CT image quality assessment has been developed, which started with a few material density images, generated material-specific sinogram (line integrals) data, noisy PCD data with spectral distortion using the model version 3.2, and reconstructed PCD- CT images for four energy windows. The model version 3.2 addressed all of the four problems listed above. The spectra withd pix = 56-113 μm agreed with that of Medipix3 detector withd pix = 55-110 μm without charge summing mode qualitatively. The counts for 10-40 keV were larger than the previous model (version 2.1) and agreed with MC simulations very well (root-mean-square difference values with model version 3.2 were decreased to 16%-67% of the values with version 2.1). There were many non-zero off-diagonal elements withn-tuple-counting withn > 2 in the normalized covariance matrix of 3 × 3 neighboring pixels. Reconstructed images showed biases and artifacts attributed to the spectral distortion due to the charge sharing and fluorescence x rays. We have developed a new PCD model for spatio-energetic cross-talk and correlation between PCD pixels. The workflow demonstrated the utility of the model for general or task-specific image quality assessments for the PCD- CT.Note: The program (Pc TK) and the workflow scripts have been made available to academic researchers. Interested readers should visit the website (pctk.jhu.edu) or contact the corresponding author. © 2018 American Association of Physicists in Medicine.
Large-Scale, Parallel, Multi-Sensor Data Fusion in the Cloud
NASA Astrophysics Data System (ADS)
Wilson, B. D.; Manipon, G.; Hua, H.
2012-12-01
NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over periods of years to decades. However, moving from predominantly single-instrument studies to a multi-sensor, measurement-based model for long-duration analysis of important climate variables presents serious challenges for large-scale data mining and data fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another instrument (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over years of AIRS data. To perform such an analysis, one must discover & access multiple datasets from remote sites, find the space/time "matchups" between instruments swaths and model grids, understand the quality flags and uncertainties for retrieved physical variables, assemble merged datasets, and compute fused products for further scientific and statistical analysis. To efficiently assemble such decade-scale datasets in a timely manner, we are utilizing Elastic Computing in the Cloud and parallel map/reduce-based algorithms. "SciReduce" is a Hadoop-like parallel analysis system, programmed in parallel python, that is designed from the ground up for Earth science. SciReduce executes inside VMWare images and scales to any number of nodes in the Cloud. Unlike Hadoop, in which simple tuples (keys & values) are passed between the map and reduce functions, SciReduce operates on bundles of named numeric arrays, which can be passed in memory or serialized to disk in netCDF4 or HDF5. Thus, SciReduce uses the native datatypes (geolocated grids, swaths, and points) that geo-scientists are familiar with. We are deploying within SciReduce a versatile set of python operators for data lookup, access, subsetting, co-registration, mining, fusion, and statistical analysis. All operators take in sets of geo-located arrays and generate more arrays. Large, multi-year satellite and model datasets are automatically "sharded" by time and space across a cluster of nodes so that years of data (millions of granules) can be compared or fused in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP or webification URLs, thereby minimizing the size of the stored input and intermediate datasets. A typical map function might assemble and quality control AIRS Level-2 water vapor profiles for a year of data in parallel, then a reduce function would average the profiles in lat/lon bins (again, in parallel), and a final reduce would aggregate the climatology and write it to output files. We are using SciReduce to automate the production of multiple versions of a multi-year water vapor climatology (AIRS & MODIS), stratified by Cloudsat cloud classification, and compare it to models (ECMWF & MERRA reanalysis). We will present the architecture of SciReduce, describe the achieved "clock time" speedups in fusing huge datasets on our own nodes and in the Amazon Cloud, and discuss the Cloud cost tradeoffs for storage, compute, and data transfer.
Large-Scale, Parallel, Multi-Sensor Data Fusion in the Cloud
NASA Astrophysics Data System (ADS)
Wilson, B.; Manipon, G.; Hua, H.
2012-04-01
NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over periods of years to decades. However, moving from predominantly single-instrument studies to a multi-sensor, measurement-based model for long-duration analysis of important climate variables presents serious challenges for large-scale data mining and data fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another instrument (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over years of AIRS data. To perform such an analysis, one must discover & access multiple datasets from remote sites, find the space/time "matchups" between instruments swaths and model grids, understand the quality flags and uncertainties for retrieved physical variables, assemble merged datasets, and compute fused products for further scientific and statistical analysis. To efficiently assemble such decade-scale datasets in a timely manner, we are utilizing Elastic Computing in the Cloud and parallel map/reduce-based algorithms. "SciReduce" is a Hadoop-like parallel analysis system, programmed in parallel python, that is designed from the ground up for Earth science. SciReduce executes inside VMWare images and scales to any number of nodes in the Cloud. Unlike Hadoop, in which simple tuples (keys & values) are passed between the map and reduce functions, SciReduce operates on bundles of named numeric arrays, which can be passed in memory or serialized to disk in netCDF4 or HDF5. Thus, SciReduce uses the native datatypes (geolocated grids, swaths, and points) that geo-scientists are familiar with. We are deploying within SciReduce a versatile set of python operators for data lookup, access, subsetting, co-registration, mining, fusion, and statistical analysis. All operators take in sets of geo-arrays and generate more arrays. Large, multi-year satellite and model datasets are automatically "sharded" by time and space across a cluster of nodes so that years of data (millions of granules) can be compared or fused in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP or webification URLs, thereby minimizing the size of the stored input and intermediate datasets. A typical map function might assemble and quality control AIRS Level-2 water vapor profiles for a year of data in parallel, then a reduce function would average the profiles in bins (again, in parallel), and a final reduce would aggregate the climatology and write it to output files. We are using SciReduce to automate the production of multiple versions of a multi-year water vapor climatology (AIRS & MODIS), stratified by Cloudsat cloud classification, and compare it to models (ECMWF & MERRA reanalysis). We will present the architecture of SciReduce, describe the achieved "clock time" speedups in fusing huge datasets on our own nodes and in the Amazon Cloud, and discuss the Cloud cost tradeoffs for storage, compute, and data transfer.
Methods for semi-automated indexing for high precision information retrieval
NASA Technical Reports Server (NTRS)
Berrios, Daniel C.; Cucina, Russell J.; Fagan, Lawrence M.
2002-01-01
OBJECTIVE: To evaluate a new system, ISAID (Internet-based Semi-automated Indexing of Documents), and to generate textbook indexes that are more detailed and more useful to readers. DESIGN: Pilot evaluation: simple, nonrandomized trial comparing ISAID with manual indexing methods. Methods evaluation: randomized, cross-over trial comparing three versions of ISAID and usability survey. PARTICIPANTS: Pilot evaluation: two physicians. Methods evaluation: twelve physicians, each of whom used three different versions of the system for a total of 36 indexing sessions. MEASUREMENTS: Total index term tuples generated per document per minute (TPM), with and without adjustment for concordance with other subjects; inter-indexer consistency; ratings of the usability of the ISAID indexing system. RESULTS: Compared with manual methods, ISAID decreased indexing times greatly. Using three versions of ISAID, inter-indexer consistency ranged from 15% to 65% with a mean of 41%, 31%, and 40% for each of three documents. Subjects using the full version of ISAID were faster (average TPM: 5.6) and had higher rates of concordant index generation. There were substantial learning effects, despite our use of a training/run-in phase. Subjects using the full version of ISAID were much faster by the third indexing session (average TPM: 9.1). There was a statistically significant increase in three-subject concordant indexing rate using the full version of ISAID during the second indexing session (p < 0.05). SUMMARY: Users of the ISAID indexing system create complex, precise, and accurate indexing for full-text documents much faster than users of manual methods. Furthermore, the natural language processing methods that ISAID uses to suggest indexes contributes substantially to increased indexing speed and accuracy.
Methods for Semi-automated Indexing for High Precision Information Retrieval
Berrios, Daniel C.; Cucina, Russell J.; Fagan, Lawrence M.
2002-01-01
Objective. To evaluate a new system, ISAID (Internet-based Semi-automated Indexing of Documents), and to generate textbook indexes that are more detailed and more useful to readers. Design. Pilot evaluation: simple, nonrandomized trial comparing ISAID with manual indexing methods. Methods evaluation: randomized, cross-over trial comparing three versions of ISAID and usability survey. Participants. Pilot evaluation: two physicians. Methods evaluation: twelve physicians, each of whom used three different versions of the system for a total of 36 indexing sessions. Measurements. Total index term tuples generated per document per minute (TPM), with and without adjustment for concordance with other subjects; inter-indexer consistency; ratings of the usability of the ISAID indexing system. Results. Compared with manual methods, ISAID decreased indexing times greatly. Using three versions of ISAID, inter-indexer consistency ranged from 15% to 65% with a mean of 41%, 31%, and 40% for each of three documents. Subjects using the full version of ISAID were faster (average TPM: 5.6) and had higher rates of concordant index generation. There were substantial learning effects, despite our use of a training/run-in phase. Subjects using the full version of ISAID were much faster by the third indexing session (average TPM: 9.1). There was a statistically significant increase in three-subject concordant indexing rate using the full version of ISAID during the second indexing session (p < 0.05). Summary. Users of the ISAID indexing system create complex, precise, and accurate indexing for full-text documents much faster than users of manual methods. Furthermore, the natural language processing methods that ISAID uses to suggest indexes contributes substantially to increased indexing speed and accuracy. PMID:12386114
Rule-based support system for multiple UMLS semantic type assignments
Geller, James; He, Zhe; Perl, Yehoshua; Morrey, C. Paul; Xu, Julia
2012-01-01
Background When new concepts are inserted into the UMLS, they are assigned one or several semantic types from the UMLS Semantic Network by the UMLS editors. However, not every combination of semantic types is permissible. It was observed that many concepts with rare combinations of semantic types have erroneous semantic type assignments or prohibited combinations of semantic types. The correction of such errors is resource-intensive. Objective We design a computational system to inform UMLS editors as to whether a specific combination of two, three, four, or five semantic types is permissible or prohibited or questionable. Methods We identify a set of inclusion and exclusion instructions in the UMLS Semantic Network documentation and derive corresponding rule-categories as well as rule-categories from the UMLS concept content. We then design an algorithm adviseEditor based on these rule-categories. The algorithm specifies rules for an editor how to proceed when considering a tuple (pair, triple, quadruple, quintuple) of semantic types to be assigned to a concept. Results Eight rule-categories were identified. A Web-based system was developed to implement the adviseEditor algorithm, which returns for an input combination of semantic types whether it is permitted, prohibited or (in a few cases) requires more research. The numbers of semantic type pairs assigned to each rule-category are reported. Interesting examples for each rule-category are illustrated. Cases of semantic type assignments that contradict rules are listed, including recently introduced ones. Conclusion The adviseEditor system implements explicit and implicit knowledge available in the UMLS in a system that informs UMLS editors about the permissibility of a desired combination of semantic types. Using adviseEditor might help accelerate the work of the UMLS editors and prevent erroneous semantic type assignments. PMID:23041716
Magnetic MIMO Signal Processing and Optimization for Wireless Power Transfer
NASA Astrophysics Data System (ADS)
Yang, Gang; Moghadam, Mohammad R. Vedady; Zhang, Rui
2017-06-01
In magnetic resonant coupling (MRC) enabled multiple-input multiple-output (MIMO) wireless power transfer (WPT) systems, multiple transmitters (TXs) each with one single coil are used to enhance the efficiency of simultaneous power transfer to multiple single-coil receivers (RXs) by constructively combining their induced magnetic fields at the RXs, a technique termed "magnetic beamforming". In this paper, we study the optimal magnetic beamforming design in a multi-user MIMO MRC-WPT system. We introduce the multi-user power region that constitutes all the achievable power tuples for all RXs, subject to the given total power constraint over all TXs as well as their individual peak voltage and current constraints. We characterize each boundary point of the power region by maximizing the sum-power deliverable to all RXs subject to their minimum harvested power constraints. For the special case without the TX peak voltage and current constraints, we derive the optimal TX current allocation for the single-RX setup in closed-form as well as that for the multi-RX setup. In general, the problem is a non-convex quadratically constrained quadratic programming (QCQP), which is difficult to solve. For the case of one single RX, we show that the semidefinite relaxation (SDR) of the problem is tight. For the general case with multiple RXs, based on SDR we obtain two approximate solutions by applying time-sharing and randomization, respectively. Moreover, for practical implementation of magnetic beamforming, we propose a novel signal processing method to estimate the magnetic MIMO channel due to the mutual inductances between TXs and RXs. Numerical results show that our proposed magnetic channel estimation and adaptive beamforming schemes are practically effective, and can significantly improve the power transfer efficiency and multi-user performance trade-off in MIMO MRC-WPT systems.
Okano, Hiroyuki; Baba, Misato; Kawato, Katsuhiro; Hidese, Ryota; Yanagihara, Itaru; Kojima, Kenji; Takita, Teisuke; Fujiwara, Shinsuke; Yasukawa, Kiyoshi
2018-03-01
One-step RT-PCR has not been widely used even though some thermostable DNA polymerases with reverse transcriptase (RT) activity were developed from bacterial and archaeal polymerases, which is owing to low cDNA synthesis activity from RNA. In the present study, we developed highly-sensitive one-step RT-PCR using the single variant of family A DNA polymerase with RT activity, K4pol L329A (L329A), from the hyperthermophilic bacterium Thermotoga petrophila K4 or the 16-tuple variant of family B DNA polymerase with RT activity, RTX, from the hyperthermophilic archaeon Thermococcus kodakarensis. Optimization of reaction condition revealed that the activities for cDNA synthesis and PCR of K4pol L329A and RTX were highly affected by the concentrations of MgCl 2 and Mn(OCOCH 3 ) 2 as well as those of K4pol L329A or RTX. Under the optimized condition, 300 copies/μl of target RNA in 10 μl reaction volumes were successfully detected by the one-step RT-PCR with K4pol L329A or RTX, which was almost equally sensitive enough compared with the current RT-PCR condition using retroviral RT and thermostable DNA polymerase. Considering that K4pol L329A and RTX are stable even at 90-100°C, our results suggest that the one-step RT-PCR with K4pol L329A or RTX is more advantageous than the current one. Copyright © 2017 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Siegel, Edward
2015-06-01
NON-shock plasticity/fracture BAE[E.S.:MSE 8,310(71); PSS:(a)5,601/607(71); Xl.-Latt.Defects 5,277(74); Scripta Met.:6,785(72); 8,587/617(74); 3rd Tokyo AE Symp.(76); Acta Met. 5,383(77); JMMM 7,312(78)] ``1''/ ω-``noise'' power-spectrum ``pink''-Zipf(NOT ``red'' =Pareto) power-law UNIVERSALITY is manifestly-demonstrated in two distinct ways to be nothing but Newton 3rd Law of Motion F = ma REdiscovery!!! (aka ``Bak''(1988)-``SOC'':1687 <<<1988: 1988-1687 =301-years!!! PHYSICS:F =ma cross-multiplied as 1/m =a/F =OUTPUT/INPUT = EFFECT/CAUSE =inverse-mass mechanical-susceptibility = χ (`` ω'') χ(`` ω'') ~(F.-D.thm.) ~P(`` ω'') ``noise'' power-spectrum; (``Max & Al show''): E ~ ω & E ~ (upper-limiting-speeds media) ~m. Thus: ω ~ E ~m Inverting: 1/ ω ~ 1/E ~1/m ~a/F = χ (`` ω'') ~P(`` ω'') Thus: F =ma integral-transform(I-T) is ````SOC'''s'' P(ω) ~ 1/ ω!!! ; ''PURE''-MATHS: F =ma DOUBLE-integral time-series(T-S) s(t) =[v0t +(1/2)at2] I-T formally de?nes power-spectrum(PS): P(ω) ≡ ∫ s(t)e-iωtdt = ∫ [vot +(1/2)at2]e-iωtdt = vo ∫ a(t)e-iωtdt +(1/2)[a ≠a(t)] ∫t2e-iωtdt =vo(∂ / ∂ω) δ(ω) + (1/2)[a ≠a(t)](∂2/ ∂ω2) δ(ω) = vo/ω0 + (1/2)[a ≠a(t)]/ω 1 . 000 ...; uniform-velocity a =0 PS P(ω) = 1/ωo WHITE vs. uniform:-a>0a<0) PS P(ω) = 1/ω 1 . 000 ... pink/flicker/HYPERBOLICITY.
A Fault Oblivious Extreme-Scale Execution Environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
McKie, Jim
The FOX project, funded under the ASCR X-stack I program, developed systems software and runtime libraries for a new approach to the data and work distribution for massively parallel, fault oblivious application execution. Our work was motivated by the premise that exascale computing systems will provide a thousand-fold increase in parallelism and a proportional increase in failure rate relative to today’s machines. To deliver the capability of exascale hardware, the systems software must provide the infrastructure to support existing applications while simultaneously enabling efficient execution of new programming models that naturally express dynamic, adaptive, irregular computation; coupled simulations; and massivemore » data analysis in a highly unreliable hardware environment with billions of threads of execution. Our OS research has prototyped new methods to provide efficient resource sharing, synchronization, and protection in a many-core compute node. We have experimented with alternative task/dataflow programming models and shown scalability in some cases to hundreds of thousands of cores. Much of our software is in active development through open source projects. Concepts from FOX are being pursued in next generation exascale operating systems. Our OS work focused on adaptive, application tailored OS services optimized for multi → many core processors. We developed a new operating system NIX that supports role-based allocation of cores to processes which was released to open source. We contributed to the IBM FusedOS project, which promoted the concept of latency-optimized and throughput-optimized cores. We built a task queue library based on distributed, fault tolerant key-value store and identified scaling issues. A second fault tolerant task parallel library was developed, based on the Linda tuple space model, that used low level interconnect primitives for optimized communication. We designed fault tolerance mechanisms for task parallel computations employing work stealing for load balancing that scaled to the largest existing supercomputers. Finally, we implemented the Elastic Building Blocks runtime, a library to manage object-oriented distributed software components. To support the research, we won two INCITE awards for time on Intrepid (BG/P) and Mira (BG/Q). Much of our work has had impact in the OS and runtime community through the ASCR Exascale OS/R workshop and report, leading to the research agenda of the Exascale OS/R program. Our project was, however, also affected by attrition of multiple PIs. While the PIs continued to participate and offer guidance as time permitted, losing these key individuals was unfortunate both for the project and for the DOE HPC community.« less
Fourier transform infrared spectroscopy for analysis of kidney stones.
Khan, Aysha Habib; Imran, Sheharbano; Talati, Jamsheer; Jafri, Lena
2018-01-01
To compare the results of a chemical method of kidney stone analysis with the results of Fourier transform infrared (FT-IR) spectroscopy. Kidney stones collected between June and October 2015 were simultaneously analyzed by chemical and FT-IR methods. Kidney stones (n=449) were collected from patients from 1 to 81 years old. Most stones were from adults, with only 11.5% from children (aged 3-16 years) and 1.5% from children aged <2 years. The male to female ratio was 4.6. In adults, the calcium oxalate stone type, calcium oxalate monohydrate (COM, n=224), was the most common crystal, followed by uric acid and calcium oxalate dihydrate (COD, n=83). In children, the most frequently occurring type was predominantly COD (n=21), followed by COM (n=11), ammonium urate (n=10), carbonate apatite (n=6), uric acid (n=4), and cystine (n=1). Core composition in 22 stones showed ammonium urate (n=2), COM (n=2), and carbonate apatite (n=1) in five stones, while uric acid crystals were detected (n=13) by FT-IR. While chemical analysis identified 3 stones as uric acid and the rest as calcium oxalate only. Agreement between the two methods was moderate, with a kappa statistic of 0.57 (95% confidence interval, 0.5-0.64). Disagreement was noted in the analysis of 77 stones. FT-IR analysis of kidney stones can overcome many limitations associated with chemical analysis.
Mogusu, Emmanuel O; Wolbert, J Benjamin; Kujawinski, Dorothea M; Jochmann, Maik A; Elsner, Martin
2015-07-01
To assess sources and degradation of the herbicide glyphosate [N-(phosphonomethyl) glycine] and its metabolite AMPA (aminomethylphosphonic acid), concentration measurements are often inconclusive and even (13)C/(12)C analysis alone may give limited information. To advance isotope ratio analysis of an additional element, we present compound-specific (15)N/(14)N analysis of glyphosate and AMPA by a two step derivatization in combination with gas chromatography/isotope ratio mass spectrometry (GC/IRMS). The N-H group was derivatized with isopropyl chloroformate (iso-PCF), and remaining acidic groups were subsequently methylated with trimethylsilyldiazomethane (TMSD). Iso-PCF treatment at pH <10 gave too low (15)N/(14)N ratios indicating an incomplete derivatization; in contrast, too high (15)N/(14)N ratios at pH >10 indicated decomposition of the derivative. At pH 10, and with an excess of iso-PCF by 10-24, greatest yields and accurate (15)N/(14)N ratios were obtained (deviation from elemental analyzer-IRMS: -0.2 ± 0.9% for glyphosate; -0.4 ± 0.7% for AMPA). Limits for accurate δ(15)N analysis of glyphosate and AMPA were 150 and 250 ng injected, respectively. A combination of δ(15)N and δ(13)C analysis by liquid chromatography/isotope ratio mass spectrometry (LC/IRMS) (1) enabled an improved distinction of commercial glyphosate products and (2) showed that glyphosate isotope values during degradation by MnO2 clearly fell outside the commercial product range. This highlights the potential of combined carbon and nitrogen isotopes analysis to trace sources and degradation of glyphosate.
Hopkins, F B; Gravett, M R; Self, A J; Wang, M; Chua, Hoe-Chee; Hoe-Chee, C; Lee, H S Nancy; Sim, N Lee Hoi; Jones, J T A; Timperley, C M; Riches, J R
2014-08-01
Detailed chemical analysis of solutions used to decontaminate chemical warfare agents can be used to support verification and forensic attribution. Decontamination solutions are amongst the most difficult matrices for chemical analysis because of their corrosive and potentially emulsion-based nature. Consequently, there are relatively few publications that report their detailed chemical analysis. This paper describes the application of modern analytical techniques to the analysis of decontamination solutions following decontamination of the chemical warfare agent O-ethyl S-2-diisopropylaminoethyl methylphosphonothiolate (VX). We confirm the formation of N,N-diisopropylformamide and N,N-diisopropylamine following decontamination of VX with hypochlorite-based solution, whereas they were not detected in extracts of hydroxide-based decontamination solutions by nuclear magnetic resonance (NMR) spectroscopy or gas chromatography-mass spectrometry. We report the electron ionisation and chemical ionisation mass spectroscopic details, retention indices, and NMR spectra of N,N-diisopropylformamide and N,N-diisopropylamine, as well as analytical methods suitable for their analysis and identification in solvent extracts and decontamination residues.
Forecasting Japan's Physician Shortage in 2035 as the First Full-Fledged Aged Society
Yamaguchi, Rui; Matsumura, Tomoko; Murashige, Naoko; Kodama, Yuko; Minayo, Satoru; Imai, Kohzoh; Kami, Masahiro
2012-01-01
Introduction Japan is rapidly becoming a full-fledged aged society, and physician shortage is a significant concern. The Japanese government has increased the number of medical school enrollments since 2008, but some researchers warn that this increase could lead to physician surplus in the future. It is unknown how many physicians will be required to accommodate future healthcare needs. Materials and Methods We simulated changes in age/sex composition of the population, fatalities (the number of fatalities for the consecutive five years), and number of physicians from 2010 to 2035. Two indicators were defined: fatalities per physician and fatalities by physician working hour, based on the data of the working hours of physicians for each tuple of sex and age groups. We estimated the necessary number of physicians in 2035 and the number of new physicians to maintain the indicator levels in 2010. Results The number of physicians per 1,000 population is predicted to rise from 2·00 in 2010 to 3·14 in 2035. The number of physicians aged 60 years or older is expected to increase from 55,375 (20% of physicians) to 141,711 (36%). In 2010 and 2035, fatalities per physician were 23·1 and 24·0 for the total population, and 13·9 and 19·2 for 75 years or older, respectively. Fatalities per physician working hour are predicted to rise from 0·128 to 0·138. If working hours are limited to 48 hours per week in 2035, the number of fatalities per physician working hour is expected to be 0·196, and the number of new physicians must be increased by 53% over the current pace. Discussion The number of physicians per population continues to rise, but the estimated supply will not fulfill the demand for healthcare in the aging society. Strategies to increase the number of physicians and improve working conditions are urgently needed. PMID:23233868
Bifunctional 3D porous Cu(I) metal-organic framework with gas sorption and luminescent properties
NASA Astrophysics Data System (ADS)
Xing, Guang'en; Zhang, Yan; Cao, Xiulian
2017-10-01
A new Cu(I) metal-organic framework, namely [Cu(L)]2n·n(H2O) (1 HL = 5-(4-Pyridyl)-1H-tetrazole), has been successfully synthesized via the solvothermal reactions of CuI and 5-(4-Pyridyl)-1H-tetrazole ligand, and further characterized by elemental analysis, powder X-ray diffraction analysis, thermal analysis and single crystal X-ray structural analysis. The L- ligand displays a μ4-N2, N3, N4, N5 coordination mode bridging Cu(I) ions into a 3D porous framework with the opened 1D channels filled by the lattice water molecules. Gas sorption investigations indicated that compound 1 can selectively adsorb CO2 over N2 at 298 K, and luminescent properties investigations revealed that compound 1 features luminescent sensing function for nitrobenzene.
Power Analysis Tutorial for Experimental Design Software
2014-11-01
I N S T I T U T E F O R D E F E N S E A N A L Y S E S IDA Document D-5205 November 2014 Power Analysis Tutorial for Experimental Design Software...16) [Jun 2013]. I N S T I T U T E F O R D E F E N S E A N A L Y S E S IDA Document D-5205 Power Analysis Tutorial for Experimental Design ...Test and Evaluation (T&E) community is increasing its employment of Design of Experiments (DOE), a rigorous methodology for planning and evaluating
Analysis of waveguide architectures of InGaN/GaN diode lasers by nearfield optical microscopy
NASA Astrophysics Data System (ADS)
Friede, Sebastian; Tomm, Jens W.; Kühn, Sergei; Hoffmann, Veit; Wenzel, Hans
2017-02-01
Waveguide (WG) architectures of 420-nm emitting InAlGaN/GaN diode lasers are analyzed by photoluminescence (PL) and photocurrent (PC) spectroscopy using a nearfield scanning optical microscope (NSOM) for excitation and detection. The measurements with a spatial resolution of 100 nm are implemented by scanning the fiber tip along the unprepared front facets of standard devices. PL is collected by the fiber tip, whereas PCs are extracted from the contacts that are anyway present for power supply. The mechanisms of signal generation are addressed in detail. The components of the `optical active region', multiple quantum wells (MQW), WGs, and cladding layers are separately inspected. Even separate analysis of p- and n-sections of the WG become possible. Defect levels are detected in the p-part of the WG. Their presence is consistent with the doping by Mg. An increased efficiency of carrier capture into InGaN/GaN WGs compared to GaN WGs is observed. Thus, beyond the improved optical confinement, the electrical confinement is improved, as well. NSOM PL and PC at GaN based devices do not reach the clarity and spatial resolution for WG mode analysis as seen before for GaAs based devices. This is due to higher modal absorption and higher WG losses. NSOM based optical analysis turns out to be an efficient tool for analysis of single layers grown into InAlGaN/GaN diode laser structures, even if this analysis is done at a packaged ready-to-work device.
Liu, Song; Zhang, Yujuan; Chen, Ling; Guan, Wenxian; Guan, Yue; Ge, Yun; He, Jian; Zhou, Zhengyang
2017-10-02
Whole-lesion apparent diffusion coefficient (ADC) histogram analysis has been introduced and proved effective in assessment of multiple tumors. However, the application of whole-volume ADC histogram analysis in gastrointestinal tumors has just started and never been reported in T and N staging of gastric cancers. Eighty patients with pathologically confirmed gastric carcinomas underwent diffusion weighted (DW) magnetic resonance imaging before surgery prospectively. Whole-lesion ADC histogram analysis was performed by two radiologists independently. The differences of ADC histogram parameters among different T and N stages were compared with independent-samples Kruskal-Wallis test. Receiver operating characteristic (ROC) analysis was performed to evaluate the performance of ADC histogram parameters in differentiating particular T or N stages of gastric cancers. There were significant differences of all the ADC histogram parameters for gastric cancers at different T (except ADC min and ADC max ) and N (except ADC max ) stages. Most ADC histogram parameters differed significantly between T1 vs T3, T1 vs T4, T2 vs T4, N0 vs N1, N0 vs N3, and some parameters (ADC 5% , ADC 10% , ADC min ) differed significantly between N0 vs N2, N2 vs N3 (all P < 0.05). Most parameters except ADC max performed well in differentiating different T and N stages of gastric cancers. Especially for identifying patients with and without lymph node metastasis, the ADC 10% yielded the largest area under the ROC curve of 0.794 (95% confidence interval, 0.677-0.911). All the parameters except ADC max showed excellent inter-observer agreement with intra-class correlation coefficients higher than 0.800. Whole-volume ADC histogram parameters held great potential in differentiating different T and N stages of gastric cancers preoperatively.
Shoreside Boiler Demonstration of Fuel-Water Emulsions.
1982-08-01
34 ( metal seat) P/N IE3980 46172 1 Valve plug 416SST P/N 1E3981 46172 2 Diaphram 302SST P/N 1E3992 36012 1 Gasket P/N 1E3993 04022 2 Gasket P/N 1P7880...each long term test. The analyses per- formed were semi-quantitative spectrographic analysis for metallic elements, quantitative analysis for carbon...analysis were to be compared with samples * of the parent metal to determine the extent of any corrosion and/or erosion during the Long Term Tests, it was
National survey on dose data analysis in computed tomography.
Heilmaier, Christina; Treier, Reto; Merkle, Elmar Max; Alkhadi, Hatem; Weishaupt, Dominik; Schindera, Sebastian
2018-05-28
A nationwide survey was performed assessing current practice of dose data analysis in computed tomography (CT). All radiological departments in Switzerland were asked to participate in the on-line survey composed of 19 questions (16 multiple choice, 3 free text). It consisted of four sections: (1) general information on the department, (2) dose data analysis, (3) use of a dose management software (DMS) and (4) radiation protection activities. In total, 152 out of 241 Swiss radiological departments filled in the whole questionnaire (return rate, 63%). Seventy-nine per cent of the departments (n = 120/152) analyse dose data on a regular basis with considerable heterogeneity in the frequency (1-2 times per year, 45%, n = 54/120; every month, 35%, n = 42/120) and method of analysis. Manual analysis is carried out by 58% (n = 70/120) compared with 42% (n = 50/120) of departments using a DMS. Purchase of a DMS is planned by 43% (n = 30/70) of the departments with manual analysis. Real-time analysis of dose data is performed by 42% (n = 21/50) of the departments with a DMS; however, residents can access the DMS in clinical routine only in 20% (n = 10/50) of the departments. An interdisciplinary dose team, which among other things communicates dose data internally (63%, n = 76/120) and externally, is already implemented in 57% (n = 68/120) departments. Swiss radiological departments are committed to radiation safety. However, there is high heterogeneity among them regarding the frequency and method of dose data analysis as well as the use of DMS and radiation protection activities. • Swiss radiological departments are committed to and interest in radiation safety as proven by a 63% return rate of the survey. • Seventy-nine per cent of departments analyse dose data on a regular basis with differences in the frequency and method of analysis: 42% use a dose management software, while 58% currently perform manual dose data analysis. Of the latter, 43% plan to buy a dose management software. • Currently, only 25% of the departments add radiation exposure data to the final CT report.
NASA Astrophysics Data System (ADS)
Joshi, Prathmesh
To enhance the surface properties of stainless steel, the substrate was coated with a 1μm thick coating of Ti-Nb-N by reactive DC magnetron sputtering at different N2 flow rates, substrate biasing and Nb-Ti ratio. The characterization of the coated samples was performed by the following techniques: hardness by Knoop micro-hardness tester, phase analysis by X-ray Diffraction (XRD), compositional analysis by Energy Dispersive X-ray Spectroscopy (EDS) and adhesion by scratch test. The tribology testing was performed on linearly reciprocating ball-on-plate wear testing machine and wear depth and wear volume were evaluated by white light interferometer. The micro-hardness test yielded appreciable enhancement in the surface hardness with the highest value being 1450 HK. Presence of three prominent phases namely NbN, Nb2N3 and TiN resulted from the XRD analysis. EDS analysis revealed the presence of Ti, Nb and Nitrogen. Adhesion was evaluated on the basis of critical loads for cohesive (Lc1) and adhesive (Lc2) failures with values varying between 7-12 N and 16-25 N respectively, during scratch test for coatings on SS substrates.
Impact of gate engineering in enhancement mode n++GaN/InAlN/AlN/GaN HEMTs
NASA Astrophysics Data System (ADS)
Adak, Sarosij; Swain, Sanjit Kumar; Rahaman, Hafizur; Sarkar, Chandan Kumar
2016-12-01
This paper illustrate the effect of gate material engineering on the performance of enhancement mode n++GaN/InAlN/AlN/GaN high electron mobility transistors (HEMTs). A comparative analysis of key device parameters is discussed for the Triple Material Gate (TMG), Dual Material Gate (DMG) and the Single Material Gate (SMG) structure HEMTs by considering the same device dimensions. The simulation results shows that an significant improvement is noticed in the key analysis parameters such as drain current (Id), transconductance (gm), cut off frequency (fT), RF current gain, maximum cut off frequency (fmax) and RF power gain of the gate material engineered devices with respect to SMG normally off n++GaN/InAlN/AlN/GaN HEMTs. This improvement is due to the existence of the perceivable step in the surface potential along the channel which successfully screens the drain potential variation in the source side of the channel for the gate engineering devices. The analysis suggested that the proposed TMG and DMG engineered structure enhancement mode n++GaN/InAlN/AlN/GaN HEMTs can be considered as a potential device for future high speed, microwave and digital application.
Environmental Definition Program Cross Sectional Analysis; Summary of Data and Analysis Techniques
1975-12-31
selected locations, and to characterize cloud and precipitation systems during certain tests and experiments conducted at Wallops Flight Center and at...values. The LWCA was on- site at Wallops Flight Center assisting the special Elight measurements. Other computer programs developed were designed to...550581 N 37025, E Kiev 50024 N 30027 E Simferopol 45001, N 33059! E Perm 58001, N 56018, E Aktyubinsk 50020, N 570131 E Semipalatinsk 50021! N 80015
Noguchi, M; Kido, Y; Kubota, H; Kinjo, H; Kohama, G
1999-12-01
The records of 136 patients with N1-3 oral squamous cell carcinoma treated by surgery were investigated retrospectively, with the aim of finding out which factors were predictive of survival on multivariate analysis. Four independent factors significantly influenced survival in the following order: pN stage; T stage; histological grade; and N stage. The most significant was pN stage, the five-year survival for patients with pN0 being 91% and for patients with pN1-3 41%. A further study was carried out on the 80 patients with pN1-3 to find out their prognostic factors for survival and the independent factors identified by multivariate analysis were T stage and presence or absence of extracapsular spread to metastatic lymph nodes.
Knowledge-Based Image Analysis.
1981-04-01
UNCLASSIF1 ED ETL-025s N IIp ETL-0258 AL Ai01319 S"Knowledge-based image analysis u George C. Stockman Barbara A. Lambird I David Lavine Laveen N. Kanal...extraction, verification, region classification, pattern recognition, image analysis . 3 20. A. CT (Continue on rever.. d. It necessary and Identify by...UNCLgSTFTF n In f SECURITY CLASSIFICATION OF THIS PAGE (When Date Entered) .L1 - I Table of Contents Knowledge Based Image Analysis I Preface
Tsao, Chia-Wen; Yang, Zhi-Jie
2015-10-14
Desorption/ionization on silicon (DIOS) is a high-performance matrix-free mass spectrometry (MS) analysis method that involves using silicon nanostructures as a matrix for MS desorption/ionization. In this study, gold nanoparticles grafted onto a nanostructured silicon (AuNPs-nSi) surface were demonstrated as a DIOS-MS analysis approach with high sensitivity and high detection specificity for glucose detection. A glucose sample deposited on the AuNPs-nSi surface was directly catalyzed to negatively charged gluconic acid molecules on a single AuNPs-nSi chip for MS analysis. The AuNPs-nSi surface was fabricated using two electroless deposition steps and one electroless etching step. The effects of the electroless fabrication parameters on the glucose detection efficiency were evaluated. Practical application of AuNPs-nSi MS glucose analysis in urine samples was also demonstrated in this study.
Exploratory Factor Analysis with Small Sample Sizes
ERIC Educational Resources Information Center
de Winter, J. C. F.; Dodou, D.; Wieringa, P. A.
2009-01-01
Exploratory factor analysis (EFA) is generally regarded as a technique for large sample sizes ("N"), with N = 50 as a reasonable absolute minimum. This study offers a comprehensive overview of the conditions in which EFA can yield good quality results for "N" below 50. Simulations were carried out to estimate the minimum required "N" for different…
Molinski, Tadeusz F.; Reynolds, Kirk A.; Morinaka, Brandon I.
2012-01-01
The absolute stereostructures of the components of symplocin A (3), a new N,N-dimethyl-terminated peptide from the Bahamian cyanobacterium, Symploca sp., were assigned from spectroscopic analysis, including MS and 2D NMR and Marfey’s analysis. The complete absolute configuration of symplocin A, including the unexpected D-configurations of the terminal N,N-dimethylisoleucine and valic acid residues, were assigned by chiral-phase HPLC of the corresponding 2-naphthacyl esters, a highly sensitive, complementary strategy for assignment of N-blocked peptide residues where Marfey’s method is ineffectual, or other methods fall short. Symplocin A exhibited potent activity as an inhibitor of cathepsin E (IC50 300 pM). PMID:22360587
Wu, Yike; Sha, Qiuyue; Du, Juan; Wang, Chang; Zhang, Liang; Liu, Bi-Feng; Lin, Yawei; Liu, Xin
2018-02-02
Robust, efficient identification and accurate quantification of N-glycans are of great significance in N-glycomics analysis. Here, a simple and rapid derivatization method, based on the combination of microwave-assisted deglycosylation and 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate (AQC) labeling, was developed for the analysis of N-glycan by high performance liquid chromatography with fluorescence detection (HPLC-FLD). After optimizing various parameters affecting deglycosylation and derivatization by RNase B, the time for N-glycan labeling was shortened to 50 min with ∼10-fold enhancement in detection sensitivity comparing to conventional 2-aminobenzoic acid (2-AA) labeling method. Additionally, the method showed good linearity (correlation coefficients > 0.991) and reproducibility (RSD < 8.7%). These advantages of the proposed method were further validated by the analysis of complex samples, including fetuin and human serum. Investigation of serum N-glycome for preliminary diagnosis of human lung cancer was conducted, where significant changes of several N-glycans corresponding to core-fucosylated, mono- and disialylated glycans have been evidenced by a series of statistical analysis. Copyright © 2018 Elsevier B.V. All rights reserved.
Analysis of Flue Gas Desulfurization (FGD) Processes for Potential Use on Army Coal-Fired Boilers
1980-09-01
TECHNICAL REPORT N-93 September 1980 ANALYSIS OF FLUE GAS DESULFURIZATION (FGD) PROCESSES FOR POTENTIAL USE ON ARMY COAL-FIRED BOILERS TECHNICAL LIBRARY...REFERENCE: Technical Report N-93, Analysis of Flue Gas Desulfurization (FGD) Ppooesses for Potential Use on Army Coal-Fired Boilers Please take a few...REPORT DOCUMENTATION PAGE 1. REPORT NUMBER CERL-TR-N-93 2. GOVT ACCESSION NO «. TITLE (end Subtitle) ANALYSIS OF FLUE GAS DESULFURIZATION (FGD
Knowles, N. Richard; Ries, Stanley K.
1981-01-01
Triacontanol (TRIA) increased fresh and dry weight and total reducible nitrogen (total N) of rice (Oryza sativa L.) seedlings within 40 minutes. Increases in total N in the supernatants from homogenates of corn (Zea mays L.) and rice leaves treated with TRIA for one minute before grinding occurred within 30 and 80 minutes, respectively. The source for the increase was investigated utilizing atmospheric substitution and enrichment and depletion studies with 15N. The increase in total N in seedlings was shown to be independent of method of N analysis and the presence of nitrate in the plants. Automated Kjeldahl determinations showing apparent increases in N composition due to TRIA were shown to be correlated with hand Kjeldahl, elemental analysis, and chemiluminescent analysis in three independent laboratories. TRIA did not alter the nitrate uptake or endogenous levels of nitrate in corn and rice seedlings. Enrichment experiments revealed that the total N increases in rice seedlings, in vivo, and in supernatants of corn leaf homogenates, in vitro, are not due to atmospheric N2. TRIA increased the soluble N pools of the plants, specifically the free amino acid and soluble protein fractions. No differences in depletion or enrichment of 15N incorporated into soluble and insoluble N fractions of rice seedlings could be detected on an atom per cent 15N basis. The apparent short-term total N increases cannot be explained by current knowledge of major N assimilation pathways. TRIA may stimulate a change in the chemical composition of the seedlings, resulting in interference with standard methods of N analysis. PMID:16662092
Consolidating Data of Global Urban Populations: a Comparative Approach
NASA Astrophysics Data System (ADS)
Blankespoor, B.; Khan, A.; Selod, H.
2017-12-01
Global data on city populations are essential for the study of urbanization, city growth and the spatial distribution of human settlements. Such data are either gathered by combining official estimates of urban populations from across countries or extracted from gridded population models that combine these estimates with geospatial data. These data sources provide varying estimates of urban populations and each approach has its advantages and limitations. In particular, official figures suffer from a lack of consistency in defining urban units (across both space and time) and often provide data for jurisdictions rather than the functionally meaningful urban area. On the other hand, gridded population models require a user-imposed definition to identify urban areas and are constrained by the modelling techniques and input data employed. To address these drawbacks, we combine these approaches by consolidating information from three established sources: (i) the Citypopulation.de (Brinkhoff, 2016); (ii) the World Urban Prospects data (United Nations, 2014); and (iii) the Global Human Settlements population grid (GHS-POP) (EC - JRC, 2015). We create urban footprints with GHS-POP and spatially merge georeferenced city points from both UN WUP and Citypopulation.de with these urban footprints to identify city points that belong to a single agglomeration. We create a consolidated dataset by combining population data from the UN WUP and Citypopulation.de. The flexible framework outlined can incorporate information from alternative inputs to identify urban clusters e.g. by using night-time lights, built-up area or alternative gridded population models (e.g WorldPop or Landscan) and the parameters employed (e.g. density thresholds for urban footprints) may also be adjusted, e.g., as a function of city-specific characteristics. Our consolidated dataset provides a wider and more accurate coverage of city populations to support studies of urbanization. We apply the data to re-examine Zipf's Law. Brinkhoff, Thomas. 2016. City Population.EC - JRC; Columbia University, CIESIN. 2015. GHS population grid, derived from GPW4, multi-temporal (1975, 1990, 2000, 2015).United Nations, Department of Economic and Social Affairs, Population Division. 2014. World Urbanization Prospects: 2014 Revision.
Reiss, Rebecca A; Guerra, Peter; Makhnin, Oleg
2016-01-01
Chlorinated solvent contamination of potable water supplies is a serious problem worldwide. Biostimulation protocols can successfully remediate chlorinated solvent contamination through enhanced reductive dechlorination pathways, however the process is poorly understood and sometimes stalls creating a more serious problem. Whole metagenome techniques have the potential to reveal details of microbial community changes induced by biostimulation. Here we compare the metagenome of a tetrachloroethene contaminated Environmental Protection Agency Superfund Site before and after the application of biostimulation protocols. Environmental DNA was extracted from uncultured microbes that were harvested by on-site filtration of groundwater one month prior to and five months after the injection of emulsified vegetable oil, nutrients, and hydrogen gas bioamendments. Pair-end libraries were prepared for high-throughput DNA sequencing and 90 basepairs from both ends of randomly fragmented 400 basepair DNA fragments were sequenced. Over 31 millions reads were annotated with Metagenome Rapid Annotation using Subsystem Technology representing 32 prokaryotic phyla, 869 genera, and 3,181 species. A 3.6 log 2 fold increase in biomass as measured by DNA yield per mL water was measured, but there was a 9% decrease in the number of genera detected post-remediation. We apply Bayesian statistical methods to assign false discovery rates to fold-change abundance data and use Zipf's power law to filter genera with low read counts. Plotting the log-rank against the log-fold-change facilitates the visualization of the changes in the community in response to the enhanced reductive dechlorination protocol. Members of the Archaea domain increased 4.7 log 2 fold, dominated by methanogens. Prior to remediation, classes Alphaproteobacteria and Betaproteobacteria dominated the community but exhibit significant decreases five months after biostimulation. Geobacter and Sulfurospirillum replace " Sideroxydans " and Burkholderia as the most abundant genera. As a result of biostimulation, Deltaproteobacteria and Epsilonproteobacteria capable of dehalogenation, iron and sulfate reduction, and sulfur oxidation increase. Matches to thermophilic, haloalkane respiring archaea is evidence for additional species involved in biodegradation of chlorinated solvents. Additionally, potentially pathogenic bacteria increase, indicating that there may be unintended consequences of bioremediation.
USDA-ARS?s Scientific Manuscript database
Elevated intake of n-3 long chain polyunsaturated fatty acids (n-3 LCPUFA) is associated with reduced risk for cardiovascular disease. Intake of n-3 LCPUFA is often quantified by analysis of plasma phospholipid fatty acids (PLFA); however, the typical analysis by gas chromatography does not allow fo...
Light Helicopter Family Trade-Off Analysis. Volume 4. Appendix N
1985-05-15
Figur«! N -V1I-9 through N -VII-U ahow th« futl flow CMp«rlsoas chac comt|>oad r.o th« powar r«qutr«««nt* thown by flguraa (J-VII-l through M-VII-4...HELICOPTER. FAMILY TRADE-OFF ANALYSIS APPENDIX N VOLUME IV ACN: 69396 • Copy l_Q] of 130 c:optea. 15 Nay 198S ~ .. 8 06 .0&1 OTIC ELECTE AU613...TITLE (- ..... do) I. TYPf ’!! ~POitT a PI!I’IOD COVI:IU!O LIGHT HELICOPTER FAMILY TRADE-OFF ANALYSIS, Fina t y Report, APPENDIX N , VOLUME IV of XI
Infrared Reflectance Analysis of Epitaxial n-Type Doped GaN Layers Grown on Sapphire.
Tsykaniuk, Bogdan I; Nikolenko, Andrii S; Strelchuk, Viktor V; Naseka, Viktor M; Mazur, Yuriy I; Ware, Morgan E; DeCuir, Eric A; Sadovyi, Bogdan; Weyher, Jan L; Jakiela, Rafal; Salamo, Gregory J; Belyaev, Alexander E
2017-12-01
Infrared (IR) reflectance spectroscopy is applied to study Si-doped multilayer n + /n 0 /n + -GaN structure grown on GaN buffer with GaN-template/sapphire substrate. Analysis of the investigated structure by photo-etching, SEM, and SIMS methods showed the existence of the additional layer with the drastic difference in Si and O doping levels and located between the epitaxial GaN buffer and template. Simulation of the experimental reflectivity spectra was performed in a wide frequency range. It is shown that the modeling of IR reflectance spectrum using 2 × 2 transfer matrix method and including into analysis the additional layer make it possible to obtain the best fitting of the experimental spectrum, which follows in the evaluation of GaN layer thicknesses which are in good agreement with the SEM and SIMS data. Spectral dependence of plasmon-LO-phonon coupled modes for each GaN layer is obtained from the spectral dependence of dielectric of Si doping impurity, which is attributed to compensation effects by the acceptor states.
Heiland, Dieter Henrik; Mader, Irina; Schlosser, Pascal; Pfeifer, Dietmar; Carro, Maria Stella; Lange, Thomas; Schwarzwald, Ralf; Vasilikos, Ioannis; Urbach, Horst; Weyerbrock, Astrid
2016-01-01
The goal of this study was to identify correlations between metabolites from proton MR spectroscopy and genetic pathway activity in glioblastoma multiforme (GBM). Twenty patients with primary GBM were analysed by short echo-time chemical shift imaging and genome-wide expression analyses. Weighed Gene Co-Expression Analysis was used for an integrative analysis of imaging and genetic data. N-acetylaspartate, normalised to the contralateral healthy side (nNAA), was significantly correlated to oligodendrocytic and neural development. For normalised creatine (nCr), a group with low nCr was linked to the mesenchymal subtype, while high nCr could be assigned to the proneural subtype. Moreover, clustering of normalised glutamine and glutamate (nGlx) revealed two groups, one with high nGlx being attributed to the neural subtype, and one with low nGlx associated with the classical subtype. Hence, the metabolites nNAA, nCr, and nGlx correlate with a specific gene expression pattern reflecting the previously described subtypes of GBM. Moreover high nNAA was associated with better clinical prognosis, whereas patients with lower nNAA revealed a shorter progression-free survival (PFS). PMID:27350391
NASA Astrophysics Data System (ADS)
Lindholm, D. M.; Wilson, A.
2012-12-01
The steps many scientific data users go through to use data (after discovering it) can be rather tedious, even when dealing with datasets within their own discipline. Accessing data across domains often seems intractable. We present here, LaTiS, an Open Source brokering solution that bridges the gap between the source data and the user's code by defining a unified data model plus a plugin framework for "adapters" to read data from their native source, "filters" to perform server side data processing, and "writers" to output any number of desired formats or streaming protocols. A great deal of work is being done in the informatics community to promote multi-disciplinary science with a focus on search and discovery based on metadata - information about the data. The goal of LaTiS is to go that last step to provide a uniform interface to read the dataset into computer programs and other applications once it has been identified. The LaTiS solution for integrating a wide variety of data models is to return to mathematical fundamentals. The LaTiS data model emphasizes functional relationships between variables. For example, a time series of temperature measurements can be thought of as a function that maps a time to a temperature. With just three constructs: "Scalar" for a single variable, "Tuple" for a collection of variables, and "Function" to represent a set of independent and dependent variables, the LaTiS data model can represent most scientific datasets at a low level that enables uniform data access. Higher level abstractions can be built on top of the basic model to add more meaningful semantics for specific user communities. LaTiS defines its data model in terms of the Unified Modeling Language (UML). It also defines a very thin Java Interface that can be implemented by numerous existing data interfaces (e.g. NetCDF-Java) such that client code can access any dataset via the Java API, independent of the underlying data access mechanism. LaTiS also provides a reference implementation of the data model and server framework (with a RESTful service interface) in the Scala programming language. Scala can be thought of as the next generation of Java. It runs on the Java Virtual Machine and can directly use Java code. Scala improves upon Java's object-oriented capabilities and adds support for functional programming paradigms which are particularly well suited for scientific data analysis. The Scala implementation of LaTiS can be thought of as a Domain Specific Language (DSL) which presents an API that better matches the semantics of the problems scientific data users are trying to solve. Instead of working with bytes, ints, or arrays, the data user can directly work with data as "time series" or "spectra". LaTiS provides many layers of abstraction with which users can interact to support a wide variety of data access and analysis needs.
Dong, Guoying; Luo, Jing; Zhang, Hong; Wang, Chengmin; Duan, Mingxing; Deliberto, Thomas Jude; Nolte, Dale Louis; Ji, Guangju; He, Hongxuan
2011-01-01
H9N2 influenza A viruses have become established worldwide in terrestrial poultry and wild birds, and are occasionally transmitted to mammals including humans and pigs. To comprehensively elucidate the genetic and evolutionary characteristics of H9N2 influenza viruses, we performed a large-scale sequence analysis of 571 viral genomes from the NCBI Influenza Virus Resource Database, representing the spectrum of H9N2 influenza viruses isolated from 1966 to 2009. Our study provides a panoramic framework for better understanding the genesis and evolution of H9N2 influenza viruses, and for describing the history of H9N2 viruses circulating in diverse hosts. Panorama phylogenetic analysis of the eight viral gene segments revealed the complexity and diversity of H9N2 influenza viruses. The 571 H9N2 viral genomes were classified into 74 separate lineages, which had marked host and geographical differences in phylogeny. Panorama genotypical analysis also revealed that H9N2 viruses include at least 98 genotypes, which were further divided according to their HA lineages into seven series (A–G). Phylogenetic analysis of the internal genes showed that H9N2 viruses are closely related to H3, H4, H5, H7, H10, and H14 subtype influenza viruses. Our results indicate that H9N2 viruses have undergone extensive reassortments to generate multiple reassortants and genotypes, suggesting that the continued circulation of multiple genotypical H9N2 viruses throughout the world in diverse hosts has the potential to cause future influenza outbreaks in poultry and epidemics in humans. We propose a nomenclature system for identifying and unifying all lineages and genotypes of H9N2 influenza viruses in order to facilitate international communication on the evolution, ecology and epidemiology of H9N2 influenza viruses. PMID:21386964
Jeong, Yeong Ran; Kim, Sun Young; Park, Young Sam; Lee, Gyun Min
2018-03-21
N-glycans of therapeutic glycoproteins are critical quality attributes that should be monitored throughout all stages of biopharmaceutical development. To reduce both the time for sample preparation and the variations in analytical results, we have developed an N-glycan analysis method that includes improved 2-aminobenzoic acid (2-AA) labeling to easily remove deglycosylated proteins. Using this analytical method, 15 major 2-AA-labeled N-glycans of Enbrel ® were separated into single peaks in hydrophilic interaction chromatography mode and therefore could be quantitated. 2-AA-labeled N-glycans were also highly compatible with in-line quadrupole time-of-flight mass spectrometry (MS) for structural identification. The structures of 15 major and 18 minor N-glycans were identified from their mass values determined by quadrupole time-of-flight MS. Furthermore, the structures of 14 major N-glycans were confirmed by interpreting the MS/MS data of each N-glycan. This analytical method was also successfully applied to neutral N-glycans of Humira ® and highly sialylated N-glycans of NESP ® . Furthermore, the analysis data of Enbrel ® that were accumulated for 2.5 years demonstrated the high-level consistency of this analytical method. Taken together, the results show that a wide repertoire of N-glycans of therapeutic glycoproteins can be analyzed with high efficiency and consistency using the improved 2-AA labeling-based N-glycan analysis method. Copyright © 2018 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.
Schulte-Uebbing, Lena; de Vries, Wim
2018-02-01
Elevated nitrogen (N) deposition may increase net primary productivity in N-limited terrestrial ecosystems and thus enhance the terrestrial carbon (C) sink. To assess the magnitude of this N-induced C sink, we performed a meta-analysis on data from forest fertilization experiments to estimate N-induced C sequestration in aboveground tree woody biomass, a stable C pool with long turnover times. Our results show that boreal and temperate forests responded strongly to N addition and sequestered on average an additional 14 and 13 kg C per kg N in aboveground woody biomass, respectively. Tropical forests, however, did not respond significantly to N addition. The common hypothesis that tropical forests do not respond to N because they are phosphorus-limited could not be confirmed, as we found no significant response to phosphorus addition in tropical forests. Across climate zones, we found that young forests responded more strongly to N addition, which is important as many previous meta-analyses of N addition experiments rely heavily on data from experiments on seedlings and young trees. Furthermore, the C-N response (defined as additional mass unit of C sequestered per additional mass unit of N addition) was affected by forest productivity, experimental N addition rate, and rate of ambient N deposition. The estimated C-N responses from our meta-analysis were generally lower that those derived with stoichiometric scaling, dynamic global vegetation models, and forest growth inventories along N deposition gradients. We estimated N-induced global C sequestration in tree aboveground woody biomass by multiplying the C-N responses obtained from the meta-analysis with N deposition estimates per biome. We thus derived an N-induced global C sink of about 177 (112-243) Tg C/year in aboveground and belowground woody biomass, which would account for about 12% of the forest biomass C sink (1,400 Tg C/year). © 2017 John Wiley & Sons Ltd.
Phylogenetic analysis of swine influenza viruses recently isolated in Korea.
Lee, C S; Kang, B K; Kim, H K; Park, S J; Park, B K; Jung, K; Song, D S
2008-10-01
Several influenza A viral subtypes were isolated from pigs during a severe outbreak of respiratory disease in Korea during 2005 and 2006. They included a classical swine H1N1 subtype, two swine-human-avian triple-recombinant H1N2 subtypes, and a swine-human-avian triple-recombinant H3N2 subtype. In the current study, genetic characterization to determine the probable origin of these recent isolates was carried out for the first time. Phylogenetic analysis indicated that all the recent Korean isolates of H1N1, H1N2, and H3N2 influenza are closely related to viruses from the United States. Serologic and genetic analysis indicated that the Korean H1N2 viral subtypes were introduced directly from the United States, and did not arise from recombination between Korean H1N1 and H3N2. We suggest that the H1N1, H1N2, and H3N2 viral subtypes that were isolated from the Korean swine population originated in North America, and that these viruses are currently circulating in the Korean swine population.
Glazoff, Michael V.; Gering, Kevin L.; Garnier, John E.; Rashkeev, Sergey N.; Pyt'ev, Yuri Petrovich
2016-05-17
Embodiments discussed herein in the form of methods, systems, and computer-readable media deal with the application of advanced "projectional" morphological algorithms for solving a broad range of problems. In a method of performing projectional morphological analysis, an N-dimensional input signal is supplied. At least one N-dimensional form indicative of at least one feature in the N-dimensional input signal is identified. The N-dimensional input signal is filtered relative to the at least one N-dimensional form and an N-dimensional output signal is generated indicating results of the filtering at least as differences in the N-dimensional input signal relative to the at least one N-dimensional form.
NASA Astrophysics Data System (ADS)
Ohta, Akio; Truyen, Nguyen Xuan; Fujimura, Nobuyuki; Ikeda, Mitsuhisa; Makihara, Katsunori; Miyazaki, Seiichi
2018-06-01
The energy distribution of the electronic state density of wet-cleaned epitaxial GaN surfaces and SiO2/GaN structures has been studied by total photoelectron yield spectroscopy (PYS). By X-ray photoelectron spectroscopy (XPS) analysis, the energy band diagram for a wet-cleaned epitaxial GaN surface such as the energy level of the valence band top and electron affinity has been determined to obtain a better understanding of the measured PYS signals. The electronic state density of GaN surface with different carrier concentrations in the energy region corresponding to the GaN bandgap has been evaluated. Also, the interface defect state density of SiO2/GaN structures was also estimated by not only PYS analysis but also capacitance–voltage (C–V) characteristics. We have demonstrated that PYS analysis enables the evaluation of defect state density filled with electrons at the SiO2/GaN interface in the energy region corresponding to the GaN midgap, which is difficult to estimate by C–V measurement of MOS capacitors.
Effect of ethanol on human sleep EEG using correlation dimension analysis.
Kobayashi, Toshio; Madokoro, Shigeki; Wada, Yuji; Misaki, Kiwamu; Nakagawa, Hiroki
2002-01-01
Our study was designed to investigate the influence of alcohol on sleep using the correlation dimension (D2) analysis. Polysomnography (PSG) was performed in 10 adult human males during a baseline night (BL-N) and an ethanol (0.8 g/kg body weight) night (Et-N). The mean D2 values during the Et-N and BL-N decreased significantly from wakefulness to stages 1, 2, and 3+4 of nonrapid eye movement (non-REM) sleep, and increased during REM sleep. The mean D2 of the sleep electroencephalogram (EEG) during stage 2 during the Et-N was significantly higher than during BL-N. In addition, the mean D2 values of the sleep EEG for the second, third and fourth sleep cycles during the Et-N were significantly higher than during the BL-N. These significant differences between BL-N and Et-N were not recognized by spectral and visual analyses. Our results suggest that D2 is a potentially useful parameter for quantitative analysis of the effect of ethanol on sleep EEGs throughout the entire night. Copyright 2002 S. Karger AG, Basel
Kinetic-limited etching of magnesium doping nitrogen polar GaN in potassium hydroxide solution
NASA Astrophysics Data System (ADS)
Jiang, Junyan; Zhang, Yuantao; Chi, Chen; Yang, Fan; Li, Pengchong; Zhao, Degang; Zhang, Baolin; Du, Guotong
2016-01-01
KOH based wet etchings were performed on both undoped and Mg-doped N-polar GaN films grown by metal-organic chemical vapor deposition. It is found that the etching rate for Mg-doped N-polar GaN gets slow obviously compared with undoped N-polar GaN. X-ray photoelectron spectroscopy analysis proved that Mg oxide formed on N-polar GaN surface is insoluble in KOH solution so that kinetic-limited etching occurs as the etching process goes on. The etching process model of Mg-doped N-polar GaN in KOH solution is tentatively purposed using a simplified ideal atomic configuration. Raman spectroscopy analysis reveals that Mg doping can induce tensile strain in N-polar GaN films. Meanwhile, p-type N-polar GaN film with a hole concentration of 2.4 ÿ 1017 cm3 was obtained by optimizing bis-cyclopentadienyl magnesium flow rates.
Krzyżaniak, Agnieszka; Weggemans, Wilko; Schuur, Boelo; de Haan, André B
2011-12-16
Analysis of primary amines in aqueous samples remains a challenging analytical issue. The preferred approach by gas chromatography is hampered by interactions of free silanol groups with the highly reactive amine groups, resulting in inconsistent measurements. Here, we report a method for direct analysis of aliphatic amines and diamines in aqueous samples by gas chromatography (GC) with silanol deactivation using ionic liquids (ILs). ILs including trihexyl(tetradecyl)phosphonium bis 2,4,4-(trimethylpentyl)phosphinate (Cyphos IL-104), 1-methyl-3-propylimidazolium bis(trifluoromethylsulfonyl)imide [pmim][Tf(2)N] and N″-ethyl-N,N,N',N'-tetramethylguanidinium tris(pentafluoroethyl)trifluorophosphate [etmg][FAP] were tested as deactivating media for the GC liner. Solutions of these ILs in methanol were injected in the system prior to the analysis of primary amines. Butane-1,4-diamine (putrescine, BDA) was used as a reference amine. The best results were obtained using the imidazolium IL [pmim][Tf(2)N]. With this deactivator, excellent reproducibility of the analysis was achieved, and the detection limit of BDA was as low as 1mM. The applicability of the method was proven for the analysis of two different primary amines (C4-C5) and pentane-1,5-diamine. Copyright © 2011 Elsevier B.V. All rights reserved.
Weng, Yejing; Sui, Zhigang; Jiang, Hao; Shan, Yichu; Chen, Lingfan; Zhang, Shen; Zhang, Lihua; Zhang, Yukui
2015-04-22
Due to the important roles of N-glycoproteins in various biological processes, the global N-glycoproteome analysis has been paid much attention. However, by current strategies for N-glycoproteome profiling, peptides with glycosylated Asn at N-terminus (PGANs), generated by protease digestion, could hardly be identified, due to the poor deglycosylation capacity by enzymes. However, theoretically, PGANs occupy 10% of N-glycopeptides in the typical tryptic digests. Therefore, in this study, we developed a novel strategy to identify PGANs by releasing N-glycans through the N-terminal site-selective succinylation assisted enzymatic deglycosylation. The obtained PGANs information is beneficial to not only achieve the deep coverage analysis of glycoproteomes, but also discover the new biological functions of such modification.
Kobayashi, Ryuji; Patenia, Rebecca; Ashizawa, Satoshi; Vykoukal, Jody
2009-07-21
Alternative translation initiation is a mechanism whereby functionally altered proteins are produced from a single mRNA. Internal initiation of translation generates N-terminally truncated protein isoforms, but such isoforms observed in immunoblot analysis are often overlooked or dismissed as degradation products. We identified an N-terminally truncated isoform of human Dok-1 with N-terminal acetylation as seen in the wild-type. This Dok-1 isoform exhibited distinct perinuclear localization whereas the wild-type protein was distributed throughout the cytoplasm. Targeted analysis of blocked N-terminal peptides provides rapid identification of protein isoforms and could be widely applied for the general evaluation of perplexing immunoblot bands.
Wawer, Iwona; Pisklak, Maciej; Chilmonczyk, Zdzisław
2005-08-10
Sildenafil citrate (SC) (Viagra) and sildenafil base in pure form are easily and unequivocally characterized by multinuclear NMR spectroscopy. Analysis of chemical shifts indicates that: (i) N6-H forms intramolecular hydrogen bonds, (ii) N25 is protonated in the salt and (iii) intermolecular OH...N hydrogen bonds involving N2 and N4 are present in the solid sildenafil citrate. 13C CPMAS NMR method has been proposed for the identification and quantitation of Viagra in its pharmaceutical formulations.
Wang, Jingrui; Tang, Wei; Zheng, Yongna; Xing, Zhuqing; Wang, Yanping
2016-09-01
A novel lactic acid bacteria strain Lactobacillus kefiranofaciens ZW3 exhibited the characteristics of high production of exopolysaccharide (EPS). The epsN gene, located in the eps gene cluster of this strain, is associated with EPS biosynthesis. Bioinformatics analysis of this gene was performed. The conserved domain analysis showed that the EpsN protein contained MATE-Wzx-like domains. Then the epsN gene was amplified to construct the recombinant expression vector pMG36e-epsN. The results showed that the EPS yields of the recombinants were significantly improved. By determining the yields of EPS and intracellular polysaccharide, it was considered that epsN gene could play its Wzx flippase role in the EPS biosynthesis. This is the first time to prove the effect of EpsN on L. kefiranofaciens EPS biosynthesis and further prove its functional property.
Maeda, Koki; Toyoda, Sakae; Shimojima, Ryosuke; Osada, Takashi; Hanajima, Dai; Morioka, Riki; Yoshida, Naohiro
2010-03-01
A molecular analysis of betaproteobacterial ammonia oxidizers and a N(2)O isotopomer analysis were conducted to study the sources of N(2)O emissions during the cow manure composting process. Much NO(2)(-)-N and NO(3)(-)-N and the Nitrosomonas europaea-like amoA gene were detected at the surface, especially at the top of the composting pile, suggesting that these ammonia-oxidizing bacteria (AOB) significantly contribute to the nitrification which occurs at the surface layer of compost piles. However, the (15)N site preference within the asymmetric N(2)O molecule (SP = delta(15)N(alpha) - delta(15)N(beta), where (15)N(alpha) and (15)N(beta) represent the (15)N/(14)N ratios at the center and end sites of the nitrogen atoms, respectively) indicated that the source of N(2)O emissions just after the compost was turned originated mainly from the denitrification process. Based on these results, the reduction of accumulated NO(2)(-)-N or NO(3)(-)-N after turning was identified as the main source of N(2)O emissions. The site preference and bulk delta(15)N results also indicate that the rate of N(2)O reduction was relatively low, and an increased value for the site preference indicates that the nitrification which occurred mainly in the surface layer of the pile partially contributed to N(2)O emissions between the turnings.
N-Nitrosodimethylamine (NDMA) is a probable human carcinogen that has been identified as a drinking water contaminant of concern. United States Environmental Protection Agency (USEPA) Method 521 has been developed for the analysis of NDMA and six additional N-nitrosamines in dri...
Measurement techniques for analysis of fission fragment excited gases
NASA Technical Reports Server (NTRS)
Schneider, R. T.; Carroll, E. E.; Davis, J. F.; Davie, R. N.; Maguire, T. C.; Shipman, R. G.
1976-01-01
Spectroscopic analysis of fission fragment excited He, Ar, Xe, N2, Ne, Ar-N2, and Ne-N2 have been conducted. Boltzmann plot analysis of He, Ar and Xe have indicated a nonequilibrium, recombining plasma, and population inversions have been found in these gases. The observed radiating species in helium have been adequately described by a simple kinetic model. A more extensive model for argon, nitrogen and Ar-N2 mixtures was developed which adequately describes the energy flow in the system and compares favorably with experimental measurements. The kinetic processes involved in these systems are discussed.
N-myc regulates growth and fiber cell differentiation in lens development
Cavalheiro, Gabriel R.; Matos-Rodrigues, Gabriel E.; Zhao, Yilin; Gomes, Anielle L.; Anand, Deepti; Predes, Danilo; de Lima, Silmara; Abreu, Jose G.; Zheng, Deyou; Lachke, Salil A.; Cvekl, Ales; Martins, Rodrigo A. P.
2017-01-01
Myc proto-oncogenes regulate diverse cellular processes during development, but their roles during morphogenesis of specific tissues are not fully understood. We found that c-myc regulates cell proliferation in mouse lens development and previous genome-wide studies suggested functional roles for N-myc in developing lens. Here, we examined the role of N-myc in mouse lens development. Genetic inactivation of N-myc in the surface ectoderm or lens vesicle impaired eye and lens growth, while "late" inactivation in lens fibers had no effect. Unexpectedly, defective growth of N-myc--deficient lenses was not associated with alterations in lens progenitor cell proliferation or survival. Notably, N-myc-deficient lens exhibited a delay in degradation of DNA in terminally differentiating lens fiber cells. RNA-sequencing analysis of N-myc--deficient lenses identified a cohort of down-regulated genes associated with fiber cell differentiation that included DNaseIIβ. Further, an integrated analysis of differentially expressed genes in N-myc-deficient lens using normal lens expression patterns of iSyTE, N-myc-binding motif analysis and molecular interaction data from the String database led to the derivation of an N-myc-based gene regulatory network in the lens. Finally, analysis of N-myc and c-myc double-deficient lens demonstrated that these Myc genes cooperate to drive lens growth prior to lens vesicle stage. Together, these findings provide evidence for exclusive and cooperative functions of Myc transcription factors in mouse lens development and identify novel mechanisms by which N-myc regulates cell differentiation during eye morphogenesis. PMID:28716713
Fatty Acid Metabolism is Associated With Disease Severity After H7N9 Infection.
Sun, Xin; Song, Lijia; Feng, Shuang; Li, Li; Yu, Hongzhi; Wang, Qiaoxing; Wang, Xing; Hou, Zhili; Li, Xue; Li, Yu; Zhang, Qiuyang; Li, Kuan; Cui, Chao; Wu, Junping; Qin, Zhonghua; Wu, Qi; Chen, Huaiyong
2018-06-22
Human infections with the H7N9 virus could lead to lung damage and even multiple organ failure, which is closely associated with a high mortality rate. However, the metabolic basis of such systemic alterations remains unknown. This study included hospitalized patients (n = 4) with laboratory-confirmed H7N9 infection, healthy controls (n = 9), and two disease control groups comprising patients with pneumonia (n = 9) and patients with pneumonia who received steroid treatment (n = 10). One H7N9-infected patient underwent lung biopsy for histopathological analysis and expression analysis of genes associated with lung homeostasis. H7N9-induced systemic alterations were investigated using metabolomic analysis of sera collected from the four patients by using ultra-performance liquid chromatography-mass spectrometry. Chest digital radiography and laboratory tests were also conducted. Two of the four patients did not survive the clinical treatments with antiviral medication, steroids, and oxygen therapy. Biopsy revealed disrupted expression of genes associated with lung epithelial integrity. Histopathological analysis demonstrated severe lung inflammation after H7N9 infection. Metabolomic analysis indicated that fatty acid metabolism may be inhibited during H7N9 infection. Serum levels of palmitic acid, erucic acid, and phytal may negatively correlate with the extent of lung inflammation after H7N9 infection. The changes in fatty acid levels may not be due to steroid treatment or pneumonia. Altered structural and secretory properties of the lung epithelium may be associated with the severity of H7N9-infection-induced lung disease. Moreover, fatty acid metabolism level may predict a fatal outcome after H7N9 virus infection. Copyright © 2018. Published by Elsevier B.V.
Analysis of (n,2n) cross-section measurements for nuclei up to mass 238
DOE Office of Scientific and Technical Information (OSTI.GOV)
Davey, W.G.; Goin, R.W.; Ross, J.R.
All suitable measurements of the energy dependence of (n,2n) cross sections of all isotopes up to mass 238 have been analyzed. The objectives were to display the quality of the measured data for each isotope and to examine the systematic dependence of the (n,2n) cross section upon N, Z, and A. Graphs and tables are presented of the ratio of the asymptotic (n,2n) and nonelastic cross section to the neutron-asymmetry parameter (N--Z)/A. Similar data are presented for the derived nuclear temperature, T, and level-density parameter, $alpha$, as a function of N, Z, and A. This analysis of the results ofmore » over 145 experiments on 61 isotopes is essentially a complete review of the current status of (n,2n) cross-section measurements. (auth)« less
Detection of non-milk fat in milk fat by gas chromatography and linear discriminant analysis.
Gutiérrez, R; Vega, S; Díaz, G; Sánchez, J; Coronado, M; Ramírez, A; Pérez, J; González, M; Schettino, B
2009-05-01
Gas chromatography was utilized to determine triacylglycerol profiles in milk and non-milk fat. The values of triacylglycerol were subjected to linear discriminant analysis to detect and quantify non-milk fat in milk fat. Two groups of milk fat were analyzed: A) raw milk fat from the central region of Mexico (n = 216) and B) ultrapasteurized milk fat from 3 industries (n = 36), as well as pork lard (n = 2), bovine tallow (n = 2), fish oil (n = 2), peanut (n = 2), corn (n = 2), olive (n = 2), and soy (n = 2). The samples of raw milk fat were adulterated with non-milk fats in proportions of 0, 5, 10, 15, and 20% to form 5 groups. The first function obtained from the linear discriminant analysis allowed the correct classification of 94.4% of the samples with levels <10% of adulteration. The triacylglycerol values of the ultrapasteurized milk fats were evaluated with the discriminant function, demonstrating that one industry added non-milk fat to its product in 80% of the samples analyzed.
Phylogenetic Analysis of Nuclear-Encoded RNA Maturases
Malik, Sunita; Upadhyaya, KC; Khurana, SM Paul
2017-01-01
Posttranscriptional processes, such as splicing, play a crucial role in gene expression and are prevalent not only in nuclear genes but also in plant mitochondria where splicing of group II introns is catalyzed by a class of proteins termed maturases. In plant mitochondria, there are 22 mitochondrial group II introns. matR, nMAT1, nMAT2, nMAT3, and nMAT4 proteins have been shown to be required for efficient splicing of several group II introns in Arabidopsis thaliana. Nuclear maturases (nMATs) are necessary for splicing of mitochondrial genes, leading to normal oxidative phosphorylation. Sequence analysis through phylogenetic tree (including bootstrapping) revealed high homology with maturase sequences of A thaliana and other plants. This study shows the phylogenetic relationship of nMAT proteins between A thaliana and other nonredundant plant species taken from BLASTP analysis. PMID:28607538
2011-01-01
Background There is increasing interest by chiropractors in North America regarding integration into mainstream healthcare; however, there is limited information about attitudes towards the profession among conventional healthcare providers, including orthopaedic surgeons. Methods We administered a 43-item cross-sectional survey to 1000 Canadian and American orthopaedic surgeons that inquired about demographic variables and their attitudes towards chiropractic. Our survey included an option for respondants to include written comments, and our present analysis is restricted to these comments. Two reviewers, independantly and in duplicate, coded all written comments using thematic analysis. Results 487 surgeons completed the survey (response rate 49%), and 174 provided written comments. Our analysis revealed 8 themes and 24 sub-themes represented in surgeons' comments. Reported themes were: variability amongst chiropractors (n = 55); concerns with chiropractic treatment (n = 54); areas where chiropractic is perceived as effective (n = 43); unethical behavior (n = 43); patient interaction (n = 36); the scientific basis of chiropractic (n = 26); personal experiences with chiropractic (n = 21); and chiropractic training (n = 18). Common sub-themes endorsed by surgeon's were diversity within the chiropractic profession as a barrier to increased interprofessional collaboration, endorsement for chiropractic treatment of musculoskeletal complaints, criticism for treatment of non-musculoskeletal complaints, and concern over whether chiropractic care was evidence-based. Conclusions Our analysis identified a number of issues that will have to be considered by the chiropractic profession as part of its efforts to further integrate chiropractic into mainstream healthcare. PMID:21970333
N-linked (N-) glycoproteomics of urinary exosomes. [Corrected].
Saraswat, Mayank; Joenväära, Sakari; Musante, Luca; Peltoniemi, Hannu; Holthofer, Harry; Renkonen, Risto
2015-02-01
Epithelial cells lining the urinary tract secrete urinary exosomes (40-100 nm) that can be targeted to specific cells modulating their functionality. One potential targeting mechanism is adhesion between vesicle surface glycoproteins and target cells. This makes the glycopeptide analysis of exosomes important. Exosomes reflect the physiological state of the parent cells; therefore, they are a good source of biomarkers for urological and other diseases. Moreover, the urine collection is easy and noninvasive and urinary exosomes give information about renal and systemic organ systems. Accordingly, multiple studies on proteomic characterization of urinary exosomes in health and disease have been published. However, no systematic analysis of their glycoproteomic profile has been carried out to date, whereas a conserved glycan signature has been found for exosomes from urine and other sources including T cell lines and human milk. Here, we have enriched and identified the N-glycopeptides from these vesicles. These enriched N-glycopeptides were solved for their peptide sequence, glycan composition, structure, and glycosylation site using collision-induced dissociation MS/MS (CID-tandem MS) data interpreted by a publicly available software GlycopeptideId. Released glycans from the same sample was also analyzed with MALDI-MS. We have identified the N-glycoproteome of urinary exosomes. In total 126 N-glycopeptides from 51 N-glycosylation sites belonging to 37 glycoproteins were found in our results. The peptide sequences of these N-glycopeptides were identified unambiguously and their glycan composition (for 125 N-glycopeptides) and structures (for 87 N-glycopeptides) were proposed. A corresponding glycomic analysis with released N-glycans was also performed. We identified 66 unique nonmodified N-glycan compositions and in addition 13 sulfated/phosphorylated glycans were also found. This is the first systematic analysis of N-glycoproteome of urinary exosomes. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Mejía, Sol M; Flórez, Elizabeth; Mondragón, Fanor
2012-04-14
A computational study of (ethanol)(n)-water, n = 1 to 5 heteroclusters was carried out employing the B3LYP∕6-31+G(d) approach. The molecular (MO) and atomic (AO) orbital analysis and the topological study of the electron density provided results that were successfully correlated. Results were compared with those obtained for (ethanol)(n), (methanol)(n), n = 1 to 6 clusters and (methanol)(n)-water, n = 1 to 5 heteroclusters. These systems showed the same trends observed in the (ethanol)(n)-water, n = 1 to 5 heteroclusters such as an O---O distance of 5 Å to which the O-H---O hydrogen bonds (HBs) can have significant influence on the constituent monomers. The HOMO of the hetero(clusters) is less stable than the HOMO of the isolated alcohol monomer as the hetero(cluster) size increases, that destabilization is higher for linear geometries than for cyclic geometries. Changes of the occupancy and energy of the AO are correlated with the strength of O-H---O and C-H---O HBs as well as with the proton donor and/or acceptor character of the involved molecules. In summary, the current MO and AO analysis provides alternative ways to characterize HBs. However, this analysis cannot be applied to the study of H---H interactions observed in the molecular graphs.
Nishikaze, Takashi
2017-01-01
Mass spectrometry (MS) has become an indispensable tool for analyzing post translational modifications of proteins, including N-glycosylated molecules. Because most glycosylation sites carry a multitude of glycans, referred to as “glycoforms,” the purpose of an N-glycosylation analysis is glycoform profiling and glycosylation site mapping. Matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) has unique characteristics that are suited for the sensitive analysis of N-glycosylated products. However, the analysis is often hampered by the inherent physico-chemical properties of N-glycans. Glycans are highly hydrophilic in nature, and therefore tend to show low ion yields in both positive- and negative-ion modes. The labile nature and complicated branched structures involving various linkage isomers make structural characterization difficult. This review focuses on MALDI-MS-based approaches for enhancing analytical performance in N-glycosylation research. In particular, the following three topics are emphasized: (1) Labeling for enhancing the ion yields of glycans and glycopeptides, (2) Negative-ion fragmentation for less ambiguous elucidation of the branched structure of N-glycans, (3) Derivatization for the stabilization and linkage isomer discrimination of sialic acid residues. PMID:28794918
Mukhtar, Hussnain; Lin, Yu-Pin; Shipin, Oleg V; Petway, Joy R
2017-07-12
This study presents an approach for obtaining realization sets of parameters for nitrogen removal in a pilot-scale waste stabilization pond (WSP) system. The proposed approach was designed for optimal parameterization, local sensitivity analysis, and global uncertainty analysis of a dynamic simulation model for the WSP by using the R software package Flexible Modeling Environment (R-FME) with the Markov chain Monte Carlo (MCMC) method. Additionally, generalized likelihood uncertainty estimation (GLUE) was integrated into the FME to evaluate the major parameters that affect the simulation outputs in the study WSP. Comprehensive modeling analysis was used to simulate and assess nine parameters and concentrations of ON-N, NH₃-N and NO₃-N. Results indicate that the integrated FME-GLUE-based model, with good Nash-Sutcliffe coefficients (0.53-0.69) and correlation coefficients (0.76-0.83), successfully simulates the concentrations of ON-N, NH₃-N and NO₃-N. Moreover, the Arrhenius constant was the only parameter sensitive to model performances of ON-N and NH₃-N simulations. However, Nitrosomonas growth rate, the denitrification constant, and the maximum growth rate at 20 °C were sensitive to ON-N and NO₃-N simulation, which was measured using global sensitivity.
Determining cantilever stiffness from thermal noise.
Lübbe, Jannis; Temmen, Matthias; Rahe, Philipp; Kühnle, Angelika; Reichling, Michael
2013-01-01
We critically discuss the extraction of intrinsic cantilever properties, namely eigenfrequency f n , quality factor Q n and specifically the stiffness k n of the nth cantilever oscillation mode from thermal noise by an analysis of the power spectral density of displacement fluctuations of the cantilever in contact with a thermal bath. The practical applicability of this approach is demonstrated for several cantilevers with eigenfrequencies ranging from 50 kHz to 2 MHz. As such an analysis requires a sophisticated spectral analysis, we introduce a new method to determine k n from a spectral analysis of the demodulated oscillation signal of the excited cantilever that can be performed in the frequency range of 10 Hz to 1 kHz regardless of the eigenfrequency of the cantilever. We demonstrate that the latter method is in particular useful for noncontact atomic force microscopy (NC-AFM) where the required simple instrumentation for spectral analysis is available in most experimental systems.
Optimization of Parameter Ranges for Composite Tape Winding Process Based on Sensitivity Analysis
NASA Astrophysics Data System (ADS)
Yu, Tao; Shi, Yaoyao; He, Xiaodong; Kang, Chao; Deng, Bo; Song, Shibo
2017-08-01
This study is focus on the parameters sensitivity of winding process for composite prepreg tape. The methods of multi-parameter relative sensitivity analysis and single-parameter sensitivity analysis are proposed. The polynomial empirical model of interlaminar shear strength is established by response surface experimental method. Using this model, the relative sensitivity of key process parameters including temperature, tension, pressure and velocity is calculated, while the single-parameter sensitivity curves are obtained. According to the analysis of sensitivity curves, the stability and instability range of each parameter are recognized. Finally, the optimization method of winding process parameters is developed. The analysis results show that the optimized ranges of the process parameters for interlaminar shear strength are: temperature within [100 °C, 150 °C], tension within [275 N, 387 N], pressure within [800 N, 1500 N], and velocity within [0.2 m/s, 0.4 m/s], respectively.
2015-10-01
analysis General support a. Domestic cannabis suppression and eradication b. Transportation Reconnaissance and observation a. Ground...mapping analysis n/a n/a n/a 433 433 Training or administrative 5,321 1,431 3,650 2,878 13,281 General support Cannabis eradication 23,679
USDA-ARS?s Scientific Manuscript database
Weather and soil properties are known to affect soil nitrogen (N) availability and plant N uptake. Studies examining N response as affected by soil and weather sometimes give conflicting results. Meta-analysis is a statistical method for estimating treatment effects in a series of experiments...
Emergence of Compulsive Behavior and Tantrums in Children with Prader-Willi Syndrome.
ERIC Educational Resources Information Center
Dimitropoulos, A.; Feurer, I. D.; Butler, M. G.; Thompson, T.
2001-01-01
Analysis of questionnaires completed by parents of young children with either Prader-Willi syndrome (N=84), Down syndrome (N=56), or typically developing (N=86), found children with Prader-Willi exhibited more compulsions, skin-picking, and tantrums than did the other groups. Discriminant analysis identified two functions (developmental milestones…
Atom Probe Tomography Analysis of Gallium-Nitride-Based Light-Emitting Diodes
NASA Astrophysics Data System (ADS)
Prosa, Ty J.; Olson, David; Giddings, A. Devin; Clifton, Peter H.; Larson, David J.; Lefebvre, Williams
2014-03-01
Thin-film light-emitting diodes (LEDs) composed of GaN/InxGa1-xN/GaN quantum well (QW) structures are integrated into modern optoelectronic devices because of the tunable InGaN band-gap enabling emission of the full visible spectrum. Atom probe tomography (APT) offers unique capabilities for 3D device characterization including compositional mapping of nano-volumes (>106 nm3) , high detection efficiency (>50%), and good sensitivity. In this study, APT is used to understand the distribution of dopants as well as Al and In alloying agents in a GaN device. Measurements using transmission electron microscopy (TEM) and secondary ion mass spectrometry (SIMS) have also been made to improve the accuracy of the APT analysis by correlating the information content of these complimentary techniques. APT analysis reveals various QW and other optoelectronic structures including a Mg p-GaN layer, an Al-rich electron blocking layer, an In-rich multi-QW region, and an In-based super-lattice structure. The multi-QW composition shows good quantitative agreement with layer thickness and spacing extracted from a high resolution TEM image intensity analysis.
PIXE analysis on Maya blue in Prehispanic and colonial mural paintings
NASA Astrophysics Data System (ADS)
Sánchez del Río, M.; Martinetto, P.; Solís, C.; Reyes-Valerio, C.
2006-08-01
Particle induced X-ray emission (PIXE) experiments have been carried out at the AGLAE facility (Paris) on several mural samples containing Maya blue from different Prehispanic archaeological sites (Cacaxtla, El Tajín, Tamuin, Santa Cecilia Acatitlán) and from several colonial convents in the Mexican plateau (Jiutepec, Totimehuacán, Tezontepec and Cuauhtinchán). The analysis of the concentration of several elements permitted to extract some information on the technique used for painting the mural, usually fresco. Principal component analysis permitted to classify the samples into groups. This grouping is discussed in relation to geographic and historic data.
Booth, Brandon D; Vilt, Steven G; McCabe, Clare; Jennings, G Kane
2009-09-01
This Article presents a quantitative comparison of the frictional performance for monolayers derived from n-alkanethiolates on gold and n-alkyl trichlorosilanes on silicon. Monolayers were characterized by pin-on-disk tribometry, contact angle analysis, ellipsometry, and electrochemical impedance spectroscopy (EIS). Pin-on-disk microtribometry provided frictional analysis at applied normal loads from 10 to 1000 mN at a speed of 0.1 mm/s. At low loads (10 mN), methyl-terminated n-alkanethiolate self-assembled monolayers (SAMs) exhibited a 3-fold improvement in coefficient of friction over SAMs with hydroxyl- or carboxylic-acid-terminated surfaces. For monolayers prepared from both n-alkanethiols on gold and n-alkyl trichlorosilanes on silicon, a critical chain length of at least eight carbons is required for beneficial tribological performance at an applied load of 9.8 mN. Evidence for disruption of chemisorbed alkanethiolate SAMs with chain lengths n
Cross sections for n+{sup 14}N from an R-matrix analysis of the {sup 15}N system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hale, G.M.; Young, P.G.; Chadwick, M.B.
1994-06-01
As part of the Hiroshima-Nagasaki Dose Reevaluation Program, a new evaluation of the neutron cross sections for {sup 14}N was made for ENDF/B-VI, based at energies below 2.5 MeV on a multichannel R-matrix analysis of reactions in the {sup 15}N system. The types of data used in the analysis, and the resulting cross sections and resonance structure for {sup 15}N are briefly described. The resonant features of the neutron cross sections were especially well determined by including precise, high-resolution neutron total cross section measurements from ORNL. While the new evaluated cross section appear to be significant improvements over the earliermore » ones, they still need to be tested more extensively against recent measurements of the differential elastic cross section from Oak Ridge.« less
Gu, Yingchun; Song, Yelin; Liu, Yufeng
2014-09-30
To explore the clinical characteristics and prognostic factors of pulmonary tuberculosis with concurrent lung cancer. Comprehensive analyses were conducted for 58 cases of pulmonary tuberculosis patients with lung cancer. Their clinical symptoms, signs and imaging results were analyzed between January 1998 and January 2005 at Qingdao Chest Hospital. Kaplan-Meier method was utilized to calculate their survival rates. Nine prognostic characteristics were analyzed. Single factor analysis was performed with Logrank test and multi-factor analysis with Cox regression model. The initial symptoms were cough, chest tightness, fever and hemoptysis. Chest radiology showed the coexistence of two diseases was 36 in the same lobe and 22 in different lobes. And there were pulmonary nodules (n = 24), cavities (n = 19), infiltration (n = 8) and atelectasis (n = 7). According to the pathological characteristics, there were squamous carcinoma (n = 33), adenocarcinoma (n = 17), small cell carcinoma (n = 4) and unidentified (n = 4) respectively. The TNM stages were I (n = 13), II(n = 22), III (n = 16) and IV (n = 7) respectively. The median survival period was 24 months. And the 1, 3, 5-year survival rates were 65.5%, 65.5% and 29.0% respectively. Single factor analysis showed that lung cancer TNM staging (P = 0.000) and tuberculosis activity (P = 0.024) were significantly associated with patient prognosis. And multi-factor analysis showed that lung cancer TNM staging (RR = 2.629, 95%CI: 1.759-3.928, P = 0.000) and tuberculosis activity (RR = 1.885, 95%CI: 1.023-3.471, P = 0.042) were relatively independent prognostic factors. The clinical and radiological characteristics contribute jointly to early diagnosis and therapy of tuberculosis with concurrent lung cancer. And TNM staging of lung cancer and activity of tuberculosis are major prognostic factors.
Characteristics of cellulose-microalgae composite
NASA Astrophysics Data System (ADS)
Hwang, Kyo-Jung; Kwon, Gu-Joong; Yang, Ji-Wook; Kim, Sung-yeol; Kim, Dae-Young
2017-10-01
The composites were prepared in order of mixing the cellulose with the N. commune, dissolution-regeneration procedure by LiOH/Urea aqueous solution and freeze-drying. Before the freeze-drying, internal pores of the composites were substituted with an organic solvent. SEM analysis showed that the increase of N. commune results in blockage of cellulose network structure. Brunauer-Emmett-Teller (BET) surface area analysis showed the decrease of mesopore and macropore as the N. commune ratio increases, also the decrease of the specific surface area was shown. The composites appear to have different thermogravimetric analysis properties with the pure N. commune or cellulose itself. Fourier transform infrared spectroscopy (FT-IR) spectra of the composites have specific peaks of the cellulose and N. commune, and increase of N. commune ratio results broadening of peaks relevant to proteins, lipids, and fatty acids. The composites showed higher adsorptivity as the N. commune ratio increases. Especially, the adsorptivity was higher than active carbon before 120 minutes of adsorption. The composite is expected to be used for the situations which need urgent adsorption.
Maeda, Koki; Toyoda, Sakae; Shimojima, Ryosuke; Osada, Takashi; Hanajima, Dai; Morioka, Riki; Yoshida, Naohiro
2010-01-01
A molecular analysis of betaproteobacterial ammonia oxidizers and a N2O isotopomer analysis were conducted to study the sources of N2O emissions during the cow manure composting process. Much NO2−-N and NO3−-N and the Nitrosomonas europaea-like amoA gene were detected at the surface, especially at the top of the composting pile, suggesting that these ammonia-oxidizing bacteria (AOB) significantly contribute to the nitrification which occurs at the surface layer of compost piles. However, the 15N site preference within the asymmetric N2O molecule (SP = δ15Nα − δ15Nβ, where 15Nα and 15Nβ represent the 15N/14N ratios at the center and end sites of the nitrogen atoms, respectively) indicated that the source of N2O emissions just after the compost was turned originated mainly from the denitrification process. Based on these results, the reduction of accumulated NO2−-N or NO3−-N after turning was identified as the main source of N2O emissions. The site preference and bulk δ15N results also indicate that the rate of N2O reduction was relatively low, and an increased value for the site preference indicates that the nitrification which occurred mainly in the surface layer of the pile partially contributed to N2O emissions between the turnings. PMID:20048060
Quaternion normalization in spacecraft attitude determination
NASA Technical Reports Server (NTRS)
Deutschmann, J.; Markley, F. L.; Bar-Itzhack, Itzhack Y.
1993-01-01
Attitude determination of spacecraft usually utilizes vector measurements such as Sun, center of Earth, star, and magnetic field direction to update the quaternion which determines the spacecraft orientation with respect to some reference coordinates in the three dimensional space. These measurements are usually processed by an extended Kalman filter (EKF) which yields an estimate of the attitude quaternion. Two EKF versions for quaternion estimation were presented in the literature; namely, the multiplicative EKF (MEKF) and the additive EKF (AEKF). In the multiplicative EKF, it is assumed that the error between the correct quaternion and its a-priori estimate is, by itself, a quaternion that represents the rotation necessary to bring the attitude which corresponds to the a-priori estimate of the quaternion into coincidence with the correct attitude. The EKF basically estimates this quotient quaternion and then the updated quaternion estimate is obtained by the product of the a-priori quaternion estimate and the estimate of the difference quaternion. In the additive EKF, it is assumed that the error between the a-priori quaternion estimate and the correct one is an algebraic difference between two four-tuple elements and thus the EKF is set to estimate this difference. The updated quaternion is then computed by adding the estimate of the difference to the a-priori quaternion estimate. If the quaternion estimate converges to the correct quaternion, then, naturally, the quaternion estimate has unity norm. This fact was utilized in the past to obtain superior filter performance by applying normalization to the filter measurement update of the quaternion. It was observed for the AEKF that when the attitude changed very slowly between measurements, normalization merely resulted in a faster convergence; however, when the attitude changed considerably between measurements, without filter tuning or normalization, the quaternion estimate diverged. However, when the quaternion estimate was normalized, the estimate converged faster and to a lower error than with tuning only. In last years, symposium we presented three new AEKF normalization techniques and we compared them to the brute force method presented in the literature. The present paper presents the issue of normalization of the MEKF and examines several MEKF normalization techniques.
Exploratory Analysis of Supply Chains in the Defense Industrial Base
2012-04-01
Instruments Industry Group 382: Laboratory Apparatus and Analytical, Optical, Measuring, and Controlling Instruments 3821 Laboratory Apparatus and Furniture ...I N S T I T U T E F O R D E F E N S E A N A LY S E S Exploratory Analysis of Supply Chains in the Defense Industrial Base James R. Dominy...contract DASW01-04-C-0003, AH-7-3315, “Exploratory Analysis of Supply Chains in the Defense Industrial Base,” for the Director, Industrial Policy. The
Market Analysis for Nondevelopmental Items
1992-02-01
A252 287 S-muININu nIn Defense Standardization Program MARKET ANALYSIS FOR NONDEVELOPMENTAL ITEMS February 1992 This do-umcont has bee-n CiPPn*x.,d... market analysis, that task would be much more difficult. This bro- chure proposes a generic approach to market analysis that can be tailored to a wide...Statement A per telecon Greg Saunders OASD(P&L)PR/MM Washington, DC 20301-8000 NWW 6/30/92 Market Analysis for NDI WHY DO MARKET ANALYSIS? The
Hanf, Stefan; Keiner, Robert; Yan, Di; Popp, Jürgen; Frosch, Torsten
2014-06-03
Versatile multigas analysis bears high potential for environmental sensing of climate relevant gases and noninvasive early stage diagnosis of disease states in human breath. In this contribution, a fiber-enhanced Raman spectroscopic (FERS) analysis of a suite of climate relevant atmospheric gases is presented, which allowed for reliable quantification of CH4, CO2, and N2O alongside N2 and O2 with just one single measurement. A highly improved analytical sensitivity was achieved, down to a sub-parts per million limit of detection with a high dynamic range of 6 orders of magnitude and within a second measurement time. The high potential of FERS for the detection of disease markers was demonstrated with the analysis of 27 nL of exhaled human breath. The natural isotopes (13)CO2 and (14)N(15)N were quantified at low levels, simultaneously with the major breath components N2, O2, and (12)CO2. The natural abundances of (13)CO2 and (14)N(15)N were experimentally quantified in very good agreement to theoretical values. A fiber adapter assembly and gas filling setup was designed for rapid and automated analysis of multigas compositions and their fluctuations within seconds and without the need for optical readjustment of the sensor arrangement. On the basis of the abilities of such miniaturized FERS system, we expect high potential for the diagnosis of clinically administered (13)C-labeled CO2 in human breath and also foresee high impact for disease detection via biologically vital nitrogen compounds.
NASA Astrophysics Data System (ADS)
Liu, Y.; Gao, B.; Gong, M.
2017-06-01
In this paper, we proposed to use step heterojunctions emitter spacer (SHES) and InGaN sub-quantum well in AlGaN/GaN/AlGaN double barrier resonant tunnelling diodes (RTDs). Theoretical analysis of RTD with SHES and InGaN sub-quantum well was presented, which indicated that the negative differential resistance (NDR) characteristic was improved. And the simulation results, peak current density JP=82.67 mA/μm2, the peak-to-valley current ratio PVCR=3.38, and intrinsic negative differential resistance RN=-0.147Ω at room temperature, verified the improvement of NDR characteristic brought about by SHES and InGaN sub-quantum well. Both the theoretical analysis and simulation results showed that the device performance, especially the average oscillator output power presented great improvement and reached 2.77mW/μm2 magnitude. And the resistive cut-off frequency would benefit a lot from the relatively small RN as well. Our works provide an important alternative to the current approaches in designing new structure GaN based RTD for practical high frequency and high power applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Casperson, R. J.; Burke, J. T.; Hughes, R. O.
Directly measuring (n,2n) cross sections on short-lived actinides presents a number of experimental challenges. The surrogate reaction technique is an experimental method for measuring cross sections on short-lived isotopes, and it provides a unique solution for measuring (n,2n) cross sections. This technique involves measuring a charged-particle reaction cross section, where the reaction populates the same compound nucleus as the reaction of interest. To perform these surrogate (n,2n) cross section measurements, a silicon telescope array has been placed along a beam line at the Texas A&M University Cyclotron Institute, which is surrounded by a large tank of gadolinium-doped liquid scintillator, whichmore » acts as a neutron detector. The combination of the charge-particle and neutron-detector arrays is referred to as NeutronSTARS. In the analysis procedure for calculating the (n,2n) cross section, the neutron detection efficiency and time structure plays an important role. Due to the lack of availability of isotropic, mono-energetic neutron sources, modeling is an important component in establishing this efficiency and time structure. This report describes the NeutronSTARS array, which was designed and commissioned during this project. It also describes the surrogate reaction technique, specifically referencing a 235U(n,2n) commissioning measurement that was fielded during the past year. Advanced multiplicity analysis techniques have been developed for this work, which should allow for efficient analysis of 241Pu(n,2n) and 239Pu(n,2n) cross section measurements« less
Affective, Normative, and Continuance Commitment Levels across Cultures: A Meta-Analysis
ERIC Educational Resources Information Center
Meyer, John P.; Stanley, David J.; Jackson, Timothy A.; McInnis, Kate J.; Maltin, Elyse R.; Sheppard, Leah
2012-01-01
With increasing globalization of business and diversity within the workplace, there has been growing interest in cultural differences in employee commitment. We used meta-analysis to compute mean levels of affective (AC; K=966, N=433,129), continuance (CC; K=428, N=199,831), and normative (NC; K=336, N=133,277) organizational commitment for as…
USDA-ARS?s Scientific Manuscript database
Soil properties and weather conditions are known to affect soil nitrogen (N) availability and plant N uptake. However, studies examining N response as affected by soil and weather sometimes give conflicting results. Meta-analysis is a statistical method for estimating treatment effects in a series o...
Li, Jian-Gang; Shen, Min-Chong; Hou, Jin-Feng; Li, Ling; Wu, Jun-Xia; Dong, Yuan-Hua
2016-04-28
Pyrosequencing-based analyses revealed significant effects among low (N50), medium (N80), and high (N100) fertilization on community composition involving a long-term monoculture of lettuce in a greenhouse in both summer and winter. The non-fertilized control (CK) treatment was characterized by a higher relative abundance of Actinobacteria, Acidobacteria, and Chloroflexi; however, the average abundance of Firmicutes typically increased in summer, and the relative abundance of Bacteroidetes increased in winter in the N-fertilized treatments. Principle component analysis showed that the distribution of the microbial community was separated by a N gradient with N80 and N100 in the same group in the summer samples, while CK and N50 were in the same group in the winter samples, with the other N-level treatments existing independently. Redundancy analysis revealed that available N, NO3(-)-N, and NH4(+)-N, were the main environmental factors affecting the distribution of the bacterial community. Correlation analysis showed that nitrogen affected the shifts of microbial communities by strongly driving the shifts of Firmicutes, Bacteroidetes, and Proteobacteria in summer samples, and Bacteroidetes, Actinobacteria, and Acidobacteria in winter samples. The study demonstrates a novel example of rhizosphere bacterial diversity and the main factors influencing rizosphere microbial community in continuous vegetable cropping within an intensive greenhouse ecosystem.
Nitrogen fractionation in high-mass star-forming cores across the Galaxy
NASA Astrophysics Data System (ADS)
Colzi, L.; Fontani, F.; Rivilla, V. M.; Sánchez-Monge, A.; Testi, L.; Beltrán, M. T.; Caselli, P.
2018-04-01
The fractionation of nitrogen (N) in star-forming regions is a poorly understood process. To put more stringent observational constraints on the N-fractionation, we have observed with the IRAM-30m telescope a large sample of 66 cores in massive star-forming regions. We targeted the (1-0) rotational transition of HN13C, HC15N, H13CN and HC15N, and derived the 14N/15N ratio for both HCN and HNC. We have completed this sample with that already observed by Colzi et al. (2018), and thus analysed a total sample of 87 sources. The 14N/15N ratios are distributed around the Proto-Solar Nebula value with a lower limit near the terrestrial atmosphere value (˜272). We have also derived the 14N/15N ratio as a function of the Galactocentric distance and deduced a linear trend based on unprecedented statistics. The Galactocentric dependences that we have found are consistent, in the slope, with past works but we have found a new local 14N/15N value of ˜400, i.e. closer to the Prosolar Nebula value. A second analysis was done, and a parabolic Galactocentric trend was found. Comparison with Galactic chemical evolution models shows that the slope until 8 kpc is consistent with the linear analysis, while the flattening trend above 8 kpc is well reproduced by the parabolic analysis.
NASA Astrophysics Data System (ADS)
Li, Jian-Gang; Shen, Min-Chong; Hou, Jin-Feng; Li, Ling; Wu, Jun-Xia; Dong, Yuan-Hua
2016-04-01
Pyrosequencing-based analyses revealed significant effects among low (N50), medium (N80), and high (N100) fertilization on community composition involving a long-term monoculture of lettuce in a greenhouse in both summer and winter. The non-fertilized control (CK) treatment was characterized by a higher relative abundance of Actinobacteria, Acidobacteria, and Chloroflexi; however, the average abundance of Firmicutes typically increased in summer, and the relative abundance of Bacteroidetes increased in winter in the N-fertilized treatments. Principle component analysis showed that the distribution of the microbial community was separated by a N gradient with N80 and N100 in the same group in the summer samples, while CK and N50 were in the same group in the winter samples, with the other N-level treatments existing independently. Redundancy analysis revealed that available N, NO3--N, and NH4+-N, were the main environmental factors affecting the distribution of the bacterial community. Correlation analysis showed that nitrogen affected the shifts of microbial communities by strongly driving the shifts of Firmicutes, Bacteroidetes, and Proteobacteria in summer samples, and Bacteroidetes, Actinobacteria, and Acidobacteria in winter samples. The study demonstrates a novel example of rhizosphere bacterial diversity and the main factors influencing rizosphere microbial community in continuous vegetable cropping within an intensive greenhouse ecosystem.
Assessment of fetal sex chromosome aneuploidy using directed cell-free DNA analysis.
Nicolaides, Kypros H; Musci, Thomas J; Struble, Craig A; Syngelaki, Argyro; Gil, M M
2014-01-01
To examine the performance of chromosome-selective sequencing of cell-free (cf) DNA in maternal blood for assessment of fetal sex chromosome aneuploidies. This was a case-control study of 177 stored maternal plasma samples, obtained before fetal karyotyping at 11-13 weeks of gestation, from 59 singleton pregnancies with fetal sex chromosome aneuploidies (45,X, n = 49; 47,XXX, n = 6; 47,XXY, n = 1; 47,XYY, n = 3) and 118 with euploid fetuses (46,XY, n = 59; 46,XX, n = 59). Digital analysis of selected regions (DANSR™) on chromosomes 21, 18, 13, X and Y was performed and the fetal-fraction optimized risk of trisomy evaluation (FORTE™) algorithm was used to estimate the risk for non-disomic genotypes. Performance was calculated at a risk cut-off of 1:100. Analysis of cfDNA provided risk scores for 172 (97.2%) samples; 4 samples (45,X, n = 2; 46,XY, n = 1; 46,XX, n = 1) had an insufficient fetal cfDNA fraction for reliable testing and 1 case (47,XXX) failed laboratory quality control metrics. The classification was correct in 43 (91.5%) of 47 cases of 45,X, all 5 of 47,XXX, 1 of 47,XXY and 3 of 47,XYY. There were no false-positive results for monosomy X. Analysis of cfDNA by chromosome-selective sequencing can correctly classify fetal sex chromosome aneuploidy with reasonably high sensitivity. © 2013 S. Karger AG, Basel.
Nishikaze, Takashi; Kaneshiro, Kaoru; Kawabata, Shin-ichirou; Tanaka, Koichi
2012-11-06
Negative-ion fragmentation of underivatized N-glycans has been proven to be more informative than positive-ion fragmentation. Fluorescent labeling via reductive amination is often employed for glycan analysis, but little is known about the influence of the labeling group on negative-ion fragmentation. We previously demonstrated that the on-target glycan-labeling method using 3-aminoquinoline/α-cyano-4-hydroxycinnamic acid (3AQ/CHCA) liquid matrix enables highly sensitive, rapid, and quantitative N-glycan profiling analysis. The current study investigates the suitability of 3AQ-labeled N-glycans for structural analysis based on negative-ion collision-induced dissociation (CID) spectra. 3AQ-labeled N-glycans exhibited simple and informative CID spectra similar to those of underivatized N-glycans, with product ions due to cross-ring cleavages of the chitobiose core and ions specific to two antennae (D and E ions). The interpretation of diagnostic fragment ions suggested for underivatized N-glycans could be directly applied to the 3AQ-labeled N-glycans. However, fluorescently labeled N-glycans by conventional reductive amination, such as 2-aminobenzamide (2AB)- and 2-pyrydilamine (2PA)-labeled N-glycans, exhibited complicated CID spectra consisting of numerous signals formed by dehydration and multiple cleavages. The complicated spectra of 2AB- and 2PA-labeled N-glycans was found to be due to their open reducing-terminal N-acetylglucosamine (GlcNAc) ring, rather than structural differences in the labeling group in the N-glycan derivative. Finally, as an example, the on-target 3AQ labeling method followed by negative-ion CID was applied to structurally analyze neutral N-glycans released from human epidermal growth factor receptor type 2 (HER2) protein. The glycan-labeling method using 3AQ-based liquid matrix should facilitate highly sensitive quantitative and qualitative analyses of glycans.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gustafson, F.W.; Todd, M.E.
1993-09-01
The release of large volumes of water to waste disposal cribs at the Hanford Site`s 100-N Area caused contaminants, principally strontium-90, to be carried toward the Columbia River through the groundwater. Since shutdown of the N Reactor, these releases have been discontinued, although small water flows continue to be discharged to the 1325-N crib. Most of the contamination which is now transported to the river is occurring as a result of the natural groundwater movement. The contaminated groundwater at N Springs flows into the river through seeps and springs along the river`s edge. An expedited response action (ERA) has beenmore » proposed to eliminate or restrict the flux of strontium-90 into the river. A cost benefit analysis of potential remedial alternatives was completed that recommends the alternative which best meets given selection criteria prescribed by the Comprehensive Environmental Response, Compensation, and Liability Act (CERCLA). The methodology used for evaluation, cost analysis, and alternative recommendation is the engineering evaluation/cost analysis (EE/CA). Complete remediation of the contaminated groundwater beneath 100-N Area was not a principal objective of the analysis. The objective of the cost benefit analysis was to identify a remedial alternative that optimizes the degree of benefit produced for the costs incurred.« less
Nutrient analysis of the Beef Alternative Merchandising cuts.
Desimone, T L; Acheson, R A; Woerner, D R; Engle, T E; Douglass, L W; Belk, K E
2013-03-01
The objective of this study was to generate raw and cooked nutrient composition data to identify Quality Grade differences in proximate values for eight Beef Alternative Merchandising (BAM) cuts. The data generated will be used to update the nutrient data in the USDA National Nutrient Database for Standard Reference (SR). Beef Rib, Oven-Prepared, Beef Loin, Strip Loin, and Beef Loin, Top Sirloin Butt subprimals were collected from a total of 24 carcasses from four packing plants. The carcasses were a combination of USDA Yield Grades 2 (n=12) and 3 (n=12), USDA Quality Grades upper two-thirds Choice (n=8), low Choice (n=8), and Select (n=8), and two genders, steer (n=16) and heifer (n=8). After aging, subprimals were fabricated into the BAM cuts, dissected, and nutrient analysis was performed. Sample homogenates from each animal were homogenized and composited for analysis of the following: proximate analysis, long chain and trans-fatty acids, conjugated linoleic acid, total cholesterol, vitamin B-12, and selenium. This study identified seven BAM cuts from all three Quality Grades that qualify for USDA Lean; seven Select cuts that qualify for USDA Extra Lean; and three Select cuts that qualify for the American Heart Association's Heart Healthy Check. Copyright © 2012 Elsevier Ltd. All rights reserved.
Mukhtar, Hussnain; Lin, Yu-Pin; Shipin, Oleg V.; Petway, Joy R.
2017-01-01
This study presents an approach for obtaining realization sets of parameters for nitrogen removal in a pilot-scale waste stabilization pond (WSP) system. The proposed approach was designed for optimal parameterization, local sensitivity analysis, and global uncertainty analysis of a dynamic simulation model for the WSP by using the R software package Flexible Modeling Environment (R-FME) with the Markov chain Monte Carlo (MCMC) method. Additionally, generalized likelihood uncertainty estimation (GLUE) was integrated into the FME to evaluate the major parameters that affect the simulation outputs in the study WSP. Comprehensive modeling analysis was used to simulate and assess nine parameters and concentrations of ON-N, NH3-N and NO3-N. Results indicate that the integrated FME-GLUE-based model, with good Nash–Sutcliffe coefficients (0.53–0.69) and correlation coefficients (0.76–0.83), successfully simulates the concentrations of ON-N, NH3-N and NO3-N. Moreover, the Arrhenius constant was the only parameter sensitive to model performances of ON-N and NH3-N simulations. However, Nitrosomonas growth rate, the denitrification constant, and the maximum growth rate at 20 °C were sensitive to ON-N and NO3-N simulation, which was measured using global sensitivity. PMID:28704958
Morrill, K M; Conrad, E; Polo, J; Lago, A; Campbell, J; Quigley, J; Tyler, H
2012-07-01
Our objectives were to evaluate the use of refractometry as a means of estimating immunoglobulin G (IgG) concentration of bovine maternal colostrum (MC) and determine if fractionation of MC using caprylic acid (CA) improved estimates of IgG. Samples (n=85) of MC were collected from a single dairy in California and used to determine the method of CA fraction that produced the best prediction of IgG based on CA fractionation followed by refractometry. Subsequently, samples of MC (n=827) were collected from 67 farms in 12 states to compare refractometry with or without CA fractionation as methods to estimate IgG concentration. Samples were collected from the feeding pool and consisted of fresh (n=196), previously frozen (n=479), or refrigerated (n=152) MC. Samples were further classified by the number freeze-thaw cycles before analysis. Fractionation with CA was conducted by adding 1 mL of MC to a tube containing 75 μL of CA and 1 mL of 0.06 M acetic acid. The tube was shaken and allowed to react for 1 min. Refractive index of the IgG-rich supernatant (nDf) was determined using a digital refractometer. Whole, nonfractionated MC was analyzed for IgG by radial immunodiffusion (RID) and refractive index (nDw). The relationship between nDf and IgG (r=0.53; n=805) was weak, whereas that between nDw and IgG was stronger (r=0.73; n=823). Fresh samples analyzed by refractometry that subsequently went through 1 freeze-thaw cycle before RID analysis resulted in the strongest relationship between IgG and nDf or nDw (r=0.93 and 0.90, respectively). The MC samples collected fresh on the farm but frozen 2 or more times before analysis by refractometry or RID had low correlations between IgG and nDf and nDw (r=0.09 and 0.01). Samples refrigerated or frozen on the farm before analysis had weaker relationships between RID and nDf or nDw (r=0.38 to 0.80), regardless of the number of freeze-thaw cycles. Breed and lactation number did not affect the accuracy of either test. These results indicated that refractometry, without or with CA fractionation, was an accurate and rapid method to determine IgG concentration when samples of MC were not previously stored before refractometry and frozen only once before RID analysis. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Relative contributions of three descriptive methods: implications for behavioral assessment.
Pence, Sacha T; Roscoe, Eileen M; Bourret, Jason C; Ahearn, William H
2009-01-01
This study compared the outcomes of three descriptive analysis methods-the ABC method, the conditional probability method, and the conditional and background probability method-to each other and to the results obtained from functional analyses. Six individuals who had been diagnosed with developmental delays and exhibited problem behavior participated. Functional analyses indicated that participants' problem behavior was maintained by social positive reinforcement (n = 2), social negative reinforcement (n = 2), or automatic reinforcement (n = 2). Results showed that for all but 1 participant, descriptive analysis outcomes were similar across methods. In addition, for all but 1 participant, the descriptive analysis outcome differed substantially from the functional analysis outcome. This supports the general finding that descriptive analysis is a poor means of determining functional relations.
Regional assessment of NLEAP NO3-N leaching indices
Wylie, B.K.; Shaffer, M.J.; Hall, M.D.
1995-01-01
Nonpoint source ground water contamination by nitrate nitrogen (NO3-N) leached from agricultural lands can be substantial and increase health risks to humans and animals. Accurate and rapid methods are needed to identify and map localities that have a high potential for contamination of shallow aquifers with NO3-N leached from agriculture. Evaluation of Nitrate Leaching and Economic Analysis Package (NLEAP) indices and input variables across an irrigated agricultural area on an alluvial aquifer in Colorado indicated that all leaching indices tested were more strongly correlated with aquifer NO3-N concentration than with aquifer N mass. Of the indices and variables tested, the NO3-N Leached (NL) index was the NLEAP index most strongly associated with groundwater NO3-N concentration (r2 values from 0.37 to 0.39). NO3-N concentration of the leachate was less well correlated with ground water NO3-N concentration (r2values from 0.21 to 0.22). Stepwise regression analysis indicated that, although inorganic and organic/inorganic fertilizer scenarios had similar r2 values, the Feedlot Indicator (proximity) variable was significant over and above the NO3-N Leached index for the inorganic scenario. The analysis also showed that combination of either Movement Risk Index (MIRI) or NO3-N concentration of the leachate with the NO3-N Leached index leads to an improved regression, which provides insight into area-wide associations between agricultural activities and ground water NO3-N concentration.
Qian, Fang; Brewster, Megan; Lim, Sung K; Ling, Yichuan; Greene, Christopher; Laboutin, Oleg; Johnson, Jerry W; Gradečak, Silvija; Cao, Yu; Li, Yat
2012-06-13
We report the controlled synthesis of AlN/GaN multi-quantum well (MQW) radial nanowire heterostructures by metal-organic chemical vapor deposition. The structure consists of a single-crystal GaN nanowire core and an epitaxially grown (AlN/GaN)(m) (m = 3, 13) MQW shell. Optical excitation of individual MQW nanowires yielded strong, blue-shifted photoluminescence in the range 340-360 nm, with respect to the GaN near band-edge emission at 368.8 nm. Cathodoluminescence analysis on the cross-sectional MQW nanowire samples showed that the blue-shifted ultraviolet luminescence originated from the GaN quantum wells, while the defect-associated yellow luminescence was emitted from the GaN core. Computational simulation provided a quantitative analysis of the mini-band energies in the AlN/GaN superlattices and suggested the observed blue-shifted emission corresponds to the interband transitions between the second subbands of GaN, as a result of quantum confinement and strain effect in these AlN/GaN MQW nanowire structures.
NASA Astrophysics Data System (ADS)
Ogura, Kenji; Okamura, Hideyasu
2013-10-01
Growth factor receptor-bound protein 2 (Grb2) is a small adapter protein composed of a single SH2 domain flanked by two SH3 domains. The N-terminal SH3 (nSH3) domain of Grb2 binds a proline-rich region present in the guanine nucleotide releasing factor, son of sevenless (Sos). Using NMR relaxation dispersion and chemical shift analysis methods, we investigated the conformational change of the Sos-derived proline-rich peptide during the transition between the free and Grb2 nSH3-bound states. The chemical shift analysis revealed that the peptide does not present a fully random conformation but has a relatively rigid structure. The relaxation dispersion analysis detected conformational exchange of several residues of the peptide upon binding to Grb2 nSH3.
1985-06-01
12. It was stated that analysis of the gaseous products showed that they consisted of N2O, NO, N2, CO, CO2, F^CO and traces of N,* The products of...IR, UV and mass spectrometry. These were (yields summarized in Table 1) as follows: No 1 N2O, NO, CO2, CO, HCN, CH2O, and I^O. NO2 and a trace ...Ramirez, "Reaction of Gem-Nitronitroso Compounds with Triethyl Phosphite ," Tetrahedron, Vol. 29, p. 4195, 1973. J. Jappy and P.N. Preston
Zhang, Xianyu; Kim, Jin Seuk; Kwon, Younghwan
2017-04-01
Here we describe the synthesis of polyurethane (PU)-based energetic nanocomposites loaded with nano-aluminum (n-Al) particles. The energetic nanocomposite was prepared by polyurethane reaction of poly(glycidyl azide-co-tetramethylene glycol) (PGT) prepolymers and IPDI/N-100 isocyanates with simultaneous catalyst-free azide-alkyne Click reaction in the presence of n-Al. Initial study carried out with various n-Al/fluorinated PGT blends and demonstrated the potential of fluorinated PGT prepolymer for an energetic PU matrix. Thermal analysis of n-Al/fluorinated PGT-based PU energetic nanocomposite was performed using DSC and TGA.
Ali, Muhammad; Rathnayake, Rathnayake M L D; Zhang, Lei; Ishii, Satoshi; Kindaichi, Tomonori; Satoh, Hisashi; Toyoda, Sakae; Yoshida, Naohiro; Okabe, Satoshi
2016-10-01
Nitrous oxide (N2O) production pathway in a signal-stage nitritation-anammox sequencing batch reactor (SBR) was investigated based on a multilateral approach including real-time N2O monitoring, N2O isotopic composition analysis, and in-situ analyses of spatial distribution of N2O production rate and microbial populations in granular biomass. N2O emission rate was high in the initial phase of the operation cycle and gradually decreased with decreasing NH4(+) concentration. The average emission of N2O was 0.98 ± 0.42% and 1.35 ± 0.72% of the incoming nitrogen load and removed nitrogen, respectively. The N2O isotopic composition analysis revealed that N2O was produced via NH2OH oxidation and NO2(-) reduction pathways equally, although there is an unknown influence from N2O reduction and/or anammox N2O production. However, the N2O isotopomer analysis could not discriminate the relative contribution of nitrifier denitrification and heterotrophic denitrification in the NO2(-) reduction pathway. Various in-situ techniques (e.g. microsensor measurements and FISH (fluorescent in-situ hybridization) analysis) were therefore applied to further identify N2O producers. Microsensor measurements revealed that approximately 70% of N2O was produced in the oxic surface zone, where nitrifiers were predominantly localized. Thus, NH2OH oxidation and NO2 reduction by nitrifiers (nitrifier-denitrification) could be responsible for the N2O production in the oxic zone. The rest of N2O (ca. 30%) was produced in the anammox bacteria-dominated anoxic zone, probably suggesting that NO2(-) reduction by coexisting putative heterotrophic denitrifiers and some other unknown pathway(s) including the possibility of anammox process account for the anaerobic N2O production. Further study is required to identify the anaerobic N2O production pathways. Our multilateral approach can be useful to quantitatively examine the relative contributions of N2O production pathways. Good understanding of the key N2O production pathways is essential to establish a strategy to mitigate N2O emission from biological nitrogen removal processes. Copyright © 2016 Elsevier Ltd. All rights reserved.
Heating and Cooling Master Plan for Fort Bragg, NC, Fiscal Years 2005 to 2030
2009-02-01
Pacific Northwest National Labora- tory ( PNNL ) performed an energy use analysis on these buildings in 2005. These buildings are located in 33 specific...electrical equipment the electrical use is provided in the PNNL analysis .* The central chilled water system energy use is reported in cooling chilled...February 2009 C on st ru ct io n E n gi n ee ri n g R es ea rc h L ab or at or y Approved for public release; distribution is unlimited
NASA Astrophysics Data System (ADS)
Khani, S.; Montazerozohori, M.; Masoudiasl, A.; White, J. M.
2018-02-01
A new manganese (II) coordination polymer, [MnL2 (μ-1,3-N3)2]n, with co-ligands including azide anion and Schiff base based on isonicotinoylhydrazone has been synthesized and characterized. The crystal structure determination shows that the azide ligand acts as end-to-end (EE) bridging ligand and generates a one-dimensional coordination polymer. In this compound, each manganes (II) metal center is hexa-coordinated by four azide nitrogens and two pyridinic nitrogens for the formation of octahedral geometry. The analysis of crystal packing indicates that the 1D chain of [MnL2 (μ-1,3-N3)2]n, is stabilized as a 3D supramolecular network by intra- and inter-chain intermolecular interactions of X-H···Y (X = N and C, Y = O and N). Hirshfeld surface analysis and 2D fingerprint plots have been used for a more detailed investigation of intermolecular interactions. Also, natural bond orbital (NBO) analysis was performed to get information about atomic charge distributions, hybridizations and the strength of interactions. Finally, thermal analysis of compound showed its complete decomposition during three thermal steps.
Cattle genome-wide analysis reveals genetic signatures in trypanotolerant N'Dama.
Kim, Soo-Jin; Ka, Sojeong; Ha, Jung-Woo; Kim, Jaemin; Yoo, DongAhn; Kim, Kwondo; Lee, Hak-Kyo; Lim, Dajeong; Cho, Seoae; Hanotte, Olivier; Mwai, Okeyo Ally; Dessie, Tadelle; Kemp, Stephen; Oh, Sung Jong; Kim, Heebal
2017-05-12
Indigenous cattle in Africa have adapted to various local environments to acquire superior phenotypes that enhance their survival under harsh conditions. While many studies investigated the adaptation of overall African cattle, genetic characteristics of each breed have been poorly studied. We performed the comparative genome-wide analysis to assess evidence for subspeciation within species at the genetic level in trypanotolerant N'Dama cattle. We analysed genetic variation patterns in N'Dama from the genomes of 101 cattle breeds including 48 samples of five indigenous African cattle breeds and 53 samples of various commercial breeds. Analysis of SNP variances between cattle breeds using wMI, XP-CLR, and XP-EHH detected genes containing N'Dama-specific genetic variants and their potential associations. Functional annotation analysis revealed that these genes are associated with ossification, neurological and immune system. Particularly, the genes involved in bone formation indicate that local adaptation of N'Dama may engage in skeletal growth as well as immune systems. Our results imply that N'Dama might have acquired distinct genotypes associated with growth and regulation of regional diseases including trypanosomiasis. Moreover, this study offers significant insights into identifying genetic signatures for natural and artificial selection of diverse African cattle breeds.
Bystrykh, L V; Vonck, J; van Bruggen, E F; van Beeumen, J; Samyn, B; Govorukhina, N I; Arfman, N; Duine, J A; Dijkhuizen, L
1993-01-01
The quaternary protein structure of two methanol:N,N'-dimethyl-4-nitrosoaniline (NDMA) oxidoreductases purified from Amycolatopsis methanolica and Mycobacterium gastri MB19 was analyzed by electron microscopy and image processing. The enzymes are decameric proteins (displaying fivefold symmetry) with estimated molecular masses of 490 to 500 kDa based on their subunit molecular masses of 49 to 50 kDa. Both methanol:NDMA oxidoreductases possess a tightly but noncovalently bound NADP(H) cofactor at an NADPH-to-subunit molar ratio of 0.7. These cofactors are redox active toward alcohol and aldehyde substrates. Both enzymes contain significant amounts of Zn2+ and Mg2+ ions. The primary amino acid sequences of the A. methanolica and M. gastri MB19 methanol:NDMA oxidoreductases share a high degree of identity, as indicated by N-terminal sequence analysis (63% identity among the first 27 N-terminal amino acids), internal peptide sequence analysis, and overall amino acid composition. The amino acid sequence analysis also revealed significant similarity to a decameric methanol dehydrogenase of Bacillus methanolicus C1. Images PMID:8449887
Vasilaki, V; Volcke, E I P; Nandi, A K; van Loosdrecht, M C M; Katsou, E
2018-04-26
Multivariate statistical analysis was applied to investigate the dependencies and underlying patterns between N 2 O emissions and online operational variables (dissolved oxygen and nitrogen component concentrations, temperature and influent flow-rate) during biological nitrogen removal from wastewater. The system under study was a full-scale reactor, for which hourly sensor data were available. The 15-month long monitoring campaign was divided into 10 sub-periods based on the profile of N 2 O emissions, using Binary Segmentation. The dependencies between operating variables and N 2 O emissions fluctuated according to Spearman's rank correlation. The correlation between N 2 O emissions and nitrite concentrations ranged between 0.51 and 0.78. Correlation >0.7 between N 2 O emissions and nitrate concentrations was observed at sub-periods with average temperature lower than 12 °C. Hierarchical k-means clustering and principal component analysis linked N 2 O emission peaks with precipitation events and ammonium concentrations higher than 2 mg/L, especially in sub-periods characterized by low N 2 O fluxes. Additionally, the highest ranges of measured N 2 O fluxes belonged to clusters corresponding with NO 3 -N concentration less than 1 mg/L in the upstream plug-flow reactor (middle of oxic zone), indicating slow nitrification rates. The results showed that the range of N 2 O emissions partially depends on the prior behavior of the system. The principal component analysis validated the findings from the clustering analysis and showed that ammonium, nitrate, nitrite and temperature explained a considerable percentage of the variance in the system for the majority of the sub-periods. The applied statistical methods, linked the different ranges of emissions with the system variables, provided insights on the effect of operating conditions on N 2 O emissions in each sub-period and can be integrated into N 2 O emissions data processing at wastewater treatment plants. Copyright © 2018. Published by Elsevier Ltd.
Bao, Shixing; Watanabe, Yoshiyuki; Takahashi, Hiroto; Tanaka, Hisashi; Arisawa, Atsuko; Matsuo, Chisato; Wu, Rongli; Fujimoto, Yasunori; Tomiyama, Noriyuki
2018-05-31
This study aimed to determine whether whole-tumor histogram analysis of normalized cerebral blood volume (nCBV) and apparent diffusion coefficient (ADC) for contrast-enhancing lesions can be used to differentiate between glioblastoma (GBM) and primary central nervous system lymphoma (PCNSL). From 20 patients, 9 with PCNSL and 11 with GBM without any hemorrhagic lesions, underwent MRI, including diffusion-weighted imaging and dynamic susceptibility contrast perfusion-weighted imaging before surgery. Histogram analysis of nCBV and ADC from whole-tumor voxels in contrast-enhancing lesions was performed. An unpaired t-test was used to compare the mean values for each type of tumor. A multivariate logistic regression model (LRM) was performed to classify GBM and PCNSL using the best parameters of ADC and nCBV. All nCBV histogram parameters of GBMs were larger than those of PCNSLs, but only average nCBV was statistically significant after Bonferroni correction. Meanwhile, ADC histogram parameters were also larger in GBM compared to those in PCNSL, but these differences were not statistically significant. According to receiver operating characteristic curve analysis, the nCBV average and ADC 25th percentile demonstrated the largest area under the curve with values of 0.869 and 0.838, respectively. The LRM combining these two parameters differentiated between GBM and PCNSL with a higher area under the curve value (Logit (P) = -21.12 + 10.00 × ADC 25th percentile (10 -3 mm 2 /s) + 5.420 × nCBV mean, P < 0.001). Our results suggest that whole-tumor histogram analysis of nCBV and ADC combined can be a valuable objective diagnostic method for differentiating between GBM and PCNSL.
Sunwoo, Leonard; Yun, Tae Jin; You, Sung-Hye; Yoo, Roh-Eul; Kang, Koung Mi; Choi, Seung Hong; Kim, Ji-Hoon; Sohn, Chul-Ho; Park, Sun-Won; Jung, Cheolkyu; Park, Chul-Kee
2016-01-01
To evaluate the diagnostic performance of cerebral blood flow (CBF) by using arterial spin labeling (ASL) perfusion magnetic resonance (MR) imaging to differentiate glioblastoma (GBM) from brain metastasis. The institutional review board of our hospital approved this retrospective study. The study population consisted of 128 consecutive patients who underwent surgical resection and were diagnosed as either GBM (n = 89) or brain metastasis (n = 39). All participants underwent preoperative MR imaging including ASL. For qualitative analysis, the tumors were visually graded into five categories based on ASL-CBF maps by two blinded reviewers. For quantitative analysis, the reviewers drew regions of interest (ROIs) on ASL-CBF maps upon the most hyperperfused portion within the tumor and upon peritumoral T2 hyperintensity area. Signal intensities of intratumoral and peritumoral ROIs for each subject were normalized by dividing the values by those of contralateral normal gray matter (nCBFintratumoral and nCBFperitumoral, respectively). Visual grading scales and quantitative parameters between GBM and brain metastasis were compared. In addition, the area under the receiver-operating characteristic curve was used to evaluate the diagnostic performance of ASL-driven CBF to differentiate GBM from brain metastasis. For qualitative analysis, GBM group showed significantly higher grade compared to metastasis group (p = 0.001). For quantitative analysis, both nCBFintratumoral and nCBFperitumoral in GBM were significantly higher than those in metastasis (both p < 0.001). The areas under the curve were 0.677, 0.714, and 0.835 for visual grading, nCBFintratumoral, and nCBFperitumoral, respectively (all p < 0.001). ASL perfusion MR imaging can aid in the differentiation of GBM from brain metastasis.
Nuclear reaction analysis for H, Li, Be, B, C, N, O and F with an RBS check
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lanford, W. A.; Parenti, M.; Nordell, B. J.
2015-11-12
In this paper, 15N nuclear reaction analysis (NRA) for H is combined with 1.2 MeV deuteron (D) NRA which provides a simultaneous analysis for Li, Be, B, C, N, O and F. The energy dependence of the D NRA has been measured and used to correct for the D energy loss in film being analyzed. A 2 MeV He RBS measurement is made. Film composition is determined by a self-consistent analysis of the light element NRA data combined with an RBS analysis for heavy elements. This composition is used to simulate, with no adjustable parameters, the complete RBS spectrum. Finally,more » comparison of this simulated RBS spectrum with the measured spectrum provides a powerful check of the analysis.« less
Lau, Ho Ming; Smit, Johannes H; Fleming, Theresa M; Riper, Heleen
2016-01-01
The development and use of serious games for mental health disorders are on the rise. Yet, little is known about the impact of these games on clinical mental health symptoms. We conducted a systematic review and meta-analysis of randomized controlled trials that evaluated the effectiveness of serious games on symptoms of mental disorder. We conducted a systematic search in the PubMed, PsycINFO, and Embase databases, using mental health and serious games-related keywords. Ten studies met the inclusion criteria and were included in the review, and nine studies were included in the meta-analysis. All of the serious games were provided via personal computer, mostly on CD-ROM without the need for an internet connection. The studies targeted age groups ranging from 7 to 80 years old. The serious games focused on symptoms of depression ( n = 2), post-traumatic stress disorder ( n = 2), autism spectrum disorder ( n = 2), attention deficit hyperactivity disorder ( n = 1), cognitive functioning ( n = 2), and alcohol use disorder ( n = 1). The studies used goal-oriented ( n = 4) and cognitive training games ( n = 6). A total of 674 participants were included in the meta-analysis (380 in experimental and 294 in control groups). A meta-analysis of 9 studies comprising 10 comparisons, using a random effects model, showed a moderate effect on improvement of symptoms [ g = 0.55 (95% confidence interval 0.28-0.83); P < 0.001], favoring serious games over no intervention controls. Though the number of comparisons in the meta-analysis was small, these findings suggest that serious gaming interventions may be effective for reducing disorder-related symptoms. More studies are needed in order to attain deeper knowledge of the efficacy for specific mental disorders and the longer term effects of this new type of treatment for mental disorders.
Leisure-time physical activity and sciatica: A systematic review and meta-analysis
Shiri, R.; Falah-Hassani, K.; Viikari-Juntura, E.; Coggon, D.
2016-01-01
Background and objective The role of leisure-time physical activity in sciatica is uncertain. This study aimed to assess the association of leisure-time physical activity with lumbar radicular pain and sciatica. Databases and data treatment Literature searches were conducted in PubMed, Embase, Web of Science, Scopus, Google Scholar and ResearchGate databases from 1964 through August 2015. A random-effects meta-analysis was performed, and heterogeneity and small-study bias were assessed. Results Ten cohort (N=82,024 participants), 4 case-control (N=9350) and 4 cross-sectional (N=10,046)) studies qualified for meta-analysis. In comparison with no regular physical activity, high level of physical activity (≥4 times/week) was inversely associated with new onset of lumbar radicular pain or sciatica in a meta-analysis of prospective cohort studies (risk ratio (RR)=0.88, 95% CI 0.78-0.99, I2=0%, 7 studies, N=78,065). The association for moderate level of physical activity (1-3 times/week) was weaker (RR=0.93, CI 0.82-1.05, I2=0%, 6 studies, N=69,049), and there was no association with physical activity for at least once/week (RR=0.99, CI 0.86-1.13, 9 studies, N=73,008). On the contrary, a meta-analysis of cross-sectional studies showed a higher prevalence of lumbar radicular pain or sciatica in participants who exercised at least once/week (prevalence ratio (PR)=1.29, CI 1.09-1.53, I2=0%, 4 studies, N=10,046), or 1-3 times/week (PR=1.34, CI 1.02-1.77, I2=0%, N=7631) than among inactive participants. There was no evidence of small-study bias. Conclusions This meta-analysis suggests that moderate to high level of leisure physical activity may have a moderate protective effect against development of lumbar radicular pain. However, a large reduction in risk (>30%) seems unlikely. PMID:27091423
Reliability analysis of InGaN/GaN multi-quantum-well solar cells under thermal stress
NASA Astrophysics Data System (ADS)
Huang, Xuanqi; Fu, Houqiang; Chen, Hong; Lu, Zhijian; Baranowski, Izak; Montes, Jossue; Yang, Tsung-Han; Gunning, Brendan P.; Koleske, Dan; Zhao, Yuji
2017-12-01
We investigate the thermal stability of InGaN solar cells under thermal stress at elevated temperatures from 400 °C to 500 °C. High Resolution X-Ray Diffraction analysis reveals that material quality of InGaN/GaN did not degrade after thermal stress. The external quantum efficiency characteristics of solar cells were well-maintained at all temperatures, which demonstrates the thermal robustness of InGaN materials. Analysis of current density-voltage (J-V) curves shows that the degradation of conversion efficiency of solar cells is mainly caused by the decrease in open-circuit voltage (Voc), while short-circuit current (Jsc) and fill factor remain almost constant. The decrease in Voc after thermal stress is attributed to the compromised metal contacts. Transmission line method results further confirmed that p-type contacts became Schottky-like after thermal stress. The Arrhenius model was employed to estimate the failure lifetime of InGaN solar cells at different temperatures. These results suggest that while InGaN solar cells have high thermal stability, the degradation in the metal contact could be the major limiting factor for these devices under high temperature operation.
Pauling, L
1992-08-01
Analysis of the gamma-ray energies of 28 excited superdeformed bands of lanthanon nuclei by application of the two-revolving-cluster model yields the result that the central sphere for all 28 has the semimagic-magic composition p40n50, with the range p8n12 to p14n18 for the clusters and the radius of revolution increasing from 7.31 to 7.76 fm. Similar analysis of 28 excited bands of Hg, Tl, and Pb nuclei leads to p56n82 (semimagic-magic) for the central sphere of 24 bands, p64n82 (semimagic-magic) for 2, and p64n90 (doubly semimagic) for 2, with cluster range p8n12 to p14n16 and values of the radius of revolution from 8.70 to 8.92 fm for 26 bands and 9.2 fm for 2.
Pauling, L
1992-01-01
Analysis of the gamma-ray energies of 28 excited superdeformed bands of lanthanon nuclei by application of the two-revolving-cluster model yields the result that the central sphere for all 28 has the semimagic-magic composition p40n50, with the range p8n12 to p14n18 for the clusters and the radius of revolution increasing from 7.31 to 7.76 fm. Similar analysis of 28 excited bands of Hg, Tl, and Pb nuclei leads to p56n82 (semimagic-magic) for the central sphere of 24 bands, p64n82 (semimagic-magic) for 2, and p64n90 (doubly semimagic) for 2, with cluster range p8n12 to p14n16 and values of the radius of revolution from 8.70 to 8.92 fm for 26 bands and 9.2 fm for 2. PMID:11607313
Identification of key nitrous oxide production pathways in aerobic partial nitrifying granules.
Ishii, Satoshi; Song, Yanjun; Rathnayake, Lashitha; Tumendelger, Azzaya; Satoh, Hisashi; Toyoda, Sakae; Yoshida, Naohiro; Okabe, Satoshi
2014-10-01
The identification of the key nitrous oxide (N2O) production pathways is important to establish a strategy to mitigate N2O emission. In this study, we combined real-time gas-monitoring analysis, (15)N stable isotope analysis, denitrification functional gene transcriptome analysis and microscale N2O concentration measurements to identify the main N2O producers in a partial nitrification (PN) aerobic granule reactor, which was fed with ammonium and acetate. Our results suggest that heterotrophic denitrification was the main contributor to N2O production in our PN aerobic granule reactor. The heterotrophic denitrifiers were probably related to Rhodocyclales bacteria, although different types of bacteria were active in the initial and latter stages of the PN reaction cycles, most likely in response to the presence of acetate. Hydroxylamine oxidation and nitrifier denitrification occurred, but their contribution to N2O emission was relatively small (20-30%) compared with heterotrophic denitrification. Our approach can be useful to quantitatively examine the relative contributions of the three pathways (hydroxylamine oxidation, nitrifier denitrification and heterotrophic denitrification) to N2O emission in mixed microbial populations. © 2014 Society for Applied Microbiology and John Wiley & Sons Ltd.
Modeling abundance using multinomial N-mixture models
Royle, Andy
2016-01-01
Multinomial N-mixture models are a generalization of the binomial N-mixture models described in Chapter 6 to allow for more complex and informative sampling protocols beyond simple counts. Many commonly used protocols such as multiple observer sampling, removal sampling, and capture-recapture produce a multivariate count frequency that has a multinomial distribution and for which multinomial N-mixture models can be developed. Such protocols typically result in more precise estimates than binomial mixture models because they provide direct information about parameters of the observation process. We demonstrate the analysis of these models in BUGS using several distinct formulations that afford great flexibility in the types of models that can be developed, and we demonstrate likelihood analysis using the unmarked package. Spatially stratified capture-recapture models are one class of models that fall into the multinomial N-mixture framework, and we discuss analysis of stratified versions of classical models such as model Mb, Mh and other classes of models that are only possible to describe within the multinomial N-mixture framework.
Dispersive analysis of the scalar form factor of the nucleon
NASA Astrophysics Data System (ADS)
Hoferichter, M.; Ditsche, C.; Kubis, B.; Meißner, U.-G.
2012-06-01
Based on the recently proposed Roy-Steiner equations for pion-nucleon ( πN) scattering [1], we derive a system of coupled integral equations for the π π to overline N N and overline K K to overline N N S-waves. These equations take the form of a two-channel Muskhelishvili-Omnès problem, whose solution in the presence of a finite matching point is discussed. We use these results to update the dispersive analysis of the scalar form factor of the nucleon fully including overline K K intermediate states. In particular, we determine the correction {Δ_{σ }} = σ ( {2M_{π }^2} ) - {σ_{{π N}}} , which is needed for the extraction of the pion-nucleon σ term from πN scattering, as a function of pion-nucleon subthreshold parameters and the πN coupling constant.
Jo, Min Sung; Sadasivam, Karthikeyan Giri; Tawfik, Wael Z; Yang, Seung Bea; Lee, Jung Ju; Ha, Jun Seok; Moon, Young Boo; Ryu, Sang Wan; Lee, June Key
2013-01-01
n-type GaN epitaxial layers were regrown on the patterned n-type GaN substrate (PNS) with different size of silicon dioxide (SiO2) nano dots to improve the crystal quality and optical properties. PNS with SiO2 nano dots promotes epitaxial lateral overgrowth (ELOG) for defect reduction and also acts as a light scattering point. Transmission electron microscopy (TEM) analysis suggested that PNS with SiO2 nano dots have superior crystalline properties. Hall measurements indicated that incrementing values in electron mobility were clear indication of reduction in threading dislocation and it was confirmed by TEM analysis. Photoluminescence (PL) intensity was enhanced by 2.0 times and 3.1 times for 1-step and 2-step PNS, respectively.
Ndah, Elvis; Jonckheere, Veronique
2017-01-01
Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well- and poorly-annotated genomes. PMID:28432195
Willems, Patrick; Ndah, Elvis; Jonckheere, Veronique; Stael, Simon; Sticker, Adriaan; Martens, Lennart; Van Breusegem, Frank; Gevaert, Kris; Van Damme, Petra
2017-06-01
Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well- and poorly-annotated genomes. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Theoretical analysis of the correlation observed in fatigue crack growth rate parameters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chay, S.C.; Liaw, P.K.
Fatigue crack growth rates have been found to follow the Paris-Erdogan rule, da/dN = C{sub o}({Delta}K){sup n}, for many steels, aluminum, nickel and copper alloys. The fatigue crack growth rate behavior in the Paris regime, thus, can be characterized by the parameters C{sub o} and n, which have been obtained for various materials. When n vs the logarithm of C{sub o} were plotted for various experimental results, a very definite linear relationship has been observed by many investigators, and questions have been raised as to the nature of this correlation. This paper presents a theoretical analysis that explains precisely whymore » such a linear correlation should exist between the two parameters, how strong the relationship should be, and how it can be predicted by analysis. This analysis proves that the source of such a correlation is of mathematical nature rather than physical.« less
Built environment analysis for road traffic hotspot locations in Moshi, Tanzania.
Waldon, Meredith; Ibingira, Treasure Joelson; de Andrade, Luciano; Mmbaga, Blandina T; Vissoci, João Ricardo N; Mvungi, Mark; Staton, Catherine A
2018-02-08
Road traffic injuries (RTIs) cause significant morbidity and mortality in low- and middle-income countries. Investigation of high risk areas for RTIs is needed to guide improvements. This study provides built environmental analysis of road traffic crash hotspots within Moshi, Tanzania. Spatial analysis of police data identified 36 hotspots. Qualitative comparative analysis revealed 40% of crash sites were on local roads without night lighting and increased motorcycle density. Paved narrow roads represented 26% of hotspots and 13% were unpaved roads with uneven roadsides. Roadside unevenness was more predominate in low risk [n = 19, (90.5%)] than high risk sites [n = 7 (46.7%)]. Both low [n = 6 (28.6%)] and high risk [n = 1 (6.7%)] sites had minimal signage. All sites had informal pedestrian pathways. Little variability between risk sites suggests hazardous conditions are widespread. Findings suggest improvement in municipal infrastructure, signage and enforcement is needed to reduce RTI burden.
Azospirillum zeae sp. nov., a diazotrophic bacterium isolated from rhizosphere soil of Zea mays.
Mehnaz, Samina; Weselowski, Brian; Lazarovits, George
2007-12-01
Two free-living nitrogen-fixing bacterial strains, N6 and N7(T), were isolated from corn rhizosphere. A polyphasic taxonomic approach, including morphological characterization, Biolog analysis, DNA-DNA hybridization, and 16S rRNA, cpn60 and nifH gene sequence analysis, was taken to analyse the two strains. 16S rRNA gene sequence analysis indicated that strains N6 and N7(T) both belonged to the genus Azospirillum and were closely related to Azospirillum oryzae (98.7 and 98.8 % similarity, respectively) and Azospirillum lipoferum (97.5 and 97.6 % similarity, respectively). DNA-DNA hybridization of strains N6 and N7(T) showed reassociation values of 48 and 37 %, respectively, with A. oryzae and 43 % with A. lipoferum. Sequences of the nifH and cpn60 genes of both strains showed 99 and approximately 95 % similarity, respectively, with those of A. oryzae. Chemotaxonomic characteristics (Q-10 as quinone system, 18 : 1omega7c as major fatty acid) and G+C content of the DNA (67.6 mol%) were also similar to those of members of the genus Azospirillum. Gene sequences and Biolog and fatty acid analysis showed that strains N6 and N7(T) differed from the closely related species A. lipoferum and A. oryzae. On the basis of these results, it is proposed that these nitrogen-fixing strains represent a novel species. The name Azospirillum zeae sp. nov. is suggested, with N7(T) (=NCCB 100147(T)=LMG 23989(T)) as the type strain.
pN0(i+) Breast Cancer: Treatment Patterns, Locoregional Recurrence, and Survival Outcomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karam, Irene; Breast Cancer Outcomes Unit, British Columbia Cancer Agency, Vancouver, BC; Lesperance, Maria F.
Purpose: To examine treatment patterns, recurrence, and survival outcomes in patients with pN0(i+) breast cancer. Methods and Materials: Subjects were 5999 women with AJCC (6th edition) pT1-3, pN0-N1a, M0 breast cancer diagnosed between 2003 and 2006. Of these, 4342 (72%) had pN0, 96 (2%) had pN0(i+), 349 (6%) had pNmic (micrometastases >0.2 mm to ≤2 mm), and 1212 (20%) had pN1a (1-3 positive macroscopic nodes) disease. Treatment characteristics and 5-year Kaplan-Meier local recurrence, regional recurrence (RR), locoregional recurrence (LRR), and overall survival were compared between nodal subgroups. Multivariable analysis was performed using Cox regression modeling. A 1:3 case-match analysis examinedmore » outcomes in pN0(i+) cases compared with pN0 controls matched for similar tumor and treatment characteristics. Results: Median follow-up was 4.8 years. Adjuvant systemic therapy use increased with nodal stage: 81%, 92%, 95%, and 94% in pN0, pN0(i+), pNmic, and pN1a disease, respectively (P<.001). Nodal radiation therapy (RT) use also increased with nodal stage: 1.7% in pN0, 27% in pN0(i+), 33% in pNmic, and 63% in pN1a cohorts (P<.001). Five-year Kaplan-Meier outcomes in pN0 versus pN0(i+) cases were as follows: local recurrence 1.7% versus 3.7% (P=.20), RR 0.5% versus 2.2% (P=.02), and LRR 2.1% versus 5.8% (P=.02). There were no RR events in 26 patients with pN0(i+) disease who received nodal RT and 2 RR events in 70 patients who did not receive nodal RT. On multivariable analysis, pN0(i+) was not associated with worse locoregional control or survival. On case-match analysis, LRR and overall survival were similar between pN0(i+) and matched pN0 counterparts. Conclusions: Nodal involvement with isolated tumor cells is not a significant prognostic factor for LRR or survival in this study's multivariable and case-match analyses. These data do not support the routine use of nodal RT in the setting of pN0(i+) disease. Prospective studies are needed to define optimal locoregional management for women with pN0(i+) breast cancer.« less
Word Criticality Analysis. MOS: 91P. Skill Levels 1 & 2.
1981-09-01
NOT REPRODUCE LEGIBLY. UNCLASSI FIED SECURITY CLASSIFICATION OF THIS PAGE (lThen Data Entered) REPORT DOCUMAENTATION PAGE BEFORE COMSPLETING OM 1...oS’ 43 0>.o" ) > I- Dtu w - - - - -- ..... 4 -4 4 - * - -4. -.... .e... -N . ... .N-.." N’ N- . N . , fl s’N
Świętoń, Edyta; Śmietanka, Krzysztof
2018-06-19
Sixty-five poultry outbreaks and sixty-eight events in wild birds were reported during the highly pathogenic H5N8/H5N5 avian influenza epidemic in Poland in 2016-2017. The analysis of all gene segment sequences of selected strains revealed cocirculation of at least four different genome configurations (genotypes) generated through reassortment of clade 2.3.4.4 H5N8 viruses detected in Russia and China in mid-2016. The geographical and temporal distribution of three H5N8 genotypes indicates separate introductions. Additionally, an H5N5 virus with a different gene configuration was detected in wild birds. The compilation of the results with those from studies on the virus' diversity in Germany, Italy and the Netherlands revealed that Europe was affected by at least eight different H5N8/H5N5 reassortants. Analysis of the HA gene sequence of a larger subset of samples showed its diversification corresponding to the genotype classification. The close relationship between poultry and wild bird strains from the same locations observed in several cases points to wild birds as the primary source of the outbreaks in poultry. © 2018 Blackwell Verlag GmbH.
Glycomic Analysis of Prostate Cancer
2012-07-01
allowed measurements of N-glycans and the Clinical Molecular Epidemiology Shared Resources which provided services for biological sample storage and...select N-glycans for the detection of prostate cancer. Aim3. Perform an exploratory study of N-glycans in urine of the participants and correlation of...cases. We have designed a pooled-unpooled study where initial discovery is conducted in smaller number of pooled samples followed by analysis of
ERIC Educational Resources Information Center
Fidler, Deborah J.; Lawson, John E.; Hodapp, Robert M.
2003-01-01
An analysis of educational desires found parents of children with Down syndrome (n=39) wanted changes in speech therapy and reading services, parents of children with Prader-Willi syndrome (n=25) wanted increases in adaptive physical education services, and parents of children with Williams syndrome (n=26) wanted increases in music services and…
Rapid detection of Echinococcus species by a high-resolution melting (HRM) approach.
Santos, Guilherme Brzoskowski; Espínola, Sergio Martín; Ferreira, Henrique Bunselmeyer; Margis, Rogerio; Zaha, Arnaldo
2013-11-14
High-resolution melting (HRM) provides a low-cost, fast and sensitive scanning method that allows the detection of DNA sequence variations in a single step, which makes it appropriate for application in parasite identification and genotyping. The aim of this work was to implement an HRM-PCR assay targeting part of the mitochondrial cox1 gene to achieve an accurate and fast method for Echinococcus spp. differentiation. For melting analysis, a total of 107 samples from seven species were used in this study. The species analyzed included Echinococcus granulosus (n = 41) and Echinococcus ortleppi (n = 50) from bovine, Echinococcus vogeli (n = 2) from paca, Echinococcus oligarthra (n = 3) from agouti, Echinococcus multilocularis (n = 6) from monkey and Echinococcus canadensis (n = 2) and Taenia hydatigena (n = 3) from pig. DNA extraction was performed, and a 444-bp fragment of the cox1 gene was amplified. Two approaches were used, one based on HRM analysis, and a second using SYBR Green Tm-based. In the HRM analysis, a specific profile for each species was observed. Although some species exhibited almost the same melting temperature (Tm) value, the HRM profiles could be clearly discriminated. The SYBR Green Tm-based analysis showed differences between E. granulosus and E. ortleppi and between E. vogeli and E. oligarthra. In this work, we report the implementation of HRM analysis to differentiate species of the genus Echinococcus using part of the mitochondrial gene cox1. This method may be also potentially applied to identify other species belonging to the Taeniidae family.
Adjuvant radiation therapy and lymphadenectomy in esophageal cancer: a SEER database analysis.
Shridhar, Ravi; Weber, Jill; Hoffe, Sarah E; Almhanna, Khaldoun; Karl, Richard; Meredith, Kenneth
2013-08-01
This study seeks to determine the effects of postoperative radiation therapy and lymphadenectomy on survival in esophageal cancer. An analysis of patients with surgically resected esophageal cancer from the SEER database between 2004 and 2008 was performed to determine association of adjuvant radiation and lymph node dissection on survival. Survival curves were calculated according to the Kaplan-Meier method and log-rank analysis. Multivariate analysis (MVA) was performed by the Cox proportional hazard model. We identified 2109 patients who met inclusion criteria. Radiation was associated with increased survival in stage III patients (p = 0.005), no benefit in stage II (p = 0.075) and IV (p = 0.913) patients, and decreased survival in stage I patients (p < 0.0001). Univariate analysis revealed that radiation therapy was associated with a survival benefit node positive (N1) patients while it was associated with a detriment in survival for node negative (N0) patients. Removing >12 and >15 lymph nodes was associated with increased survival in N0 patients, while removing >8, >10, >12, >15, and >20 lymph nodes was associated with a survival benefit in N1 patients. MVA revealed that age, gender, tumor and nodal stage, tumor location, and number of lymph nodes removed were prognostic for survival in N0 patients. In N1 patients, MVA showed the age, tumor stage, number of lymph nodes removed, and radiation were prognostic for survival. The number of lymph nodes removed in esophageal cancer is associated with increased survival. The benefit of adjuvant radiation therapy on survival in esophageal cancer is limited to N1 patients.
NASA Astrophysics Data System (ADS)
Singh, Priyanka; Islam, S. S.; Ahmad, Hilal; Prabaharan, A.
2018-02-01
Nitrosourea plays an important role in the treatment of cancer. N-ethyl-N-nitrosourea, also known as ENU, (chemical formula C3H7N3O2), is a highly potent mutagen. The chemical is an alkylating agent and acts by transferring the ethyl group of ENU to nucleobases (usually thymine) in nucleic acids. The molecular structure of N-ethyl-N-nitrosourea has been elucidated using experimental (FT-IR and FT-Raman) and theoretical (DFT) techniques. APT charges, Mulliken atomic charges, Natural bond orbital, Electrostatic potential, HOMO-LUMO and AIM analysis were performed to identify the reactive sites and charge transfer interactions. Furthermore, to evaluate the anticancer activity of ENU molecular docking studies were carried out against 2JIU protein.
Analysis of variances of quasirapidities in collisions of gold nuclei with track-emulsion nuclei
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gulamov, K. G.; Zhokhova, S. I.; Lugovoi, V. V., E-mail: lugovoi@uzsci.net
2012-08-15
A new method of an analysis of variances was developed for studying n-particle correlations of quasirapidities in nucleus-nucleus collisions for a large constant number n of particles. Formulas that generalize the results of the respective analysis to various values of n were derived. Calculations on the basis of simple models indicate that the method is applicable, at least for n {>=} 100. Quasirapidity correlations statistically significant at a level of 36 standard deviations were discovered in collisions between gold nuclei and track-emulsion nuclei at an energy of 10.6 GeV per nucleon. The experimental data obtained in our present study aremore » contrasted against the theory of nucleus-nucleus collisions.« less
Kim, Tae-Hyung; Yun, Tae Jin; Park, Chul-Kee; Kim, Tae Min; Kim, Ji-Hoon; Sohn, Chul-Ho; Won, Jae Kyung; Park, Sung-Hye; Kim, Il Han; Choi, Seung Hong
2017-03-21
Purpose was to assess predictive power for overall survival (OS) and diagnostic performance of combination of susceptibility-weighted MRI sequences (SWMRI) and dynamic susceptibility contrast (DSC) perfusion-weighted imaging (PWI) for differentiation of recurrence and radionecrosis in high-grade glioma (HGG). We enrolled 51 patients who underwent radiation therapy or gamma knife surgeryfollowed by resection for HGG and who developed new measurable enhancement more than six months after complete response. The lesions were confirmed as recurrence (n = 32) or radionecrosis (n = 19). The mean and each percentile value from cumulative histograms of normalized CBV (nCBV) and proportion of dark signal intensity on SWMRI (proSWMRI, %) within enhancement were compared. Multivariate regression was performed for the best differentiator. The cutoff value of best predictor from ROC analysis was evaluated. OS was determined with Kaplan-Meier method and log-rank test. Recurrence showed significantly lower proSWMRI and higher mean nCBV and 90th percentile nCBV (nCBV90) than radionecrosis. Regression analysis revealed both nCBV90 and proSWMRI were independent differentiators. Combination of nCBV90 and proSWMRI achieved 71.9% sensitivity (23/32), 100% specificity (19/19) and 82.3% accuracy (42/51) using best cut-off values (nCBV90 > 2.07 and proSWMRI≤15.76%) from ROC analysis. In subgroup analysis, radionecrosis with nCBV > 2.07 (n = 5) showed obvious hemorrhage (proSWMRI > 32.9%). Patients with nCBV90 > 2.07 and proSWMRI≤15.76% had significantly shorter OS. In conclusion, compared with DSC PWI alone, combination of SWMRI and DSC PWI have potential to be prognosticator for OS and lower false positive rate in differentiation of recurrence and radionecrosis in HGG who develop new measurable enhancement more than six months after complete response.
RELATIVE CONTRIBUTIONS OF THREE DESCRIPTIVE METHODS: IMPLICATIONS FOR BEHAVIORAL ASSESSMENT
Pence, Sacha T; Roscoe, Eileen M; Bourret, Jason C; Ahearn, William H
2009-01-01
This study compared the outcomes of three descriptive analysis methods—the ABC method, the conditional probability method, and the conditional and background probability method—to each other and to the results obtained from functional analyses. Six individuals who had been diagnosed with developmental delays and exhibited problem behavior participated. Functional analyses indicated that participants' problem behavior was maintained by social positive reinforcement (n = 2), social negative reinforcement (n = 2), or automatic reinforcement (n = 2). Results showed that for all but 1 participant, descriptive analysis outcomes were similar across methods. In addition, for all but 1 participant, the descriptive analysis outcome differed substantially from the functional analysis outcome. This supports the general finding that descriptive analysis is a poor means of determining functional relations. PMID:19949536
Space Trajectories Error Analysis (STEAP) Programs. Volume 1: Analytic manual, update
NASA Technical Reports Server (NTRS)
1971-01-01
Manual revisions are presented for the modified and expanded STEAP series. The STEAP 2 is composed of three independent but related programs: NOMAL for the generation of n-body nominal trajectories performing a number of deterministic guidance events; ERRAN for the linear error analysis and generalized covariance analysis along specific targeted trajectories; and SIMUL for testing the mathematical models used in the navigation and guidance process. The analytic manual provides general problem description, formulation, and solution and the detailed analysis of subroutines. The programmers' manual gives descriptions of the overall structure of the programs as well as the computational flow and analysis of the individual subroutines. The user's manual provides information on the input and output quantities of the programs. These are updates to N69-36472 and N69-36473.
Khan, Shaheer; Liu, Jenkuei; Szabo, Zoltan; Kunnummal, Baburaj; Han, Xiaorui; Ouyang, Yilan; Linhardt, Robert J; Xia, Qiangwei
2018-06-15
N-linked glycan analysis of recombinant therapeutic proteins, such as monoclonal antibodies, Fc-fusion proteins, and antibody-drug conjugates, provides valuable information regarding protein therapeutics glycosylation profile. Both qualitative identification and quantitative analysis of N-linked glycans on recombinant therapeutic proteins are critical analytical tasks in the biopharma industry during the development of a biotherapeutic. Currently, such analyses are mainly carried out using capillary electrophoresis/laser-induced fluorescence (CE/LIF), liquid chromatography/fluorescence (LC/FLR), and liquid chromatography/fluorescence/mass spectrometry (LC/FLR/MS) technologies. N-linked glycans are first released from glycoproteins by enzymatic digestion, then labeled with fluorescence dyes for subsequent CE or LC separation, and LIF or MS detection. Here we present an on-line CE/LIF/MS N-glycan analysis workflow that incorporates the fluorescent Teal™ dye and an electrokinetic pump-based nanospray sheath liquid capillary electrophoresis/mass spectrometry (CE/MS) ion source. Electrophoresis running buffer systems using ammonium acetate and ammonium hydroxide were developed for the negative ion mode CE/MS analysis of fluorescence-labeled N-linked glycans. Results show that on-line CE/LIF/MS analysis can be readily achieved using this versatile CE/MS ion source on common CE/MS instrument platforms. This on-line CE/LIF/MS method using Teal™ fluorescent dye and electrokinetic pump-based nanospray sheath liquid CE/MS coupling technology holds promise for on-line quantitation and identification of N-linked glycans on recombinant therapeutic proteins. Copyright © 2018 John Wiley & Sons, Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sintonen, Sakari, E-mail: sakari.sintonen@aalto.fi; Suihkonen, Sami; Jussila, Henri
2014-08-28
The crystal quality of bulk GaN crystals is continuously improving due to advances in GaN growth techniques. Defect characterization of the GaN substrates by conventional methods is impeded by the very low dislocation density and a large scale defect analysis method is needed. White beam synchrotron radiation x-ray topography (SR-XRT) is a rapid and non-destructive technique for dislocation analysis on a large scale. In this study, the defect structure of an ammonothermal c-plane GaN substrate was recorded using SR-XRT and the image contrast caused by the dislocation induced microstrain was simulated. The simulations and experimental observations agree excellently and themore » SR-XRT image contrasts of mixed and screw dislocations were determined. Apart from a few exceptions, defect selective etching measurements were shown to correspond one to one with the SR-XRT results.« less
Keru, Godfrey; Ndungu, Patrick G.; Mola, Genene T.; Nyamori, Vincent O.
2015-01-01
Nanocomposites of poly(3-hexylthiophene) (P3HT) and nitrogen-doped carbon nanotubes (N-CNTs) have been synthesized by two methods; specifically, direct solution mixing and in situ polymerization. The nanocomposites were characterized by means of transmission electron microscopy (TEM), scanning electron microscopy (SEM), X-ray dispersive spectroscopy, UV-Vis spectrophotometry, photoluminescence spectrophotometry (PL), Fourier transform infrared spectroscopy (FTIR), Raman spectroscopy, thermogravimetric analysis, and dispersive surface energy analysis. The nanocomposites were used in the active layer of a bulk heterojunction organic solar cell with the composition ITO/PEDOT:PSS/P3HT:N-CNTS:PCBM/LiF/Al. TEM and SEM analysis showed that the polymer successfully wrapped the N-CNTs. FTIR results indicated good π-π interaction within the nanocomposite synthesized by in situ polymerization as opposed to samples made by direct solution mixing. Dispersive surface energies of the N-CNTs and nanocomposites supported the fact that polymer covered the N-CNTs well. J-V analysis show that good devices were formed from the two nanocomposites, however, the in situ polymerization nanocomposite showed better photovoltaic characteristics.
Mutual influence between triel bond and cation-π interactions: an ab initio study
NASA Astrophysics Data System (ADS)
Esrafili, Mehdi D.; Mousavian, Parisasadat
2017-12-01
Using ab initio calculations, the cooperative and solvent effects on cation-π and B...N interactions are studied in some model ternary complexes, where these interactions coexist. The nature of the interactions and the mechanism of cooperativity are investigated by means of quantum theory of atoms in molecules (QTAIM), noncovalent interaction (NCI) index and natural bond orbital analysis. The results indicate that all cation-π and B...N binding distances in the ternary complexes are shorter than those of corresponding binary systems. The QTAIM analysis reveals that ternary complexes have higher electron density at their bond critical points relative to the corresponding binary complexes. In addition, according to the QTAIM analysis, the formation of cation-π interaction increases covalency of B...N bonds. The NCI analysis indicates that the cooperative effects in the ternary complexes make a shift in the location of the spike associated with each interaction, which can be regarded as an evidence for the reinforcement of both cation-π and B...N interactions in these systems. Solvent effects on the cooperativity of cation-π and B...N interactions are also investigated.
NASA Astrophysics Data System (ADS)
Toyoda, Sakae; Yano, Midori; Nishimura, Sei-Ichi; Akiyama, Hiroko; Hayakawa, Atsushi; Koba, Keisuke; Sudo, Shigeto; Yagi, Kazuyuki; Makabe, Akiko; Tobari, Yoshifumi; Ogawa, Nanako O.; Ohkouchi, Naohiko; Yamada, Keita; Yoshida, Naohiro
2011-06-01
Isotopomer ratios of N2O (bulk nitrogen and oxygen isotope ratios, δ15Nbulk and δ18O, and intramolecular 15N site preference, SP) are useful parameters that characterize sources of this greenhouse gas and also provide insight into production and consumption mechanisms. We measured isotopomer ratios of N2O emitted from typical Japanese agricultural soils (Fluvisols and Andisols) planted with rice, wheat, soybean, and vegetables, and treated with synthetic (urea or ammonium) and organic (poultry manure) fertilizers. The results were analyzed using a previously reported isotopomeric N2O signature produced by nitrifying/denitrifying bacteria and a characteristic relationship between δ15Nbulk and SP during N2O reduction by denitrifying bacteria. Relative contributions from nitrification (hydroxylamine oxidation) and denitrification (nitrite reduction) to gross N2O production deduced from the analysis depended on soil type and fertilizer. The contribution from nitrification was relatively high (40%-70%) in Andisols amended with synthetic ammonium fertilizer, while denitrification was dominant (50%-90%) in the same soils amended with poultry manure during the period when N2O production occurred in the surface layer. This information on production processes is in accordance with that obtained from flux/concentration analysis of N2O and soil inorganic nitrogen. However, isotopomer analysis further revealed that partial reduction of N2O was pronounced in high-bulk density, alluvial soil (Fluvisol) compared to low-bulk density, volcanic ash soil (Andisol), and that the observed difference in N2O flux between normal and pelleted manure could have resulted from a similar mechanism with different rates of gross production and gross consumption. The isotopomeric analysis is based on data from pure culture bacteria and would be improved by further studies on in situ biological processes in soils including those by fungi. When flux/concentration-weighted average isotopomer ratios of N2O from various fertilized soils were examined, linear correlations were found between δ15Nbulk and δ18O, and between SP and δ15Nbulk. These relationships would be useful to parameterize isotopomer ratios of soil-emitted N2O for the modeling of the global N2O isotopomer budget. The results obtained in this study and those from previous firn/ice core studies confirm that the principal source of anthropogenic N2O is fertilized soils.
Nuclear-Renewable Hybrid Energy System Market Analysis Plans
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ruth, Mark
2016-06-09
This presentation describes nuclear-renewable hybrid energy systems (N-R HESs), states their potential benefits, provides figures for the four tightly coupled N-R HESs that NREL is currently analyzing, and outlines the analysis process that is underway.
Reducing youth screen time: qualitative metasynthesis of findings on barriers and facilitators.
Minges, Karl E; Owen, Neville; Salmon, Jo; Chao, Ariana; Dunstan, David W; Whittemore, Robin
2015-04-01
An integrated perspective on the relevant qualitative findings on the experience of screen time in youth can inform the development of hypotheses to be tested in future research and can guide the development of interventions to decrease sedentary behavior. The purpose of this qualitative metasynthesis was to explore parent, youth, and educational professionals' perceptions of barriers to, and facilitators of, reducing youth screen time. Qualitative metasynthesis techniques were used to analyze and synthesize 15 qualitative studies of screen time among youth (11-18 years) meeting inclusion criteria. The phrases, quotes, and/or author interpretations (i.e., theme or subtheme) were recorded in a data display matrix to facilitate article comparisons. Codes were collapsed into 23 categories of similar conceptual meaning and 3 overarching themes were derived using thematic analysis procedures. Study sample sizes ranged from 6 to 270 participants from 6 countries. Data collection methods included focus groups (n = 6), interviews (n = 4), focus group and interviews (n = 4), and naturalistic observation (n = 1) with youth and/or parents. Data analysis techniques included thematic analysis (n = 9), content analysis (n = 3), grounded theory (n = 1), observation (n = 1), and interpretive phenomenological analysis (n = 1). Three thematic categories were identified: (a) youth's norms-screen time is an integral part of daily life, and facilitates opportunities for entertainment, social interaction, and escapism; (b) family dynamics and parental roles-parents are conflicted and send mixed messages about the appropriate uses and amounts of screen time; and, (c) resources and environment-engagement in screen time is dependent on school, community, neighborhood, and home environmental contexts. Screen time is an established norm in many youth cultures, presenting barriers to behavior change. Parents recognize the importance of reducing youth screen time, but model and promote engagement themselves. For youth and parents, mutually agreed rules, limits, and parental monitoring of screen time were perceived as likely to be effective. (c) 2015 APA, all rights reserved).
Reducing Youth Screen Time: Qualitative Metasynthesis of Findings on Barriers and Facilitators
Minges, Karl E.; Salmon, Jo; Dunstan, David W.; Owen, Neville; Chao, Ariana; Whittemore, Robin
2015-01-01
Objective An integrated perspective on the relevant qualitative findings on the experience of screen time in youth can inform the development of hypotheses to be tested in future research and can guide the development of interventions to decrease sedentary behavior. The purpose of this qualitative metasynthesis was to explore parent, youth, and educational professionals’ perceptions of barriers to, and facilitators of, reducing youth screen time. Method Qualitative metasynthesis techniques were used to analyze and synthesize 15 qualitative studies of screen time among youth (11–18 years) meeting inclusion criteria. The phrases, quotes, and/or author interpretations (i.e., theme or subtheme) were recorded in a data display matrix to facilitate article comparisons. Codes were collapsed into 23 categories of similar conceptual meaning and 3 overarching themes were derived using thematic analysis procedures. Results Study sample sizes ranged from 6 to 270 participants from 6 countries. Data collection methods included focus groups (n = 6), interviews (n = 4), focus group and interviews (n = 4), and naturalistic observation (n = 1) with youth and/or parents. Data analysis techniques included thematic analysis (n = 9), content analysis (n = 3), grounded theory (n = 1), observation (n = 1), and interpretive phenomenological analysis (n = 1). Three thematic categories were identified: (a) youth’s norms—screen time is an integral part of daily life, and facilitates opportunities for entertainment, social interaction, and escapism; (b) family dynamics and parental roles—parents are conflicted and send mixed messages about the appropriate uses and amounts of screen time; and, (c) resources and environment—engagement in screen time is dependent on school, community, neighborhood, and home environmental contexts. Conclusions Screen time is an established norm in many youth cultures, presenting barriers to behavior change. Parents recognize the importance of reducing youth screen time, but model and promote engagement themselves. For youth and parents, mutually agreed rules, limits, and parental monitoring of screen time were perceived as likely to be effective. PMID:25822054
Why Do We Miss Rare Targets? Exploring the Boundaries of the Low Prevalence Effect
2008-11-24
effect of prevalence ( F (1,8) = 34.2, p G 0.001, partial eta2 = 0.81), but no effect of set size ( F (1,8) G 1 , n.s.) and no interaction ( F (1,8) G 1 , n.s...Figure 2d; for Prevalence, Target Presence, and all interaction terms, F (1,8) G 1 , n.s.; for Set Size, F (1,8) = 1.7, p 9 0.2). What hints can we get... 1 , n.s.), and no interaction ( F (1,14) G 1 , n.s.). There were insufficient errors on target-absent trials for analysis. An analysis by RT quartile
Melioration of Optical and Electrical Performance of Ga-N Codoped ZnO Thin Films
NASA Astrophysics Data System (ADS)
Narayanan, Nripasree; Deepak, N. K.
2018-06-01
Transparent and conducting p-type zinc oxide (ZnO) thin films doped with gallium (Ga) and nitrogen (N) simultaneously were deposited on glass substrates by spray pyrolysis technique. Phase composition analysis by X-ray diffraction confirmed the polycrystallinity of the films with pure ZnO phase. Energy dispersive X-ray analysis showed excellent incorporation of N in the ZnO matrix by means of codoping. The optical transmittance of N monodoped film was poor but got improved with Ga-N codoping and also resulted in the enhancement of optical energy gap. Hole concentration increased with codoping and consequently, lower resistivity and high stability were obtained.
Wongphatcharachai, Manoosak; Wisedchanwet, Trong; Lapkuntod, Jiradej; Nonthabenjawan, Nutthawan; Jairak, Waleemas; Amonsin, Alongkorn
2012-06-01
Monitoring of influenza A virus (IAV) was conducted in wild bird species in central Thailand. Four IAV subtype H12N1 strains were isolated from a watercock (order Gruiformes, family Rallidae) (n = 1) and lesser whistling ducks (order Anseriformes, family Anatidae) (n = 3). All H12N1 viruses were characterized by whole-genome sequencing. Phylogenetic analysis of all eight genes of the Thai H12N1 viruses indicated that they are most closely related to the Eurasian strains. Analysis of the HA gene revealed the strains to be of low pathogenicity. This study is the first to report the circulation of IAV subtype H12N1 in Thailand and to describe the genetic characteristics of H12N1 in Eurasia. Moreover, the genetic information obtained on H12N1 has contributed a new Eurasian strain of H12N1 to the GenBank database.