-
Information criteria for quantifying loss of reversibility in parallelized KMC
NASA Astrophysics Data System (ADS)
Gourgoulias, Konstantinos; Katsoulakis, Markos A.; Rey-Bellet, Luc
2017-01-01
Parallel Kinetic Monte Carlo (KMC) is a potent tool to simulate stochastic particle systems efficiently. However, despite literature on quantifying domain decomposition errors of the particle system for this class of algorithms in the short and in the long time regime, no study yet explores and quantifies the loss of time-reversibility in Parallel KMC. Inspired by concepts from non-equilibrium statistical mechanics, we propose the entropy production per unit time, or entropy production rate, given in terms of an observable and a corresponding estimator, as a metric that quantifies the loss of reversibility. Typically, this is a quantity that cannot be computed explicitly for Parallel KMC, which is why we develop a posteriori estimators that have good scaling properties with respect to the size of the system. Through these estimators, we can connect the different parameters of the scheme, such as the communication time step of the parallelization, the choice of the domain decomposition, and the computational schedule, with its performance in controlling the loss of reversibility. From this point of view, the entropy production rate can be seen both as an information criterion to compare the reversibility of different parallel schemes and as a tool to diagnose reversibility issues with a particular scheme. As a demonstration, we use Sandia Lab's SPPARKS software to compare different parallelization schemes and different domain (lattice) decompositions.
-
On the design of turbo codes
NASA Technical Reports Server (NTRS)
Divsalar, D.; Pollara, F.
1995-01-01
In this article, we design new turbo codes that can achieve near-Shannon-limit performance. The design criterion for random interleavers is based on maximizing the effective free distance of the turbo code, i.e., the minimum output weight of codewords due to weight-2 input sequences. An upper bound on the effective free distance of a turbo code is derived. This upper bound can be achieved if the feedback connection of convolutional codes uses primitive polynomials. We review multiple turbo codes (parallel concatenation of q convolutional codes), which increase the so-called 'interleaving gain' as q and the interleaver size increase, and a suitable decoder structure derived from an approximation to the maximum a posteriori probability decision rule. We develop new rate 1/3, 2/3, 3/4, and 4/5 constituent codes to be used in the turbo encoder structure. These codes, for from 2 to 32 states, are designed by using primitive polynomials. The resulting turbo codes have rates b/n (b = 1, 2, 3, 4 and n = 2, 3, 4, 5, 6), and include random interleavers for better asymptotic performance. These codes are suitable for deep-space communications with low throughput and for near-Earth communications where high throughput is desirable. The performance of these codes is within 1 dB of the Shannon limit at a bit-error rate of 10(exp -6) for throughputs from 1/15 up to 4 bits/s/Hz.
-
Characterizing the Retrieval of Cloud Optical Thickness and Droplet Effective Radius to Overlying Aerosols Using a General Inverse Theory Approach
NASA Astrophysics Data System (ADS)
Coddington, O.; Pilewskie, P.; Schmidt, S.
2013-12-01
The upwelling shortwave irradiance measured by the airborne Solar Spectral Flux Radiometer (SSFR) flying above a cloud and aerosol layer is influenced by the properties of the cloud and aerosol particles below, just as would the radiance measured from satellite. Unlike satellite measurements, those from aircraft provide the unique capability to fly a lower-level leg above the cloud, yet below the aerosol layer, to characterize the extinction of the aerosol layer and account for its impact on the measured cloud albedo. Previous work [Coddington et al., 2010] capitalized on this opportunity to test the effects of aerosol particles (or more appropriately, the effects of neglecting aerosols in forward modeling calculations) on cloud retrievals using data obtained during the Intercontinental Chemical Transport Experiment/Intercontinental Transport and Chemical Transformation of anthropogenic pollution (INTEX-A/ITCT) study. This work showed aerosols can cause a systematic bias in the cloud retrieval and that such a bias would need to be distinguished from a true aerosol indirect effect (i.e. the brightening of a cloud due to aerosol effects on cloud microphysics) as theorized by Haywood et al., [2004]. The effects of aerosols on clouds are typically neglected in forward modeling calculations because their pervasiveness, variable microphysical properties, loading, and lifetimes makes forward modeling calculations under all possible combinations completely impractical. Using a general inverse theory technique, which propagates separate contributions from measurement and forward modeling errors into probability distributions of retrieved cloud optical thickness and droplet effective radius, we have demonstrated how the aerosol presence can be introduced as a spectral systematic error in the distributions of the forward modeling solutions. The resultant uncertainty and bias in cloud properties induced by the aerosols is identified by the shape and peak of the posteriori retrieval distributions. In this work, we apply this general inverse theory approach to extend our analysis of the spectrally-dependent impacts of overlying aerosols on cloud properties over a broad range in cloud optical thickness and droplet effective radius. We investigate the relative impacts of this error source and compare and contrast results to biases and uncertainties in cloud properties induced by varying surface conditions (ocean, land, snow). We perform the analysis for two different measurement accuracies (3% and 0.3%) that are typical of current passive imagers, such as the Moderate Resolution Imaging Spectroradiometer (MODIS) [Platnick et al., 2003], and that are expected for future passive imagers, such as the HyperSpectral Imager for Climate Science (HySICS) [Kopp et al., 2010]. Coddington, O., P. Pilewskie, et al., 2010, J. Geophys. Res., 115, doi: 10.1029/2009JD012829. Haywood, J. M., S. R. Osborne, and S. J. Abel, 2004, Q. J. R. Meteorol. Soc., 130, 779-800. Kopp, G., et al., 2010, Hyperspectral Imagery Radiometry Improvements for Visible and Near-Infrared Climate Studies, paper presented at 2010 Earth Science Technology Forum, Arlington, VA, USA. Platnick, S., et al., 2003, IEEE Trans. Geosci. Remote Sens., 41(2), 459- 473.
-
Analysis of Highly-Resolved Simulations of 2-D Humps Toward Improvement of Second-Moment Closures
NASA Technical Reports Server (NTRS)
Jeyapaul, Elbert; Rumsey Christopher
2013-01-01
Fully resolved simulation data of flow separation over 2-D humps has been used to analyze the modeling terms in second-moment closures of the Reynolds-averaged Navier- Stokes equations. Existing models for the pressure-strain and dissipation terms have been analyzed using a priori calculations. All pressure-strain models are incorrect in the high-strain region near separation, although a better match is observed downstream, well into the separated-flow region. Near-wall inhomogeneity causes pressure-strain models to predict incorrect signs for the normal components close to the wall. In a posteriori computations, full Reynolds stress and explicit algebraic Reynolds stress models predict the separation point with varying degrees of success. However, as with one- and two-equation models, the separation bubble size is invariably over-predicted.
-
Technical Note: Introduction of variance component analysis to setup error analysis in radiotherapy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matsuo, Yukinori, E-mail: ymatsuo@kuhp.kyoto-u.ac.
Purpose: The purpose of this technical note is to introduce variance component analysis to the estimation of systematic and random components in setup error of radiotherapy. Methods: Balanced data according to the one-factor random effect model were assumed. Results: Analysis-of-variance (ANOVA)-based computation was applied to estimate the values and their confidence intervals (CIs) for systematic and random errors and the population mean of setup errors. The conventional method overestimates systematic error, especially in hypofractionated settings. The CI for systematic error becomes much wider than that for random error. The ANOVA-based estimation can be extended to a multifactor model considering multiplemore » causes of setup errors (e.g., interpatient, interfraction, and intrafraction). Conclusions: Variance component analysis may lead to novel applications to setup error analysis in radiotherapy.« less
-
Adaptive Mesh Refinement for Microelectronic Device Design
NASA Technical Reports Server (NTRS)
Cwik, Tom; Lou, John; Norton, Charles
1999-01-01
Finite element and finite volume methods are used in a variety of design simulations when it is necessary to compute fields throughout regions that contain varying materials or geometry. Convergence of the simulation can be assessed by uniformly increasing the mesh density until an observable quantity stabilizes. Depending on the electrical size of the problem, uniform refinement of the mesh may be computationally infeasible due to memory limitations. Similarly, depending on the geometric complexity of the object being modeled, uniform refinement can be inefficient since regions that do not need refinement add to the computational expense. In either case, convergence to the correct (measured) solution is not guaranteed. Adaptive mesh refinement methods attempt to selectively refine the region of the mesh that is estimated to contain proportionally higher solution errors. The refinement may be obtained by decreasing the element size (h-refinement), by increasing the order of the element (p-refinement) or by a combination of the two (h-p refinement). A successful adaptive strategy refines the mesh to produce an accurate solution measured against the correct fields without undue computational expense. This is accomplished by the use of a) reliable a posteriori error estimates, b) hierarchal elements, and c) automatic adaptive mesh generation. Adaptive methods are also useful when problems with multi-scale field variations are encountered. These occur in active electronic devices that have thin doped layers and also when mixed physics is used in the calculation. The mesh needs to be fine at and near the thin layer to capture rapid field or charge variations, but can coarsen away from these layers where field variations smoothen and charge densities are uniform. This poster will present an adaptive mesh refinement package that runs on parallel computers and is applied to specific microelectronic device simulations. Passive sensors that operate in the infrared portion of the spectrum as well as active device simulations that model charge transport and Maxwell's equations will be presented.
-
Mass-conservative reconstruction of Galerkin velocity fields for transport simulations
NASA Astrophysics Data System (ADS)
Scudeler, C.; Putti, M.; Paniconi, C.
2016-08-01
Accurate calculation of mass-conservative velocity fields from numerical solutions of Richards' equation is central to reliable surface-subsurface flow and transport modeling, for example in long-term tracer simulations to determine catchment residence time distributions. In this study we assess the performance of a local Larson-Niklasson (LN) post-processing procedure for reconstructing mass-conservative velocities from a linear (P1) Galerkin finite element solution of Richards' equation. This approach, originally proposed for a-posteriori error estimation, modifies the standard finite element velocities by imposing local conservation on element patches. The resulting reconstructed flow field is characterized by continuous fluxes on element edges that can be efficiently used to drive a second order finite volume advective transport model. Through a series of tests of increasing complexity that compare results from the LN scheme to those using velocity fields derived directly from the P1 Galerkin solution, we show that a locally mass-conservative velocity field is necessary to obtain accurate transport results. We also show that the accuracy of the LN reconstruction procedure is comparable to that of the inherently conservative mixed finite element approach, taken as a reference solution, but that the LN scheme has much lower computational costs. The numerical tests examine steady and unsteady, saturated and variably saturated, and homogeneous and heterogeneous cases along with initial and boundary conditions that include dry soil infiltration, alternating solute and water injection, and seepage face outflow. Typical problems that arise with velocities derived from P1 Galerkin solutions include outgoing solute flux from no-flow boundaries, solute entrapment in zones of low hydraulic conductivity, and occurrences of anomalous sources and sinks. In addition to inducing significant mass balance errors, such manifestations often lead to oscillations in concentration values that can moreover cause the numerical solution to explode. These problems do not occur when using LN post-processed velocities.
-
Bayesian inversion of a CRN depth profile to infer Quaternary erosion of the northwestern Campine Plateau (NE Belgium)
NASA Astrophysics Data System (ADS)
Laloy, Eric; Beerten, Koen; Vanacker, Veerle; Christl, Marcus; Rogiers, Bart; Wouters, Laurent
2017-07-01
The rate at which low-lying sandy areas in temperate regions, such as the Campine Plateau (NE Belgium), have been eroding during the Quaternary is a matter of debate. Current knowledge on the average pace of landscape evolution in the Campine area is largely based on geological inferences and modern analogies. We performed a Bayesian inversion of an in situ-produced 10Be concentration depth profile to infer the average long-term erosion rate together with two other parameters: the surface exposure age and the inherited 10Be concentration. Compared to the latest advances in probabilistic inversion of cosmogenic radionuclide (CRN) data, our approach has the following two innovative components: it (1) uses Markov chain Monte Carlo (MCMC) sampling and (2) accounts (under certain assumptions) for the contribution of model errors to posterior uncertainty. To investigate to what extent our approach differs from the state of the art in practice, a comparison against the Bayesian inversion method implemented in the CRONUScalc program is made. Both approaches identify similar maximum a posteriori (MAP) parameter values, but posterior parameter and predictive uncertainty derived using the method taken in CRONUScalc is moderately underestimated. A simple way for producing more consistent uncertainty estimates with the CRONUScalc-like method in the presence of model errors is therefore suggested. Our inferred erosion rate of 39 ± 8. 9 mm kyr-1 (1σ) is relatively large in comparison with landforms that erode under comparable (paleo-)climates elsewhere in the world. We evaluate this value in the light of the erodibility of the substrate and sudden base level lowering during the Middle Pleistocene. A denser sampling scheme of a two-nuclide concentration depth profile would allow for better inferred erosion rate resolution, and including more uncertain parameters in the MCMC inversion.
-
Variationally consistent discretization schemes and numerical algorithms for contact problems
NASA Astrophysics Data System (ADS)
Wohlmuth, Barbara
We consider variationally consistent discretization schemes for mechanical contact problems. Most of the results can also be applied to other variational inequalities, such as those for phase transition problems in porous media, for plasticity or for option pricing applications from finance. The starting point is to weakly incorporate the constraint into the setting and to reformulate the inequality in the displacement in terms of a saddle-point problem. Here, the Lagrange multiplier represents the surface forces, and the constraints are restricted to the boundary of the simulation domain. Having a uniform inf-sup bound, one can then establish optimal low-order a priori convergence rates for the discretization error in the primal and dual variables. In addition to the abstract framework of linear saddle-point theory, complementarity terms have to be taken into account. The resulting inequality system is solved by rewriting it equivalently by means of the non-linear complementarity function as a system of equations. Although it is not differentiable in the classical sense, semi-smooth Newton methods, yielding super-linear convergence rates, can be applied and easily implemented in terms of a primal-dual active set strategy. Quite often the solution of contact problems has a low regularity, and the efficiency of the approach can be improved by using adaptive refinement techniques. Different standard types, such as residual- and equilibrated-based a posteriori error estimators, can be designed based on the interpretation of the dual variable as Neumann boundary condition. For the fully dynamic setting it is of interest to apply energy-preserving time-integration schemes. However, the differential algebraic character of the system can result in high oscillations if standard methods are applied. A possible remedy is to modify the fully discretized system by a local redistribution of the mass. Numerical results in two and three dimensions illustrate the wide range of possible applications and show the performance of the space discretization scheme, non-linear solver, adaptive refinement process and time integration.
-
Spectroscopic properties of Arx-Zn and Arx-Ag+ (x = 1,2) van der Waals complexes
NASA Astrophysics Data System (ADS)
Oyedepo, Gbenga A.; Peterson, Charles; Schoendorff, George; Wilson, Angela K.
2013-03-01
Potential energy curves have been constructed using coupled cluster with singles, doubles, and perturbative triple excitations (CCSD(T)) in combination with all-electron and pseudopotential-based multiply augmented correlation consistent basis sets [m-aug-cc-pV(n + d)Z; m = singly, doubly, triply, n = D,T,Q,5]. The effect of basis set superposition error on the spectroscopic properties of Ar-Zn, Ar2-Zn, Ar-Ag+, and Ar2-Ag+ van der Waals complexes was examined. The diffuse functions of the doubly and triply augmented basis sets have been constructed using the even-tempered expansion. The a posteriori counterpoise scheme of Boys and Bernardi and its generalized variant by Valiron and Mayer has been utilized to correct for basis set superposition error (BSSE) in the calculated spectroscopic properties for diatomic and triatomic species. It is found that even at the extrapolated complete basis set limit for the energetic properties, the pseudopotential-based calculations still suffer from significant BSSE effects unlike the all-electron basis sets. This indicates that the quality of the approximations used in the design of pseudopotentials could have major impact on a seemingly valence-exclusive effect like BSSE. We confirm the experimentally determined equilibrium internuclear distance (re), binding energy (De), harmonic vibrational frequency (ωe), and C1Π ← X1Σ transition energy for ArZn and also predict the spectroscopic properties for the low-lying excited states of linear Ar2-Zn (X1Σg, 3Πg, 1Πg), Ar-Ag+ (X1Σ, 3Σ, 3Π, 3Δ, 1Σ, 1Π, 1Δ), and Ar2-Ag+ (X1Σg, 3Σg, 3Πg, 3Δg, 1Σg, 1Πg, 1Δg) complexes, using the CCSD(T) and MR-CISD + Q methods, to aid in their experimental characterizations.
-
The Influence of Observation Errors on Analysis Error and Forecast Skill Investigated with an Observing System Simulation Experiment
NASA Technical Reports Server (NTRS)
Prive, N. C.; Errico, R. M.; Tai, K.-S.
2013-01-01
The Global Modeling and Assimilation Office (GMAO) observing system simulation experiment (OSSE) framework is used to explore the response of analysis error and forecast skill to observation quality. In an OSSE, synthetic observations may be created that have much smaller error than real observations, and precisely quantified error may be applied to these synthetic observations. Three experiments are performed in which synthetic observations with magnitudes of applied observation error that vary from zero to twice the estimated realistic error are ingested into the Goddard Earth Observing System Model (GEOS-5) with Gridpoint Statistical Interpolation (GSI) data assimilation for a one-month period representing July. The analysis increment and observation innovation are strongly impacted by observation error, with much larger variances for increased observation error. The analysis quality is degraded by increased observation error, but the change in root-mean-square error of the analysis state is small relative to the total analysis error. Surprisingly, in the 120 hour forecast increased observation error only yields a slight decline in forecast skill in the extratropics, and no discernable degradation of forecast skill in the tropics.
-
Improved identification of the solution space of aerosol microphysical properties derived from the inversion of profiles of lidar optical data, part 1: theory.
PubMed
Kolgotin, Alexei; Müller, Detlef; Chemyakin, Eduard; Romanov, Anton
2016-12-01
Multiwavelength Raman/high spectral resolution lidars that measure backscatter coefficients at 355, 532, and 1064 nm and extinction coefficients at 355 and 532 nm can be used for the retrieval of particle microphysical parameters, such as effective and mean radius, number, surface-area and volume concentrations, and complex refractive index, from inversion algorithms. In this study, we carry out a correlation analysis in order to investigate the degree of dependence that may exist between the optical data taken with lidar and the underlying microphysical parameters. We also investigate if the correlation properties identified in our study can be used as a priori or a posteriori constraints for our inversion scheme so that the inversion results can be improved. We made the simplifying assumption of error-free optical data in order to find out what correlations exist in the best case situation. Clearly, for practical applications, erroneous data need to be considered too. On the basis of simulations with synthetic optical data, we find the following results, which hold true for arbitrary particle size distributions, i.e., regardless of the modality or the shape of the size distribution function: surface-area concentrations and extinction coefficients are linearly correlated with a correlation coefficient above 0.99. We also find a correlation coefficient above 0.99 for the extinction coefficient versus (1) the ratio of the volume concentration to effective radius and (2) the product of the number concentration times the sum of the squares of the mean radius and standard deviation of the investigated particle size distributions. Besides that, we find that for particles of any mode fraction of the particle size distribution, the complex refractive index is uniquely defined by extinction- and backscatter-related Ångström exponents, lidar ratios at two wavelengths, and an effective radius.
-
An adaptive sparse-grid high-order stochastic collocation method for Bayesian inference in groundwater reactive transport modeling
NASA Astrophysics Data System (ADS)
Zhang, Guannan; Lu, Dan; Ye, Ming; Gunzburger, Max; Webster, Clayton
2013-10-01
Bayesian analysis has become vital to uncertainty quantification in groundwater modeling, but its application has been hindered by the computational cost associated with numerous model executions required by exploring the posterior probability density function (PPDF) of model parameters. This is particularly the case when the PPDF is estimated using Markov Chain Monte Carlo (MCMC) sampling. In this study, a new approach is developed to improve the computational efficiency of Bayesian inference by constructing a surrogate of the PPDF, using an adaptive sparse-grid high-order stochastic collocation (aSG-hSC) method. Unlike previous works using first-order hierarchical basis, this paper utilizes a compactly supported higher-order hierarchical basis to construct the surrogate system, resulting in a significant reduction in the number of required model executions. In addition, using the hierarchical surplus as an error indicator allows locally adaptive refinement of sparse grids in the parameter space, which further improves computational efficiency. To efficiently build the surrogate system for the PPDF with multiple significant modes, optimization techniques are used to identify the modes, for which high-probability regions are defined and components of the aSG-hSC approximation are constructed. After the surrogate is determined, the PPDF can be evaluated by sampling the surrogate system directly without model execution, resulting in improved efficiency of the surrogate-based MCMC compared with conventional MCMC. The developed method is evaluated using two synthetic groundwater reactive transport models. The first example involves coupled linear reactions and demonstrates the accuracy of our high-order hierarchical basis approach in approximating high-dimensional posteriori distribution. The second example is highly nonlinear because of the reactions of uranium surface complexation, and demonstrates how the iterative aSG-hSC method is able to capture multimodal and non-Gaussian features of PPDF caused by model nonlinearity. Both experiments show that aSG-hSC is an effective and efficient tool for Bayesian inference.
-
On Building an A-Posteriori Index from Survey Data: A Case for Educational Planners' Assessment of Attitudes towards an Educational Innovation.
ERIC Educational Resources Information Center
Vazquez-Abad, Jesus; DePauw, Karen
To simplify data from a large survey, it is desirable to classify subjects according to their attitudes toward certain issues, as measured by questions in the survey. Responses to 12 questions were identified as indicative of attitudes toward deschooling education. These attitudes were explained by means of patterns exhibited within the responses…
-
At the origins of the Trojan Horse Method
NASA Astrophysics Data System (ADS)
Lattuada, Marcello
2018-01-01
During the seventies and eighties a long experimental research program on the quasi-free reactions at low energy was carried out by a small group of nuclear physicists, where Claudio Spitaleri was one of the main protagonists. Nowadays, a posteriori, the results of these studies can be considered an essential step preparatory to the application of the Trojan Horse Method (THM) in Nuclear Astrophysics.
-
A MAP blind image deconvolution algorithm with bandwidth over-constrained
NASA Astrophysics Data System (ADS)
Ren, Zhilei; Liu, Jin; Liang, Yonghui; He, Yulong
2018-03-01
We demonstrate a maximum a posteriori (MAP) blind image deconvolution algorithm with bandwidth over-constrained and total variation (TV) regularization to recover a clear image from the AO corrected images. The point spread functions (PSFs) are estimated by bandwidth limited less than the cutoff frequency of the optical system. Our algorithm performs well in avoiding noise magnification. The performance is demonstrated on simulated data.
-
Leukocyte Recognition Using EM-Algorithm
NASA Astrophysics Data System (ADS)
Colunga, Mario Chirinos; Siordia, Oscar Sánchez; Maybank, Stephen J.
This document describes a method for classifying images of blood cells. Three different classes of cells are used: Band Neutrophils, Eosinophils and Lymphocytes. The image pattern is projected down to a lower dimensional sub space using PCA; the probability density function for each class is modeled with a Gaussian mixture using the EM-Algorithm. A new cell image is classified using the maximum a posteriori decision rule.
-
A full-mission data set of H2O and HDO columns from SCIAMACHY 2.3 µm reflectance measurements
NASA Astrophysics Data System (ADS)
Schneider, Andreas; Borsdorff, Tobias; aan de Brugh, Joost; Hu, Haili; Landgraf, Jochen
2018-06-01
A new data set of vertical column densities of the water vapour isotopologues H2O and HDO from the SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY (SCIAMACHY) instrument for the whole of the mission period from January 2003 to April 2012 is presented. The data are retrieved from reflectance measurements in the spectral range 2339 to 2383 nm with the Shortwave Infrared CO Retrieval (SICOR) algorithm, ignoring atmospheric light scattering in the measurement simulation. The retrievals are validated with ground-based Fourier transform infrared measurements obtained within the Multi-platform remote Sensing of Isotopologues for investigating the Cycle of Atmospheric water (MUSICA) project. A good agreement for low-altitude stations is found with an average bias of -3.6×1021 for H2O and -1.0×1018 molec cm-2 for HDO. The a posteriori computed δD shows an average bias of -8 ‰, even though polar stations have a larger negative bias. The latter is due to the large amount of sensor noise in SCIAMACHY in combination with low albedo and high solar zenith angles. To demonstrate the benefit of accounting for light scattering in the retrieval, the quality of the data product fitting effective cloud parameters simultaneously with trace gas columns is evaluated in a dedicated case study for measurements round high-altitude stations. Due to a large altitude difference between the satellite ground pixel and the mountain station, clear-sky scenes yield a large bias, resulting in a δD bias of 125 ‰. When selecting scenes with optically thick clouds within 1000 m above or below the station altitude, the bias in a posteriori δD is reduced from 125 to 44 ‰. The insights from the present study will also benefit the analysis of the data from the new Sentinel-5 Precursor mission.
-
Identifying uniformly mutated segments within repeats.
PubMed
Sahinalp, S Cenk; Eichler, Evan; Goldberg, Paul; Berenbrink, Petra; Friedetzky, Tom; Ergun, Funda
2004-12-01
Given a long string of characters from a constant size alphabet we present an algorithm to determine whether its characters have been generated by a single i.i.d. random source. More specifically, consider all possible n-coin models for generating a binary string S, where each bit of S is generated via an independent toss of one of the n coins in the model. The choice of which coin to toss is decided by a random walk on the set of coins where the probability of a coin change is much lower than the probability of using the same coin repeatedly. We present a procedure to evaluate the likelihood of a n-coin model for given S, subject a uniform prior distribution over the parameters of the model (that represent mutation rates and probabilities of copying events). In the absence of detailed prior knowledge of these parameters, the algorithm can be used to determine whether the a posteriori probability for n=1 is higher than for any other n>1. Our algorithm runs in time O(l4logl), where l is the length of S, through a dynamic programming approach which exploits the assumed convexity of the a posteriori probability for n. Our test can be used in the analysis of long alignments between pairs of genomic sequences in a number of ways. For example, functional regions in genome sequences exhibit much lower mutation rates than non-functional regions. Because our test provides means for determining variations in the mutation rate, it may be used to distinguish functional regions from non-functional ones. Another application is in determining whether two highly similar, thus evolutionarily related, genome segments are the result of a single copy event or of a complex series of copy events. This is particularly an issue in evolutionary studies of genome regions rich with repeat segments (especially tandemly repeated segments).
-
Image-based topology for sensor gridlocking and association
NASA Astrophysics Data System (ADS)
Stanek, Clay J.; Javidi, Bahram; Yanni, Philip
2002-07-01
Correlation engines have been evolving since the implementation of radar. In modern sensor fusion architectures, correlation and gridlock filtering are required to produce common, continuous, and unambiguous tracks of all objects in the surveillance area. The objective is to provide a unified picture of the theatre or area of interest to battlefield decision makers, ultimately enabling them to make better inferences for future action and eliminate fratricide by reducing ambiguities. Here, correlation refers to association, which in this context is track-to-track association. A related process, gridlock filtering or gridlocking, refers to the reduction in navigation errors and sensor misalignment errors so that one sensor's track data can be accurately transformed into another sensor's coordinate system. As platforms gain multiple sensors, the correlation and gridlocking of tracks become significantly more difficult. Much of the existing correlation technology revolves around various interpretations of the generalized Bayesian decision rule: choose the action that minimizes conditional risk. One implementation of this principle equates the risk minimization statement to the comparison of ratios of a priori probability distributions to thresholds. The binary decision problem phrased in terms of likelihood ratios is also known as the famed Neyman-Pearson hypothesis test. Using another restatement of the principle for a symmetric loss function, risk minimization leads to a decision that maximizes the a posteriori probability distribution. Even for deterministic decision rules, situations can arise in correlation where there are ambiguities. For these situations, a common algorithm used is a sparse assignment technique such as the Munkres or JVC algorithm. Furthermore, associated tracks may be combined with the hope of reducing the positional uncertainty of a target or object identified by an existing track from the information of several fused/correlated tracks. Gridlocking is typically accomplished with some type of least-squares algorithm, such as the Kalman filtering technique, which attempts to locate the best bias error vector estimate from a set of correlated/fused track pairs. Here, we will introduce a new approach to this longstanding problem by adapting many of the familiar concepts from pattern recognition, ones certainly familiar to target recognition applications. Furthermore, we will show how this technique can lend itself to specialized processing, such as that available through an optical or hybrid correlator.