Sample records for maximum length sequence

  1. Statistical properties of filtered pseudorandom digital sequences formed from the sum of maximum-length sequences

    NASA Technical Reports Server (NTRS)

    Wallace, G. R.; Weathers, G. D.; Graf, E. R.

    1973-01-01

    The statistics of filtered pseudorandom digital sequences called hybrid-sum sequences, formed from the modulo-two sum of several maximum-length sequences, are analyzed. The results indicate that a relation exists between the statistics of the filtered sequence and the characteristic polynomials of the component maximum length sequences. An analysis procedure is developed for identifying a large group of sequences with good statistical properties for applications requiring the generation of analog pseudorandom noise. By use of the analysis approach, the filtering process is approximated by the convolution of the sequence with a sum of unit step functions. A parameter reflecting the overall statistical properties of filtered pseudorandom sequences is derived. This parameter is called the statistical quality factor. A computer algorithm to calculate the statistical quality factor for the filtered sequences is presented, and the results for two examples of sequence combinations are included. The analysis reveals that the statistics of the signals generated with the hybrid-sum generator are potentially superior to the statistics of signals generated with maximum-length generators. Furthermore, fewer calculations are required to evaluate the statistics of a large group of hybrid-sum generators than are required to evaluate the statistics of the same size group of approximately equivalent maximum-length sequences.

  2. A note on chaotic unimodal maps and applications.

    PubMed

    Zhou, C T; He, X T; Yu, M Y; Chew, L Y; Wang, X G

    2006-09-01

    Based on the word-lift technique of symbolic dynamics of one-dimensional unimodal maps, we investigate the relation between chaotic kneading sequences and linear maximum-length shift-register sequences. Theoretical and numerical evidence that the set of the maximum-length shift-register sequences is a subset of the set of the universal sequence of one-dimensional chaotic unimodal maps is given. By stabilizing unstable periodic orbits on superstable periodic orbits, we also develop techniques to control the generation of long binary sequences.

  3. A new molecular evolution model for limited insertion independent of substitution.

    PubMed

    Lèbre, Sophie; Michel, Christian J

    2013-10-01

    We recently introduced a new molecular evolution model called the IDIS model for Insertion Deletion Independent of Substitution [13,14]. In the IDIS model, the three independent processes of substitution, insertion and deletion of residues have constant rates. In order to control the genome expansion during evolution, we generalize here the IDIS model by introducing an insertion rate which decreases when the sequence grows and tends to 0 for a maximum sequence length nmax. This new model, called LIIS for Limited Insertion Independent of Substitution, defines a matrix differential equation satisfied by a vector P(t) describing the sequence content in each residue at evolution time t. An analytical solution is obtained for any diagonalizable substitution matrix M. Thus, the LIIS model gives an expression of the sequence content vector P(t) in each residue under evolution time t as a function of the eigenvalues and the eigenvectors of matrix M, the residue insertion rate vector R, the total insertion rate r, the initial and maximum sequence lengths n0 and nmax, respectively, and the sequence content vector P(t0) at initial time t0. The derivation of the analytical solution is much more technical, compared to the IDIS model, as it involves Gauss hypergeometric functions. Several propositions of the LIIS model are derived: proof that the IDIS model is a particular case of the LIIS model when the maximum sequence length nmax tends to infinity, fixed point, time scale, time step and time inversion. Using a relation between the sequence length l and the evolution time t, an expression of the LIIS model as a function of the sequence length l=n(t) is obtained. Formulas for 'insertion only', i.e. when the substitution rates are all equal to 0, are derived at evolution time t and sequence length l. Analytical solutions of the LIIS model are explicitly derived, as a function of either evolution time t or sequence length l, for two classical substitution matrices: the 3-parameter symmetric substitution matrix [12] (LIIS-SYM3) and the HKY asymmetric substitution matrix[9] (LIIS-HKY). An evaluation of the LIIS model (precisely, LIIS-HKY) based on four statistical analyses of the GC content in complete genomes of four prokaryotic taxonomic groups, namely Chlamydiae, Crenarchaeota, Spirochaetes and Thermotogae, shows the expected improvement from the theory of the LIIS model compared to the IDIS model. Copyright © 2013 Elsevier Inc. All rights reserved.

  4. Correcting for sequencing error in maximum likelihood phylogeny inference.

    PubMed

    Kuhner, Mary K; McGill, James

    2014-11-04

    Accurate phylogenies are critical to taxonomy as well as studies of speciation processes and other evolutionary patterns. Accurate branch lengths in phylogenies are critical for dating and rate measurements. Such accuracy may be jeopardized by unacknowledged sequencing error. We use simulated data to test a correction for DNA sequencing error in maximum likelihood phylogeny inference. Over a wide range of data polymorphism and true error rate, we found that correcting for sequencing error improves recovery of the branch lengths, even if the assumed error rate is up to twice the true error rate. Low error rates have little effect on recovery of the topology. When error is high, correction improves topological inference; however, when error is extremely high, using an assumed error rate greater than the true error rate leads to poor recovery of both topology and branch lengths. The error correction approach tested here was proposed in 2004 but has not been widely used, perhaps because researchers do not want to commit to an estimate of the error rate. This study shows that correction with an approximate error rate is generally preferable to ignoring the issue. Copyright © 2014 Kuhner and McGill.

  5. Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering.

    PubMed

    Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor; Essex, M

    2015-05-01

    To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice.

  6. Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering

    PubMed Central

    Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor

    2015-01-01

    Abstract To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice. PMID:25560745

  7. Genomes: At the edge of chaos with maximum information capacity

    NASA Astrophysics Data System (ADS)

    Kong, Sing-Guan; Chen, Hong-Da; Torda, Andrew; Lee, H. C.

    2016-12-01

    We propose an order index, ϕ, which quantifies the notion of “life at the edge of chaos” when applied to genome sequences. It maps genomes to a number from 0 (random and of infinite length) to 1 (fully ordered) and applies regardless of sequence length and base composition. The 786 complete genomic sequences in GenBank were found to have ϕ values in a very narrow range, 0.037 ± 0.027. We show this implies that genomes are halfway towards being completely random, namely, at the edge of chaos. We argue that this narrow range represents the neighborhood of a fixed-point in the space of sequences, and genomes are driven there by the dynamics of a robust, predominantly neutral evolution process.

  8. On the error probability of general tree and trellis codes with applications to sequential decoding

    NASA Technical Reports Server (NTRS)

    Johannesson, R.

    1973-01-01

    An upper bound on the average error probability for maximum-likelihood decoding of the ensemble of random binary tree codes is derived and shown to be independent of the length of the tree. An upper bound on the average error probability for maximum-likelihood decoding of the ensemble of random L-branch binary trellis codes of rate R = 1/n is derived which separates the effects of the tail length T and the memory length M of the code. It is shown that the bound is independent of the length L of the information sequence. This implication is investigated by computer simulations of sequential decoding utilizing the stack algorithm. These simulations confirm the implication and further suggest an empirical formula for the true undetected decoding error probability with sequential decoding.

  9. Phylogenetic place of guinea pigs: no support of the rodent-polyphyly hypothesis from maximum-likelihood analyses of multiple protein sequences.

    PubMed

    Cao, Y; Adachi, J; Yano, T; Hasegawa, M

    1994-07-01

    Graur et al.'s (1991) hypothesis that the guinea pig-like rodents have an evolutionary origin within mammals that is separate from that of other rodents (the rodent-polyphyly hypothesis) was reexamined by the maximum-likelihood method for protein phylogeny, as well as by the maximum-parsimony and neighbor-joining methods. The overall evidence does not support Graur et al.'s hypothesis, which radically contradicts the traditional view of rodent monophyly. This work demonstrates that we must be careful in choosing a proper method for phylogenetic inference and that an argument based on a small data set (with respect to the length of the sequence and especially the number of species) may be unstable.

  10. Vibration transfer mobility measurements using maximum length sequences

    NASA Astrophysics Data System (ADS)

    Singleton, Herbert L.

    2005-09-01

    Vibration transfer mobility measurements are required under Federal Transit Administration guidelines when developing detailed predictions of ground-borne vibration for rail transit systems. These measurements typically use a large instrumented hammer to generate impulses in the soil. These impulses are measured by an array of accelerometers to characterize the transfer mobility of the ground in a localized area. While effective, these measurements often make use of heavy, custom-engineered equipment to produce the impulse signal. To obtain satisfactory signal-to-noise ratios, it is necessary to generate multiple impulses to generate an average value, but this process involves considerable physical labor in the field. To address these shortcomings, a transfer mobility measurement system utilizing a tactile transducer and maximum length sequences (MLS) was developed. This system uses lightweight off-the-shelf components to significantly reduce the weight and cost of the system. The use of MLS allows for adequate signal-to-noise ratio from the tactile transducer, while minimizing the length of the measurement. Tests of the MLS system show good agreement with the impulse-based method. The combination of the cost savings and reduced weight of this new system facilitates transfer mobility measurements that are less physically demanding, and more economical when compared with current methods.

  11. Mining and gene ontology based annotation of SSR markers from expressed sequence tags of Humulus lupulus

    PubMed Central

    Singh, Swati; Gupta, Sanchita; Mani, Ashutosh; Chaturvedi, Anoop

    2012-01-01

    Humulus lupulus is commonly known as hops, a member of the family moraceae. Currently many projects are underway leading to the accumulation of voluminous genomic and expressed sequence tag sequences in public databases. The genetically characterized domains in these databases are limited due to non-availability of reliable molecular markers. The large data of EST sequences are available in hops. The simple sequence repeat markers extracted from EST data are used as molecular markers for genetic characterization, in the present study. 25,495 EST sequences were examined and assembled to get full-length sequences. Maximum frequency distribution was shown by mononucleotide SSR motifs i.e. 60.44% in contig and 62.16% in singleton where as minimum frequency are observed for hexanucleotide SSR in contig (0.09%) and pentanucleotide SSR in singletons (0.12%). Maximum trinucleotide motifs code for Glutamic acid (GAA) while AT/TA were the most frequent repeat of dinucleotide SSRs. Flanking primer pairs were designed in-silico for the SSR containing sequences. Functional categorization of SSRs containing sequences was done through gene ontology terms like biological process, cellular component and molecular function. PMID:22368382

  12. De novo transcriptome analysis of an imminent biofuel crop, Camelina sativa L. using Illumina GAIIX sequencing platform and identification of SSR markers.

    PubMed

    Mudalkar, Shalini; Golla, Ramesh; Ghatty, Sreenivas; Reddy, Attipalli Ramachandra

    2014-01-01

    Camelina sativa L. is an emerging biofuel crop with potential applications in industry, medicine, cosmetics and human nutrition. The crop is unexploited owing to very limited availability of transcriptome and genomic data. In order to analyse the various metabolic pathways, we performed de novo assembly of the transcriptome on Illumina GAIIX platform with paired end sequencing for obtaining short reads. The sequencing output generated a FastQ file size of 2.97 GB with 10.83 million reads having a maximum read length of 101 nucleotides. The number of contigs generated was 53,854 with maximum and minimum lengths of 10,086 and 200 nucleotides respectively. These trancripts were annotated using BLAST search against the Aracyc, Swiss-Prot, TrEMBL, gene ontology and clusters of orthologous groups (KOG) databases. The genes involved in lipid metabolism were studied and the transcription factors were identified. Sequence similarity studies of Camelina with the other related organisms indicated the close relatedness of Camelina with Arabidopsis. In addition, bioinformatics analysis revealed the presence of a total of 19,379 simple sequence repeats. This is the first report on Camelina sativa L., where the transcriptome of the entire plant, including seedlings, seed, root, leaves and stem was done. Our data established an excellent resource for gene discovery and provide useful information for functional and comparative genomic studies in this promising biofuel crop.

  13. Maximum-likelihood soft-decision decoding of block codes using the A* algorithm

    NASA Technical Reports Server (NTRS)

    Ekroot, L.; Dolinar, S.

    1994-01-01

    The A* algorithm finds the path in a finite depth binary tree that optimizes a function. Here, it is applied to maximum-likelihood soft-decision decoding of block codes where the function optimized over the codewords is the likelihood function of the received sequence given each codeword. The algorithm considers codewords one bit at a time, making use of the most reliable received symbols first and pursuing only the partially expanded codewords that might be maximally likely. A version of the A* algorithm for maximum-likelihood decoding of block codes has been implemented for block codes up to 64 bits in length. The efficiency of this algorithm makes simulations of codes up to length 64 feasible. This article details the implementation currently in use, compares the decoding complexity with that of exhaustive search and Viterbi decoding algorithms, and presents performance curves obtained with this implementation of the A* algorithm for several codes.

  14. Intensity inhomogeneity correction for magnetic resonance imaging of human brain at 7T.

    PubMed

    Uwano, Ikuko; Kudo, Kohsuke; Yamashita, Fumio; Goodwin, Jonathan; Higuchi, Satomi; Ito, Kenji; Harada, Taisuke; Ogawa, Akira; Sasaki, Makoto

    2014-02-01

    To evaluate the performance and efficacy for intensity inhomogeneity correction of various sequences of the human brain in 7T MRI using the extended version of the unified segmentation algorithm. Ten healthy volunteers were scanned with four different sequences (2D spin echo [SE], 3D fast SE, 2D fast spoiled gradient echo, and 3D time-of-flight) by using a 7T MRI system. Intensity inhomogeneity correction was performed using the "New Segment" module in SPM8 with four different values (120, 90, 60, and 30 mm) of full width at half maximum (FWHM) in Gaussian smoothness. The uniformity in signals in the entire white matter was evaluated using the coefficient of variation (CV); mean signal intensities between the subcortical and deep white matter were compared, and contrast between subcortical white matter and gray matter was measured. The length of the lenticulostriate (LSA) was measured on maximum intensity projection (MIP) images in the original and corrected images. In all sequences, the CV decreased as the FWHM value decreased. The differences of mean signal intensities between subcortical and deep white matter also decreased with smaller FWHM values. The contrast between white and gray matter was maintained at all FWHM values. LSA length was significantly greater in corrected MIP than in the original MIP images. Intensity inhomogeneity in 7T MRI can be successfully corrected using SPM8 for various scan sequences.

  15. EXONSAMPLER: a computer program for genome-wide and candidate gene exon sampling for targeted next-generation sequencing.

    PubMed

    Cosart, Ted; Beja-Pereira, Albano; Luikart, Gordon

    2014-11-01

    The computer program EXONSAMPLER automates the sampling of thousands of exon sequences from publicly available reference genome sequences and gene annotation databases. It was designed to provide exon sequences for the efficient, next-generation gene sequencing method called exon capture. The exon sequences can be sampled by a list of gene name abbreviations (e.g. IFNG, TLR1), or by sampling exons from genes spaced evenly across chromosomes. It provides a list of genomic coordinates (a bed file), as well as a set of sequences in fasta format. User-adjustable parameters for collecting exon sequences include a minimum and maximum acceptable exon length, maximum number of exonic base pairs (bp) to sample per gene, and maximum total bp for the entire collection. It allows for partial sampling of very large exons. It can preferentially sample upstream (5 prime) exons, downstream (3 prime) exons, both external exons, or all internal exons. It is written in the Python programming language using its free libraries. We describe the use of EXONSAMPLER to collect exon sequences from the domestic cow (Bos taurus) genome for the design of an exon-capture microarray to sequence exons from related species, including the zebu cow and wild bison. We collected ~10% of the exome (~3 million bp), including 155 candidate genes, and ~16,000 exons evenly spaced genomewide. We prioritized the collection of 5 prime exons to facilitate discovery and genotyping of SNPs near upstream gene regulatory DNA sequences, which control gene expression and are often under natural selection. © 2014 John Wiley & Sons Ltd.

  16. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Reister, D.B.; Lenhart, S.M.

    Recent theoretical results have completely solved the problem of determining the minimum length path for a vehicle with a minimum turning radius moving from an initial configuration to a final configuration. Time optimal paths for a constant speed vehicle are a subset of the minimum length paths. This paper uses the Pontryagin maximum principle to find time optimal paths for a constant speed vehicle. The time optimal paths consist of sequences of axes of circles and straight lines. The maximum principle introduces concepts (dual variables, bang-bang solutions, singular solutions, and transversality conditions) that provide important insight into the nature ofmore » the time optimal paths. We explore the properties of the optimal paths and present some experimental results for a mobile robot following an optimal path.« less

  17. NMR studies on the structure and dynamics of lac operator DNA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, S.C.

    Nuclear Magnetic Resonance spectroscopy was used to elucidate the relationships between structure, dynamics and function of the gene regulatory sequence corresponding to the lactose operon operator of Escherichia coli. The length of the DNA fragments examined varied from 13 to 36 base pair, containing all or part of the operator sequence. These DNA fragments are either derived genetically or synthesized chemically. Resonances of the imino protons were assigned by one dimensional inter-base pair nuclear Overhauser enhancement (NOE) measurements. Imino proton exchange rates were measured by saturation recovery methods. Results from the kinetic measurements show an interesting dynamic heterogeneity with amore » maximum opening rate centered about a GTG/CAC sequence which correlates with the biological function of the operator DNA. This particular three base pair sequence occurs frequently and often symmetrically in prokaryotic nd eukaryotic DNA sites where one anticipates specific protein interaction for gene regulation. The observed sequence dependent imino proton exchange rate may be a reflection of variation of the local structure of regulatory DNA. The results also indicate that the observed imino proton exchange rates are length dependent.« less

  18. Fast and accurate estimation of the covariance between pairwise maximum likelihood distances.

    PubMed

    Gil, Manuel

    2014-01-01

    Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989) which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error.

  19. Fast and accurate estimation of the covariance between pairwise maximum likelihood distances

    PubMed Central

    2014-01-01

    Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989) which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error. PMID:25279263

  20. Time optimal paths for high speed maneuvering

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Reister, D.B.; Lenhart, S.M.

    1993-01-01

    Recent theoretical results have completely solved the problem of determining the minimum length path for a vehicle with a minimum turning radius moving from an initial configuration to a final configuration. Time optimal paths for a constant speed vehicle are a subset of the minimum length paths. This paper uses the Pontryagin maximum principle to find time optimal paths for a constant speed vehicle. The time optimal paths consist of sequences of axes of circles and straight lines. The maximum principle introduces concepts (dual variables, bang-bang solutions, singular solutions, and transversality conditions) that provide important insight into the nature ofmore » the time optimal paths. We explore the properties of the optimal paths and present some experimental results for a mobile robot following an optimal path.« less

  1. Generalized theory on the mechanism of site-specific DNA-protein interactions

    NASA Astrophysics Data System (ADS)

    Niranjani, G.; Murugan, R.

    2016-05-01

    We develop a generalized theoretical framework on the binding of transcription factor proteins (TFs) with specific sites on DNA that takes into account the interplay of various factors regarding overall electrostatic potential at the DNA-protein interface, occurrence of kinetic traps along the DNA sequence, presence of other roadblock protein molecules along DNA and crowded environment, conformational fluctuations in the DNA binding domains (DBDs) of TFs, and the conformational state of the DNA. Starting from a Smolochowski type theoretical framework on site-specific binding of TFs we logically build our model by adding the effects of these factors one by one. Our generalized two-step model suggests that the electrostatic attractive forces present inbetween the positively charged DBDs of TFs and the negatively charged phosphate backbone of DNA, along with the counteracting shielding effects of solvent ions, is the core factor that creates a fluidic type environment at the DNA-protein interface. This in turn facilitates various one-dimensional diffusion (1Dd) processes such as sliding, hopping and intersegmental transfers. These facilitating processes as well as flipping dynamics of conformational states of DBDs of TFs between stationary and mobile states can enhance the 1Dd coefficient on a par with three-dimensional diffusion (3Dd). The random coil conformation of DNA also plays critical roles in enhancing the site-specific association rate. The extent of enhancement over the 3Dd controlled rate seems to be directly proportional to the maximum possible 1Dd length. We show that the overall site-specific binding rate scales with the length of DNA in an asymptotic way. For relaxed DNA, the specific binding rate will be independent of the length of DNA as length increases towards infinity. For condensed DNA as in in vivo conditions, the specific binding rate depends on the length of DNA in a turnover way with a maximum. This maximum rate seems to scale with the maximum possible 1Dd length of TFs in a square root manner. Results suggest that 1Dd processes contribute much less to the enhancement of specific binding rate under in vivo conditions for condensed DNA. There exists a critical length of binding stretch of TFs beyond which the probability associated with the random occurrence of similar specific binding sites will be close to zero. TFs in natural systems from prokaryotes to eukaryotes seem to handle sequence-mediated kinetic traps via increasing the length of their recognition stretch or combinatorial binding. TFs overcome the hurdles of roadblocks via switching efficiently between sliding, hopping and intersegmental transfer modes. The site-specific binding rate as well as the maximum possible 1Dd length seem to be directly proportional to the square root of the probability (p R) of finding a nonspecific binding site to be free from dynamic roadblocks. Here p R seems to be a function of the number of nsbs available per DNA binding protein (ϕ) inside the living cell. It seems that p R  >  0.8 when ϕ  >  10 which is true for the Escherichia coli cell system.

  2. Full-length genomic characterization and molecular evolution of canine parvovirus in China.

    PubMed

    Zhou, Ling; Tang, Qinghai; Shi, Lijun; Kong, Miaomiao; Liang, Lin; Mao, Qianqian; Bu, Bin; Yao, Lunguang; Zhao, Kai; Cui, Shangjin; Leal, Élcio

    2016-06-01

    Canine parvovirus type 2 (CPV-2) can cause acute haemorrhagic enteritis in dogs and myocarditis in puppies. This disease has become one of the most serious infectious diseases of dogs. During 2014 in China, there were many cases of acute infectious diarrhoea in dogs. Some faecal samples were negative for the CPV-2 antigen based on a colloidal gold test strip but were positive based on PCR, and a viral strain was isolated from one such sample. The cytopathic effect on susceptible cells and the results of the immunoperoxidase monolayer assay, PCR, and sequencing indicated that the pathogen was CPV-2. The strain was named CPV-NY-14, and the full-length genome was sequenced and analysed. A maximum likelihood tree was constructed using the full-length genome and all available CPV-2 genomes. New strains have replaced the original strain in Taiwan and Italy, although the CPV-2a strain is still predominant there. However, CPV-2a still causes many cases of acute infectious diarrhoea in dogs in China.

  3. Complete sequence of two tick-borne flaviviruses isolated from Siberia and the UK: analysis and significance of the 5' and 3'-UTRs.

    PubMed

    Gritsun, T S; Venugopal, K; Zanotto, P M; Mikhailov, M V; Sall, A A; Holmes, E C; Polkinghorne, I; Frolova, T V; Pogodina, V V; Lashkevich, V A; Gould, E A

    1997-05-01

    The complete nucleotide sequence of two tick-transmitted flaviviruses, Vasilchenko (Vs) from Siberia and louping ill (LI) from the UK, have been determined. The genomes were respectively, 10928 and 10871 nucleotides (nt) in length. The coding strategy and functional protein sequence motifs of tick-borne flaviviruses are presented in both Vs and LI viruses. The phylogenies based on maximum likelihood, maximum parsimony and distance analysis of the polyproteins, identified Vs virus as a member of the tick-borne encephalitis virus subgroup within the tick-borne serocomplex, genus Flavivirus, family Flaviviridae. Comparative alignment of the 3'-untranslated regions revealed deletions of different lengths essentially at the same position downstream of the stop codon for all tick-borne viruses. Two direct 27 nucleotide repeats at the 3'-end were found only for Vs and LI virus. Immediately following the deletions a region of 332-334 nt with relatively conserved primary structure (67-94% identity) was observed at the 3'-non-coding end of the virus genome. Pairwise comparisons of the nucleotide sequence data revealed similar levels of variation between the coding region, and the 5' and 3'-termini of the genome, implying an equivalent strong selective control for translated and untranslated regions. Indeed the predicted folding of the 5' and 3'-untranslated regions revealed patterns of stem and loop structures conserved for all tick-borne flaviviruses suggesting a purifying selection for preservation of essential RNA secondary structures which could be involved in translational control and replication. The possible implications of these findings are discussed.

  4. Air charged and microtip catheters cannot be used interchangeably for urethral pressure measurement: a prospective, single-blind, randomized trial.

    PubMed

    Zehnder, Pascal; Roth, Beat; Burkhard, Fiona C; Kessler, Thomas M

    2008-09-01

    We determined and compared urethral pressure measurements using air charged and microtip catheters in a prospective, single-blind, randomized trial. A consecutive series of 64 women referred for urodynamic investigation underwent sequential urethral pressure measurements using an air charged and a microtip catheter in randomized order. Patients were blinded to the type and sequence of catheter used. Agreement between the 2 catheter systems was assessed using the Bland and Altman 95% limits of agreement method. Intraclass correlation coefficients of air charged and microtip catheters for maximum urethral closure pressure at rest were 0.97 and 0.93, and for functional profile length they were 0.9 and 0.78, respectively. Pearson's correlation coefficients and Lin's concordance coefficients of air charged and microtip catheters were r = 0.82 and rho = 0.79 for maximum urethral closure pressure at rest, and r = 0.73 and rho = 0.7 for functional profile length, respectively. When applying the Bland and Altman method, air charged catheters gave higher readings than microtip catheters for maximum urethral closure pressure at rest (mean difference 7.5 cm H(2)O) and functional profile length (mean difference 1.8 mm). There were wide 95% limits of agreement for differences in maximum urethral closure pressure at rest (-24.1 to 39 cm H(2)O) and functional profile length (-7.7 to 11.3 mm). For urethral pressure measurement the air charged catheter is at least as reliable as the microtip catheter and it generally gives higher readings. However, air charged and microtip catheters cannot be used interchangeably for clinical purposes because of insufficient agreement. Hence, clinicians should be aware that air charged and microtip catheters may yield completely different results, and these differences should be acknowledged during clinical decision making.

  5. Length variation and sequence divergence in mitochondrial control region of Schizothoracine (Teleostei: Cyperinidae) species.

    PubMed

    Syed, Mudasir Ahmad; Bhat, Farooz Ahmad; Balkhi, Masood-ul Hassan; Bhat, Bilal Ahmad

    2016-01-01

    Schizothoracine fish commonly called snow trouts inhibit the entire network of snow and spring fed cool waters of Kashmir, India. Over 10 species reported earlier, only five species have been found, these include Schizothorax niger, Schizothorax esocinus, Schizothorax plagiostomus, Schizothorax curvifrons and Schizothorax labiatus. The relationship between these species is contradicting. To understand the evolutionary relation of these species, we examined the sequence information of mitochondrial D-loop of 25 individuals representing five species. Sequence alignment showed D-loop region highly variable and length variation was observed in di-nucleotide (TA)n microsatellite between and within species. Interestingly, all these species have (TA)n microsatellite not associated with longer tandem repeats at the 3' end of the mitochondrial control region and do not show heteroplasmy. Our analysis also indicates the presence of four conserved sequence blocks (CSB), CSB-D, CSB-1, CSB-II and CSB-III, four (Termination Associated Sequence) TAS motifs and 15bp pyrimidine block within the mitochondrial control region, that are highly conserved within genus Schizothorax when compared with other species. The phylogenetic analysis carried by Maximum likelihood (ML), Neighbor Joining (NJ) and Bayesian inference (BI) generated almost identical results. The resultant BI tree showed a close genetic relationship of all the five species and supports two distinct grouping of S. esocinus species. Besides the species relation, the presence of length variation in tandem repeats is attributed to differences in predicting the stability of secondary structures. The role of CSBs and TASs, reported so far as main regulatory signals, would explain the conservation of these elements in evolution.

  6. Single molecule sequencing-guided scaffolding and correction of draft assemblies.

    PubMed

    Zhu, Shenglong; Chen, Danny Z; Emrich, Scott J

    2017-12-06

    Although single molecule sequencing is still improving, the lengths of the generated sequences are inevitably an advantage in genome assembly. Prior work that utilizes long reads to conduct genome assembly has mostly focused on correcting sequencing errors and improving contiguity of de novo assemblies. We propose a disassembling-reassembling approach for both correcting structural errors in the draft assembly and scaffolding a target assembly based on error-corrected single molecule sequences. To achieve this goal, we formulate a maximum alternating path cover problem. We prove that this problem is NP-hard, and solve it by a 2-approximation algorithm. Our experimental results show that our approach can improve the structural correctness of target assemblies in the cost of some contiguity, even with smaller amounts of long reads. In addition, our reassembling process can also serve as a competitive scaffolder relative to well-established assembly benchmarks.

  7. Recent horizontal transfer of mellifera subfamily mariner transposons into insect lineages representing four different orders shows that selection acts only during horizontal transfer.

    PubMed

    Lampe, David J; Witherspoon, David J; Soto-Adames, Felipe N; Robertson, Hugh M

    2003-04-01

    We report the isolation and sequencing of genomic copies of mariner transposons involved in recent horizontal transfers into the genomes of the European earwig, Forficula auricularia; the European honey bee, Apis mellifera; the Mediterranean fruit fly, Ceratitis capitata; and a blister beetle, Epicauta funebris, insects from four different orders. These elements are in the mellifera subfamily and are the second documented example of full-length mariner elements involved in this kind of phenomenon. We applied maximum likelihood methods to the coding sequences and determined that the copies in each genome were evolving neutrally, whereas reconstructed ancestral coding sequences appeared to be under selection, which strengthens our previous hypothesis that the primary selective constraint on mariner sequence evolution is the act of horizontal transfer between genomes.

  8. Autocorrelation peaks in congruential pseudorandom number generators

    NASA Technical Reports Server (NTRS)

    Neuman, F.; Merrick, R. B.

    1976-01-01

    The complete correlation structure of several congruential pseudorandom number generators (PRNG) of the same type and small cycle length was studied to deal with the problem of congruential PRNG almost repeating themselves at intervals smaller than their cycle lengths, during simulation of bandpass filtered normal random noise. Maximum period multiplicative and mixed congruential generators were studied, with inferences drawn from examination of several tractable members of a class of random number generators, and moduli from 2 to the 5th power to 2 to the 9th power. High correlation is shown to exist in mixed and multiplicative congruential random number generators and prime moduli Lehmer generators for shifts a fraction of their cycle length. The random noise sequences in question are required when simulating electrical noise, air turbulence, or time variation of wind parameters.

  9. High-Frame-Rate Doppler Ultrasound Using a Repeated Transmit Sequence

    PubMed Central

    Podkowa, Anthony S.; Oelze, Michael L.; Ketterling, Jeffrey A.

    2018-01-01

    The maximum detectable velocity of high-frame-rate color flow Doppler ultrasound is limited by the imaging frame rate when using coherent compounding techniques. Traditionally, high quality ultrasonic images are produced at a high frame rate via coherent compounding of steered plane wave reconstructions. However, this compounding operation results in an effective downsampling of the slow-time signal, thereby artificially reducing the frame rate. To alleviate this effect, a new transmit sequence is introduced where each transmit angle is repeated in succession. This transmit sequence allows for direct comparison between low resolution, pre-compounded frames at a short time interval in ways that are resistent to sidelobe motion. Use of this transmit sequence increases the maximum detectable velocity by a scale factor of the transmit sequence length. The performance of this new transmit sequence was evaluated using a rotating cylindrical phantom and compared with traditional methods using a 15-MHz linear array transducer. Axial velocity estimates were recorded for a range of ±300 mm/s and compared to the known ground truth. Using these new techniques, the root mean square error was reduced from over 400 mm/s to below 50 mm/s in the high-velocity regime compared to traditional techniques. The standard deviation of the velocity estimate in the same velocity range was reduced from 250 mm/s to 30 mm/s. This result demonstrates the viability of the repeated transmit sequence methods in detecting and quantifying high-velocity flow. PMID:29910966

  10. Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics

    PubMed Central

    Kolaczkowski, Bryan; Thornton, Joseph W.

    2009-01-01

    Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias—which is apparent under both controlled simulation conditions and in analyses of empirical sequence data—also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages—that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis. PMID:20011052

  11. Long-branch attraction bias and inconsistency in Bayesian phylogenetics.

    PubMed

    Kolaczkowski, Bryan; Thornton, Joseph W

    2009-12-09

    Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias--which is apparent under both controlled simulation conditions and in analyses of empirical sequence data--also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages--that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis.

  12. 49 CFR 236.55 - Dead section; maximum length.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 49 Transportation 4 2010-10-01 2010-10-01 false Dead section; maximum length. 236.55 Section 236... Instructions: All Systems Track Circuits § 236.55 Dead section; maximum length. Where dead section exceeds 35... over such dead section is less than 35 feet, the maximum length of the dead section shall not exceed...

  13. 49 CFR 236.55 - Dead section; maximum length.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 49 Transportation 4 2014-10-01 2014-10-01 false Dead section; maximum length. 236.55 Section 236... Instructions: All Systems Track Circuits § 236.55 Dead section; maximum length. Where dead section exceeds 35... over such dead section is less than 35 feet, the maximum length of the dead section shall not exceed...

  14. 49 CFR 236.55 - Dead section; maximum length.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 49 Transportation 4 2011-10-01 2011-10-01 false Dead section; maximum length. 236.55 Section 236... Instructions: All Systems Track Circuits § 236.55 Dead section; maximum length. Where dead section exceeds 35... over such dead section is less than 35 feet, the maximum length of the dead section shall not exceed...

  15. 49 CFR 236.55 - Dead section; maximum length.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 49 Transportation 4 2013-10-01 2013-10-01 false Dead section; maximum length. 236.55 Section 236... Instructions: All Systems Track Circuits § 236.55 Dead section; maximum length. Where dead section exceeds 35... over such dead section is less than 35 feet, the maximum length of the dead section shall not exceed...

  16. 49 CFR 236.55 - Dead section; maximum length.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 49 Transportation 4 2012-10-01 2012-10-01 false Dead section; maximum length. 236.55 Section 236... Instructions: All Systems Track Circuits § 236.55 Dead section; maximum length. Where dead section exceeds 35... over such dead section is less than 35 feet, the maximum length of the dead section shall not exceed...

  17. The complete chloroplast genome sequence of Euonymus japonicus (Celastraceae).

    PubMed

    Choi, Kyoung Su; Park, SeonJoo

    2016-09-01

    The complete chloroplast (cp) genome sequence of the Euonymus japonicus, the first sequenced of the genus Euonymus, was reported in this study. The total length was 157 637 bp, containing a pair of 26 678 bp inverted repeat region (IR), which were separated by small single copy (SSC) region and large single copy (LSC) region of 18 340 bp and 85 941 bp, respectively. This genome contains 107 unique genes, including 74 coding genes, four rRNA genes, and 29 tRNA genes. Seventeen genes contain intron of E. japonicus, of which three genes (clpP, ycf3, and rps12) include two introns. The maximum likelihood (ML) phylogenetic analysis revealed that E. japonicus was closely related to Manihot and Populus.

  18. Maximum step length: relationships to age and knee and hip extensor capacities.

    PubMed

    Schulz, Brian W; Ashton-Miller, James A; Alexander, Neil B

    2007-07-01

    Maximum Step Length may be used to identify older adults at increased risk for falls. Since leg muscle weakness is a risk factor for falls, we tested the hypotheses that maximum knee and hip extension speed, strength, and power capacities would significantly correlate with Maximum Step Length and also that the "step out and back" Maximum Step Length [Medell, J.L., Alexander, N.B., 2000. A clinical measure of maximal and rapid stepping in older women. J. Gerontol. A Biol. Sci. Med. Sci. 55, M429-M433.] would also correlate with the Maximum Step Length of its two sub-tasks: stepping "out only" and stepping "back only". These sub-tasks will be referred to as versions of Maximum Step Length. Unimpaired younger (N=11, age=24[3]years) and older (N=10, age=73[5]years) women performed the above three versions of Maximum Step Length. Knee and hip extension speed, strength, and power capacities were determined on a separate day and regressed on Maximum Step Length and age group. Version and practice effects were quantified and subjective impressions of test difficulty recorded. Hypotheses were tested using linear regressions, analysis of variance, and Fisher's exact test. Maximum Step Length explained 6-22% additional variance in knee and hip extension speed, strength, and power capacities after controlling for age group. Within- and between-block and test-retest correlation values were high (>0.9) for all test versions. Shorter Maximum Step Lengths are associated with reduced knee and hip extension speed, strength, and power capacities after controlling for age. A single out-and-back step of maximal length is a feasible, rapid screening measure that may provide insight into underlying functional impairment, regardless of age.

  19. Streaming current magnetic fields in a charged nanopore.

    PubMed

    Mansouri, Abraham; Taheri, Peyman; Kostiuk, Larry W

    2016-11-11

    Magnetic fields induced by currents created in pressure driven flows inside a solid-state charged nanopore were modeled by numerically solving a system of steady state continuum partial differential equations, i.e., Poisson, Nernst-Planck, Ampere and Navier-Stokes equations (PNPANS). This analysis was based on non-dimensional transport governing equations that were scaled using Debye length as the characteristic length scale, and applied to a finite length cylindrical nano-channel. The comparison of numerical and analytical studies shows an excellent agreement and verified the magnetic fields density both inside and outside the nanopore. The radially non-uniform currents resulted in highly non-uniform magnetic fields within the nanopore that decay as 1/r outside the nanopore. It is worth noting that for either streaming currents or streaming potential cases, the maximum magnetic field occurred inside the pore in the vicinity of nanopore wall, as opposed to a cylindrical conductor that carries a steady electric current where the maximum magnetic fields occur at the perimeter of conductor. Based on these results, it is suggested and envisaged that non-invasive external magnetic fields readouts generated by streaming/ionic currents may be viewed as secondary electronic signatures of biomolecules to complement and enhance current DNA nanopore sequencing techniques.

  20. Streaming current magnetic fields in a charged nanopore

    NASA Astrophysics Data System (ADS)

    Mansouri, Abraham; Taheri, Peyman; Kostiuk, Larry W.

    2016-11-01

    Magnetic fields induced by currents created in pressure driven flows inside a solid-state charged nanopore were modeled by numerically solving a system of steady state continuum partial differential equations, i.e., Poisson, Nernst-Planck, Ampere and Navier-Stokes equations (PNPANS). This analysis was based on non-dimensional transport governing equations that were scaled using Debye length as the characteristic length scale, and applied to a finite length cylindrical nano-channel. The comparison of numerical and analytical studies shows an excellent agreement and verified the magnetic fields density both inside and outside the nanopore. The radially non-uniform currents resulted in highly non-uniform magnetic fields within the nanopore that decay as 1/r outside the nanopore. It is worth noting that for either streaming currents or streaming potential cases, the maximum magnetic field occurred inside the pore in the vicinity of nanopore wall, as opposed to a cylindrical conductor that carries a steady electric current where the maximum magnetic fields occur at the perimeter of conductor. Based on these results, it is suggested and envisaged that non-invasive external magnetic fields readouts generated by streaming/ionic currents may be viewed as secondary electronic signatures of biomolecules to complement and enhance current DNA nanopore sequencing techniques.

  1. Aligner optimization increases accuracy and decreases compute times in multi-species sequence data.

    PubMed

    Robinson, Kelly M; Hawkins, Aziah S; Santana-Cruz, Ivette; Adkins, Ricky S; Shetty, Amol C; Nagaraj, Sushma; Sadzewicz, Lisa; Tallon, Luke J; Rasko, David A; Fraser, Claire M; Mahurkar, Anup; Silva, Joana C; Dunning Hotopp, Julie C

    2017-09-01

    As sequencing technologies have evolved, the tools to analyze these sequences have made similar advances. However, for multi-species samples, we observed important and adverse differences in alignment specificity and computation time for bwa- mem (Burrows-Wheeler aligner-maximum exact matches) relative to bwa-aln. Therefore, we sought to optimize bwa-mem for alignment of data from multi-species samples in order to reduce alignment time and increase the specificity of alignments. In the multi-species cases examined, there was one majority member (i.e. Plasmodium falciparum or Brugia malayi ) and one minority member (i.e. human or the Wolbachia endosymbiont w Bm) of the sequence data. Increasing bwa-mem seed length from the default value reduced the number of read pairs from the majority sequence member that incorrectly aligned to the reference genome of the minority sequence member. Combining both source genomes into a single reference genome increased the specificity of mapping, while also reducing the central processing unit (CPU) time. In Plasmodium , at a seed length of 18 nt, 24.1 % of reads mapped to the human genome using 1.7±0.1 CPU hours, while 83.6 % of reads mapped to the Plasmodium genome using 0.2±0.0 CPU hours (total: 107.7 % reads mapping; in 1.9±0.1 CPU hours). In contrast, 97.1 % of the reads mapped to a combined Plasmodium- human reference in only 0.7±0.0 CPU hours. Overall, the results suggest that combining all references into a single reference database and using a 23 nt seed length reduces the computational time, while maximizing specificity. Similar results were found for simulated sequence reads from a mock metagenomic data set. We found similar improvements to computation time in a publicly available human-only data set.

  2. Mitochondrial genome sequences of landsnails Aegista diversifamilia and Dolicheulota formosensis (Gastropoda: Pulmonata: Stylommatophora).

    PubMed

    Huang, Chih-Wei; Lin, Si-Min; Wu, Wen-Lung

    2016-07-01

    The first mitochondrial genome sequences of Aegista and Dolicheulota belonging to Bradybaenidae are described in this report. Mitogenomic sequences were generated from Illumina paired-end sequencing. The complete mitogenome of Aegista diversifamilia was 14,039 bp in length and nearly complete mitogenome of Dolicheulota formosensis was 14,237 bp. Both mitogenomes consisted of 13 protein-coding genes (PCGs), 2 ribosomal RNA genes, and 22 transfer RNA genes. Most genes were overlapped with neighboring genes that the overlapping regions ranged from 2 to 64 bp in A. diversifamilia and from 1 to 45 bp in D. formosensis. Novel gene arrangement, tRNA-Tyr-ND3-tRNA-Trp, was identified in A. diversifamilia, whereas D. formosensis showed identical gene order to other Bradybaenidae mitogenomes. Maximum likelihood phylogenetic tree suggested Aegista as a sister clade to Euhadra and Dolicheulota. Bradybaenidae is monophyly sister clade to Camaenidae.

  3. Complete nuclear ribosomal DNA sequence amplification and molecular analyses of Bangia (Bangiales, Rhodophyta) from China

    NASA Astrophysics Data System (ADS)

    Xu, Jiajie; Jiang, Bo; Chai, Sanming; He, Yuan; Zhu, Jianyi; Shen, Zonggen; Shen, Songdong

    2016-09-01

    Filamentous Bangia, which are distributed extensively throughout the world, have simple and similar morphological characteristics. Scientists can classify these organisms using molecular markers in combination with morphology. We successfully sequenced the complete nuclear ribosomal DNA, approximately 13 kb in length, from a marine Bangia population. We further analyzed the small subunit ribosomal DNA gene (nrSSU) and the internal transcribed spacer (ITS) sequence regions along with nine other marine, and two freshwater Bangia samples from China. Pairwise distances of the nrSSU and 5.8S ribosomal DNA gene sequences show the marine samples grouping together with low divergences (00.003; 0-0.006, respectively) from each other, but high divergences (0.123-0.126; 0.198, respectively) from freshwater samples. An exception is the marine sample collected from Weihai, which shows high divergence from both other marine samples (0.063-0.065; 0.129, respectively) and the freshwater samples (0.097; 0.120, respectively). A maximum likelihood phylogenetic tree based on a combined SSU-ITS dataset with maximum likelihood method shows the samples divided into three clades, with the two marine sample clades containing Bangia spp. from North America, Europe, Asia, and Australia; and one freshwater clade, containing Bangia atropurpurea from North America and China.

  4. Structural analysis of the α subunit of Na(+)/K(+) ATPase genes in invertebrates.

    PubMed

    Thabet, Rahma; Rouault, J-D; Ayadi, Habib; Leignel, Vincent

    2016-01-01

    The Na(+)/K(+) ATPase is a ubiquitous pump coordinating the transport of Na(+) and K(+) across the membrane of cells and its role is fundamental to cellular functions. It is heteromer in eukaryotes including two or three subunits (α, β and γ which is specific to the vertebrates). The catalytic functions of the enzyme have been attributed to the α subunit. Several complete α protein sequences are available, but only few gene structures were characterized. We identified the genomic sequences coding the α-subunit of the Na(+)/K(+) ATPase, from the whole-genome shotgun contigs (WGS), NCBI Genomes (chromosome), Genomic Survey Sequences (GSS) and High Throughput Genomic Sequences (HTGS) databases across distinct phyla. One copy of the α subunit gene was found in Annelida, Arthropoda, Cnidaria, Echinodermata, Hemichordata, Mollusca, Placozoa, Porifera, Platyhelminthes, Urochordata, but the nematodes seem to possess 2 to 4 copies. The number of introns varied from 0 (Platyhelminthes) to 26 (Porifera); and their localization and length are also highly variable. Molecular phylogenies (Maximum Likelihood and Maximum Parsimony methods) showed some clusters constituted by (Chordata/(Echinodermata/Hemichordata)) or (Plathelminthes/(Annelida/Mollusca)) and a basal position for Porifera. These structural analyses increase our knowledge about the evolutionary events of the α subunit genes in the invertebrates. Copyright © 2016 Elsevier Inc. All rights reserved.

  5. Characterization of petunia flower mottle virus (PetFMV), a new potyvirus infecting Petunia x hybrida.

    PubMed

    Feldhoff, A; Wetzel, T; Peters, D; Kellner, R; Krczal, G

    1998-01-01

    With the introduction of cutting-grown Petunia x hybrida plants on the European market, a new potyvirus which showed no serological reaction with antisera against any other potyviruses infecting petunias was discovered. Infected leaves contained flexuous rod-shaped virus particles of 750-800 nm in length and inclusion bodies (pinwheel structures) typical for potyviruses in ultrathin leaf sections. The purified coat protein with a Mr of approximately 36 kDa could be detected in Western immunoblots with a specific antibody to the coat protein of the petunia-infecting virus. The 3' end of the viral genome encompassing the 3' non-coding region, the coat protein gene, and part of the NIb gene was amplified from infected leaf material by IC/PCR using degenerate and specific primers. Sequences of PCR-generated cDNA clones were compared to other known sequences of potyviruses. Maximum homology of 56% was found in the 3' non-coding region between the petunia isolate and other potyviruses. A maximum homology of 69% was found between the amino acid sequence of the coat protein of the petunia isolate and corresponding sequences of other potyviruses. These data indicate that the petunia-infecting virus is a previously undescribed potyvirus and the name petunia flower mottle virus (PetFMV) is suggested.

  6. The complete mitochondrial genome of dhole Cuon alpinus: phylogenetic analysis and dating evolutionary divergence within Canidae.

    PubMed

    Zhang, Honghai; Chen, Lei

    2011-03-01

    The dhole (Cuon alpinus) is the only existent species in the genus Cuon (Carnivora: Canidae). In the present study, the complete mitochondrial genome of the dhole was sequenced. The total length is 16672 base pairs which is the shortest in Canidae. Sequence analysis revealed that most mitochondrial genomic functional regions were highly consistent among canid animals except the CSB domain of the control region. The difference in length among the Canidae mitochondrial genome sequences is mainly due to the number of short segments of tandem repeated in the CSB domain. Phylogenetic analysis was progressed based on the concatenated data set of 14 mitochondrial genes of 8 canid animals by using maximum parsimony (MP), maximum likelihood (ML) and Bayesian (BI) inference methods. The genera Vulpes and Nyctereutes formed a sister group and split first within Canidae, followed by that in the Cuon. The divergence in the genus Canis was the latest. The divarication of domestic dogs after that of the Canis lupus laniger is completely supported by all the three topologies. Pairwise sequence divergence data of different mitochondrial genes among canid animals were also determined. Except for the synonymous substitutions in protein-coding genes, the control region exhibits the highest sequence divergences. The synonymous rates are approximately two to six times higher than those of the non-synonymous sites except for a slightly higher rate in the non-synonymous substitution between Cuon alpinus and Vulpes vulpes. 16S rRNA genes have a slightly faster sequence divergence than 12S rRNA and tRNA genes. Based on nucleotide substitutions of tRNA genes and rRNA genes, the times since divergence between dhole and other canid animals, and between domestic dogs and three subspecies of wolves were evaluated. The result indicates that Vulpes and Nyctereutes have a close phylogenetic relationship and the divergence of Nyctereutes is a little earlier. The Tibetan wolf may be an archaic pedigree within wolf subspecies. The genetic distance between wolves and domestic dogs is less than that among different subspecies of wolves. The domestication of dogs was about 1.56-1.92 million years ago or even earlier.

  7. Host switch during evolution of a genetically distinct hantavirus in the American shrew mole (Neurotrichus gibbsii)

    PubMed Central

    Kang, Hae Ji; Bennett, Shannon N.; Dizney, Laurie; Sumibcay, Laarni; Arai, Satoru; Ruedas, Luis A.; Song, Jin-Won; Yanagihara, Richard

    2009-01-01

    A genetically distinct hantavirus, designated Oxbow virus (OXBV), was detected in tissues of an American shrew mole (Neurotrichus gibbsii), captured in Gresham, Oregon, in September 2003. Pairwise analysis of full-length S- and M- and partial L-segment nucleotide and amino acid sequences of OXBV indicated low sequence similarity with rodent-borne hantaviruses. Phylogenetic analyses using maximum-likelihood and Bayesian methods, and host-parasite evolutionary comparisons, showed that OXBV and Asama virus, a hantavirus recently identified from the Japanese shrew mole (Urotrichus talpoides), were related to soricine shrew-borne hantaviruses from North America and Eurasia, respectively, suggesting parallel evolution associated with cross-species transmission. PMID:19394994

  8. Amazonian waters harbour an ancient freshwater Ceratomyxa lineage (Cnidaria: Myxosporea).

    PubMed

    Zatti, Suellen A; Atkinson, Stephen D; Bartholomew, Jerri L; Maia, Antônio A M; Adriano, Edson A

    2017-05-01

    A new species of Ceratomyxa parasitizing the gall bladder of Cichla monoculus, an endemic cichlid fish from the Amazon basin in Brazil, is described using morphological and molecular data. In the bile, both immature and mature myxospores were found floating freely or inside elongated plasmodia: length 304 (196-402) μm and width 35.7 (18.3-55.1) μm. Mature spores were elongated and only slightly crescent-shaped in frontal view with a prominent sutural line between two valve cells, which had rounded ends. Measurements of formalin-fixed myxospores: length 6.3±0.6 (5.1-7.5) μm, thickness 41.2±2.9 (37.1-47.6) μm, posterior angle 147°. Lateral projections slightly asymmetric, with lengths 19.3±1.4μm and 20.5±1.3μm. Two ovoid, equal size polar capsules, length 2.6±0.3 (2-3.3) μm, width 2.5±0.4 (1.8-3.7) μm, located adjacent to the suture and containing polar filaments with 3-4 turns. The small subunit ribosomal DNA sequence of 1605 nt was no more than 97% similar to any other sequence in GenBank, and together with the host, locality and morphometric data, supports diagnosis of the parasite as a new species, Ceratomyxa brasiliensis n. sp. Maximum parsimony and maximum likelihood analyses showed that C. brasiliensis n. sp. clustered within the marine Ceratomyxa clade, but was in a basally divergent lineage with two other freshwater species from the Amazon basin. Our results are consistent with previous studies that show Ceratomyxa species can cluster according to both geography and host ecotype, and that the few known freshwater species diverged from marine cousins relatively early in evolution of the genus, possibly driven by marine incursions into riverine environments. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Mitogenomic analysis of the genus Panthera.

    PubMed

    Wei, Lei; Wu, Xiaobing; Zhu, Lixin; Jiang, Zhigang

    2011-10-01

    The complete sequences of the mitochondrial DNA genomes of Panthera tigris, Panthera pardus, and Panthera uncia were determined using the polymerase chain reaction method. The lengths of the complete mitochondrial DNA sequences of the three species were 16990, 16964, and 16773 bp, respectively. Each of the three mitochondrial DNA genomes included 13 protein-coding genes, 22 tRNA, two rRNA, one O(L)R, and one control region. The structures of the genomes were highly similar to those of Felis catus, Acinonyx jubatus, and Neofelis nebulosa. The phylogenies of the genus Panthera were inferred from two combined mitochondrial sequence data sets and the complete mitochondrial genome sequences, by MP (maximum parsimony), ML (maximum likelihood), and Bayesian analysis. The results showed that Panthera was composed of Panthera leo, P. uncia, P. pardus, Panthera onca, P. tigris, and N. nebulosa, which was included as the most basal member. The phylogeny within Panthera genus was N. nebulosa (P. tigris (P. onca (P. pardus, (P. leo, P. uncia)))). The divergence times for Panthera genus were estimated based on the ML branch lengths and four well-established calibration points. The results showed that at about 11.3 MYA, the Panthera genus separated from other felid species and then evolved into the several species of the genus. In detail, N. nebulosa was estimated to be founded about 8.66 MYA, P. tigris about 6.55 MYA, P. uncia about 4.63 MYA, and P. pardus about 4.35 MYA. All these estimated times were older than those estimated from the fossil records. The divergence event, evolutionary process, speciation, and distribution pattern of P. uncia, a species endemic to the central Asia with core habitats on the Qinghai-Tibetan Plateau and surrounding highlands, mostly correlated with the geological tectonic events and intensive climate shifts that happened at 8, 3.6, 2.5, and 1.7 MYA on the plateau during the late Cenozoic period.

  10. The effects of rest interval length manipulation of the first upper-body resistance exercise in sequence on acute performance of subsequent exercises in men and women.

    PubMed

    Ratamess, Nicholas A; Chiarello, Christina M; Sacco, Anthony J; Hoffman, Jay R; Faigenbaum, Avery D; Ross, Ryan E; Kang, Jie

    2012-11-01

    The purpose of the present study was to investigate the effects of manipulating rest interval (RI) length of the first upper-body exercise in sequence on subsequent resistance exercise performance. Twenty-two men and women with at least 1 year of resistance training experience performed resistance exercise protocols on 3 occasions in random order. Each protocol consisted of performing 4 barbell upper-body exercises in the same sequence (bench press, incline bench press, shoulder press, and bent-over row) for 3 sets of up to 10 repetitions with 75% of 1 repetition maximum. Bench press RIs were 1, 2, or 3 minutes, whereas other exercises were performed with a standard 2-minute rest interval. The number of repetitions completed, average power, and velocity for each set of each exercise were recorded. Gender differences were observed during the bench press and incline press as women performed significantly (p ≤ 0.05) more repetitions than men during all RIs. The magnitude of decline in velocity and power over 3 sets of the bench press and incline press was significantly higher in men than women. Manipulation of RI length during the bench press did not affect performance of the remaining exercises in men. However, significantly more repetitions were performed by women during the first set of the incline press using 3-minute rest interval than 1-minute rest interval. In men and women, performance of the incline press and shoulder press was compromised compared with baseline performances. Manipulation of RI length of the first exercise affected performance of only the first set of 1 subsequent exercise in women. All RIs led to comparable levels of fatigue in men, indicating that reductions in load are necessary for subsequent exercises performed in sequence that stress similar agonist muscle groups when 10 repetitions are desired.

  11. Tsunami focusing and leading wave height

    NASA Astrophysics Data System (ADS)

    Kanoglu, Utku

    2016-04-01

    Field observations from tsunami events show that sometimes the maximum tsunami amplitude might not occur for the first wave, such as the maximum wave from the 2011 Japan tsunami reaching to Papeete, Tahiti as a fourth wave 72 min later after the first wave. This might mislead local authorities and give a wrong sense of security to the public. Recently, Okal and Synolakis (2016, Geophys. J. Int. 204, 719-735) discussed "the factors contributing to the sequencing of tsunami waves in the far field." They consider two different generation mechanisms through an axial symmetric source -circular plug; one, Le Mehaute and Wang's (1995, World Scientific, 367 pp.) formalism where irritational wave propagation is formulated in the framework of investigating tsunamis generated by underwater explosions and two, Hammack's formulation (1972, Ph.D. Dissertation, Calif. Inst. Tech., 261 pp., Pasadena) which introduces deformation at the ocean bottom and does not represent an immediate deformation of the ocean surface, i.e. time dependent ocean surface deformation. They identify the critical distance for transition from the first wave being largest to the second wave being largest. To verify sequencing for a finite length source, Okal and Synolakis (2016) is then used NOAA's validated and verified real time forecasting numerical model MOST (Titov and Synolakis, 1998, J. Waterw. Port Coast. Ocean Eng., 124, 157-171) through Synolakis et al. (2008, Pure Appl. Geophys. 165, 2197-2228). As a reference, they used the parameters of the 1 April 2014 Iquique, Chile earthquake over real bathymetry, variants of this source (small, big, wide, thin, and long) over a flat bathymetry, and 2010 Chile and 211 Japan tsunamis over both real and flat bathymetries to explore the influence of the fault parameters on sequencing. They identified that sequencing more influenced by the source width rather than the length. We extend Okal and Synolakis (2016)'s analysis to an initial N-wave form (Tadepalli and Synolakis, 1994, Proc. R. Soc. A: Math. Phys. Eng. Sci., 445, 99-112) with a finite crest length, which is most common tsunami initial waveform. We fit earthquake initial waveform calculated through Okada (1985, Bull. Seismol. Soc. Am. 75, 1135-1040) to the N-wave form presented by Tadepalli and Synolakis (1994). First, we investigate focusing phenomena as presented by Kanoglu et al. (2013, Proc. R. Soc. A: Math. Phys. Eng. Sci., 469, 20130015) and compare our results with their non-dispersive and dispersive linear analytical solutions. We confirm focusing phenomena, which amplify the wave height in the leading depression side. We then study sequencing of an N-wave profile with a finite crest length. Our preliminary results show that sequencing is more pronounced on the leading depression side. We perform parametric study to understand sequencing in terms of N-wave, hence earthquake, parameters. We then discuss the results both in terms of tsunami focusing and leading wave amplitude. Acknowledgment: The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement no 603839 (Project ASTARTE - Assessment, Strategy and Risk Reduction for Tsunamis in Europe).

  12. The complete chloroplast genome of Gentiana straminea (Gentianaceae), an endemic species to the Sino-Himalayan subregion.

    PubMed

    Ni, Lianghong; Zhao, Zhili; Xu, Hongxi; Chen, Shilin; Dorje, Gaawe

    2016-02-15

    Endemic to the Sino-Himalayan subregion, the medicinal alpine plant Gentiana straminea is a threatened species. The genetic and molecular data about it is deficient. Here we report the complete chloroplast (cp) genome sequence of G. straminea, as the first sequenced member of the family Gentianaceae. The cp genome is 148,991bp in length, including a large single copy (LSC) region of 81,240bp, a small single copy (SSC) region of 17,085bp and a pair of inverted repeats (IRs) of 25,333bp. It contains 112 unique genes, including 78 protein-coding genes, 30 tRNAs and 4 rRNAs. The rps16 gene lacks exon2 between trnK-UUU and trnQ-UUG, which is the first rps16 pseudogene found in the nonparasitic plants of Asterids clade. Sequence analysis revealed the presence of 13 forward repeats, 13 palindrome repeats and 39 simple sequence repeats (SSRs). An entire cp genome comparison study of G. straminea and four other species in Gentianales was carried out. Phylogenetic analyses using maximum likelihood (ML) and maximum parsimony (MP) were performed based on 69 protein-coding genes from 36 species of Asterids. The results strongly supported the position of Gentianaceae as one member of the order Gentianales. The complete chloroplast genome sequence will provide intragenic information for its conservation and contribute to research on the genetic and phylogenetic analyses of Gentianales and Asterids. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Streaming current magnetic fields in a charged nanopore

    PubMed Central

    Mansouri, Abraham; Taheri, Peyman; Kostiuk, Larry W.

    2016-01-01

    Magnetic fields induced by currents created in pressure driven flows inside a solid-state charged nanopore were modeled by numerically solving a system of steady state continuum partial differential equations, i.e., Poisson, Nernst-Planck, Ampere and Navier-Stokes equations (PNPANS). This analysis was based on non-dimensional transport governing equations that were scaled using Debye length as the characteristic length scale, and applied to a finite length cylindrical nano-channel. The comparison of numerical and analytical studies shows an excellent agreement and verified the magnetic fields density both inside and outside the nanopore. The radially non-uniform currents resulted in highly non-uniform magnetic fields within the nanopore that decay as 1/r outside the nanopore. It is worth noting that for either streaming currents or streaming potential cases, the maximum magnetic field occurred inside the pore in the vicinity of nanopore wall, as opposed to a cylindrical conductor that carries a steady electric current where the maximum magnetic fields occur at the perimeter of conductor. Based on these results, it is suggested and envisaged that non-invasive external magnetic fields readouts generated by streaming/ionic currents may be viewed as secondary electronic signatures of biomolecules to complement and enhance current DNA nanopore sequencing techniques. PMID:27833119

  14. Complete chloroplast genome of Prunus yedoensis Matsum.(Rosaceae), wild and endemic flowering cherry on Jeju Island, Korea.

    PubMed

    Cho, Myong-Suk; Hyun Cho, Chung; Yeon Kim, Su; Su Yoon, Hwan; Kim, Seung-Chul

    2016-09-01

    The complete chloroplast genome sequences of the wild flowering cherry, Prunus yedoensis Matsum., which is native and endemic to Jeju Island, Korea, is reported in this study. The genome size is 157 786 bp in length with 36.7% GC content, which is composed of LSC region of 85 908 bp, SSC region of 19 120 bp and two IR copies of 26 379 bp each. The cp genome contains 131 genes, including 86 coding genes, 8 rRNA genes and 37 tRNA genes. The maximum likelihood analysis was conducted to verify a phylogenetic position of the newly sequenced cp genome of P. yedoensis using 11 representatives of complete cp genome sequences within the family Rosaceae. The genus Prunus exhibited monophyly and the result of the phylogenetic relationship agreed with the previous phylogenetic analyses within Rosaceae.

  15. High-Throughput Sequencing of 16S rRNA Gene Amplicons: Effects of Extraction Procedure, Primer Length and Annealing Temperature

    PubMed Central

    Sergeant, Martin J.; Constantinidou, Chrystala; Cogan, Tristan; Penn, Charles W.; Pallen, Mark J.

    2012-01-01

    The analysis of 16S-rDNA sequences to assess the bacterial community composition of a sample is a widely used technique that has increased with the advent of high throughput sequencing. Although considerable effort has been devoted to identifying the most informative region of the 16S gene and the optimal informatics procedures to process the data, little attention has been paid to the PCR step, in particular annealing temperature and primer length. To address this, amplicons derived from 16S-rDNA were generated from chicken caecal content DNA using different annealing temperatures, primers and different DNA extraction procedures. The amplicons were pyrosequenced to determine the optimal protocols for capture of maximum bacterial diversity from a chicken caecal sample. Even at very low annealing temperatures there was little effect on the community structure, although the abundance of some OTUs such as Bifidobacterium increased. Using shorter primers did not reveal any novel OTUs but did change the community profile obtained. Mechanical disruption of the sample by bead beating had a significant effect on the results obtained, as did repeated freezing and thawing. In conclusion, existing primers and standard annealing temperatures captured as much diversity as lower annealing temperatures and shorter primers. PMID:22666455

  16. High-throughput sequencing of 16S rRNA gene amplicons: effects of extraction procedure, primer length and annealing temperature.

    PubMed

    Sergeant, Martin J; Constantinidou, Chrystala; Cogan, Tristan; Penn, Charles W; Pallen, Mark J

    2012-01-01

    The analysis of 16S-rDNA sequences to assess the bacterial community composition of a sample is a widely used technique that has increased with the advent of high throughput sequencing. Although considerable effort has been devoted to identifying the most informative region of the 16S gene and the optimal informatics procedures to process the data, little attention has been paid to the PCR step, in particular annealing temperature and primer length. To address this, amplicons derived from 16S-rDNA were generated from chicken caecal content DNA using different annealing temperatures, primers and different DNA extraction procedures. The amplicons were pyrosequenced to determine the optimal protocols for capture of maximum bacterial diversity from a chicken caecal sample. Even at very low annealing temperatures there was little effect on the community structure, although the abundance of some OTUs such as Bifidobacterium increased. Using shorter primers did not reveal any novel OTUs but did change the community profile obtained. Mechanical disruption of the sample by bead beating had a significant effect on the results obtained, as did repeated freezing and thawing. In conclusion, existing primers and standard annealing temperatures captured as much diversity as lower annealing temperatures and shorter primers.

  17. 33 CFR 401.4 - Maximum length and weight.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 33 Navigation and Navigable Waters 3 2010-07-01 2010-07-01 false Maximum length and weight. 401.4 Section 401.4 Navigation and Navigable Waters SAINT LAWRENCE SEAWAY DEVELOPMENT CORPORATION, DEPARTMENT OF TRANSPORTATION SEAWAY REGULATIONS AND RULES Regulations Condition of Vessels § 401.4 Maximum length and weight...

  18. Design of a fast echo matching algorithm to reduce crosstalk with Doppler shifts in ultrasonic ranging

    NASA Astrophysics Data System (ADS)

    Liu, Lei; Guo, Rui; Wu, Jun-an

    2017-02-01

    Crosstalk is a main factor for wrong distance measurement by ultrasonic sensors, and this problem becomes more difficult to deal with under Doppler effects. In this paper, crosstalk reduction with Doppler shifts on small platforms is focused on, and a fast echo matching algorithm (FEMA) is proposed on the basis of chaotic sequences and pulse coding technology, then verified through applying it to match practical echoes. Finally, we introduce how to select both better mapping methods for chaotic sequences, and algorithm parameters for higher achievable maximum of cross-correlation peaks. The results indicate the following: logistic mapping is preferred to generate good chaotic sequences, with high autocorrelation even when the length is very limited; FEMA can not only match echoes and calculate distance accurately with an error degree mostly below 5%, but also generates nearly the same calculation cost level for static or kinematic ranging, much lower than that by direct Doppler compensation (DDC) with the same frequency compensation step; The sensitivity to threshold value selection and performance of FEMA depend significantly on the achievable maximum of cross-correlation peaks, and a higher peak is preferred, which can be considered as a criterion for algorithm parameter optimization under practical conditions.

  19. Complete Mitochondrial Genome of Echinostoma hortense (Digenea: Echinostomatidae).

    PubMed

    Liu, Ze-Xuan; Zhang, Yan; Liu, Yu-Ting; Chang, Qiao-Cheng; Su, Xin; Fu, Xue; Yue, Dong-Mei; Gao, Yuan; Wang, Chun-Ren

    2016-04-01

    Echinostoma hortense (Digenea: Echinostomatidae) is one of the intestinal flukes with medical importance in humans. However, the mitochondrial (mt) genome of this fluke has not been known yet. The present study has determined the complete mt genome sequences of E. hortense and assessed the phylogenetic relationships with other digenean species for which the complete mt genome sequences are available in GenBank using concatenated amino acid sequences inferred from 12 protein-coding genes. The mt genome of E. hortense contained 12 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes, and 1 non-coding region. The length of the mt genome of E. hortense was 14,994 bp, which was somewhat smaller than those of other trematode species. Phylogenetic analyses based on concatenated nucleotide sequence datasets for all 12 protein-coding genes using maximum parsimony (MP) method showed that E. hortense and Hypoderaeum conoideum gathered together, and they were closer to each other than to Fasciolidae and other echinostomatid trematodes. The availability of the complete mt genome sequences of E. hortense provides important genetic markers for diagnostics, population genetics, and evolutionary studies of digeneans.

  20. Complete Mitochondrial Genome of Echinostoma hortense (Digenea: Echinostomatidae)

    PubMed Central

    Liu, Ze-Xuan; Zhang, Yan; Liu, Yu-Ting; Chang, Qiao-Cheng; Su, Xin; Fu, Xue; Yue, Dong-Mei; Gao, Yuan; Wang, Chun-Ren

    2016-01-01

    Echinostoma hortense (Digenea: Echinostomatidae) is one of the intestinal flukes with medical importance in humans. However, the mitochondrial (mt) genome of this fluke has not been known yet. The present study has determined the complete mt genome sequences of E. hortense and assessed the phylogenetic relationships with other digenean species for which the complete mt genome sequences are available in GenBank using concatenated amino acid sequences inferred from 12 protein-coding genes. The mt genome of E. hortense contained 12 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes, and 1 non-coding region. The length of the mt genome of E. hortense was 14,994 bp, which was somewhat smaller than those of other trematode species. Phylogenetic analyses based on concatenated nucleotide sequence datasets for all 12 protein-coding genes using maximum parsimony (MP) method showed that E. hortense and Hypoderaeum conoideum gathered together, and they were closer to each other than to Fasciolidae and other echinostomatid trematodes. The availability of the complete mt genome sequences of E. hortense provides important genetic markers for diagnostics, population genetics, and evolutionary studies of digeneans. PMID:27180575

  1. A maximum entropy model for chromatin structure

    NASA Astrophysics Data System (ADS)

    Farre, Pau; Emberly, Eldon; Emberly Group Team

    The DNA inside the nucleus of eukaryotic cells shows a variety of conserved structures at different length scales These structures are formed by interactions between protein complexes that bind to the DNA and regulate gene activity. Recent high throughput sequencing techniques allow for the measurement both of the genome wide contact map of the folded DNA within a cell (HiC) and where various proteins are bound to the DNA (ChIP-seq). In this talk I will present a maximum-entropy method capable of both predicting HiC contact maps from binding data, and binding data from HiC contact maps. This method results in an intuitive Ising-type model that is able to predict how altering the presence of binding factors can modify chromosome conformation, without the need of polymer simulations.

  2. A Four-Phase Modulation System for Use with an Adaptive Array.

    DTIC Science & Technology

    1982-07-01

    MODULATION SYSTEM FORIteia epr * ~USE WITH AN ADAPTIVE ARRAY *P RIGOG EOTM~E _____________________________________ ESL 711679-5 7s AUTHOeO~) 9 . CONTRACT r0...OUSOLE1T6 UNCLASSIFIED SECURITY CLASSIFICATION OF THIS P040E (when Doe I91 r2 UNCLASSIFIED 8ncumV CL"M,ICAnIo, o TP , ImS 8... ., 9 fte-H - LMS...nterval has a duration of : Tb seconds. a(t) is a pseudonotse code, i.e., a maximum length " lInear shift register sequence [ 9 ]. The code symbol interval

  3. Molecular identification based on ITS sequences for Kappaphycus and Eucheuma cultivated in China

    NASA Astrophysics Data System (ADS)

    Zhao, Sufen; He, Peimin

    2011-11-01

    The systematic classification of the Eucheumatoideae is difficult because of their variable morphology and interpretation of reproductive structures. Kappaphycus and Eucheuma specimens cultivated on the Hainan and Fujian coast of China were introduced from Vietnam, the Philippines and Indonesia. Combined with morphological characteristics, all Kappaphycus and Eucheuma cultivated strains were identified by internal transcribed spacer (ITS) sequences. The phylogenetic tree was constructed using neighbor-joining and maximum likelihood methods. The results indicate that different ITS sequence lengths occurred in the different genera and species. An obvious difference in morphology could be found in the protuberance shape between Kappaphycus and Eucheuma. The protuberance in Eucheuma was thorn-like and in Kappaphycus was wartlike or papillate. Their ITS sequence lengths differed significantly in nucleotide variation rates up to 58.55%-63.90%. All nucleotide variations occurred in the ITS1 and ITS2 regions except for five nucleotide transversions in the 5.8S rDNA region. In addition, the difference was at the branches among congeneric species. Kappaphycus sp. had branches with small buds, while K. alvarezii did not have such a feature. The nucleotide variation rates varied from 7.02% to 7.48% among species; within the same species of the clades it was <1.20%. Eucheumatoideae algae cultivated in China consisted of three clades, K. alvarezii, Kappaphycus sp., and E. denticulatum. The results indicate that ITS sequence analysis was an effective way for identification of interspecies and intraspecies phylogenetic relationships and might provide a clue for molecular identification of algal Eucheumatoideae.

  4. DNA Barcode for Identifying Folium Artemisiae Argyi from Counterfeits.

    PubMed

    Mei, Quanxi; Chen, Xiaolu; Xiang, Li; Liu, Yue; Su, Yanyan; Gao, Yuqiao; Dai, Weibo; Dong, Pengpeng; Chen, Shilin

    2016-01-01

    Folium Artemisiae Argyi is an important herb in traditional Chinese medicine. It is commonly used in moxibustion, medicine, etc. However, identifying Artemisia argyi is difficult because this herb exhibits similar morphological characteristics to closely related species and counterfeits. To verify the applicability of DNA barcoding, ITS2 and psbA-trnH were used to identify A. argyi from 15 closely related species and counterfeits. Results indicated that total DNA was easily extracted from all the samples and that both ITS2 and psbA-trnH fragments can be easily amplified. ITS2 was a more ideal barcode than psbA-trnH and ITS2+psbA-trnH to identify A. argyi from closely related species and counterfeits on the basis of sequence character, genetic distance, and tree methods. The sequence length was 225 bp for the 56 ITS2 sequences of A. argyi, and no variable site was detected. For the ITS2 sequences, A. capillaris, A. anomala, A. annua, A. igniaria, A. maximowicziana, A. princeps, Dendranthema vestitum, and D. indicum had single nucleotide polymorphisms (SNPs). The intraspecific Kimura 2-Parameter distance was zero, which is lower than the minimum interspecific distance (0.005). A. argyi, the closely related species, and counterfeits, except for Artemisia maximowicziana and Artemisia sieversiana, were separated into pairs of divergent clusters by using the neighbor joining, maximum parsimony, and maximum likelihood tree methods. Thus, the ITS2 sequence was an ideal barcode to identify A. argyi from closely related species and counterfeits to ensure the safe use of this plant.

  5. Length-independent structural similarities enrich the antibody CDR canonical class model.

    PubMed

    Nowak, Jaroslaw; Baker, Terry; Georges, Guy; Kelm, Sebastian; Klostermann, Stefan; Shi, Jiye; Sridharan, Sudharsan; Deane, Charlotte M

    2016-01-01

    Complementarity-determining regions (CDRs) are antibody loops that make up the antigen binding site. Here, we show that all CDR types have structurally similar loops of different lengths. Based on these findings, we created length-independent canonical classes for the non-H3 CDRs. Our length variable structural clusters show strong sequence patterns suggesting either that they evolved from the same original structure or result from some form of convergence. We find that our length-independent method not only clusters a larger number of CDRs, but also predicts canonical class from sequence better than the standard length-dependent approach. To demonstrate the usefulness of our findings, we predicted cluster membership of CDR-L3 sequences from 3 next-generation sequencing datasets of the antibody repertoire (over 1,000,000 sequences). Using the length-independent clusters, we can structurally classify an additional 135,000 sequences, which represents a ∼20% improvement over the standard approach. This suggests that our length-independent canonical classes might be a highly prevalent feature of antibody space, and could substantially improve our ability to accurately predict the structure of novel CDRs identified by next-generation sequencing.

  6. Differentiation of Trypanosoma cruzi I subgroups through characterization of cytochrome b gene sequences.

    PubMed

    Spotorno O, Angel E; Córdova, Luis; Solari I, Aldo

    2008-12-01

    To identify and characterize chilean samples of Trypanosoma cruzi and their association with hosts, the first 516 bp of the mitochondrial cytochrome b gene were sequenced from eight biological samples, and phylogenetically compared with other known 20 American sequences. The molecular characterization of these 28 sequences in a maximum likelihood phylogram (-lnL = 1255.12, tree length = 180, consistency index = 0.79) allowed the robust identification (bootstrap % > 99) of three previously known discrete typing units (DTU): DTU IIb, IIa, and I. An apparently undescribed new sequence found in four new chilean samples was detected and designated as DTU Ib; they were separated by 24.7 differences, but robustly related (bootstrap % = 97 in 500 replicates) to those of DTU I by sharing 12 substitutions, among which four were nonsynonymous ones. Such new DTU Ib was also robust (bootstrap % = 100), and characterized by 10 unambiguous substitutions, with a single nonsynonymous G to T change at site 409. The fact that two of such new sequences were found in parasites from a chilean endemic caviomorph rodent, Octodon degus, and that they were closely related to the ancient DTU I suggested old origins and a long association to caviomorph hosts.

  7. Authentication of an endangered herb Changium smyrnioides from different producing areas based on rDNA ITS sequences and allele-specific PCR.

    PubMed

    Sun, Xiaoqin; Wei, Yanglian; Qin, Minjian; Guo, Qiaosheng; Guo, Jianlin; Zhou, Yifeng; Hang, Yueyu

    2012-03-01

    The rDNA ITS region of 18 samples of Changium smyrnioides from 7 areas and of 2 samples of Chuanminshen violaceum were sequenced and analyzed. The amplified ITS region of the samples, including a partial sequence of ITS1 and complete sequences of 5.8S and ITS2, had a total length of 555 bp. After complete alignment, there were 49 variable sites, of which 45 were informative, when gaps were treated as missing data. Samples of C. smyrnioides from different locations could be identified exactly based on the variable sites. The maximum parsimony (MP) and neighbor joining (NJ) tree constructed from the ITS sequences based on Kumar's two-parameter model showed that the genetic distances of the C. smyrnioides samples from different locations were not always related to their geographical distances. A specific primer set for Allele-specific PCR authentication of C. violaceum from Jurong of Jiangsu was designed based on the SNP in the ITS sequence alignment. C. violaceum from the major genuine producing area in Jurong of Jiangsu could be identified exactly and quickly by Allele-specific PCR.

  8. An investigation of the uniform random number generator

    NASA Technical Reports Server (NTRS)

    Temple, E. C.

    1982-01-01

    Most random number generators that are in use today are of the congruential form X(i+1) + AX(i) + C mod M where A, C, and M are nonnegative integers. If C=O, the generator is called the multiplicative type and those for which C/O are called mixed congruential generators. It is easy to see that congruential generators will repeat a sequence of numbers after a maximum of M values have been generated. The number of numbers that a procedure generates before restarting the sequence is called the length or the period of the generator. Generally, it is desirable to make the period as long as possible. A detailed discussion of congruential generators is given. Also, several promising procedures that differ from the multiplicative and mixed procedure are discussed.

  9. Using nearly full-genome HIV sequence data improves phylogeny reconstruction in a simulated epidemic

    PubMed Central

    Yebra, Gonzalo; Hodcroft, Emma B.; Ragonnet-Cronin, Manon L.; Pillay, Deenan; Brown, Andrew J. Leigh; Fraser, Christophe; Kellam, Paul; de Oliveira, Tulio; Dennis, Ann; Hoppe, Anne; Kityo, Cissy; Frampton, Dan; Ssemwanga, Deogratius; Tanser, Frank; Keshani, Jagoda; Lingappa, Jairam; Herbeck, Joshua; Wawer, Maria; Essex, Max; Cohen, Myron S.; Paton, Nicholas; Ratmann, Oliver; Kaleebu, Pontiano; Hayes, Richard; Fidler, Sarah; Quinn, Thomas; Novitsky, Vladimir; Haywards, Andrew; Nastouli, Eleni; Morris, Steven; Clark, Duncan; Kozlakidis, Zisis

    2016-01-01

    HIV molecular epidemiology studies analyse viral pol gene sequences due to their availability, but whole genome sequencing allows to use other genes. We aimed to determine what gene(s) provide(s) the best approximation to the real phylogeny by analysing a simulated epidemic (created as part of the PANGEA_HIV project) with a known transmission tree. We sub-sampled a simulated dataset of 4662 sequences into different combinations of genes (gag-pol-env, gag-pol, gag, pol, env and partial pol) and sampling depths (100%, 60%, 20% and 5%), generating 100 replicates for each case. We built maximum-likelihood trees for each combination using RAxML (GTR + Γ), and compared their topologies to the corresponding true tree’s using CompareTree. The accuracy of the trees was significantly proportional to the length of the sequences used, with the gag-pol-env datasets showing the best performance and gag and partial pol sequences showing the worst. The lowest sampling depths (20% and 5%) greatly reduced the accuracy of tree reconstruction and showed high variability among replicates, especially when using the shortest gene datasets. In conclusion, using longer sequences derived from nearly whole genomes will improve the reliability of phylogenetic reconstruction. With low sample coverage, results can be highly variable, particularly when based on short sequences. PMID:28008945

  10. Using nearly full-genome HIV sequence data improves phylogeny reconstruction in a simulated epidemic.

    PubMed

    Yebra, Gonzalo; Hodcroft, Emma B; Ragonnet-Cronin, Manon L; Pillay, Deenan; Brown, Andrew J Leigh

    2016-12-23

    HIV molecular epidemiology studies analyse viral pol gene sequences due to their availability, but whole genome sequencing allows to use other genes. We aimed to determine what gene(s) provide(s) the best approximation to the real phylogeny by analysing a simulated epidemic (created as part of the PANGEA_HIV project) with a known transmission tree. We sub-sampled a simulated dataset of 4662 sequences into different combinations of genes (gag-pol-env, gag-pol, gag, pol, env and partial pol) and sampling depths (100%, 60%, 20% and 5%), generating 100 replicates for each case. We built maximum-likelihood trees for each combination using RAxML (GTR + Γ), and compared their topologies to the corresponding true tree's using CompareTree. The accuracy of the trees was significantly proportional to the length of the sequences used, with the gag-pol-env datasets showing the best performance and gag and partial pol sequences showing the worst. The lowest sampling depths (20% and 5%) greatly reduced the accuracy of tree reconstruction and showed high variability among replicates, especially when using the shortest gene datasets. In conclusion, using longer sequences derived from nearly whole genomes will improve the reliability of phylogenetic reconstruction. With low sample coverage, results can be highly variable, particularly when based on short sequences.

  11. Medium- and Long-term Prediction of LOD Change by the Leap-step Autoregressive Model

    NASA Astrophysics Data System (ADS)

    Wang, Qijie

    2015-08-01

    The accuracy of medium- and long-term prediction of length of day (LOD) change base on combined least-square and autoregressive (LS+AR) deteriorates gradually. Leap-step autoregressive (LSAR) model can significantly reduce the edge effect of the observation sequence. Especially, LSAR model greatly improves the resolution of signals’ low-frequency components. Therefore, it can improve the efficiency of prediction. In this work, LSAR is used to forecast the LOD change. The LOD series from EOP 08 C04 provided by IERS is modeled by both the LSAR and AR models. The results of the two models are analyzed and compared. When the prediction length is between 10-30 days, the accuracy improvement is less than 10%. When the prediction length amounts to above 30 day, the accuracy improved obviously, with the maximum being around 19%. The results show that the LSAR model has higher prediction accuracy and stability in medium- and long-term prediction.

  12. Cross-Border Sexual Transmission of the Newly Emerging HIV-1 Clade CRF51_01B

    PubMed Central

    Cheong, Hui Ting; Ng, Kim Tien; Ong, Lai Yee; Chook, Jack Bee; Chan, Kok Gan; Takebe, Yutaka; Kamarulzaman, Adeeba; Tee, Kok Keng

    2014-01-01

    A novel HIV-1 recombinant clade (CRF51_01B) was recently identified among men who have sex with men (MSM) in Singapore. As cases of sexually transmitted HIV-1 infection increase concurrently in two socioeconomically intimate countries such as Malaysia and Singapore, cross transmission of HIV-1 between said countries is highly probable. In order to investigate the timeline for the emergence of HIV-1 CRF51_01B in Singapore and its possible introduction into Malaysia, 595 HIV-positive subjects recruited in Kuala Lumpur from 2008 to 2012 were screened. Phylogenetic relationship of 485 amplified polymerase gene sequences was determined through neighbour-joining method. Next, near-full length sequences were amplified for genomic sequences inferred to be CRF51_01B and subjected to further analysis implemented through Bayesian Markov chain Monte Carlo (MCMC) sampling and maximum likelihood methods. Based on the near full length genomes, two isolates formed a phylogenetic cluster with CRF51_01B sequences of Singapore origin, sharing identical recombination structure. Spatial and temporal information from Bayesian MCMC coalescent and maximum likelihood analysis of the protease, gp120 and gp41 genes suggest that Singapore is probably the country of origin of CRF51_01B (as early as in the mid-1990s) and featured a Malaysian who acquired the infection through heterosexual contact as host for its ancestral lineages. CRF51_01B then spread rapidly among the MSM in Singapore and Malaysia. Although the importation of CRF51_01B from Singapore to Malaysia is supported by coalescence analysis, the narrow timeframe of the transmission event indicates a closely linked epidemic. Discrepancies in the estimated divergence times suggest that CRF51_01B may have arisen through multiple recombination events from more than one parental lineage. We report the cross transmission of a novel CRF51_01B lineage between countries that involved different sexual risk groups. Understanding the cross-border transmission of HIV-1 involving sexual networks is crucial for effective intervention strategies in the region. PMID:25340817

  13. Cross-border sexual transmission of the newly emerging HIV-1 clade CRF51_01B.

    PubMed

    Cheong, Hui Ting; Ng, Kim Tien; Ong, Lai Yee; Chook, Jack Bee; Chan, Kok Gan; Takebe, Yutaka; Kamarulzaman, Adeeba; Tee, Kok Keng

    2014-01-01

    A novel HIV-1 recombinant clade (CRF51_01B) was recently identified among men who have sex with men (MSM) in Singapore. As cases of sexually transmitted HIV-1 infection increase concurrently in two socioeconomically intimate countries such as Malaysia and Singapore, cross transmission of HIV-1 between said countries is highly probable. In order to investigate the timeline for the emergence of HIV-1 CRF51_01B in Singapore and its possible introduction into Malaysia, 595 HIV-positive subjects recruited in Kuala Lumpur from 2008 to 2012 were screened. Phylogenetic relationship of 485 amplified polymerase gene sequences was determined through neighbour-joining method. Next, near-full length sequences were amplified for genomic sequences inferred to be CRF51_01B and subjected to further analysis implemented through Bayesian Markov chain Monte Carlo (MCMC) sampling and maximum likelihood methods. Based on the near full length genomes, two isolates formed a phylogenetic cluster with CRF51_01B sequences of Singapore origin, sharing identical recombination structure. Spatial and temporal information from Bayesian MCMC coalescent and maximum likelihood analysis of the protease, gp120 and gp41 genes suggest that Singapore is probably the country of origin of CRF51_01B (as early as in the mid-1990s) and featured a Malaysian who acquired the infection through heterosexual contact as host for its ancestral lineages. CRF51_01B then spread rapidly among the MSM in Singapore and Malaysia. Although the importation of CRF51_01B from Singapore to Malaysia is supported by coalescence analysis, the narrow timeframe of the transmission event indicates a closely linked epidemic. Discrepancies in the estimated divergence times suggest that CRF51_01B may have arisen through multiple recombination events from more than one parental lineage. We report the cross transmission of a novel CRF51_01B lineage between countries that involved different sexual risk groups. Understanding the cross-border transmission of HIV-1 involving sexual networks is crucial for effective intervention strategies in the region.

  14. Speech serial control in healthy speakers and speakers with hypokinetic or ataxic dysarthria: effects of sequence length and practice

    PubMed Central

    Reilly, Kevin J.; Spencer, Kristie A.

    2013-01-01

    The current study investigated the processes responsible for selection of sounds and syllables during production of speech sequences in 10 adults with hypokinetic dysarthria from Parkinson’s disease, five adults with ataxic dysarthria, and 14 healthy control speakers. Speech production data from a choice reaction time task were analyzed to evaluate the effects of sequence length and practice on speech sound sequencing. Speakers produced sequences that were between one and five syllables in length over five experimental runs of 60 trials each. In contrast to the healthy speakers, speakers with hypokinetic dysarthria demonstrated exaggerated sequence length effects for both inter-syllable intervals (ISIs) and speech error rates. Conversely, speakers with ataxic dysarthria failed to demonstrate a sequence length effect on ISIs and were also the only group that did not exhibit practice-related changes in ISIs and speech error rates over the five experimental runs. The exaggerated sequence length effects in the hypokinetic speakers with Parkinson’s disease are consistent with an impairment of action selection during speech sequence production. The absent length effects observed in the speakers with ataxic dysarthria is consistent with previous findings that indicate a limited capacity to buffer speech sequences in advance of their execution. In addition, the lack of practice effects in these speakers suggests that learning-related improvements in the production rate and accuracy of speech sequences involves processing by structures of the cerebellum. Together, the current findings inform models of serial control for speech in healthy speakers and support the notion that sequencing deficits contribute to speech symptoms in speakers with hypokinetic or ataxic dysarthria. In addition, these findings indicate that speech sequencing is differentially impaired in hypokinetic and ataxic dysarthria. PMID:24137121

  15. Laser-zone Growth in a Ribbon-to-ribbon (RTR) Process Silicon Sheet Growth Development for the Large Area Silicon Sheet Task of the Low Cost Solar Array Project

    NASA Technical Reports Server (NTRS)

    Baghdadi, A.; Gurtler, R. W.; Legge, R.; Sopori, B.; Rice, M. J.; Ellis, R. J.

    1979-01-01

    A technique for growing limited-length ribbons continually was demonstrated. This Rigid Edge technique can be used to recrystallize about 95% of the polyribbon feedstock. A major advantage of this method is that only a single, constant length silicon ribbon is handled throughout the entire process sequence; this may be accomplished using cassettes similar to those presently in use for processing Czochralski waters. Thus a transition from Cz to ribbon technology can be smoothly affected. The maximum size being considered, 3 inches x 24 inches, is half a square foot, and will generate 6 watts for 12% efficiency at 1 sun. Silicon dioxide has been demonstrated as an effective, practical diffusion barrier for use during the polyribbon formation.

  16. Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions

    DOEpatents

    Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA

    2011-01-18

    A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.

  17. Thirty years since diffuse sound reflection by maximum length

    NASA Astrophysics Data System (ADS)

    Cox, Trevor J.; D'Antonio, Peter

    2005-09-01

    This year celebrates the 30th anniversary of Schroeder's seminal paper on sound scattering from maximum length sequences. This paper, along with Schroeder's subsequent publication on quadratic residue diffusers, broke new ground, because they contained simple recipes for designing diffusers with known acoustic performance. So, what has happened in the intervening years? As with most areas of engineering, the room acoustic diffuser has been greatly influenced by the rise of digital computing technologies. Numerical methods have become much more powerful, and this has enabled predictions of surface scattering to greater accuracy and for larger scale surfaces than previously possible. Architecture has also gone through a revolution where the forms of buildings have become more extreme and sculptural. Acoustic diffuser designs have had to keep pace with this to produce shapes and forms that are desirable to architects. To achieve this, design methodologies have moved away from Schroeder's simple equations to brute force optimization algorithms. This paper will look back at the past development of the modern diffuser, explaining how the principles of diffuser design have been devised and revised over the decades. The paper will also look at the present state-of-the art, and dreams for the future.

  18. Novel methodologies for spectral classification of exon and intron sequences

    NASA Astrophysics Data System (ADS)

    Kwan, Hon Keung; Kwan, Benjamin Y. M.; Kwan, Jennifer Y. Y.

    2012-12-01

    Digital processing of a nucleotide sequence requires it to be mapped to a numerical sequence in which the choice of nucleotide to numeric mapping affects how well its biological properties can be preserved and reflected from nucleotide domain to numerical domain. Digital spectral analysis of nucleotide sequences unfolds a period-3 power spectral value which is more prominent in an exon sequence as compared to that of an intron sequence. The success of a period-3 based exon and intron classification depends on the choice of a threshold value. The main purposes of this article are to introduce novel codes for 1-sequence numerical representations for spectral analysis and compare them to existing codes to determine appropriate representation, and to introduce novel thresholding methods for more accurate period-3 based exon and intron classification of an unknown sequence. The main findings of this study are summarized as follows: Among sixteen 1-sequence numerical representations, the K-Quaternary Code I offers an attractive performance. A windowed 1-sequence numerical representation (with window length of 9, 15, and 24 bases) offers a possible speed gain over non-windowed 4-sequence Voss representation which increases as sequence length increases. A winner threshold value (chosen from the best among two defined threshold values and one other threshold value) offers a top precision for classifying an unknown sequence of specified fixed lengths. An interpolated winner threshold value applicable to an unknown and arbitrary length sequence can be estimated from the winner threshold values of fixed length sequences with a comparable performance. In general, precision increases as sequence length increases. The study contributes an effective spectral analysis of nucleotide sequences to better reveal embedded properties, and has potential applications in improved genome annotation.

  19. Molecular basis of length polymorphism in the human zeta-globin gene complex.

    PubMed Central

    Goodbourn, S E; Higgs, D R; Clegg, J B; Weatherall, D J

    1983-01-01

    The length polymorphism between the human zeta-globin gene and its pseudogene is caused by an allele-specific variation in the copy number of a tandemly repeating 36-base-pair sequence. This sequence is related to a tandemly repeated 14-base-pair sequence in the 5' flanking region of the human insulin gene, which is known to cause length polymorphism, and to a repetitive sequence in intervening sequence (IVS) 1 of the pseudo-zeta-globin gene. Evidence is presented that the latter is also of variable length, probably because of differences in the copy number of the tandem repeat. The homology between the three length polymorphisms may be an indication of the presence of a more widespread group of related sequences in the human genome, which might be useful for generalized linkage studies. PMID:6308667

  20. Kinetic Induction of Oat Shoot Pulvinus Invertase mRNA by Gravistimulation and Partial cDNA Cloning by the Polymerase Chain Reaction

    NASA Technical Reports Server (NTRS)

    Wu, Liu-Lai; Song, Il; Karuppiah, Nadarajah; Kaufman, Peter B.

    1993-01-01

    An asymmetric (top vs. bottom halves of pulvini) induction of invertase mRNA by gravistimulation was analyzed in oat shoot pulvini. Total RNA and poly(A)(+) RNA, isolated from oat pulvini, and two oli-gonucleotide primers, corresponding to two conserved amino acid sequences (NDPNG and WECPD) found in invertase from other species, were used for the polymerase chain reaction (PCR). A partial length cDNA (550 bp) was obtained and characterized. A 62% nucleotide sequence homology and 58% deduced amino acid sequence homology, as compared to beta-fructosidase of carrot cell wall, was found. Northern blot analysis showed that there was an obviously transient induction of invertase mRNA by gravistimulation in the oat pulvinus system. The mRNA was rapidly induced to a maximum level at 1 hour after gravistimulation treatment and gradually decreased afterwards. The mRNA level in the bottom half of the oat pulvinus was significantly higher than that in the top half of the pulvinus tissue. The kinetic induction of invertase mRNA was consistent with the transient accumulation of invertase activity during the graviresponse of the pulvinus. This indicates that the expression of the invertase gene(s) could be regulated by gravistimulation at the transcriptional level. Southern blot analysis showed that there were two to three genomic DNA fragments which hybridized with the partial-length invertase cDNA.

  1. MHC class II B diversity in blue tits: a preliminary study.

    PubMed

    Aguilar, Juan Rivero-de; Schut, Elske; Merino, Santiago; Martínez, Javier; Komdeur, Jan; Westerdahl, Helena

    2013-07-01

    In this study, we partly characterize major histocompatibility complex (MHC) class II B in the blue tit (Cyanistes caeruleus). A total of 22 individuals from three different European locations: Spain, The Netherlands, and Sweden were screened for MHC allelic diversity. The MHC genes were investigated using both PCR-based methods and unamplified genomic DNA with restriction fragment length polymorphism (RFLP) and southern blots. A total of 13 different exon 2 sequences were obtained independently from DNA and/or RNA, thus confirming gene transcription and likely functionality of the genes. Nine out of 13 alleles were found in more than one country, and two alleles appeared in all countries. Positive selection was detected in the region coding for the peptide binding region (PBR). A maximum of three alleles per individual was detected by sequencing and the RFLP pattern consisted of 4-7 fragments, indicating a minimum number of 2-4 loci per individual. A phylogenetic analysis, demonstrated that the blue tit sequences are divergent compared to sequences from other passerines resembling a different MHC lineage than those possessed by most passerines studied to date.

  2. MHC class II B diversity in blue tits: a preliminary study

    PubMed Central

    Aguilar, Juan Rivero-de; Schut, Elske; Merino, Santiago; Martínez, Javier; Komdeur, Jan; Westerdahl, Helena

    2013-01-01

    In this study, we partly characterize major histocompatibility complex (MHC) class II B in the blue tit (Cyanistes caeruleus). A total of 22 individuals from three different European locations: Spain, The Netherlands, and Sweden were screened for MHC allelic diversity. The MHC genes were investigated using both PCR-based methods and unamplified genomic DNA with restriction fragment length polymorphism (RFLP) and southern blots. A total of 13 different exon 2 sequences were obtained independently from DNA and/or RNA, thus confirming gene transcription and likely functionality of the genes. Nine out of 13 alleles were found in more than one country, and two alleles appeared in all countries. Positive selection was detected in the region coding for the peptide binding region (PBR). A maximum of three alleles per individual was detected by sequencing and the RFLP pattern consisted of 4–7 fragments, indicating a minimum number of 2–4 loci per individual. A phylogenetic analysis, demonstrated that the blue tit sequences are divergent compared to sequences from other passerines resembling a different MHC lineage than those possessed by most passerines studied to date. PMID:23919136

  3. 3D morphometry using automated aortic segmentation in native MR angiography: an alternative to contrast enhanced MRA?

    PubMed

    Müller-Eschner, Matthias; Müller, Tobias; Biesdorf, Andreas; Wörz, Stefan; Rengier, Fabian; Böckler, Dittmar; Kauczor, Hans-Ulrich; Rohr, Karl; von Tengg-Kobligk, Hendrik

    2014-04-01

    Native-MR angiography (N-MRA) is considered an imaging alternative to contrast enhanced MR angiography (CE-MRA) for patients with renal insufficiency. Lower intraluminal contrast in N-MRA often leads to failure of the segmentation process in commercial algorithms. This study introduces an in-house 3D model-based segmentation approach used to compare both sequences by automatic 3D lumen segmentation, allowing for evaluation of differences of aortic lumen diameters as well as differences in length comparing both acquisition techniques at every possible location. Sixteen healthy volunteers underwent 1.5-T-MR Angiography (MRA). For each volunteer, two different MR sequences were performed, CE-MRA: gradient echo Turbo FLASH sequence and N-MRA: respiratory-and-cardiac-gated, T2-weighted 3D SSFP. Datasets were segmented using a 3D model-based ellipse-fitting approach with a single seed point placed manually above the celiac trunk. The segmented volumes were manually cropped from left subclavian artery to celiac trunk to avoid error due to side branches. Diameters, volumes and centerline length were computed for intraindividual comparison. For statistical analysis the Wilcoxon-Signed-Ranked-Test was used. Average centerline length obtained based on N-MRA was 239.0±23.4 mm compared to 238.6±23.5 mm for CE-MRA without significant difference (P=0.877). Average maximum diameter obtained based on N-MRA was 25.7±3.3 mm compared to 24.1±3.2 mm for CE-MRA (P<0.001). In agreement with the difference in diameters, volumes obtained based on N-MRA (100.1±35.4 cm(3)) were consistently and significantly larger compared to CE-MRA (89.2±30.0 cm(3)) (P<0.001). 3D morphometry shows highly similar centerline lengths for N-MRA and CE-MRA, but systematically higher diameters and volumes for N-MRA.

  4. 3D morphometry using automated aortic segmentation in native MR angiography: an alternative to contrast enhanced MRA?

    PubMed Central

    Müller-Eschner, Matthias; Müller, Tobias; Biesdorf, Andreas; Wörz, Stefan; Rengier, Fabian; Böckler, Dittmar; Kauczor, Hans-Ulrich; Rohr, Karl

    2014-01-01

    Introduction Native-MR angiography (N-MRA) is considered an imaging alternative to contrast enhanced MR angiography (CE-MRA) for patients with renal insufficiency. Lower intraluminal contrast in N-MRA often leads to failure of the segmentation process in commercial algorithms. This study introduces an in-house 3D model-based segmentation approach used to compare both sequences by automatic 3D lumen segmentation, allowing for evaluation of differences of aortic lumen diameters as well as differences in length comparing both acquisition techniques at every possible location. Methods and materials Sixteen healthy volunteers underwent 1.5-T-MR Angiography (MRA). For each volunteer, two different MR sequences were performed, CE-MRA: gradient echo Turbo FLASH sequence and N-MRA: respiratory-and-cardiac-gated, T2-weighted 3D SSFP. Datasets were segmented using a 3D model-based ellipse-fitting approach with a single seed point placed manually above the celiac trunk. The segmented volumes were manually cropped from left subclavian artery to celiac trunk to avoid error due to side branches. Diameters, volumes and centerline length were computed for intraindividual comparison. For statistical analysis the Wilcoxon-Signed-Ranked-Test was used. Results Average centerline length obtained based on N-MRA was 239.0±23.4 mm compared to 238.6±23.5 mm for CE-MRA without significant difference (P=0.877). Average maximum diameter obtained based on N-MRA was 25.7±3.3 mm compared to 24.1±3.2 mm for CE-MRA (P<0.001). In agreement with the difference in diameters, volumes obtained based on N-MRA (100.1±35.4 cm3) were consistently and significantly larger compared to CE-MRA (89.2±30.0 cm3) (P<0.001). Conclusions 3D morphometry shows highly similar centerline lengths for N-MRA and CE-MRA, but systematically higher diameters and volumes for N-MRA. PMID:24834406

  5. Apollo 12 photography 70 mm, 16 mm, and 35 mm frame index

    NASA Technical Reports Server (NTRS)

    1970-01-01

    For each 70-mm frame, the index presents information on: (1) the focal length of the camera, (2) the photo scale at the principal point of the frame, (3) the selenographic coordinates at the principal point of the frame, (4) the percentage of forward overlap of the frame, (5) the sun angle (medium, low, high), (6) the quality of the photography, (7) the approximate tilt (minimum and maximum) of the camera, and (8) the direction of tilt. A brief description of each frame is also included. The index to the 16-mm sequence photography includes information concerning the approximate surface coverage of the photographic sequence and a brief description of the principal features shown. A column of remarks is included to indicate: (1) if the sequence is plotted on the photographic index map and (2) the quality of the photography. The pictures taken using the lunar surface closeup stereoscopic camera (35 mm) are also described in this same index format.

  6. The complete mitochondrial genome of Papilio glaucus and its phylogenetic implications.

    PubMed

    Shen, Jinhui; Cong, Qian; Grishin, Nick V

    2015-09-01

    Due to the intriguing morphology, lifecycle, and diversity of butterflies and moths, Lepidoptera are emerging as model organisms for the study of genetics, evolution and speciation. The progress of these studies relies on decoding Lepidoptera genomes, both nuclear and mitochondrial. Here we describe a protocol to obtain mitogenomes from Next Generation Sequencing reads performed for whole-genome sequencing and report the complete mitogenome of Papilio (Pterourus) glaucus. The circular mitogenome is 15,306 bp in length and rich in A and T. It contains 13 protein-coding genes (PCGs), 22 transfer-RNA-coding genes (tRNA), and 2 ribosomal-RNA-coding genes (rRNA), with a gene order typical for mitogenomes of Lepidoptera. We performed phylogenetic analyses based on PCG and RNA-coding genes or protein sequences using Bayesian Inference and Maximum Likelihood methods. The phylogenetic trees consistently show that among species with available mitogenomes Papilio glaucus is the closest to Papilio (Agehana) maraho from Asia.

  7. Complete genome sequence of a new begomovirus associated with yellow mosaic disease of Hemidesmus indicus in India.

    PubMed

    Reddy, M Sreekanth; Kanakala, S; Srinivas, K P; Hema, M; Malathi, V G; Sreenivasulu, P

    2014-05-01

    The complete DNA A genome of a virus isolate associated with yellow mosaic disease of a medicinal plant, Hemidesmus indicus, from India was cloned and sequenced. The length of DNA A was 2825 nucleotides, 35 nucleotides longer than the unit genome of monopartite begomoviruses. Comparison of the nucleotide sequence of DNA A of the virus isolate with those of other begomoviruses showed maximum sequence identity of 69 % to DNA A of ageratum yellow vein China virus (AYVCNV; AJ558120) and 68 % with tomato yellow leaf curl virus- LBa4 (TYLCV; EF185318), and it formed a distinct clade in phylogenetic analysis. The genome organization of the present virus isolate was found to be similar to that of Old World monopartite begomoviruses. The genome was considered to be monopartite, because association of DNA B and β satellite DNA components was not detected. Based on its sequence identity (<70 %) to all other begomoviruses known to date and ICTV (International Committee on Taxonomy of Viruses) species demarcating criteria (<89 % identity), it is considered a member of a novel begomovirus species, and the tentative name "Hemidesmus yellow mosaic virus" (HeYMV) is proposed.

  8. Characterization of the complete mitochondrial genome of the hybrid Epinephelus moara♀ × Epinephelus lanceolatus♂, and phylogenetic analysis in subfamily epinephelinae

    NASA Astrophysics Data System (ADS)

    Gao, Fengtao; Wei, Min; Zhu, Ying; Guo, Hua; Chen, Songlin; Yang, Guanpin

    2017-06-01

    This study presents the complete mitochondrial genome of the hybrid Epinephelus moara♀× Epinephelus lanceolatus♂. The genome is 16886 bp in length, and contains 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes, a light-strand replication origin and a control region. Additionally, phylogenetic analysis based on the nucleotide sequences of 13 conserved protein-coding genes using the maximum likelihood method indicated that the mitochondrial genome is maternally inherited. This study presents genomic data for studying phylogenetic relationships and breeding of hybrid Epinephelinae.

  9. [Cloning and sequence analysis of full-length cDNA of secoisolariciresinol dehydrogenase of Dysosma versipellis].

    PubMed

    Xu, Li; Ding, Zhi-Shan; Zhou, Yun-Kai; Tao, Xue-Fen

    2009-06-01

    To obtain the full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene from Dysosma versipellis by RACE PCR,then investigate the character of Secoisolariciresinol Dehydrogenase gene. The full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene was obtained by 3'-RACE and 5'-RACE from Dysosma versipellis. We first reported the full cDNA sequences of Secoisolariciresinol Dehydrogenase in Dysosma versipellis. The acquired gene was 991bp in full length, including 5' untranslated region of 42bp, 3' untranslated region of 112bp with Poly (A). The open reading frame (ORF) encoding 278 amino acid with molecular weight 29253.3 Daltons and isolectric point 6.328. The gene accession nucleotide sequence number in GeneBank was EU573789. Semi-quantitative RT-PCR analysis revealed that the Secoisolariciresinol Dehydrogenase gene was highly expressed in stem. Alignment of the amino acid sequence of Secoisolariciresinol Dehydrogenase indicated there may be some significant amino acid sequence difference among different species. Obtain the full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene from Dysosma versipellis.

  10. Description of Globodera ellingtonae n. sp. (Nematoda: Heteroderidae) from Oregon

    PubMed Central

    Handoo, Zafar A.; Carta, Lynn K.; Skantar, Andrea M.; Chitwood, David J.

    2012-01-01

    A new species of cyst nematode, Globodera ellingtonae, is described from soil collected from a field in Oregon. Second-stage juveniles (J2) of the species are characterized by body length of 365-515 μm, stylet length of 19-22.5 μm, basal knobs rounded posteriorly and pointed anteriorly, tail 39-55 μm, hyaline tail terminus 20-32.5 μm, and tail tapering uniformly but abruptly narrowing and constricted near the posterior third of the hyaline portion, ending with a peg-like, finely rounded to pointed terminus. Cysts are spherical to sub-spherical, dark to light brown and circumfenestrate and cyst wall pattern is ridge-like with heavy punctations. Males have a stylet length of 21-25 μm and spicule length of 30-37 μm with a pointed thorn-like tip. Females have a stylet length of 20-22.5 μm, one head annule and labial disc, heavy punctations on the cuticle, and short vulval slit 7.5-8 μm long. Morphologically this new, round-cyst species differs from the related species G. pallida, G. rostochiensis, G. tabacum complex and G. mexicana by its distinctive J2 tail, and by one or another of the following: shorter mean stylet length in J2, females and males; number of refractive bodies in the hyaline tail terminus of J2; cyst morphology including Granek’s ratio; number of cuticular ridges between the anus and vulva; and in the shape and length of spicules in males. Its relationship to these closely related species are discussed. Based upon analysis of ribosomal internal transcribed spacer (ITS) sequences, G. ellingtonae n. sp. is distinct from G. pallida, G. rostochiensis, G. tabacum and G. mexicana. Bayesian and Maximum Parsimony analysis of cloned ITS rRNA gene sequences indicated three clades, with intraspecific variability as high as 2.8%. In silico analysis revealed ITS restriction fragment length polymorphisms for enzymes Bsh 1236I, Hinf I, and Rsa I that overlap patterns for other Globodera species. PMID:23483076

  11. Description of Globodera ellingtonae n. sp. (Nematoda: Heteroderidae) from Oregon.

    PubMed

    Handoo, Zafar A; Carta, Lynn K; Skantar, Andrea M; Chitwood, David J

    2012-03-01

    A new species of cyst nematode, Globodera ellingtonae, is described from soil collected from a field in Oregon. Second-stage juveniles (J2) of the species are characterized by body length of 365-515 μm, stylet length of 19-22.5 μm, basal knobs rounded posteriorly and pointed anteriorly, tail 39-55 μm, hyaline tail terminus 20-32.5 μm, and tail tapering uniformly but abruptly narrowing and constricted near the posterior third of the hyaline portion, ending with a peg-like, finely rounded to pointed terminus. Cysts are spherical to sub-spherical, dark to light brown and circumfenestrate and cyst wall pattern is ridge-like with heavy punctations. Males have a stylet length of 21-25 μm and spicule length of 30-37 μm with a pointed thorn-like tip. Females have a stylet length of 20-22.5 μm, one head annule and labial disc, heavy punctations on the cuticle, and short vulval slit 7.5-8 μm long. Morphologically this new, round-cyst species differs from the related species G. pallida, G. rostochiensis, G. tabacum complex and G. mexicana by its distinctive J2 tail, and by one or another of the following: shorter mean stylet length in J2, females and males; number of refractive bodies in the hyaline tail terminus of J2; cyst morphology including Granek's ratio; number of cuticular ridges between the anus and vulva; and in the shape and length of spicules in males. Its relationship to these closely related species are discussed. Based upon analysis of ribosomal internal transcribed spacer (ITS) sequences, G. ellingtonae n. sp. is distinct from G. pallida, G. rostochiensis, G. tabacum and G. mexicana. Bayesian and Maximum Parsimony analysis of cloned ITS rRNA gene sequences indicated three clades, with intraspecific variability as high as 2.8%. In silico analysis revealed ITS restriction fragment length polymorphisms for enzymes Bsh 1236I, Hinf I, and Rsa I that overlap patterns for other Globodera species.

  12. Sequence-Dependent Persistence Length of Long DNA

    NASA Astrophysics Data System (ADS)

    Chuang, Hui-Min; Reifenberger, Jeffrey G.; Cao, Han; Dorfman, Kevin D.

    2017-12-01

    Using a high-throughput genome-mapping approach, we obtained circa 50 million measurements of the extension of internal human DNA segments in a 41 nm ×41 nm nanochannel. The underlying DNA sequences, obtained by mapping to the reference human genome, are 2.5-393 kilobase pairs long and contain percent GC contents between 32.5% and 60%. Using Odijk's theory for a channel-confined wormlike chain, these data reveal that the DNA persistence length increases by almost 20% as the percent GC content increases. The increased persistence length is rationalized by a model, containing no adjustable parameters, that treats the DNA as a statistical terpolymer with a sequence-dependent intrinsic persistence length and a sequence-independent electrostatic persistence length.

  13. Polarization signatures for abandoned agricultural fields in the Manix Basin area of the Mojave Desert

    NASA Technical Reports Server (NTRS)

    Ray, Terrill W.; Farr, Tom G.; Vanzyl, Jakob J.

    1991-01-01

    Polarimetric signatures from abandoned circular alfalfa fields in the Manix Basin area of the Mojave desert show systematic changes with length of abandonment. The obliteration of circular planting rows by surface processes could account for the disappearance of bright 'spokes', which seems to be reflection patterns from remnants of the planting rows, with increasing length of abandonment. An observed shift in the location of the maximum L-band copolarization return away from VV, as well as an increase in surface roughness, both occurring with increasing age of abandonment, seems to be attributable to the formation of wind ripple on the relatively vegetationless fields. A Late Pleistocene/Holocene sand bar deposit, which can be identified in the radar images, is probably responsible for the failure of three fields to match the age sequence patterns in roughness and peak shift.

  14. SEQUENCING of TSUNAMI WAVES: Why the first wave is not always the largest?

    NASA Astrophysics Data System (ADS)

    Synolakis, C.; Okal, E.

    2016-12-01

    We discuss what contributes to the `sequencing' of tsunami waves in the far field, that is, to the distribution of the maximum sea surface amplitude inside the dominant wave packet constituting the primary arrival at a distant harbour. Based on simple models of sources for which analytical solutions are available, we show that, as range is increased, the wave pattern evolves from a regime of maximum amplitude in the first oscillation to one of delayed maximum, where the largest amplitude takes place during a subsequent oscillation. In the case of the simple, instantaneous uplift of a circular disk at the surface of an ocean of constant depth, the critical distance for transition between those patterns scales as r 30 /h2 where r0 is the radius of the disk and h the depth of the ocean. This behaviour is explained from simple arguments based on a model where sequencing results from frequency dispersion in the primary wave packet, as the width of its spectrum around its dominant period T0 becomes dispersed in time in an amount comparable to T0 , the latter being controlled by a combination of source size and ocean depth. The general concepts in this model are confirmed in the case of more realistic sources for tsunami excitation by a finite-time deformation of the ocean floor, as well as in real-life simulations of tsunamis excited by large subduction events, for which we find that the influence of fault width on the distribution of sequencing is more important than that of fault length. Finally, simulation of the major events of Chile (2010) and Japan (2011) at large arrays of virtual gauges in the Pacific Basin correctly predicts the majority of the sequencing patterns observed on DART buoys during these events. By providing insight into the evolution with time of wave amplitudes inside primary wave packets for far field tsunamis generated by large earthquakes, our results stress the importance, for civil defense authorities, of issuing warning and evacuation orders of sufficient duration to avoid the hazard

  15. Sequencing of tsunami waves: why the first wave is not always the largest

    NASA Astrophysics Data System (ADS)

    Okal, Emile A.; Synolakis, Costas E.

    2016-02-01

    This paper examines the factors contributing to the `sequencing' of tsunami waves in the far field, that is, to the distribution of the maximum sea surface amplitude inside the dominant wave packet constituting the primary arrival at a distant harbour. Based on simple models of sources for which analytical solutions are available, we show that, as range is increased, the wave pattern evolves from a regime of maximum amplitude in the first oscillation to one of delayed maximum, where the largest amplitude takes place during a subsequent oscillation. In the case of the simple, instantaneous uplift of a circular disk at the surface of an ocean of constant depth, the critical distance for transition between those patterns scales as r_0^3 / h^2 where r0 is the radius of the disk and h the depth of the ocean. This behaviour is explained from simple arguments based on a model where sequencing results from frequency dispersion in the primary wave packet, as the width of its spectrum around its dominant period T0 becomes dispersed in time in an amount comparable to T0, the latter being controlled by a combination of source size and ocean depth. The general concepts in this model are confirmed in the case of more realistic sources for tsunami excitation by a finite-time deformation of the ocean floor, as well as in real-life simulations of tsunamis excited by large subduction events, for which we find that the influence of fault width on the distribution of sequencing is more important than that of fault length. Finally, simulation of the major events of Chile (2010) and Japan (2011) at large arrays of virtual gauges in the Pacific Basin correctly predicts the majority of the sequencing patterns observed on DART buoys during these events. By providing insight into the evolution with time of wave amplitudes inside primary wave packets for far field tsunamis generated by large earthquakes, our results stress the importance, for civil defense authorities, of issuing warning and evacuation orders of sufficient duration to avoid the hazard inherent in premature calls for all-clear.

  16. Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions

    DOEpatents

    Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S

    2013-06-25

    A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.

  17. An Enumerative Combinatorics Model for Fragmentation Patterns in RNA Sequencing Provides Insights into Nonuniformity of the Expected Fragment Starting-Point and Coverage Profile.

    PubMed

    Prakash, Celine; Haeseler, Arndt Von

    2017-03-01

    RNA sequencing (RNA-seq) has emerged as the method of choice for measuring the expression of RNAs in a given cell population. In most RNA-seq technologies, sequencing the full length of RNA molecules requires fragmentation into smaller pieces. Unfortunately, the issue of nonuniform sequencing coverage across a genomic feature has been a concern in RNA-seq and is attributed to biases for certain fragments in RNA-seq library preparation and sequencing. To investigate the expected coverage obtained from fragmentation, we develop a simple fragmentation model that is independent of bias from the experimental method and is not specific to the transcript sequence. Essentially, we enumerate all configurations for maximal placement of a given fragment length, F, on transcript length, T, to represent every possible fragmentation pattern, from which we compute the expected coverage profile across a transcript. We extend this model to incorporate general empirical attributes such as read length, fragment length distribution, and number of molecules of the transcript. We further introduce the fragment starting-point, fragment coverage, and read coverage profiles. We find that the expected profiles are not uniform and that factors such as fragment length to transcript length ratio, read length to fragment length ratio, fragment length distribution, and number of molecules influence the variability of coverage across a transcript. Finally, we explore a potential application of the model where, with simulations, we show that it is possible to correctly estimate the transcript copy number for any transcript in the RNA-seq experiment.

  18. An Enumerative Combinatorics Model for Fragmentation Patterns in RNA Sequencing Provides Insights into Nonuniformity of the Expected Fragment Starting-Point and Coverage Profile

    PubMed Central

    Haeseler, Arndt Von

    2017-01-01

    Abstract RNA sequencing (RNA-seq) has emerged as the method of choice for measuring the expression of RNAs in a given cell population. In most RNA-seq technologies, sequencing the full length of RNA molecules requires fragmentation into smaller pieces. Unfortunately, the issue of nonuniform sequencing coverage across a genomic feature has been a concern in RNA-seq and is attributed to biases for certain fragments in RNA-seq library preparation and sequencing. To investigate the expected coverage obtained from fragmentation, we develop a simple fragmentation model that is independent of bias from the experimental method and is not specific to the transcript sequence. Essentially, we enumerate all configurations for maximal placement of a given fragment length, F, on transcript length, T, to represent every possible fragmentation pattern, from which we compute the expected coverage profile across a transcript. We extend this model to incorporate general empirical attributes such as read length, fragment length distribution, and number of molecules of the transcript. We further introduce the fragment starting-point, fragment coverage, and read coverage profiles. We find that the expected profiles are not uniform and that factors such as fragment length to transcript length ratio, read length to fragment length ratio, fragment length distribution, and number of molecules influence the variability of coverage across a transcript. Finally, we explore a potential application of the model where, with simulations, we show that it is possible to correctly estimate the transcript copy number for any transcript in the RNA-seq experiment. PMID:27661099

  19. Towards computational improvement of DNA database indexing and short DNA query searching.

    PubMed

    Stojanov, Done; Koceski, Sašo; Mileva, Aleksandra; Koceska, Nataša; Bande, Cveta Martinovska

    2014-09-03

    In order to facilitate and speed up the search of massive DNA databases, the database is indexed at the beginning, employing a mapping function. By searching through the indexed data structure, exact query hits can be identified. If the database is searched against an annotated DNA query, such as a known promoter consensus sequence, then the starting locations and the number of potential genes can be determined. This is particularly relevant if unannotated DNA sequences have to be functionally annotated. However, indexing a massive DNA database and searching an indexed data structure with millions of entries is a time-demanding process. In this paper, we propose a fast DNA database indexing and searching approach, identifying all query hits in the database, without having to examine all entries in the indexed data structure, limiting the maximum length of a query that can be searched against the database. By applying the proposed indexing equation, the whole human genome could be indexed in 10 hours on a personal computer, under the assumption that there is enough RAM to store the indexed data structure. Analysing the methodology proposed by Reneker, we observed that hits at starting positions [Formula: see text] are not reported, if the database is searched against a query shorter than [Formula: see text] nucleotides, such that [Formula: see text] is the length of the DNA database words being mapped and [Formula: see text] is the length of the query. A solution of this drawback is also presented.

  20. Impact of sequencing depth and read length on single cell RNA sequencing data of T cells.

    PubMed

    Rizzetto, Simone; Eltahla, Auda A; Lin, Peijie; Bull, Rowena; Lloyd, Andrew R; Ho, Joshua W K; Venturi, Vanessa; Luciani, Fabio

    2017-10-06

    Single cell RNA sequencing (scRNA-seq) provides great potential in measuring the gene expression profiles of heterogeneous cell populations. In immunology, scRNA-seq allowed the characterisation of transcript sequence diversity of functionally relevant T cell subsets, and the identification of the full length T cell receptor (TCRαβ), which defines the specificity against cognate antigens. Several factors, e.g. RNA library capture, cell quality, and sequencing output affect the quality of scRNA-seq data. We studied the effects of read length and sequencing depth on the quality of gene expression profiles, cell type identification, and TCRαβ reconstruction, utilising 1,305 single cells from 8 publically available scRNA-seq datasets, and simulation-based analyses. Gene expression was characterised by an increased number of unique genes identified with short read lengths (<50 bp), but these featured higher technical variability compared to profiles from longer reads. Successful TCRαβ reconstruction was achieved for 6 datasets (81% - 100%) with at least 0.25 millions (PE) reads of length >50 bp, while it failed for datasets with <30 bp reads. Sufficient read length and sequencing depth can control technical noise to enable accurate identification of TCRαβ and gene expression profiles from scRNA-seq data of T cells.

  1. 5 CFR 890.1015 - Minimum and maximum length of permissive debarments.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ...) CIVIL SERVICE REGULATIONS (CONTINUED) FEDERAL EMPLOYEES HEALTH BENEFITS PROGRAM Administrative Sanctions Imposed Against Health Care Providers Permissive Debarments § 890.1015 Minimum and maximum length of...

  2. Phylogenetic relationships among four new complete mitogenome sequences of Pelophylax (Amphibia: Anura) from the Balkans and Cyprus.

    PubMed

    Hofman, Sebastian; Pabijan, Maciej; Osikowski, Artur; Litvinchuk, Spartak N; Szymura, Jacek M

    2016-09-01

    We present the full-length mitogenome sequences of four European water frog species: Pelophylax cypriensis, P. epeiroticus, P. kurtmuelleri and P. shqipericus. The mtDNA size varied from 17,363 to 17,895 bp, and its organization with the LPTF tRNA gene cluster preceding the 12 S rRNA gene displayed the typical Neobatrachian arrangement. Maximum likelihood and Bayesian inference revealed a well-resolved mtDNA phylogeny of seven European Pelophylax species. The uncorrected p-distance for among Pelophylax mitogenomes was 9.6 (range 0.01-0.13). Most divergent was the P. shqipericus mitogenome, clustering with the "P. lessonae" group, in contrast to the other three new Pelophylax mitogenomes related to the "P. bedriagae/ridibundus" lineage. The new mitogenomes resolve ambiguities of the phylogenetic placement of P. cretensis and P. epeiroticus.

  3. Experiment evaluation of speckle suppression efficiency of 2D quasi-spiral M-sequence-based diffractive optical element.

    PubMed

    Lapchuk, A; Pashkevich, G A; Prygun, O V; Yurlov, V; Borodin, Y; Kryuchyn, A; Korchovyi, A A; Shylo, S

    2015-10-01

    The quasi-spiral 2D diffractive optical element (DOE) based on M-sequence of length N=15 is designed and manufactured. The speckle suppression efficiency by the DOE rotation is measured. The speckle suppression coefficients of 10.5, 6, and 4 are obtained for green, violet, and red laser beams, respectively. The results of numerical simulation and experimental data show that the quasi-spiral binary DOE structure can be as effective in speckle reduction as a periodic 2D DOE structure. The numerical simulation and experimental results show that the speckle suppression efficiency of the 2D DOE structure decreases approximately twice at the boundaries of the visible range. It is shown that a replacement of this structure with the bilateral 1D DOE allows obtaining the maximum speckle suppression efficiency in the entire visible range of light.

  4. Steady-state MEG responses elicited by a sequence of amplitude-modulated short tones of different carrier frequencies.

    PubMed

    Kuriki, Shinya; Kobayashi, Yusuke; Kobayashi, Takanari; Tanaka, Keita; Uchikawa, Yoshinori

    2013-02-01

    The auditory steady-state response (ASSR) is a weak potential or magnetic response elicited by periodic acoustic stimuli with a maximum response at about a 40-Hz periodicity. In most previous studies using amplitude-modulated (AM) tones of stimulus sound, long lasting tones of more than 10 s in length were used. However, characteristics of the ASSR elicited by short AM tones have remained unclear. In this study, we examined magnetoencephalographic (MEG) ASSR using a sequence of sinusoidal AM tones of 0.78 s in length with various tone frequencies of 440-990 Hz in about one octave variation. It was found that the amplitude of the ASSR was invariant with tone frequencies when the level of sound pressure was adjusted along an equal-loudness curve. The amplitude also did not depend on the existence of preceding tone or difference in frequency of the preceding tone. When the sound level of AM tones was changed with tone frequencies in the same range of 440-990 Hz, the amplitude of ASSR varied in a proportional manner to the sound level. These characteristics are favorable for the use of ASSR in studying temporal processing of auditory information in the auditory cortex. The lack of adaptation in the ASSR elicited by a sequence of short tones may be ascribed to the neural activity of widely accepted generator of magnetic ASSR in the primary auditory cortex. Copyright © 2012 Elsevier B.V. All rights reserved.

  5. Maximum-likelihood estimation of recent shared ancestry (ERSA).

    PubMed

    Huff, Chad D; Witherspoon, David J; Simonson, Tatum S; Xing, Jinchuan; Watkins, W Scott; Zhang, Yuhua; Tuohy, Therese M; Neklason, Deborah W; Burt, Randall W; Guthery, Stephen L; Woodward, Scott R; Jorde, Lynn B

    2011-05-01

    Accurate estimation of recent shared ancestry is important for genetics, evolution, medicine, conservation biology, and forensics. Established methods estimate kinship accurately for first-degree through third-degree relatives. We demonstrate that chromosomal segments shared by two individuals due to identity by descent (IBD) provide much additional information about shared ancestry. We developed a maximum-likelihood method for the estimation of recent shared ancestry (ERSA) from the number and lengths of IBD segments derived from high-density SNP or whole-genome sequence data. We used ERSA to estimate relationships from SNP genotypes in 169 individuals from three large, well-defined human pedigrees. ERSA is accurate to within one degree of relationship for 97% of first-degree through fifth-degree relatives and 80% of sixth-degree and seventh-degree relatives. We demonstrate that ERSA's statistical power approaches the maximum theoretical limit imposed by the fact that distant relatives frequently share no DNA through a common ancestor. ERSA greatly expands the range of relationships that can be estimated from genetic data and is implemented in a freely available software package.

  6. Characterization of the complete mitochondrial genomes of Nematodirus oiratianus and Nematodirus spathiger of small ruminants

    PubMed Central

    2014-01-01

    Background Nematodirus spp. are among the most common nematodes of ruminants worldwide. N. oiratianus and N. spathiger are distributed worldwide as highly prevalent gastrointestinal nematodes, which cause emerging health problems and economic losses. Accurate identification of Nematodirus species is essential to develop effective control strategies for Nematodirus infection in ruminants. Mitochondrial DNA (mtDNA) could provide powerful genetic markers for identifying these closely related species and resolving phylogenetic relationships at different taxonomic levels. Methods In the present study, the complete mitochondrial (mt) genomes of N. oiratianus and N. spathiger from small ruminants in China were obtained using Long-range PCR and sequencing. Results The complete mt genomes of N. oiratianus and N. spathiger were 13,765 bp and 13,519 bp in length, respectively. Both mt genomes were circular and consisted of 36 genes, including 12 genes encoding proteins, 2 genes encoding rRNA, and 22 genes encoding tRNA. Phylogenetic analyses based on the concatenated amino acid sequence data of all 12 protein-coding genes by Bayesian inference (BI), Maximum likelihood (ML) and Maximum parsimony (MP) showed that the two Nematodirus species (Molineidae) were closely related to Dictyocaulidae. Conclusions The availability of the complete mtDNA sequences of N. oiratianus and N. spathiger not only provides new mtDNA sources for a better understanding of nematode mt genomics and phylogeny, but also provides novel and useful genetic markers for studying diagnosis, population genetics and molecular epidemiology of Nematodirus spp. in small ruminants. PMID:25015379

  7. Characterization of the complete mitochondrial genomes of Nematodirus oiratianus and Nematodirus spathiger of small ruminants.

    PubMed

    Zhao, Guang-Hui; Jia, Yan-Qing; Cheng, Wen-Yu; Zhao, Wen; Bian, Qing-Qing; Liu, Guo-Hua

    2014-07-11

    Nematodirus spp. are among the most common nematodes of ruminants worldwide. N. oiratianus and N. spathiger are distributed worldwide as highly prevalent gastrointestinal nematodes, which cause emerging health problems and economic losses. Accurate identification of Nematodirus species is essential to develop effective control strategies for Nematodirus infection in ruminants. Mitochondrial DNA (mtDNA) could provide powerful genetic markers for identifying these closely related species and resolving phylogenetic relationships at different taxonomic levels. In the present study, the complete mitochondrial (mt) genomes of N. oiratianus and N. spathiger from small ruminants in China were obtained using Long-range PCR and sequencing. The complete mt genomes of N. oiratianus and N. spathiger were 13,765 bp and 13,519 bp in length, respectively. Both mt genomes were circular and consisted of 36 genes, including 12 genes encoding proteins, 2 genes encoding rRNA, and 22 genes encoding tRNA. Phylogenetic analyses based on the concatenated amino acid sequence data of all 12 protein-coding genes by Bayesian inference (BI), Maximum likelihood (ML) and Maximum parsimony (MP) showed that the two Nematodirus species (Molineidae) were closely related to Dictyocaulidae. The availability of the complete mtDNA sequences of N. oiratianus and N. spathiger not only provides new mtDNA sources for a better understanding of nematode mt genomics and phylogeny, but also provides novel and useful genetic markers for studying diagnosis, population genetics and molecular epidemiology of Nematodirus spp. in small ruminants.

  8. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    PubMed Central

    Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.

    2015-01-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030

  9. Complete plastid genome sequence of Daucus carota: implications for biotechnology and phylogeny of angiosperms.

    PubMed

    Ruhlman, Tracey; Lee, Seung-Bum; Jansen, Robert K; Hostetler, Jessica B; Tallon, Luke J; Town, Christopher D; Daniell, Henry

    2006-08-31

    Carrot (Daucus carota) is a major food crop in the US and worldwide. Its capacity for storage and its lifecycle as a biennial make it an attractive species for the introduction of foreign genes, especially for oral delivery of vaccines and other therapeutic proteins. Until recently efforts to express recombinant proteins in carrot have had limited success in terms of protein accumulation in the edible tap roots. Plastid genetic engineering offers the potential to overcome this limitation, as demonstrated by the accumulation of BADH in chromoplasts of carrot taproots to confer exceedingly high levels of salt resistance. The complete plastid genome of carrot provides essential information required for genetic engineering. Additionally, the sequence data add to the rapidly growing database of plastid genomes for assessing phylogenetic relationships among angiosperms. The complete carrot plastid genome is 155,911 bp in length, with 115 unique genes and 21 duplicated genes within the IR. There are four ribosomal RNAs, 30 distinct tRNA genes and 18 intron-containing genes. Repeat analysis reveals 12 direct and 2 inverted repeats > or = 30 bp with a sequence identity > or = 90%. Phylogenetic analysis of nucleotide sequences for 61 protein-coding genes using both maximum parsimony (MP) and maximum likelihood (ML) were performed for 29 angiosperms. Phylogenies from both methods provide strong support for the monophyly of several major angiosperm clades, including monocots, eudicots, rosids, asterids, eurosids II, euasterids I, and euasterids II. The carrot plastid genome contains a number of dispersed direct and inverted repeats scattered throughout coding and non-coding regions. This is the first sequenced plastid genome of the family Apiaceae and only the second published genome sequence of the species-rich euasterid II clade. Both MP and ML trees provide very strong support (100% bootstrap) for the sister relationship of Daucus with Panax in the euasterid II clade. These results provide the best taxon sampling of complete chloroplast genomes and the strongest support yet for the sister relationship of Caryophyllales to the asterids. The availability of the complete plastid genome sequence should facilitate improved transformation efficiency and foreign gene expression in carrot through utilization of endogenous flanking sequences and regulatory elements.

  10. Four new topological indices based on the molecular path code.

    PubMed

    Balaban, Alexandru T; Beteringhe, Adrian; Constantinescu, Titus; Filip, Petru A; Ivanciuc, Ovidiu

    2007-01-01

    The sequence of all paths pi of lengths i = 1 to the maximum possible length in a hydrogen-depleted molecular graph (which sequence is also called the molecular path code) contains significant information on the molecular topology, and as such it is a reasonable choice to be selected as the basis of topological indices (TIs). Four new (or five partly new) TIs with progressively improved performance (judged by correctly reflecting branching, centricity, and cyclicity of graphs, ordering of alkanes, and low degeneracy) have been explored. (i) By summing the squares of all numbers in the sequence one obtains Sigmaipi(2), and by dividing this sum by one plus the cyclomatic number, a Quadratic TI is obtained: Q = Sigmaipi(2)/(mu+1). (ii) On summing the Square roots of all numbers in the sequence one obtains Sigmaipi(1/2), and by dividing this sum by one plus the cyclomatic number, the TI denoted by S is obtained: S = Sigmaipi(1/2)/(mu+1). (iii) On dividing terms in this sum by the corresponding topological distances, one obtains the Distance-reduced index D = Sigmai{pi(1/2)/[i(mu+1)]}. Two similar formulas define the next two indices, the first one with no square roots: (iv) distance-Attenuated index: A = Sigmai{pi/[i(mu + 1)]}; and (v) the last TI with two square roots: Path-count index: P = Sigmai{pi(1/2)/[i(1/2)(mu + 1)]}. These five TIs are compared for their degeneracy, ordering of alkanes, and performance in QSPR (for all alkanes with 3-12 carbon atoms and for all possible chemical cyclic or acyclic graphs with 4-6 carbon atoms) in correlations with six physical properties and one chemical property.

  11. First full-length genome sequence of the polerovirus luffa aphid-borne yellows virus (LABYV) reveals the presence of at least two consensus sequences in an isolate from Thailand.

    PubMed

    Knierim, Dennis; Maiss, Edgar; Kenyon, Lawrence; Winter, Stephan; Menzel, Wulf

    2015-10-01

    Luffa aphid-borne yellows virus (LABYV) was proposed as the name for a previously undescribed polerovirus based on partial genome sequences obtained from samples of cucurbit plants collected in Thailand between 2008 and 2013. In this study, we determined the first full-length genome sequence of LABYV. Based on phylogenetic analysis and genome properties, it is clear that this virus represents a distinct species in the genus Polerovirus. Analysis of sequences from sample TH24, which was collected in 2010 from a luffa plant in Thailand, reveals the presence of two different full-length genome consensus sequences.

  12. Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations.

    PubMed

    Oikonomopoulos, Spyros; Wang, Yu Chang; Djambazian, Haig; Badescu, Dunarel; Ragoussis, Jiannis

    2016-08-24

    To assess the performance of the Oxford Nanopore Technologies MinION sequencing platform, cDNAs from the External RNA Controls Consortium (ERCC) RNA Spike-In mix were sequenced. This mix mimics mammalian mRNA species and consists of 92 polyadenylated transcripts with known concentration. cDNA libraries were generated using a template switching protocol to facilitate the direct comparison between different sequencing platforms. The MinION performance was assessed for its ability to sequence the cDNAs directly with good accuracy in terms of abundance and full length. The abundance of the ERCC cDNA molecules sequenced by MinION agreed with their expected concentration. No length or GC content bias was observed. The majority of cDNAs were sequenced as full length. Additionally, a complex cDNA population derived from a human HEK-293 cell line was sequenced on an Illumina HiSeq 2500, PacBio RS II and ONT MinION platforms. We observed that there was a good agreement in the measured cDNA abundance between PacBio RS II and ONT MinION (rpearson = 0.82, isoforms with length more than 700bp) and between Illumina HiSeq 2500 and ONT MinION (rpearson = 0.75). This indicates that the ONT MinION can sequence quantitatively both long and short full length cDNA molecules.

  13. The Complete Mitochondrial Genome of Galba pervia (Gastropoda: Mollusca), an Intermediate Host Snail of Fasciola spp

    PubMed Central

    Huang, Wei-Yi; Zhao, Guang-Hui; Wei, Shu-Jun; Song, Hui-Qun; Xu, Min-Jun; Lin, Rui-Qing; Zhou, Dong-Hui; Zhu, Xing-Quan

    2012-01-01

    Complete mitochondrial (mt) genomes and the gene rearrangements are increasingly used as molecular markers for investigating phylogenetic relationships. Contributing to the complete mt genomes of Gastropoda, especially Pulmonata, we determined the mt genome of the freshwater snail Galba pervia, which is an important intermediate host for Fasciola spp. in China. The complete mt genome of G. pervia is 13,768 bp in length. Its genome is circular, and consists of 37 genes, including 13 genes for proteins, 2 genes for rRNA, 22 genes for tRNA. The mt gene order of G. pervia showed novel arrangement (tRNA-His, tRNA-Gly and tRNA-Tyr change positions and directions) when compared with mt genomes of Pulmonata species sequenced to date, indicating divergence among different species within the Pulmonata. A total of 3655 amino acids were deduced to encode 13 protein genes. The most frequently used amino acid is Leu (15.05%), followed by Phe (11.24%), Ser (10.76%) and IIe (8.346%). Phylogenetic analyses using the concatenated amino acid sequences of the 13 protein-coding genes, with three different computational algorithms (maximum parsimony, maximum likelihood and Bayesian analysis), all revealed that the families Lymnaeidae and Planorbidae are closely related two snail families, consistent with previous classifications based on morphological and molecular studies. The complete mt genome sequence of G. pervia showed a novel gene arrangement and it represents the first sequenced high quality mt genome of the family Lymnaeidae. These novel mtDNA data provide additional genetic markers for studying the epidemiology, population genetics and phylogeographics of freshwater snails, as well as for understanding interplay between the intermediate snail hosts and the intra-mollusca stages of Fasciola spp.. PMID:22844544

  14. Probing the Structures of Viral RNA Regulatory Elements with SHAPE and Related Methodologies

    PubMed Central

    Rausch, Jason W.; Sztuba-Solinska, Joanna; Le Grice, Stuart F. J.

    2018-01-01

    Viral RNAs were selected by evolution to possess maximum functionality in a minimal sequence. Depending on the classification of the virus and the type of RNA in question, viral RNAs must alternately be replicated, spliced, transcribed, transported from the nucleus into the cytoplasm, translated and/or packaged into nascent virions, and in most cases, provide the sequence and structural determinants to facilitate these processes. One consequence of this compact multifunctionality is that viral RNA structures can be exquisitely complex, often involving intermolecular interactions with RNA or protein, intramolecular interactions between sequence segments separated by several thousands of nucleotides, or specialized motifs such as pseudoknots or kissing loops. The fluidity of viral RNA structure can also present a challenge when attempting to characterize it, as genomic RNAs especially are likely to sample numerous conformations at various stages of the virus life cycle. Here we review advances in chemoenzymatic structure probing that have made it possible to address such challenges with respect to cis-acting elements, full-length viral genomes and long non-coding RNAs that play a major role in regulating viral gene expression. PMID:29375504

  15. Genetic programs can be compressed and autonomously decompressed in live cells

    NASA Astrophysics Data System (ADS)

    Lapique, Nicolas; Benenson, Yaakov

    2018-04-01

    Fundamental computer science concepts have inspired novel information-processing molecular systems in test tubes1-13 and genetically encoded circuits in live cells14-21. Recent research has shown that digital information storage in DNA, implemented using deep sequencing and conventional software, can approach the maximum Shannon information capacity22 of two bits per nucleotide23. In nature, DNA is used to store genetic programs, but the information content of the encoding rarely approaches this maximum24. We hypothesize that the biological function of a genetic program can be preserved while reducing the length of its DNA encoding and increasing the information content per nucleotide. Here we support this hypothesis by describing an experimental procedure for compressing a genetic program and its subsequent autonomous decompression and execution in human cells. As a test-bed we choose an RNAi cell classifier circuit25 that comprises redundant DNA sequences and is therefore amenable for compression, as are many other complex gene circuits15,18,26-28. In one example, we implement a compressed encoding of a ten-gene four-input AND gate circuit using only four genetic constructs. The compression principles applied to gene circuits can enable fitting complex genetic programs into DNA delivery vehicles with limited cargo capacity, and storing compressed and biologically inert programs in vivo for on-demand activation.

  16. Comparisons between Arabidopsis thaliana and Drosophila melanogaster in relation to Coding and Noncoding Sequence Length and Gene Expression

    PubMed Central

    Caldwell, Rachel; Lin, Yan-Xia; Zhang, Ren

    2015-01-01

    There is a continuing interest in the analysis of gene architecture and gene expression to determine the relationship that may exist. Advances in high-quality sequencing technologies and large-scale resource datasets have increased the understanding of relationships and cross-referencing of expression data to the large genome data. Although a negative correlation between expression level and gene (especially transcript) length has been generally accepted, there have been some conflicting results arising from the literature concerning the impacts of different regions of genes, and the underlying reason is not well understood. The research aims to apply quantile regression techniques for statistical analysis of coding and noncoding sequence length and gene expression data in the plant, Arabidopsis thaliana, and fruit fly, Drosophila melanogaster, to determine if a relationship exists and if there is any variation or similarities between these species. The quantile regression analysis found that the coding sequence length and gene expression correlations varied, and similarities emerged for the noncoding sequence length (5′ and 3′ UTRs) between animal and plant species. In conclusion, the information described in this study provides the basis for further exploration into gene regulation with regard to coding and noncoding sequence length. PMID:26114098

  17. Characterization of a Novel Subgroup of Extracellular Medium-Chain-Length Polyhydroxyalkanoate Depolymerases from Actinobacteria

    PubMed Central

    Gangoiti, Joana; Santos, Marta; Prieto, María Auxiliadora; de la Mata, Isabel; Llama, María J.

    2012-01-01

    Nineteen medium-chain-length (mcl) poly(3-hydroxyalkanoate) (PHA)-degrading microorganisms were isolated from natural sources. From them, seven Gram-positive and three Gram-negative bacteria were identified. The ability of these microorganisms to hydrolyze other biodegradable plastics, such as short-chain-length (scl) PHA, poly(ε-caprolactone) (PCL), poly(ethylene succinate) (PES), and poly(l-lactide) (PLA), has been studied. On the basis of the great ability to degrade different polyesters, Streptomyces roseolus SL3 was selected, and its extracellular depolymerase was biochemically characterized. The enzyme consisted of one polypeptide chain of 28 kDa with a pI value of 5.2. Its maximum activity was observed at pH 9.5 with chromogenic substrates. The purified enzyme hydrolyzed mcl PHA and PCL but not scl PHA, PES, and PLA. Moreover, the mcl PHA depolymerase can hydrolyze various substrates for esterases, such as tributyrin and p-nitrophenyl (pNP)-alkanoates, with its maximum activity being measured with pNP-octanoate. Interestingly, when poly(3-hydroxyoctanoate-co-3-hydroxyhexanoate [11%]) was used as the substrate, the main hydrolysis product was the monomer (R)-3-hydroxyoctanoate. In addition, the genes of several Actinobacteria strains, including S. roseolus SL3, were identified on the basis of the peptide de novo sequencing of the Streptomyces venezuelae SO1 mcl PHA depolymerase by tandem mass spectrometry. These enzymes did not show significant similarity to mcl PHA depolymerases characterized previously. Our results suggest that these distinct enzymes might represent a new subgroup of mcl PHA depolymerases. PMID:22865072

  18. Fishing and bottom water temperature as drivers of change in maximum shell length in Atlantic surfclams (Spisula solidissima)

    NASA Astrophysics Data System (ADS)

    Munroe, D. M.; Narváez, D. A.; Hennen, D.; Jacobson, L.; Mann, R.; Hofmann, E. E.; Powell, E. N.; Klinck, J. M.

    2016-03-01

    Maximum shell length of Atlantic surfclams (Spisula solidissima) on the Middle Atlantic Bight (MAB) continental shelf, obtained from federal fishery survey data from 1982-present, has decreased by 15-20 mm. Two potential causes of this decreasing trend, fishery removal of large animals and stress due to warming bottom temperatures, were investigated using an individual-based model for post-settlement surfclams and a fifty-year hindcast of bottom water temperatures on the MAB. Simulations showed that fishing and/or warming bottom water temperature can cause decreases in maximum surfclam shell length (body size) equivalent to those observed in the fished stock. Independently, either localized fishing rates of 20% or sustained bottom temperatures that are 2 °C warmer than average conditions generate the observed decrease in maximum shell length. However, these independent conditions represent extremes and are not sustained in the MAB. The combined effects of fishing and warmer temperatures can generate simulated length decreases that are similar to observed decreases. Interannual variability in bottom water temperatures can also generate fluctuations in simulated shell length of up to 20 mm over a period of 10-15 years. If the change in maximum size is not genotypic, simulations also suggest that shell size composition of surfclam populations can recover if conditions change; however, that recovery could take a decade to become evident.

  19. Coiled-coil length: Size does matter.

    PubMed

    Surkont, Jaroslaw; Diekmann, Yoan; Ryder, Pearl V; Pereira-Leal, Jose B

    2015-12-01

    Protein evolution is governed by processes that alter primary sequence but also the length of proteins. Protein length may change in different ways, but insertions, deletions and duplications are the most common. An optimal protein size is a trade-off between sequence extension, which may change protein stability or lead to acquisition of a new function, and shrinkage that decreases metabolic cost of protein synthesis. Despite the general tendency for length conservation across orthologous proteins, the propensity to accept insertions and deletions is heterogeneous along the sequence. For example, protein regions rich in repetitive peptide motifs are well known to extensively vary their length across species. Here, we analyze length conservation of coiled-coils, domains formed by an ubiquitous, repetitive peptide motif present in all domains of life, that frequently plays a structural role in the cell. We observed that, despite the repetitive nature, the length of coiled-coil domains is generally highly conserved throughout the tree of life, even when the remaining parts of the protein change, including globular domains. Length conservation is independent of primary amino acid sequence variation, and represents a conservation of domain physical size. This suggests that the conservation of domain size is due to functional constraints. © 2015 Wiley Periodicals, Inc.

  20. Database-independent Protein Sequencing (DiPS) Enables Full-length de Novo Protein and Antibody Sequence Determination.

    PubMed

    Savidor, Alon; Barzilay, Rotem; Elinger, Dalia; Yarden, Yosef; Lindzen, Moshit; Gabashvili, Alexandra; Adiv Tal, Ophir; Levin, Yishai

    2017-06-01

    Traditional "bottom-up" proteomic approaches use proteolytic digestion, LC-MS/MS, and database searching to elucidate peptide identities and their parent proteins. Protein sequences absent from the database cannot be identified, and even if present in the database, complete sequence coverage is rarely achieved even for the most abundant proteins in the sample. Thus, sequencing of unknown proteins such as antibodies or constituents of metaproteomes remains a challenging problem. To date, there is no available method for full-length protein sequencing, independent of a reference database, in high throughput. Here, we present Database-independent Protein Sequencing, a method for unambiguous, rapid, database-independent, full-length protein sequencing. The method is a novel combination of non-enzymatic, semi-random cleavage of the protein, LC-MS/MS analysis, peptide de novo sequencing, extraction of peptide tags, and their assembly into a consensus sequence using an algorithm named "Peptide Tag Assembler." As proof-of-concept, the method was applied to samples of three known proteins representing three size classes and to a previously un-sequenced, clinically relevant monoclonal antibody. Excluding leucine/isoleucine and glutamic acid/deamidated glutamine ambiguities, end-to-end full-length de novo sequencing was achieved with 99-100% accuracy for all benchmarking proteins and the antibody light chain. Accuracy of the sequenced antibody heavy chain, including the entire variable region, was also 100%, but there was a 23-residue gap in the constant region sequence. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  1. High-Resolution Sequence-Function Mapping of Full-Length Proteins

    PubMed Central

    Kowalsky, Caitlin A.; Klesmith, Justin R.; Stapleton, James A.; Kelly, Vince; Reichkitzer, Nolan; Whitehead, Timothy A.

    2015-01-01

    Comprehensive sequence-function mapping involves detailing the fitness contribution of every possible single mutation to a gene by comparing the abundance of each library variant before and after selection for the phenotype of interest. Deep sequencing of library DNA allows frequency reconstruction for tens of thousands of variants in a single experiment, yet short read lengths of current sequencers makes it challenging to probe genes encoding full-length proteins. Here we extend the scope of sequence-function maps to entire protein sequences with a modular, universal sequence tiling method. We demonstrate the approach with both growth-based selections and FACS screening, offer parameters and best practices that simplify design of experiments, and present analytical solutions to normalize data across independent selections. Using this protocol, sequence-function maps covering full sequences can be obtained in four to six weeks. Best practices introduced in this manuscript are fully compatible with, and complementary to, other recently published sequence-function mapping protocols. PMID:25790064

  2. Cost-effective sequencing of full-length cDNA clones powered by a de novo-reference hybrid assembly.

    PubMed

    Kuroshu, Reginaldo M; Watanabe, Junichi; Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka; Kasahara, Masahiro

    2010-05-07

    Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence approximately 800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only approximately US$3 per clone, demonstrating a significant advantage over previous approaches.

  3. Influence of time and length size feature selections for human activity sequences recognition.

    PubMed

    Fang, Hongqing; Chen, Long; Srinivasan, Raghavendiran

    2014-01-01

    In this paper, Viterbi algorithm based on a hidden Markov model is applied to recognize activity sequences from observed sensors events. Alternative features selections of time feature values of sensors events and activity length size feature values are tested, respectively, and then the results of activity sequences recognition performances of Viterbi algorithm are evaluated. The results show that the selection of larger time feature values of sensor events and/or smaller activity length size feature values will generate relatively better results on the activity sequences recognition performances. © 2013 ISA Published by ISA All rights reserved.

  4. A high HIV-1 strain variability in London, UK, revealed by full-genome analysis: Results from the ICONIC project.

    PubMed

    Yebra, Gonzalo; Frampton, Dan; Gallo Cassarino, Tiziano; Raffle, Jade; Hubb, Jonathan; Ferns, R Bridget; Waters, Laura; Tong, C Y William; Kozlakidis, Zisis; Hayward, Andrew; Kellam, Paul; Pillay, Deenan; Clark, Duncan; Nastouli, Eleni; Leigh Brown, Andrew J

    2018-01-01

    The ICONIC project has developed an automated high-throughput pipeline to generate HIV nearly full-length genomes (NFLG, i.e. from gag to nef) from next-generation sequencing (NGS) data. The pipeline was applied to 420 HIV samples collected at University College London Hospitals NHS Trust and Barts Health NHS Trust (London) and sequenced using an Illumina MiSeq at the Wellcome Trust Sanger Institute (Cambridge). Consensus genomes were generated and subtyped using COMET, and unique recombinants were studied with jpHMM and SimPlot. Maximum-likelihood phylogenetic trees were constructed using RAxML to identify transmission networks using the Cluster Picker. The pipeline generated sequences of at least 1Kb of length (median = 7.46Kb, IQR = 4.01Kb) for 375 out of the 420 samples (89%), with 174 (46.4%) being NFLG. A total of 365 sequences (169 of them NFLG) corresponded to unique subjects and were included in the down-stream analyses. The most frequent HIV subtypes were B (n = 149, 40.8%) and C (n = 77, 21.1%) and the circulating recombinant form CRF02_AG (n = 32, 8.8%). We found 14 different CRFs (n = 66, 18.1%) and multiple URFs (n = 32, 8.8%) that involved recombination between 12 different subtypes/CRFs. The most frequent URFs were B/CRF01_AE (4 cases) and A1/D, B/C, and B/CRF02_AG (3 cases each). Most URFs (19/26, 73%) lacked breakpoints in the PR+RT pol region, rendering them undetectable if only that was sequenced. Twelve (37.5%) of the URFs could have emerged within the UK, whereas the rest were probably imported from sub-Saharan Africa, South East Asia and South America. For 2 URFs we found highly similar pol sequences circulating in the UK. We detected 31 phylogenetic clusters using the full dataset: 25 pairs (mostly subtypes B and C), 4 triplets and 2 quadruplets. Some of these were not consistent across different genes due to inter- and intra-subtype recombination. Clusters involved 70 sequences, 19.2% of the dataset. The initial analysis of genome sequences detected substantial hidden variability in the London HIV epidemic. Analysing full genome sequences, as opposed to only PR+RT, identified previously undetected recombinants. It provided a more reliable description of CRFs (that would be otherwise misclassified) and transmission clusters.

  5. Distribution and cluster analysis of predicted intrinsically disordered protein Pfam domains

    PubMed Central

    Williams, Robert W; Xue, Bin; Uversky, Vladimir N; Dunker, A Keith

    2013-01-01

    The Pfam database groups regions of proteins by how well hidden Markov models (HMMs) can be trained to recognize similarities among them. Conservation pressure is probably in play here. The Pfam seed training set includes sequence and structure information, being drawn largely from the PDB. A long standing hypothesis among intrinsically disordered protein (IDP) investigators has held that conservation pressures are also at play in the evolution of different kinds of intrinsic disorder, but we find that predicted intrinsic disorder (PID) is not always conserved across Pfam domains. Here we analyze distributions and clusters of PID regions in 193024 members of the version 23.0 Pfam seed database. To include the maximum information available for proteins that remain unfolded in solution, we employ the 10 linearly independent Kidera factors1–3 for the amino acids, combined with PONDR4 predictions of disorder tendency, to transform the sequences of these Pfam members into an 11 column matrix where the number of rows is the length of each Pfam region. Cluster analyses of the set of all regions, including those that are folded, show 6 groupings of domains. Cluster analyses of domains with mean VSL2b scores greater than 0.5 (half predicted disorder or more) show at least 3 separated groups. It is hypothesized that grouping sets into shorter sequences with more uniform length will reveal more information about intrinsic disorder and lead to more finely structured and perhaps more accurate predictions. HMMs could be trained to include this information. PMID:28516017

  6. Advanced Sine Wave Modulation of Continuous Wave Laser System for Atmospheric CO2 Differential Absorption Measurements

    NASA Technical Reports Server (NTRS)

    Campbell, Joel F.; Lin, Bing; Nehrir, Amin R.

    2014-01-01

    NASA Langley Research Center in collaboration with ITT Exelis have been experimenting with Continuous Wave (CW) laser absorption spectrometer (LAS) as a means of performing atmospheric CO2 column measurements from space to support the Active Sensing of CO2 Emissions over Nights, Days, and Seasons (ASCENDS) mission.Because range resolving Intensity Modulated (IM) CW lidar techniques presented here rely on matched filter correlations, autocorrelation properties without side lobes or other artifacts are highly desirable since the autocorrelation function is critical for the measurements of lidar return powers, laser path lengths, and CO2 column amounts. In this paper modulation techniques are investigated that improve autocorrelation properties. The modulation techniques investigated in this paper include sine waves modulated by maximum length (ML) sequences in various hardware configurations. A CW lidar system using sine waves modulated by ML pseudo random noise codes is described, which uses a time shifting approach to separate channels and make multiple, simultaneous online/offline differential absorption measurements. Unlike the pure ML sequence, this technique is useful in hardware that is band pass filtered as the IM sine wave carrier shifts the main power band. Both amplitude and Phase Shift Keying (PSK) modulated IM carriers are investigated that exibit perfect autocorrelation properties down to one cycle per code bit. In addition, a method is presented to bandwidth limit the ML sequence based on a Gaussian filter implemented in terms of Jacobi theta functions that does not seriously degrade the resolution or introduce side lobes as a means of reducing aliasing and IM carrier bandwidth.

  7. Designing deep sequencing experiments: detecting structural variation and estimating transcript abundance.

    PubMed

    Bashir, Ali; Bansal, Vikas; Bafna, Vineet

    2010-06-18

    Massively parallel DNA sequencing technologies have enabled the sequencing of several individual human genomes. These technologies are also being used in novel ways for mRNA expression profiling, genome-wide discovery of transcription-factor binding sites, small RNA discovery, etc. The multitude of sequencing platforms, each with their unique characteristics, pose a number of design challenges, regarding the technology to be used and the depth of sequencing required for a particular sequencing application. Here we describe a number of analytical and empirical results to address design questions for two applications: detection of structural variations from paired-end sequencing and estimating mRNA transcript abundance. For structural variation, our results provide explicit trade-offs between the detection and resolution of rearrangement breakpoints, and the optimal mix of paired-read insert lengths. Specifically, we prove that optimal detection and resolution of breakpoints is achieved using a mix of exactly two insert library lengths. Furthermore, we derive explicit formulae to determine these insert length combinations, enabling a 15% improvement in breakpoint detection at the same experimental cost. On empirical short read data, these predictions show good concordance with Illumina 200 bp and 2 Kbp insert length libraries. For transcriptome sequencing, we determine the sequencing depth needed to detect rare transcripts from a small pilot study. With only 1 Million reads, we derive corrections that enable almost perfect prediction of the underlying expression probability distribution, and use this to predict the sequencing depth required to detect low expressed genes with greater than 95% probability. Together, our results form a generic framework for many design considerations related to high-throughput sequencing. We provide software tools http://bix.ucsd.edu/projects/NGS-DesignTools to derive platform independent guidelines for designing sequencing experiments (amount of sequencing, choice of insert length, mix of libraries) for novel applications of next generation sequencing.

  8. 5 CFR 890.1015 - Minimum and maximum length of permissive debarments.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... debarments. 890.1015 Section 890.1015 Administrative Personnel OFFICE OF PERSONNEL MANAGEMENT (CONTINUED) CIVIL SERVICE REGULATIONS (CONTINUED) FEDERAL EMPLOYEES HEALTH BENEFITS PROGRAM Administrative Sanctions Imposed Against Health Care Providers Permissive Debarments § 890.1015 Minimum and maximum length of...

  9. 5 CFR 890.1015 - Minimum and maximum length of permissive debarments.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... debarments. 890.1015 Section 890.1015 Administrative Personnel OFFICE OF PERSONNEL MANAGEMENT (CONTINUED) CIVIL SERVICE REGULATIONS (CONTINUED) FEDERAL EMPLOYEES HEALTH BENEFITS PROGRAM Administrative Sanctions Imposed Against Health Care Providers Permissive Debarments § 890.1015 Minimum and maximum length of...

  10. 5 CFR 890.1015 - Minimum and maximum length of permissive debarments.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... debarments. 890.1015 Section 890.1015 Administrative Personnel OFFICE OF PERSONNEL MANAGEMENT (CONTINUED) CIVIL SERVICE REGULATIONS (CONTINUED) FEDERAL EMPLOYEES HEALTH BENEFITS PROGRAM Administrative Sanctions Imposed Against Health Care Providers Permissive Debarments § 890.1015 Minimum and maximum length of...

  11. Species identification of mutans streptococci by groESL gene sequence.

    PubMed

    Hung, Wei-Chung; Tsai, Jui-Chang; Hsueh, Po-Ren; Chia, Jean-San; Teng, Lee-Jene

    2005-09-01

    The near full-length sequences of the groESL genes were determined and analysed among eight reference strains (serotypes a to h) representing five species of mutans group streptococci. The groES sequences from these reference strains revealed that there are two lengths (285 and 288 bp) in the five species. The intergenic spacer between groES and groEL appears to be a unique marker for species, with a variable size (ranging from 111 to 310 bp) and sequence. Phylogenetic analysis of groES and groEL separated the eight serotypes into two major clusters. Strains of serotypes b, c, e and f were highly related and had groES gene sequences of the same length, 288 bp, while strains of serotypes a, d, g and h were also closely related and their groES gene sequence lengths were 285 bp. The groESL sequences in clinical isolates of three serotypes of S. mutans were analysed for intraspecies polymorphism. The results showed that the groESL sequences could provide information for differentiation among species, but were unable to distinguish serotypes of the same species. Based on the determined sequences, a PCR assay was developed that could differentiate members of the mutans streptococci by amplicon size and provide an alternative way for distinguishing mutans streptococci from other viridans streptococci.

  12. Baseline studies to select the most sound and sensitive sites to install continuous monitoring per sismo-geochemical networks. The case history of the Norcia-Amatrice-Spoleto seismic sequences (2016-2017)

    NASA Astrophysics Data System (ADS)

    Quattrocchi, F.; Gallo, F.

    2017-12-01

    The paper review methodologically and historically - in the frame of seismo-geochemical studies in Italy and abroad to select the most "sensitive" sites along active faults, mostly where structural geology is not able to discover "blind" faults or complex fault crossing systems, with maximum fluids-faults interaction. The paper is highlighting the "site specific" case histories and processes helping in networks design, gathered in occasion of strong-moderate earthquakes, gas-burst or groundwater evolution in geothermal-hydrocarbons field during EU projects (i.e., Geochemical Seismic Zonation, 3F-Faults-Fractures-Fluids Corinth). Some concepts are highlighted based on gather experimental data in 25 years: - if the network is in soil gas is necessary a preliminary study on groundwater too, to understand the sectors of shallow aquifers, as "buffer" bodies, more prone to be oversaturated by geogas from depth; a preliminary grid should consider both the CO2-CH4-Rn fluxes, all gas concentrations and isotopes analyses (TDIC, CH4 CO2 , rare gas) case by case described here, mostly where the regional faults are crossing each other and where a carrier gas is acting i.e., CO2. It is very un-correct to install mono-parametric stations, i.e. only Radon to understand the real WRI processes. - if the network is in groundwater is very important a preliminary study before, during and after seismic sequences, to realize where the maximum anomalies (i.e., anomalous animal behavior, temperature increasing, geochemical anomalies, new gas relase) are and will be envisaged, as found for the Umbria-Marche border (the Colfiorito 1997-1998 and the 2016-2017 Norcia-Amatrice seismic sequences), where a deep pore-pressure dominated situation could be constrained by seismo-geochemistry, along "still silent" close fault segments too. if the network is in groundwater is very important a geochemical multidisciplinary approach to constrain the segment length and relative maximum magnitude.

  13. Thermoelectric effect and its dependence on molecular length and sequence in single DNA molecules.

    PubMed

    Li, Yueqi; Xiang, Limin; Palma, Julio L; Asai, Yoshihiro; Tao, Nongjian

    2016-04-15

    Studying the thermoelectric effect in DNA is important for unravelling charge transport mechanisms and for developing relevant applications of DNA molecules. Here we report a study of the thermoelectric effect in single DNA molecules. By varying the molecular length and sequence, we tune the charge transport in DNA to either a hopping- or tunnelling-dominated regimes. The thermoelectric effect is small and insensitive to the molecular length in the hopping regime. In contrast, the thermoelectric effect is large and sensitive to the length in the tunnelling regime. These findings indicate that one may control the thermoelectric effect in DNA by varying its sequence and length. We describe the experimental results in terms of hopping and tunnelling charge transport models.

  14. Thermoelectric effect and its dependence on molecular length and sequence in single DNA molecules

    PubMed Central

    Li, Yueqi; Xiang, Limin; Palma, Julio L.; Asai, Yoshihiro; Tao, Nongjian

    2016-01-01

    Studying the thermoelectric effect in DNA is important for unravelling charge transport mechanisms and for developing relevant applications of DNA molecules. Here we report a study of the thermoelectric effect in single DNA molecules. By varying the molecular length and sequence, we tune the charge transport in DNA to either a hopping- or tunnelling-dominated regimes. The thermoelectric effect is small and insensitive to the molecular length in the hopping regime. In contrast, the thermoelectric effect is large and sensitive to the length in the tunnelling regime. These findings indicate that one may control the thermoelectric effect in DNA by varying its sequence and length. We describe the experimental results in terms of hopping and tunnelling charge transport models. PMID:27079152

  15. The kinetics of pre-mRNA splicing in the Drosophila genome and the influence of gene architecture.

    PubMed

    Pai, Athma A; Henriques, Telmo; McCue, Kayla; Burkholder, Adam; Adelman, Karen; Burge, Christopher B

    2017-12-27

    Production of most eukaryotic mRNAs requires splicing of introns from pre-mRNA. The splicing reaction requires definition of splice sites, which are initially recognized in either intron-spanning ('intron definition') or exon-spanning ('exon definition') pairs. To understand how exon and intron length and splice site recognition mode impact splicing, we measured splicing rates genome-wide in Drosophila , using metabolic labeling/RNA sequencing and new mathematical models to estimate rates. We found that the modal intron length range of 60-70 nt represents a local maximum of splicing rates, but that much longer exon-defined introns are spliced even faster and more accurately. We observed unexpectedly low variation in splicing rates across introns in the same gene, suggesting the presence of gene-level influences, and we identified multiple gene level variables associated with splicing rate. Together our data suggest that developmental and stress response genes may have preferentially evolved exon definition in order to enhance the rate or accuracy of splicing.

  16. Dependence of Some Mechanical Properties of Elastic Bands on the Length and Load Time

    ERIC Educational Resources Information Center

    Triana, C. A.; Fajardo, F.

    2012-01-01

    We present a study of the maximum stress supported by elastics bands of nitrile as a function of the natural length and the load time. The maximum tension of rupture and the corresponding variation in length were found by measuring the elongation of an elastic band when a mass is suspended from its free end until it reaches the breaking point. The…

  17. Frequent statistics of link-layer bit stream data based on AC-IM algorithm

    NASA Astrophysics Data System (ADS)

    Cao, Chenghong; Lei, Yingke; Xu, Yiming

    2017-08-01

    At present, there are many relevant researches on data processing using classical pattern matching and its improved algorithm, but few researches on statistical data of link-layer bit stream. This paper adopts a frequent statistical method of link-layer bit stream data based on AC-IM algorithm for classical multi-pattern matching algorithms such as AC algorithm has high computational complexity, low efficiency and it cannot be applied to binary bit stream data. The method's maximum jump distance of the mode tree is length of the shortest mode string plus 3 in case of no missing? In this paper, theoretical analysis is made on the principle of algorithm construction firstly, and then the experimental results show that the algorithm can adapt to the binary bit stream data environment and extract the frequent sequence more accurately, the effect is obvious. Meanwhile, comparing with the classical AC algorithm and other improved algorithms, AC-IM algorithm has a greater maximum jump distance and less time-consuming.

  18. Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly

    PubMed Central

    Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka

    2010-01-01

    Background Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. Methodology We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence ∼800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. Conclusions The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only ∼US$3 per clone, demonstrating a significant advantage over previous approaches. PMID:20479877

  19. MEMS earthworm: a thermally actuated peristaltic linear micromotor

    NASA Astrophysics Data System (ADS)

    Arthur, Craig; Ellerington, Neil; Hubbard, Ted; Kujath, Marek

    2011-03-01

    This paper examines the design, fabrication and testing of a bio-mimetic MEMS (micro-electro mechanical systems) earthworm motor with external actuators. The motor consists of a passive mobile shuttle with two flexible diamond-shaped segments; each segment is independently squeezed by a pair of stationary chevron-shaped thermal actuators. Applying a specific sequence of squeezes to the earthworm segments, the shuttle can be driven backward or forward. Unlike existing inchworm drives that use clamping and thrusting actuators, the earthworm actuators apply only clamping forces to the shuttle, and lateral thrust is produced by the shuttle's compliant geometry. The earthworm assembly is fabricated using the PolyMUMPs process with planar dimensions of 400 µm width by 800 µm length. The stationary actuators operate within the range of 4-9 V and provide a maximum shuttle range of motion of 350 µm (approximately half its size), a maximum shuttle speed of 17 mm s-1 at 10 kHz, and a maximum dc shuttle force of 80 µN. The shuttle speed was found to vary linearly with both input voltage and input frequency. The shuttle force was found to vary linearly with the actuator voltage.

  20. Spike Code Flow in Cultured Neuronal Networks.

    PubMed

    Tamura, Shinichi; Nishitani, Yoshi; Hosokawa, Chie; Miyoshi, Tomomitsu; Sawai, Hajime; Kamimura, Takuya; Yagi, Yasushi; Mizuno-Matsumoto, Yuko; Chen, Yen-Wei

    2016-01-01

    We observed spike trains produced by one-shot electrical stimulation with 8 × 8 multielectrodes in cultured neuronal networks. Each electrode accepted spikes from several neurons. We extracted the short codes from spike trains and obtained a code spectrum with a nominal time accuracy of 1%. We then constructed code flow maps as movies of the electrode array to observe the code flow of "1101" and "1011," which are typical pseudorandom sequence such as that we often encountered in a literature and our experiments. They seemed to flow from one electrode to the neighboring one and maintained their shape to some extent. To quantify the flow, we calculated the "maximum cross-correlations" among neighboring electrodes, to find the direction of maximum flow of the codes with lengths less than 8. Normalized maximum cross-correlations were almost constant irrespective of code. Furthermore, if the spike trains were shuffled in interval orders or in electrodes, they became significantly small. Thus, the analysis suggested that local codes of approximately constant shape propagated and conveyed information across the network. Hence, the codes can serve as visible and trackable marks of propagating spike waves as well as evaluating information flow in the neuronal network.

  1. Length Variation, Heteroplasmy and Sequence Divergence in the Mitochondrial DNA of Four Species of Sturgeon (Acipenser)

    PubMed Central

    Brown, J. R.; Beckenbach, K.; Beckenbach, A. T.; Smith, M. J.

    1996-01-01

    The extent of mtDNA length variation and heteroplasmy as well as DNA sequences of the control region and two tRNA genes were determined for four North American sturgeon species: Acipenser transmontanus, A. medirostris, A. fulvescens and A. oxyrhnychus. Across the Continental Divide, a division in the occurrence of length variation and heteroplasmy was observed that was concordant with species biogeography as well as with phylogenies inferred from restriction fragment length polymorphisms (RFLP) of whole mtDNA and pairwise comparisons of unique sequences of the control region. In all species, mtDNA length variation was due to repeated arrays of 78-82-bp sequences each containing a D-loop strand synthesis termination associated sequence (TAS). Individual repeats showed greater sequence conservation within individuals and species rather than between species, which is suggestive of concerted evolution. Differences in the frequencies of multiple copy genomes and heteroplasmy among the four species may be ascribed to differences in the rates of recurrent mutation. A mechanism that may offset the high rate of mutation for increased copy number is suggested on the basis that an increase in the number of functional TAS motifs might reduce the frequency of successfully initiated H-strand replications. PMID:8852850

  2. Vibration energy harvesting using piezoelectric unimorph cantilevers with unequal piezoelectric and nonpiezoelectric lengths

    PubMed Central

    Gao, Xiaotong; Shih, Wei-Heng; Shih, Wan Y.

    2010-01-01

    We have examined a piezoelectric unimorph cantilever (PUC) with unequal piezoelectric and nonpiezoelectric lengths for vibration energy harvesting theoretically by extending the analysis of a PUC with equal piezoelectric and nonpiezoelectric lengths. The theoretical approach was validated by experiments. A case study showed that for a fixed vibration frequency, the maximum open-circuit induced voltage which was important for charge storage for later use occurred with a PUC that had a nonpiezoelectric-to-piezoelectric length ratio greater than unity, whereas the maximum power when the PUC was connected to a resistor for immediate power consumption occurred at a unity nonpiezoelectric-to-piezoelectric length ratio. PMID:21200444

  3. Vibration energy harvesting using piezoelectric unimorph cantilevers with unequal piezoelectric and nonpiezoelectric lengths.

    PubMed

    Gao, Xiaotong; Shih, Wei-Heng; Shih, Wan Y

    2010-12-06

    We have examined a piezoelectric unimorph cantilever (PUC) with unequal piezoelectric and nonpiezoelectric lengths for vibration energy harvesting theoretically by extending the analysis of a PUC with equal piezoelectric and nonpiezoelectric lengths. The theoretical approach was validated by experiments. A case study showed that for a fixed vibration frequency, the maximum open-circuit induced voltage which was important for charge storage for later use occurred with a PUC that had a nonpiezoelectric-to-piezoelectric length ratio greater than unity, whereas the maximum power when the PUC was connected to a resistor for immediate power consumption occurred at a unity nonpiezoelectric-to-piezoelectric length ratio.

  4. New powerful statistics for alignment-free sequence comparison under a pattern transfer model.

    PubMed

    Liu, Xuemei; Wan, Lin; Li, Jing; Reinert, Gesine; Waterman, Michael S; Sun, Fengzhu

    2011-09-07

    Alignment-free sequence comparison is widely used for comparing gene regulatory regions and for identifying horizontally transferred genes. Recent studies on the power of a widely used alignment-free comparison statistic D2 and its variants D*2 and D(s)2 showed that their power approximates a limit smaller than 1 as the sequence length tends to infinity under a pattern transfer model. We develop new alignment-free statistics based on D2, D*2 and D(s)2 by comparing local sequence pairs and then summing over all the local sequence pairs of certain length. We show that the new statistics are much more powerful than the corresponding statistics and the power tends to 1 as the sequence length tends to infinity under the pattern transfer model. Copyright © 2011 Elsevier Ltd. All rights reserved.

  5. New Powerful Statistics for Alignment-free Sequence Comparison Under a Pattern Transfer Model

    PubMed Central

    Liu, Xuemei; Wan, Lin; Li, Jing; Reinert, Gesine; Waterman, Michael S.; Sun, Fengzhu

    2011-01-01

    Alignment-free sequence comparison is widely used for comparing gene regulatory regions and for identifying horizontally transferred genes. Recent studies on the power of a widely used alignment-free comparison statistic D2 and its variants D2∗ and D2s showed that their power approximates a limit smaller than 1 as the sequence length tends to infinity under a pattern transfer model. We develop new alignment-free statistics based on D2, D2∗ and D2s by comparing local sequence pairs and then summing over all the local sequence pairs of certain length. We show that the new statistics are much more powerful than the corresponding statistics and the power tends to 1 as the sequence length tends to infinity under the pattern transfer model. PMID:21723298

  6. Velocity of sarcomere shortening in rat cardiac muscle: relationship to force, sarcomere length, calcium and time.

    PubMed

    Daniels, M; Noble, M I; ter Keurs, H E; Wohlfart, B

    1984-10-01

    The relation between force and velocity was determined in sixteen trabeculae of rat right ventricle as a function of time during a twitch, of sarcomere length and of external Ca2+ concentration, [Ca2+]o. The trabeculae were studied in modified Krebs-Henseleit solution at 25 degrees C. Force was measured with a semiconductor strain gauge. Sarcomere length was measured with a laser diffraction system. A servomotor system was used in which control could be switched between sarcomere length, muscle length and force. Force-velocity relations were derived from load clamps and from contractions in which sarcomere length was initially held constant followed by a quick release and slower release of the sarcomeres at controlled velocity. Force-velocity relations were fitted by Hill's equation (Hill, 1938), (Po-P) b = (P+a) V, where P = force, V = velocity, Po = isometric force in mN/mm2 and a and b are constants. For [Ca2+]o = 2.5 mM, with both interventions the values (mean +/- S.D.) were: b = 1.00 +/- 0.45 micron/s; a = 9.52 +/- 5.60 mN/mm2; Vo measured = 13.6 +/- 3.0 micron/s; Vo calculated = 13.4 +/- 3.4 micron/s; Po measured = 96.5 +/- 25.0 mN/mm2; Po calculated = 119.3 +/- 34.5 mN/mm2. Vo rose with [Ca2+]o to a maximum at [Ca2+]o = 1.2 mM when Po was about 50% of maximum, while Po rose with [Ca2+]o to a maximum at above 2.5 mM. Vo rose with time during the twitch to a maximum at 25 ms following onset of contraction; Po was then about 50% of the maximum that was obtained at 120 ms. Vo increased with sarcomere length from zero at a sarcomere length of 1.6 micron to a maximum at 1.85 micron. Between 1.85 micron and 2.3 micron, Vo was constant. At 1.85 micron, Po was about 60% of maximum Po. These results are compatible with the hypothesis that Vo is more sensitive than Po to the amount of Ca2+ bound to the contractile proteins, and that Vo reaches a maximal value with an amount of Ca2+ bound to the contractile proteins at which Po has obtained only about 50% of its maximal value.

  7. An improved and validated RNA HLA class I SBT approach for obtaining full length coding sequences.

    PubMed

    Gerritsen, K E H; Olieslagers, T I; Groeneweg, M; Voorter, C E M; Tilanus, M G J

    2014-11-01

    The functional relevance of human leukocyte antigen (HLA) class I allele polymorphism beyond exons 2 and 3 is difficult to address because more than 70% of the HLA class I alleles are defined by exons 2 and 3 sequences only. For routine application on clinical samples we improved and validated the HLA sequence-based typing (SBT) approach based on RNA templates, using either a single locus-specific or two overlapping group-specific polymerase chain reaction (PCR) amplifications, with three forward and three reverse sequencing reactions for full length sequencing. Locus-specific HLA typing with RNA SBT of a reference panel, representing the major antigen groups, showed identical results compared to DNA SBT typing. Alleles encountered with unknown exons in the IMGT/HLA database and three samples, two with Null and one with a Low expressed allele, have been addressed by the group-specific RNA SBT approach to obtain full length coding sequences. This RNA SBT approach has proven its value in our routine full length definition of alleles. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  8. Memory for tonal pitches: a music-length effect hypothesis.

    PubMed

    Akiva-Kabiri, Lilach; Vecchi, Tomaso; Granot, Roni; Basso, Demis; Schön, Daniele

    2009-07-01

    One of the most studied effects of verbal working memory (WM) is the influence of the length of the words that compose the list to be remembered. This work aims to investigate the nature of musical WM by replicating the word length effect in the musical domain. Length and rate of presentation were manipulated in a recognition task of tone sequences. Results showed significant effects for both factors (length and presentation rate) as well as their interaction, suggesting the existence of different strategies (e.g., chunking and rehearsal) for the immediate memory of musical information, depending upon the length of the sequences.

  9. On the normalization of the minimum free energy of RNAs by sequence length.

    PubMed

    Trotta, Edoardo

    2014-01-01

    The minimum free energy (MFE) of ribonucleic acids (RNAs) increases at an apparent linear rate with sequence length. Simple indices, obtained by dividing the MFE by the number of nucleotides, have been used for a direct comparison of the folding stability of RNAs of various sizes. Although this normalization procedure has been used in several studies, the relationship between normalized MFE and length has not yet been investigated in detail. Here, we demonstrate that the variation of MFE with sequence length is not linear and is significantly biased by the mathematical formula used for the normalization procedure. For this reason, the normalized MFEs strongly decrease as hyperbolic functions of length and produce unreliable results when applied for the comparison of sequences with different sizes. We also propose a simple modification of the normalization formula that corrects the bias enabling the use of the normalized MFE for RNAs longer than 40 nt. Using the new corrected normalized index, we analyzed the folding free energies of different human RNA families showing that most of them present an average MFE density more negative than expected for a typical genomic sequence. Furthermore, we found that a well-defined and restricted range of MFE density characterizes each RNA family, suggesting the use of our corrected normalized index to improve RNA prediction algorithms. Finally, in coding and functional human RNAs the MFE density appears scarcely correlated with sequence length, consistent with a negligible role of thermodynamic stability demands in determining RNA size.

  10. On the Normalization of the Minimum Free Energy of RNAs by Sequence Length

    PubMed Central

    Trotta, Edoardo

    2014-01-01

    The minimum free energy (MFE) of ribonucleic acids (RNAs) increases at an apparent linear rate with sequence length. Simple indices, obtained by dividing the MFE by the number of nucleotides, have been used for a direct comparison of the folding stability of RNAs of various sizes. Although this normalization procedure has been used in several studies, the relationship between normalized MFE and length has not yet been investigated in detail. Here, we demonstrate that the variation of MFE with sequence length is not linear and is significantly biased by the mathematical formula used for the normalization procedure. For this reason, the normalized MFEs strongly decrease as hyperbolic functions of length and produce unreliable results when applied for the comparison of sequences with different sizes. We also propose a simple modification of the normalization formula that corrects the bias enabling the use of the normalized MFE for RNAs longer than 40 nt. Using the new corrected normalized index, we analyzed the folding free energies of different human RNA families showing that most of them present an average MFE density more negative than expected for a typical genomic sequence. Furthermore, we found that a well-defined and restricted range of MFE density characterizes each RNA family, suggesting the use of our corrected normalized index to improve RNA prediction algorithms. Finally, in coding and functional human RNAs the MFE density appears scarcely correlated with sequence length, consistent with a negligible role of thermodynamic stability demands in determining RNA size. PMID:25405875

  11. What is a melody? On the relationship between pitch and brightness of timbre.

    PubMed

    Cousineau, Marion; Carcagno, Samuele; Demany, Laurent; Pressnitzer, Daniel

    2013-01-01

    Previous studies showed that the perceptual processing of sound sequences is more efficient when the sounds vary in pitch than when they vary in loudness. We show here that sequences of sounds varying in brightness of timbre are processed with the same efficiency as pitch sequences. The sounds used consisted of two simultaneous pure tones one octave apart, and the listeners' task was to make same/different judgments on pairs of sequences varying in length (one, two, or four sounds). In one condition, brightness of timbre was varied within the sequences by changing the relative level of the two pure tones. In other conditions, pitch was varied by changing fundamental frequency, or loudness was varied by changing the overall level. In all conditions, only two possible sounds could be used in a given sequence, and these two sounds were equally discriminable. When sequence length increased from one to four, discrimination performance decreased substantially for loudness sequences, but to a smaller extent for brightness sequences and pitch sequences. In the latter two conditions, sequence length had a similar effect on performance. These results suggest that the processes dedicated to pitch and brightness analysis, when probed with a sequence-discrimination task, share unexpected similarities.

  12. Poly A tail length analysis of in vitro transcribed mRNA by LC-MS.

    PubMed

    Beverly, Michael; Hagen, Caitlin; Slack, Olga

    2018-02-01

    The 3'-polyadenosine (poly A) tail of in vitro transcribed (IVT) mRNA was studied using liquid chromatography coupled to mass spectrometry (LC-MS). Poly A tails were cleaved from the mRNA using ribonuclease T1 followed by isolation with dT magnetic beads. Extracted tails were then analyzed by LC-MS which provided tail length information at single-nucleotide resolution. A 2100-nt mRNA with plasmid-encoded poly A tail lengths of either 27, 64, 100, or 117 nucleotides was used for these studies as enzymatically added poly A tails showed significant length heterogeneity. The number of As observed in the tails closely matched Sanger sequencing results of the DNA template, and even minor plasmid populations with sequence variations were detected. When the plasmid sequence contained a discreet number of poly As in the tail, analysis revealed a distribution that included tails longer than the encoded tail lengths. These observations were consistent with transcriptional slippage of T7 RNAP taking place within a poly A sequence. The type of RNAP did not alter the observed tail distribution, and comparison of T3, T7, and SP6 showed all three RNAPs produced equivalent tail length distributions. The addition of a sequence at the 3' end of the poly A tail did, however, produce narrower tail length distributions which supports a previously described model of slippage where the 3' end can be locked in place by having a G or C after the poly nucleotide region. Graphical abstract Determination of mRNA poly A tail length using magnetic beads and LC-MS.

  13. Identification of full-length proviral DNA of porcine endogenous retrovirus from Chinese Wuzhishan miniature pigs inbred.

    PubMed

    Ma, Yuyuan; Lv, Maomin; Xu, Shu; Wu, Jianmin; Tian, Kegong; Zhang, Jingang

    2010-07-01

    Existence of porcine endogenous retrovirus (PERV) hinders pigs to be used in clinical xenotransplantation to alleviate the shortage of human transplants. Chinese miniature pigs are potential organ donors for xenotransplantation in China. However, so far, an adequate level of information on the molecular characteristics of PERV from Chinese miniature pigs has not been available. We described here the cloning and characterization of full-length proviral DNA of PERV from Chinese Wuzhishan miniature pigs inbred (WZSP). Full-length nucleotide sequences of PERV-WZSP and other PERVs were aligned and phylogenetic tree was constructed from deduced amino-acid sequences of env. The results demonstrated that the full-length proviral DNA of PERV-WZSP belongs to gammaretrovirus and shares high similarity with other PERVs. Sequence analysis also suggested that different patterns of LTR existed in the same porcine germ line and partial PERV-C sequence may recombine with PERV-A sequence in LTR. (c) 2008 Elsevier Ltd. All rights reserved.

  14. Multiple symbol partially coherent detection of MPSK

    NASA Technical Reports Server (NTRS)

    Simon, M. K.; Divsalar, D.

    1992-01-01

    It is shown that by using the known (or estimated) value of carrier tracking loop signal to noise ratio (SNR) in the decision metric, it is possible to improve the error probability performance of a partially coherent multiple phase-shift-keying (MPSK) system relative to that corresponding to the commonly used ideal coherent decision rule. Using a maximum-likeihood approach, an optimum decision metric is derived and shown to take the form of a weighted sum of the ideal coherent decision metric (i.e., correlation) and the noncoherent decision metric which is optimum for differential detection of MPSK. The performance of a receiver based on this optimum decision rule is derived and shown to provide continued improvement with increasing length of observation interval (data symbol sequence length). Unfortunately, increasing the observation length does not eliminate the error floor associated with the finite loop SNR. Nevertheless, in the limit of infinite observation length, the average error probability performance approaches the algebraic sum of the error floor and the performance of ideal coherent detection, i.e., at any error probability above the error floor, there is no degradation due to the partial coherence. It is shown that this limiting behavior is virtually achievable with practical size observation lengths. Furthermore, the performance is quite insensitive to mismatch between the estimate of loop SNR (e.g., obtained from measurement) fed to the decision metric and its true value. These results may be of use in low-cost Earth-orbiting or deep-space missions employing coded modulations.

  15. Sequencing and analysis of 10967 full-length cDNA clones from Xenopus laevis and Xenopus tropicalis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morin, R D; Chang, E; Petrescu, A

    2005-10-31

    Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection initiative. Here we present an analysis of 10967 clones (8049 from X. laevis and 2918 from X. tropicalis). The clone set contains 2013 orthologs between X. laevis and X. tropicalis as well as 1795 paralog pairs within X. laevis. 1199 are in-paralogs, believed to have resulted from an allotetraploidization event approximately 30 million years ago, and the remaining 546 are likely out-paralogs that have resulted from more ancient gene duplications, prior to the divergence betweenmore » the two species. We do not detect any evidence for positive selection by the Yang and Nielsen maximum likelihood method of approximating d{sub N}/d{sub S}. However, d{sub N}/d{sub S} for X. laevis in-paralogs is elevated relative to X. tropicalis orthologs. This difference is highly significant, and indicates an overall relaxation of selective pressures on duplicated gene pairs. Within both groups of paralogs, we found evidence of subfunctionalization, manifested as differential expression of paralogous genes among tissues, as measured by EST information from public resources. We have observed, as expected, a higher instance of subfunctionalization in out-paralogs relative to in-paralogs.« less

  16. The effects of age and step length on joint kinematics and kinetics of large out-and-back steps.

    PubMed

    Schulz, Brian W; Ashton-Miller, James A; Alexander, Neil B

    2008-06-01

    Maximum step length (MSL) is a clinical test that has been shown to correlate with age, various measures of fall risk, and knee and hip joint extension speed, strength, and power capacities, but little is known about the kinematics and kinetics of the large out-and-back step utilized. Body motions and ground reaction forces were recorded for 11 unimpaired younger and 10 older women while attaining maximum step length. Joint kinematics and kinetics were calculated using inverse dynamics. The effects of age group and step length on the biomechanics of these large out-and-back steps were determined. Maximum step length was 40% greater in the younger than in the older women (P<0.0001). Peak knee and hip, but not ankle, angle, velocity, moment, and power were generally greater for younger women and longer steps. After controlling for age group, step length generally explained significant additional variance in hip and torso kinematics and kinetics (incremental R2=0.09-0.37). The young reached their peak knee extension moment immediately after landing of the step out, while the old reached their peak knee extension moment just before the return step liftoff (P=0.03). Maximum step length is strongly associated with hip kinematics and kinetics. Delays in peak knee extension moment that appear to be unrelated to step length, may indicate a reduced ability of older women to rapidly apply force to the ground with the stepping leg and thus arrest the momentum of a fall.

  17. The effects of age and step length on joint kinematics and kinetics of large out-and-back steps

    PubMed Central

    Schulz, Brian W.; Ashton-Miller, James A.; Alexander, Neil B.

    2008-01-01

    Background Maximum Step Length is a clinical test that has been shown to correlate with age, various measures of fall risk, and knee and hip joint extension speed, strength, and power capacities, but little is known about the kinematics and kinetics of the large out-and-back step utilized. Methods Body motions and ground reaction forces were recorded for 11 unimpaired younger and 10 older women while attaining Maximum Step Length. Joint kinematics and kinetics were calculated using inverse dynamics. The effects of age group and step length on the biomechanics of these large out-and-back steps were determined. Findings Maximum Step Length was 40% greater in the younger than in the older women (p<0.0001). Peak knee and hip, but not ankle, angle, velocity, moment, and power were generally greater for younger women and longer steps. After controlling for age group, step length generally explained significant additional variance in hip and torso kinematics and kinetics (incremental R2=0.09–0.37). The young reached their peak knee extension moment immediately after landing of the step out, while the old reached their peak knee extension moment just before the return step lift off (p=0.03). Interpretation Maximum Step Length is strongly associated with hip kinematics and kinetics. Delays in peak knee extension moment that appear to be unrelated to step length, may indicate a reduced ability of older women to rapidly apply force to the ground with the stepping leg and thus arrest the momentum of a fall. PMID:18308435

  18. Complete plastid genome sequence of Daucus carota: Implications for biotechnology and phylogeny of angiosperms

    PubMed Central

    Ruhlman, Tracey; Lee, Seung-Bum; Jansen, Robert K; Hostetler, Jessica B; Tallon, Luke J; Town, Christopher D; Daniell, Henry

    2006-01-01

    Background Carrot (Daucus carota) is a major food crop in the US and worldwide. Its capacity for storage and its lifecycle as a biennial make it an attractive species for the introduction of foreign genes, especially for oral delivery of vaccines and other therapeutic proteins. Until recently efforts to express recombinant proteins in carrot have had limited success in terms of protein accumulation in the edible tap roots. Plastid genetic engineering offers the potential to overcome this limitation, as demonstrated by the accumulation of BADH in chromoplasts of carrot taproots to confer exceedingly high levels of salt resistance. The complete plastid genome of carrot provides essential information required for genetic engineering. Additionally, the sequence data add to the rapidly growing database of plastid genomes for assessing phylogenetic relationships among angiosperms. Results The complete carrot plastid genome is 155,911 bp in length, with 115 unique genes and 21 duplicated genes within the IR. There are four ribosomal RNAs, 30 distinct tRNA genes and 18 intron-containing genes. Repeat analysis reveals 12 direct and 2 inverted repeats ≥ 30 bp with a sequence identity ≥ 90%. Phylogenetic analysis of nucleotide sequences for 61 protein-coding genes using both maximum parsimony (MP) and maximum likelihood (ML) were performed for 29 angiosperms. Phylogenies from both methods provide strong support for the monophyly of several major angiosperm clades, including monocots, eudicots, rosids, asterids, eurosids II, euasterids I, and euasterids II. Conclusion The carrot plastid genome contains a number of dispersed direct and inverted repeats scattered throughout coding and non-coding regions. This is the first sequenced plastid genome of the family Apiaceae and only the second published genome sequence of the species-rich euasterid II clade. Both MP and ML trees provide very strong support (100% bootstrap) for the sister relationship of Daucus with Panax in the euasterid II clade. These results provide the best taxon sampling of complete chloroplast genomes and the strongest support yet for the sister relationship of Caryophyllales to the asterids. The availability of the complete plastid genome sequence should facilitate improved transformation efficiency and foreign gene expression in carrot through utilization of endogenous flanking sequences and regulatory elements. PMID:16945140

  19. The first mitochondrial genome for the butterfly family Riodinidae (Abisara fylloides) and its systematic implications.

    PubMed

    Zhao, Fang; Huang, Dun-Yuan; Sun, Xiao-Yan; Shi, Qing-Hui; Hao, Jia-Sheng; Zhang, Lan-Lan; Yang, Qun

    2013-10-01

    The Riodinidae is one of the lepidopteran butterfly families. This study describes the complete mitochondrial genome of the butterfly species Abisara fylloides, the first mitochondrial genome of the Riodinidae family. The results show that the entire mitochondrial genome of A. fylloides is 15 301 bp in length, and contains 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and a 423 bp A+T-rich region. The gene content, orientation and order are identical to the majority of other lepidopteran insects. Phylogenetic reconstruction was conducted using the concatenated 13 protein-coding gene (PCG) sequences of 19 available butterfly species covering all the five butterfly families (Papilionidae, Nymphalidae, Peridae, Lycaenidae and Riodinidae). Both maximum likelihood and Bayesian inference analyses highly supported the monophyly of Lycaenidae+Riodinidae, which was standing as the sister of Nymphalidae. In addition, we propose that the riodinids be categorized into the family Lycaenidae as a subfamilial taxon. The Riodinidae is one of the lepidopteran butterfly families. This study describes the complete mitochondrial genome of the butterfly species Abisara fylloides , the first mitochondrial genome of the Riodinidae family. The results show that the entire mitochondrial genome of A. fylloides is 15 301 bp in length, and contains 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and a 423 bp A+T-rich region. The gene content, orientation and order are identical to the majority of other lepidopteran insects. Phylogenetic reconstruction was conducted using the concatenated 13 protein-coding gene (PCG) sequences of 19 available butterfly species covering all the five butterfly families (Papilionidae, Nymphalidae, Peridae, Lycaenidae and Riodinidae). Both maximum likelihood and Bayesian inference analyses highly supported the monophyly of Lycaenidae+Riodinidae, which was standing as the sister of Nymphalidae. In addition, we propose that the riodinids be categorized into the family Lycaenidae as a subfamilial taxon.

  20. Phylogeny of Morella rubra and Its Relatives (Myricaceae) and Genetic Resources of Chinese Bayberry Using RAD Sequencing

    PubMed Central

    Liu, Luxian; Jin, Xinjie; Chen, Nan; Li, Xian; Li, Pan; Fu, Chengxin

    2015-01-01

    Phylogenetic relationships among Chinese species of Morella (Myricaceae) are unresolved. Here, we use restriction site-associated DNA sequencing (RAD-seq) to identify candidate loci that will help in determining phylogenetic relationships among Morella rubra, M. adenophora, M. nana and M. esculenta. Three methods for inferring phylogeny, maximum parsimony (MP), maximum likelihood (ML) and Bayesian concordance, were applied to data sets including as many as 4253 RAD loci with 8360 parsimony informative variable sites. All three methods significantly favored the topology of (((M. rubra, M. adenophora), M. nana), M. esculenta). Two species from North America (M. cerifera and M. pensylvanica) were placed as sister to the four Chinese species. According to BEAST analysis, we deduced speciation of M. rubra to be at about the Miocene-Pliocene boundary (5.28 Ma). Intraspecific divergence in M. rubra occurred in the late Pliocene (3.39 Ma). From pooled data, we assembled 29378, 21902 and 23552 de novo contigs with an average length of 229, 234 and 234 bp for M. rubra, M. nana and M. esculenta respectively. The contigs were used to investigate functional classification of RAD tags in a BLASTX search. Additionally, we identified 3808 unlinked SNP sites across the four populations of M. rubra and discovered genes associated with fruit ripening and senescence, fruit quality and disease/defense metabolism based on KEGG database. PMID:26431030

  1. Length and sequence variability in mitochondrial control region of the milkfish, Chanos chanos.

    PubMed

    Ravago, Rachel G; Monje, Virginia D; Juinio-Meñez, Marie Antonette

    2002-01-01

    Extensive length variability was observed in the mitochondrial control region of the milkfish, Chanos chanos. The nucleotide sequence of the control region and flanking regions was determined. Length variability and heteroplasmy was due to the presence of varying numbers of a 41-bp tandemly repeated sequence and a 48-bp insertion/deletion (indel). The structure and organization of the milkfish control region is similar to that of other teleost fish and vertebrates. However, extensive variation in the copy number of tandem repeats (4-20 copies) and the presence of a relatively large (48-bp) indel, are apparently uncommon in teleost fish control region sequences reported to date. High sequence variability of control region peripheral domains indicates the potential utility of selected regions as markers for population-level studies.

  2. Construction and Evaluation of Normalized cDNA Libraries Enriched with Full-Length Sequences for Rapid Discovery of New Genes from Sisal (Agave sisalana Perr.) Different Developmental Stages

    PubMed Central

    Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng

    2012-01-01

    To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing. PMID:23202944

  3. Telomerecat: A ploidy-agnostic method for estimating telomere length from whole genome sequencing data.

    PubMed

    Farmery, James H R; Smith, Mike L; Lynch, Andy G

    2018-01-22

    Telomere length is a risk factor in disease and the dynamics of telomere length are crucial to our understanding of cell replication and vitality. The proliferation of whole genome sequencing represents an unprecedented opportunity to glean new insights into telomere biology on a previously unimaginable scale. To this end, a number of approaches for estimating telomere length from whole-genome sequencing data have been proposed. Here we present Telomerecat, a novel approach to the estimation of telomere length. Previous methods have been dependent on the number of telomeres present in a cell being known, which may be problematic when analysing aneuploid cancer data and non-human samples. Telomerecat is designed to be agnostic to the number of telomeres present, making it suited for the purpose of estimating telomere length in cancer studies. Telomerecat also accounts for interstitial telomeric reads and presents a novel approach to dealing with sequencing errors. We show that Telomerecat performs well at telomere length estimation when compared to leading experimental and computational methods. Furthermore, we show that it detects expected patterns in longitudinal data, repeated measurements, and cross-species comparisons. We also apply the method to a cancer cell data, uncovering an interesting relationship with the underlying telomerase genotype.

  4. [Sequencing and analysis of complete genome of rabies viruses isolated from Chinese Ferret-Badger and dog in Zhejiang province].

    PubMed

    Lei, Yong-Liang; Wang, Xiao-Guang; Tao, Xiao-Yan; Li, Hao; Meng, Sheng-Li; Chen, Xiu-Ying; Liu, Fu-Ming; Ye, Bi-Feng; Tang, Qing

    2010-01-01

    Based on sequencing the full-length genomes of four Chinese Ferret-Badger and dog, we analyze the properties of rabies viruses genetic variation in molecular level, get the information about rabies viruses prevalence and variation in Zhejiang, and enrich the genome database of rabies viruses street strains isolated from China. Rabies viruses in suckling mice were isolated, overlapped fragments were amplified by RT-PCR and full-length genomes were assembled to analyze the nucleotide and deduced protein similarities and phylogenetic analyses from Chinese Ferret-Badger, dog, sika deer, vole, used vaccine strain were determined. The four full-length genomes were sequenced completely and had the same genetic structure with the length of 11, 923 nts or 11, 925 nts including 58 nts-Leader, 1353 nts-NP, 894 nts-PP, 609 nts-MP, 1575 nts-GP, 6386 nts-LP, and 2, 5, 5 nts- intergenic regions(IGRs), 423 nts-Pseudogene-like sequence (psi), 70 nts-Trailer. The four full-length genomes were in accordance with the properties of Rhabdoviridae Lyssa virus by BLAST and multi-sequence alignment. The nucleotide and amino acid sequences among Chinese strains had the highest similarity, especially among animals of the same species. Of the four full-length genomes, the similarity in amino acid level was dramatically higher than that in nucleotide level, so the nucleotide mutations happened in these four genomes were most synonymous mutations. Compared with the reference rabies viruses, the lengths of the five protein coding regions had no change, no recombination, only with a few point mutations. It was evident that the five proteins appeared to be stable. The variation sites and types of the four genomes were similar to the reference vaccine or street strains. And the four strains were genotype 1 according to the multi-sequence and phylogenetic analyses, which possessed the distinct district characteristics of China. Therefore, these four rabies viruses are likely to be street viruses already existing in the natural world.

  5. Both positive and negative regulatory elements mediate expression of a photoregulated CAB gene from Nicotiana plumbaginifolia.

    PubMed Central

    Castresana, C; Garcia-Luque, I; Alonso, E; Malik, V S; Cashmore, A R

    1988-01-01

    We have analyzed promoter regulatory elements from a photoregulated CAB gene (Cab-E) isolated from Nicotiana plumbaginifolia. These studies have been performed by introducing chimeric gene constructs into tobacco cells via Agrobacterium tumefaciens-mediated transformation. Expression studies on the regenerated transgenic plants have allowed us to characterize three positive and one negative cis-acting elements that influence photoregulated expression of the Cab-E gene. Within the upstream sequences we have identified two positive regulatory elements (PRE1 and PRE2) which confer maximum levels of photoregulated expression. These sequences contain multiple repeated elements related to the sequence-ACCGGCCCACTT-. We have also identified within the upstream region a negative regulatory element (NRE) extremely rich in AT sequences, which reduces the level of gene expression in the light. We have defined a light regulatory element (LRE) within the promoter region extending from -396 to -186 bp which confers photoregulated expression when fused to a constitutive nopaline synthase ('nos') promoter. Within this region there is a 132-bp element, extending from -368 to -234 bp, which on deletion from the Cab-E promoter reduces gene expression from high levels to undetectable levels. Finally, we have demonstrated for a full length Cab-E promoter conferring high levels of photoregulated expression, that sequences proximal to the Cab-E TATA box are not replaceable by corresponding sequences from a 'nos' promoter. This contrasts with the apparent equivalence of these Cab-E and 'nos' TATA box-proximal sequences in truncated promoters conferring low levels of photoregulated expression. Images PMID:2901343

  6. Smaller predator-prey body size ratios in longer food chains.

    PubMed Central

    Jennings, Simon; Warr, Karema J

    2003-01-01

    Maximum food-chain length has been correlated with resource availability, ecosystem size, environmental stability and colonization history. Some of these correlations may result from environmental effects on predator-prey body size ratios. We investigate relationships between maximum food-chain length, predator-prey mass ratios, primary production and environmental stability in marine food webs with a natural history of community assembly. Our analyses provide empirical evidence that smaller mean predator-prey body size ratios are characteristic of more stable environments and that food chains are longer when mean predator-prey body size ratios are small. We conclude that environmental effects on predator-prey body size ratios contribute to observed differences in maximum food-chain length. PMID:12965034

  7. Fundamental Bounds for Sequence Reconstruction from Nanopore Sequencers.

    PubMed

    Magner, Abram; Duda, Jarosław; Szpankowski, Wojciech; Grama, Ananth

    2016-06-01

    Nanopore sequencers are emerging as promising new platforms for high-throughput sequencing. As with other technologies, sequencer errors pose a major challenge for their effective use. In this paper, we present a novel information theoretic analysis of the impact of insertion-deletion (indel) errors in nanopore sequencers. In particular, we consider the following problems: (i) for given indel error characteristics and rate, what is the probability of accurate reconstruction as a function of sequence length; (ii) using replicated extrusion (the process of passing a DNA strand through the nanopore), what is the number of replicas needed to accurately reconstruct the true sequence with high probability? Our results provide a number of important insights: (i) the probability of accurate reconstruction of a sequence from a single sample in the presence of indel errors tends quickly (i.e., exponentially) to zero as the length of the sequence increases; and (ii) replicated extrusion is an effective technique for accurate reconstruction. We show that for typical distributions of indel errors, the required number of replicas is a slow function (polylogarithmic) of sequence length - implying that through replicated extrusion, we can sequence large reads using nanopore sequencers. Moreover, we show that in certain cases, the required number of replicas can be related to information-theoretic parameters of the indel error distributions.

  8. Operating characteristics of the implicit learning system supporting serial interception sequence learning.

    PubMed

    Sanchez, Daniel J; Reber, Paul J

    2012-04-01

    The memory system that supports implicit perceptual-motor sequence learning relies on brain regions that operate separately from the explicit, medial temporal lobe memory system. The implicit learning system therefore likely has distinct operating characteristics and information processing constraints. To attempt to identify the limits of the implicit sequence learning mechanism, participants performed the serial interception sequence learning (SISL) task with covertly embedded repeating sequences that were much longer than most previous studies: ranging from 30 to 60 (Experiment 1) and 60 to 90 (Experiment 2) items in length. Robust sequence-specific learning was observed for sequences up to 80 items in length, extending the known capacity of implicit sequence learning. In Experiment 3, 12-item repeating sequences were embedded among increasing amounts of irrelevant nonrepeating sequences (from 20 to 80% of training trials). Despite high levels of irrelevant trials, learning occurred across conditions. A comparison of learning rates across all three experiments found a surprising degree of constancy in the rate of learning regardless of sequence length or embedded noise. Sequence learning appears to be constant with the logarithm of the number of sequence repetitions practiced during training. The consistency in learning rate across experiments and conditions implies that the mechanisms supporting implicit sequence learning are not capacity-constrained by very long sequences nor adversely affected by high rates of irrelevant sequences during training.

  9. Influence of crank length and crank width on maximal hand cycling power and cadence.

    PubMed

    Krämer, Christian; Hilker, Lutz; Böhm, Harald

    2009-07-01

    The effect of different crank lengths and crank widths on maximal hand cycling power, cadence and handle speed were determined. Crank lengths and crank widths were adapted to anthropometric data of the participants as the ratio to forward reach (FR) and shoulder breadth (SB), respectively. 25 able-bodied subjects performed maximal inertial load hand cycle ergometry using crank lengths of 19, 22.5 and 26% of FR and 72, 85 and 98% of SB. Maximum power ranged from 754 (246) W for the crank geometry short wide (crank length x crank width) to 873 (293) W for the combination long middle. Every crank length differed significantly (P < 0.05) from each other, whereas no significant effect of crank width to maximum power output was revealed. Optimal cadence decreased significantly (P < 0.001) with increasing crank length from 124.8 (0.9) rpm for the short to 107.5 (1.6) rpm for the long cranks, whereas optimal handle speed increased significantly (P < 0.001) with increasing crank length from 1.81 (0.01) m/s for the short to 2.13 (0.03) m/s for the long cranks. Crank width did neither influence optimal cadence nor optimal handle speed significantly. From the results of this study, for maximum hand cycling power, a crank length to FR ratio of 26% for a crank width to SB ratio of 85% is recommended.

  10. Construction and EST sequencing of full-length, drought stress cDNA libraries for common beans (Phaseolus vulgaris L.)

    PubMed Central

    2011-01-01

    Background Common bean is an important legume crop with only a moderate number of short expressed sequence tags (ESTs) made with traditional methods. The goal of this research was to use full-length cDNA technology to develop ESTs that would overlap with the beginning of open reading frames and therefore be useful for gene annotation of genomic sequences. The library was also constructed to represent genes expressed under drought, low soil phosphorus and high soil aluminum toxicity. We also undertook comparisons of the full-length cDNA library to two previous non-full clone EST sets for common bean. Results Two full-length cDNA libraries were constructed: one for the drought tolerant Mesoamerican genotype BAT477 and the other one for the acid-soil tolerant Andean genotype G19833 which has been selected for genome sequencing. Plants were grown in three soil types using deep rooting cylinders subjected to drought and non-drought stress and tissues were collected from both roots and above ground parts. A total of 20,000 clones were selected robotically, half from each library. Then, nearly 10,000 clones from the G19833 library were sequenced with an average read length of 850 nucleotides. A total of 4,219 unigenes were identified consisting of 2,981 contigs and 1,238 singletons. These were functionally annotated with gene ontology terms and placed into KEGG pathways. Compared to other EST sequencing efforts in common bean, about half of the sequences were novel or represented the 5' ends of known genes. Conclusions The present full-length cDNA libraries add to the technological toolbox available for common bean and our sequencing of these clones substantially increases the number of unique EST sequences available for the common bean genome. All of this should be useful for both functional gene annotation, analysis of splice site variants and intron/exon boundary determination by comparison to soybean genes or with common bean whole-genome sequences. In addition the library has a large number of transcription factors and will be interesting for discovery and validation of drought or abiotic stress related genes in common bean. PMID:22118559

  11. On avoided words, absent words, and their application to biological sequence analysis.

    PubMed

    Almirantis, Yannis; Charalampopoulos, Panagiotis; Gao, Jia; Iliopoulos, Costas S; Mohamed, Manal; Pissis, Solon P; Polychronopoulos, Dimitris

    2017-01-01

    The deviation of the observed frequency of a word w from its expected frequency in a given sequence x is used to determine whether or not the word is avoided . This concept is particularly useful in DNA linguistic analysis. The value of the deviation of w , denoted by [Formula: see text], effectively characterises the extent of a word by its edge contrast in the context in which it occurs. A word w of length [Formula: see text] is a [Formula: see text]-avoided word in x if [Formula: see text], for a given threshold [Formula: see text]. Notice that such a word may be completely absent from x . Hence, computing all such words naïvely can be a very time-consuming procedure, in particular for large k . In this article, we propose an [Formula: see text]-time and [Formula: see text]-space algorithm to compute all [Formula: see text]-avoided words of length k in a given sequence of length n over a fixed-sized alphabet. We also present a time-optimal [Formula: see text]-time algorithm to compute all [Formula: see text]-avoided words (of any length) in a sequence of length n over an integer alphabet of size [Formula: see text]. In addition, we provide a tight asymptotic upper bound for the number of [Formula: see text]-avoided words over an integer alphabet and the expected length of the longest one. We make available an implementation of our algorithm. Experimental results, using both real and synthetic data, show the efficiency and applicability of our implementation in biological sequence analysis. The systematic search for avoided words is particularly useful for biological sequence analysis. We present a linear-time and linear-space algorithm for the computation of avoided words of length k in a given sequence x . We suggest a modification to this algorithm so that it computes all avoided words of x , irrespective of their length, within the same time complexity. We also present combinatorial results with regards to avoided words and absent words.

  12. The Effect of Suspension-Line Length on Viking Parachute Inflation Loads

    NASA Technical Reports Server (NTRS)

    Talay, Theodore A.; Poole, Lamont R.; Whitlock, Charles H.

    1971-01-01

    Analytical calculations have considered the effect on maximum load of increasing the suspension-line length on the Viking parachute. Results indicate that unfurling time is increased to 1.85 seconds from 1.45 seconds, and that maximum loads are increased approximately 5 percent with an uncertainty of -4 percent to +3 percent.

  13. Rapid Sequencing of Complete env Genes from Primary HIV-1 Samples.

    PubMed

    Laird Smith, Melissa; Murrell, Ben; Eren, Kemal; Ignacio, Caroline; Landais, Elise; Weaver, Steven; Phung, Pham; Ludka, Colleen; Hepler, Lance; Caballero, Gemma; Pollner, Tristan; Guo, Yan; Richman, Douglas; Poignard, Pascal; Paxinos, Ellen E; Kosakovsky Pond, Sergei L; Smith, Davey M

    2016-07-01

    The ability to study rapidly evolving viral populations has been constrained by the read length of next-generation sequencing approaches and the sampling depth of single-genome amplification methods. Here, we develop and characterize a method using Pacific Biosciences' Single Molecule, Real-Time (SMRT®) sequencing technology to sequence multiple, intact full-length human immunodeficiency virus-1 env genes amplified from viral RNA populations circulating in blood, and provide computational tools for analyzing and visualizing these data.

  14. What is a melody? On the relationship between pitch and brightness of timbre

    PubMed Central

    Cousineau, Marion; Carcagno, Samuele; Demany, Laurent; Pressnitzer, Daniel

    2014-01-01

    Previous studies showed that the perceptual processing of sound sequences is more efficient when the sounds vary in pitch than when they vary in loudness. We show here that sequences of sounds varying in brightness of timbre are processed with the same efficiency as pitch sequences. The sounds used consisted of two simultaneous pure tones one octave apart, and the listeners’ task was to make same/different judgments on pairs of sequences varying in length (one, two, or four sounds). In one condition, brightness of timbre was varied within the sequences by changing the relative level of the two pure tones. In other conditions, pitch was varied by changing fundamental frequency, or loudness was varied by changing the overall level. In all conditions, only two possible sounds could be used in a given sequence, and these two sounds were equally discriminable. When sequence length increased from one to four, discrimination performance decreased substantially for loudness sequences, but to a smaller extent for brightness sequences and pitch sequences. In the latter two conditions, sequence length had a similar effect on performance. These results suggest that the processes dedicated to pitch and brightness analysis, when probed with a sequence-discrimination task, share unexpected similarities. PMID:24478638

  15. Species Profiles: Life Histories and Environmental Requirements of Coastal Fishes and Invertebrates (Pacific Southwest). Pile Perch, Striped Seaperch, and Rubberlip Seaperch

    DTIC Science & Technology

    1989-07-01

    The largest of the teeth on vomer or palatines . surfperches, reaching a maximum length Branchiostegals 5-6; gill membranes of 47 cm TL (Eschmeyer et...Fritzsche 1982). dorsal surface. Fins dusky (Tarp 1952). Maximum length 44 cm total length (TL) (Eschmeyer et al. 1983). LIFE HISTORY Embiotoca lateralis...developed, shore. From 1958 to 1961, sport fused pharyngeal tooth plates that fishermen caught an estimated 5,000 enable the fish to crush hard-shelled

  16. A maximum pseudo-profile likelihood estimator for the Cox model under length-biased sampling

    PubMed Central

    Huang, Chiung-Yu; Qin, Jing; Follmann, Dean A.

    2012-01-01

    This paper considers semiparametric estimation of the Cox proportional hazards model for right-censored and length-biased data arising from prevalent sampling. To exploit the special structure of length-biased sampling, we propose a maximum pseudo-profile likelihood estimator, which can handle time-dependent covariates and is consistent under covariate-dependent censoring. Simulation studies show that the proposed estimator is more efficient than its competitors. A data analysis illustrates the methods and theory. PMID:23843659

  17. A novel biomarker for marine environmental pollution of pi-class glutathione S-transferase from Mytilus coruscus.

    PubMed

    Liu, Huihui; He, Jianyu; Zhao, Rongtao; Chi, Changfeng; Bao, Yongbo

    2015-08-01

    Glutathione S-transferases (GSTs) are the superfamily of phase II detoxification enzymes that play crucial roles in innate immunity. In this study, a pi-class GST homolog was identified from Mytilus coruscus (named as McGST1, KC525103). The full-length cDNA sequence of McGST1 was 621bp with a 5' untranslated region (UTR) of 70bp and a 3'-UTR of 201bp. The deduced amino acid sequence was 206 residues in length with theoretical pI/MW of 5.60/23.72kDa, containing the conserved G-site and diversiform H-site. BLASTn analysis and phylogenetic relationship strongly suggested that this cDNA sequence was a member of pi class GST family. The prediction of secondary structure displayed a preserved N-terminal and a C-terminal comprised with α-helixes. Quantitative real time RT-PCR showed that constitutive expression of McGST1 was occurred, with increasing order in mantle, muscle, gill, hemocyte, gonad and hepatopancreas. The stimulation of bacterial infection, heavy metals and 180CST could up-regulate McGST1 mRNA expression in hepatopancreas with time-dependent manners. The maximum expression appeared at 6h after pathogenic bacteria injected, with 10-fold in Vibrio alginolyticus and 16-fold in Vibrio harveyi higher than that of the control. The highest point of McGST1 mRNA appeared at different time for exposure to copper (10-fold at day 15), cadmium (9-fold at day10) and 180 CST (10-fold at day 15). These results suggested that McGST1 played a significant role in antioxidation and might potentially be used as indicators and biomarkers for detection of marine environmental pollution. Copyright © 2015 Elsevier Inc. All rights reserved.

  18. Length and sequence dependence in the association of Huntingtin protein with lipid membranes

    NASA Astrophysics Data System (ADS)

    Jawahery, Sudi; Nagarajan, Anu; Matysiak, Silvina

    2013-03-01

    There is a fundamental gap in our understanding of how aggregates of mutant Huntingtin protein (htt) with overextended polyglutamine (polyQ) sequences gain the toxic properties that cause Huntington's disease (HD). Experimental studies have shown that the most important step associated with toxicity is the binding of mutant htt aggregates to lipid membranes. Studies have also shown that flanking amino acid sequences around the polyQ sequence directly affect interactions with the lipid bilayer, and that polyQ sequences of greater than 35 glutamine repeats in htt are a characteristic of HD. The key steps that determine how flanking sequences and polyQ length affect the structure of lipid bilayers remain unknown. In this study, we use atomistic molecular dynamics simulations to study the interactions between lipid membranes of varying compositions and polyQ peptides of varying lengths and flanking sequences. We find that overextended polyQ interactions do cause deformation in model membranes, and that the flanking sequences do play a role in intensifying this deformation by altering the shape of the affected regions.

  19. Apparatus and method for classifying fuel pellets for nuclear reactor

    DOEpatents

    Wilks, Robert S.; Sternheim, Eliezer; Breakey, Gerald A.; Sturges, Jr., Robert H.; Taleff, Alexander; Castner, Raymond P.

    1984-01-01

    Control for the operation of a mechanical handling and gauging system for nuclear fuel pellets. The pellets are inspected for diameters, lengths, surface flaws and weights in successive stations. The control includes, a computer for commanding the operation of the system and its electronics and for storing and processing the complex data derived at the required high rate. In measuring the diameter, the computer enables the measurement of a calibration pellet, stores that calibration data and computes and stores diameter-correction factors and their addresses along a pellet. To each diameter measurement a correction factor is applied at the appropriate address. The computer commands verification that all critical parts of the system and control are set for inspection and that each pellet is positioned for inspection. During each cycle of inspection, the measurement operation proceeds normally irrespective of whether or not a pellet is present in each station. If a pellet is not positioned in a station, a measurement is recorded, but the recorded measurement indicates maloperation. In measuring diameter and length a light pattern including successive shadows of slices transverse for diameter or longitudinal for length are projected on a photodiode array. The light pattern is scanned electronically by a train of pulses. The pulses are counted during the scan of the lighted diodes. For evaluation of diameter the maximum diameter count and the number of slices for which the diameter exceeds a predetermined minimum is determined. For acceptance, the maximum must be less than a maximum level and the minimum must exceed a set number. For evaluation of length, the maximum length is determined. For acceptance, the length must be within maximum and minimum limits.

  20. Maximum striking velocities in strikes with steel rods-the influence of rod length, rod mass and volunteer parameters.

    PubMed

    Trinh, T X; Heinke, S; Rode, C; Schenkl, S; Hubig, M; Mall, G; Muggenthaler, Holger

    2018-03-01

    In blunt force trauma to the head caused by attacks with blunt instruments, contact forces can be estimated based on the conservation of momentum if impact velocities are known. The aims of this work were to measure maximum striking velocities and to examine the influence of rod parameters such as rod mass and length as well as volunteer parameters such as sex, age, body height, body mass, body mass index and the average amount of physical exercise. Steel rods with masses of 500, 1000 and 1500 g as well as lengths of 40, 65 and 90 cm were exemplarily tested as blunt instruments. Twenty-nine men and 22 women participated in this study. Each volunteer performed several vertical strikes with the steel rods onto a passive immobile target. Maximum striking velocities were measured by means of a Qualisys motion capture system using high-speed cameras and infrared light. Male volunteers achieved maximum striking velocities between 14.0 and 35.5 m/s whereas female volunteers achieved values between 10.4 and 28.3 m/s. Results show that maximum striking velocities increased with smaller rod masses and less consistently with higher rod lengths. Statistically significant influences were found in the volunteers' sex and average amount of physical exercise.

  1. Energy balance of stellar coronae. III - Effect of stellar mass and radius

    NASA Technical Reports Server (NTRS)

    Hammer, R.

    1984-01-01

    A homologous transformation is derived which permits the application of the numerical coronal models of Hammer from a star with solar mass and radius to other stars. This scaling requires a few approximations concerning the lower boundary conditions and the temperature dependence of the conductivity and emissivity. These approximations are discussed and found to be surprisingly mild. Therefore, the scaling of the coronal models to other stars is rather accurate; it is found to be particularly accurate for main-sequence stars. The transformation is used to derive an equation that gives the maximum temperature of open coronal regions as a function of stellar mass and radius, the coronal heating flux, and the characteristic damping length over which the corona is heated.

  2. Comparison of three-dimensional visualization techniques for depicting the scala vestibuli and scala tympani of the cochlea by using high-resolution MR imaging.

    PubMed

    Hans, P; Grant, A J; Laitt, R D; Ramsden, R T; Kassner, A; Jackson, A

    1999-08-01

    Cochlear implantation requires introduction of a stimulating electrode array into the scala vestibuli or scala tympani. Although these structures can be separately identified on many high-resolution scans, it is often difficult to ascertain whether these channels are patent throughout their length. The aim of this study was to determine whether an optimized combination of an imaging protocol and a visualization technique allows routine 3D rendering of the scala vestibuli and scala tympani. A submillimeter T2 fast spin-echo imaging sequence was designed to optimize the performance of 3D visualization methods. The spatial resolution was determined experimentally using primary images and 3D surface and volume renderings from eight healthy subjects. These data were used to develop the imaging sequence and to compare the quality and signal-to-noise dependency of four data visualization algorithms: maximum intensity projection, ray casting with transparent voxels, ray casting with opaque voxels, and isosurface rendering. The ability of these methods to produce 3D renderings of the scala tympani and scala vestibuli was also examined. The imaging technique was used in five patients with sensorineural deafness. Visualization techniques produced optimal results in combination with an isotropic volume imaging sequence. Clinicians preferred the isosurface-rendered images to other 3D visualizations. Both isosurface and ray casting displayed the scala vestibuli and scala tympani throughout their length. Abnormalities were shown in three patients, and in one of these, a focal occlusion of the scala tympani was confirmed at surgery. Three-dimensional images of the scala vestibuli and scala tympani can be routinely produced. The combination of an MR sequence optimized for use with isosurface rendering or ray-casting algorithms can produce 3D images with greater spatial resolution and anatomic detail than has been possible previously.

  3. Rapid and accurate taxonomic classification of insect (class Insecta) cytochrome c oxidase subunit 1 (COI) DNA barcode sequences using a naïve Bayesian classifier

    PubMed Central

    Porter, Teresita M; Gibson, Joel F; Shokralla, Shadi; Baird, Donald J; Golding, G Brian; Hajibabaei, Mehrdad

    2014-01-01

    Current methods to identify unknown insect (class Insecta) cytochrome c oxidase (COI barcode) sequences often rely on thresholds of distances that can be difficult to define, sequence similarity cut-offs, or monophyly. Some of the most commonly used metagenomic classification methods do not provide a measure of confidence for the taxonomic assignments they provide. The aim of this study was to use a naïve Bayesian classifier (Wang et al. Applied and Environmental Microbiology, 2007; 73: 5261) to automate taxonomic assignments for large batches of insect COI sequences such as data obtained from high-throughput environmental sequencing. This method provides rank-flexible taxonomic assignments with an associated bootstrap support value, and it is faster than the blast-based methods commonly used in environmental sequence surveys. We have developed and rigorously tested the performance of three different training sets using leave-one-out cross-validation, two field data sets, and targeted testing of Lepidoptera, Diptera and Mantodea sequences obtained from the Barcode of Life Data system. We found that type I error rates, incorrect taxonomic assignments with a high bootstrap support, were already relatively low but could be lowered further by ensuring that all query taxa are actually present in the reference database. Choosing bootstrap support cut-offs according to query length and summarizing taxonomic assignments to more inclusive ranks can also help to reduce error while retaining the maximum number of assignments. Additionally, we highlight gaps in the taxonomic and geographic representation of insects in public sequence databases that will require further work by taxonomists to improve the quality of assignments generated using any method.

  4. TypeLoader: A fast and efficient automated workflow for the annotation and submission of novel full-length HLA alleles.

    PubMed

    Surendranath, V; Albrecht, V; Hayhurst, J D; Schöne, B; Robinson, J; Marsh, S G E; Schmidt, A H; Lange, V

    2017-07-01

    Recent years have seen a rapid increase in the discovery of novel allelic variants of the human leukocyte antigen (HLA) genes. Commonly, only the exons encoding the peptide binding domains of novel HLA alleles are submitted. As a result, the IPD-IMGT/HLA Database lacks sequence information outside those regions for the majority of known alleles. This has implications for the application of the new sequencing technologies, which deliver sequence data often covering the complete gene. As these technologies simplify the characterization of the complete gene regions, it is desirable for novel alleles to be submitted as full-length sequences to the database. However, the manual annotation of full-length alleles and the generation of specific formats required by the sequence repositories is prone to error and time consuming. We have developed TypeLoader to address both these facets. With only the full-length sequence as a starting point, Typeloader performs automatic sequence annotation and subsequently handles all steps involved in preparing the specific formats for submission with very little manual intervention. TypeLoader is routinely used at the DKMS Life Science Lab and has aided in the successful submission of more than 900 novel HLA alleles as full-length sequences to the European Nucleotide Archive repository and the IPD-IMGT/HLA Database with a 95% reduction in the time spent on annotation and submission when compared with handling these processes manually. TypeLoader is implemented as a web application and can be easily installed and used on a standalone Linux desktop system or within a Linux client/server architecture. TypeLoader is downloadable from http://www.github.com/DKMS-LSL/typeloader. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  5. Single-cell full-length total RNA sequencing uncovers dynamics of recursive splicing and enhancer RNAs.

    PubMed

    Hayashi, Tetsutaro; Ozaki, Haruka; Sasagawa, Yohei; Umeda, Mana; Danno, Hiroki; Nikaido, Itoshi

    2018-02-12

    Total RNA sequencing has been used to reveal poly(A) and non-poly(A) RNA expression, RNA processing and enhancer activity. To date, no method for full-length total RNA sequencing of single cells has been developed despite the potential of this technology for single-cell biology. Here we describe random displacement amplification sequencing (RamDA-seq), the first full-length total RNA-sequencing method for single cells. Compared with other methods, RamDA-seq shows high sensitivity to non-poly(A) RNA and near-complete full-length transcript coverage. Using RamDA-seq with differentiation time course samples of mouse embryonic stem cells, we reveal hundreds of dynamically regulated non-poly(A) transcripts, including histone transcripts and long noncoding RNA Neat1. Moreover, RamDA-seq profiles recursive splicing in >300-kb introns. RamDA-seq also detects enhancer RNAs and their cell type-specific activity in single cells. Taken together, we demonstrate that RamDA-seq could help investigate the dynamics of gene expression, RNA-processing events and transcriptional regulation in single cells.

  6. Intramural activation and repolarization sequences in canine ventricles. Experimental and simulation studies.

    PubMed

    Taccardi, Bruno; Punske, Bonnie B; Sachse, Frank; Tricoche, Xavier; Colli-Franzone, Piero; Pavarino, Luca F; Zabawa, Christine

    2005-10-01

    There are no published data showing the three-dimensional sequence of repolarization and the associated potential fields in the ventricles. Knowledge of the sequence of repolarization has medical relevance because high spatial dispersion of recovery times and action potential durations favors cardiac arrhythmias. In this study we describe measured and simulated 3-D excitation and recovery sequences and activation-recovery intervals (ARIs) (measured) or action potential durations (APDs) (simulated) in the ventricular walls. We recorded from 600 to 1400 unipolar electrograms from canine ventricular walls during atrial and ventricular pacing at 350-450 ms cycle length. Measured excitation and recovery times and ARIs were displayed as 2-D maps in transmural planes or 3-D maps in the volume explored, using specially developed software. Excitation and recovery sequences and APD distributions were also simulated in parallelepipedal slabs using anisotropic monodomain or bidomain models based on the Lou-Rudy version 1 model with homogeneous membrane properties. Simulations showed that in the presence of homogeneous membrane properties, the sequence of repolarization was similar but not identical to the excitation sequence. In a transmural plane perpendicular to epicardial fiber direction, both activation and recovery pathways starting from an epicardial pacing site returned toward the epicardium at a few cm distance from the pacing site. However, APDs were not constant, but had a dispersion of approximately 14 ms in the simulated domain. The maximum APD value was near the pacing site and two minima appeared along a line perpendicular to fiber directions, passing through the pacing site. Electrical measurements in dog ventricles showed that, for short cycle lengths, both excitation and recovery pathways, starting from an epicardial pacing site, returned toward the epicardium. For slower pacing rates, pathways of recovery departed from the pathway of excitation. Highest ARI values were observed near the pacing site in part of the experiments. In addition, maps of activation-recovery intervals showed mid-myocardial clusters with activation-recovery intervals that were slightly longer than ARIs closer to the epi- or endocardium, suggesting the presence of M cells in those areas. Transmural dispersion of measured ARIs was on the order of 20-25 ms. Potential distributions during recovery were less affected by myocardial anisotropy than were excitation potentials.

  7. Coseismic Surface Cracks Produced By the Mw8.1 Pisagua Earthquake Sequence

    NASA Astrophysics Data System (ADS)

    Allmendinger, R. W.; Scott, C. P.; Gonzalez, G.; Loveless, J. P.

    2014-12-01

    The April 1, 2014 Mw8.1 Pisagua earthquake filled a relatively small part of the Iquique Gap, a segment of the the Nazca-South America plate boundary that had not experienced a great earthquake since 1877. The slip maximum for the event occurred south of the hypocenter offshore of the village of Pisagua. To document the permanent surface deformation, we measured more than 3,700 co- or post seismic cracks, spanning 220 km of coast length, during three field excursions 2 weeks, 6 weeks, and 3 months after the main shock. Thanks to the hyperarid climate of the region, many fresh cracks are still visible 3.5 months after the main event but eolian processes and sloughing of the side-walls are rapidly obscuring these fragile features. The distribution of crack strikes is noisy for several reasons: (1) the vast majority of new cracks reactivated pre-existing cracks in many cases with less than ideal orientations; (2) both the April 1 main shock and the April 2 Mw7.7 aftershock 70 km to the south probably produced cracks; (3) several smaller crustal aftershocks occurred on EW reverse faults and may have enhanced cracking on EW scarps; and (4) cracking is locally enhanced along sharp topographic features. Nonetheless, there is a tendency for NNE striking cracks S of the slip maximum and NNW cracks to the north. We measured crack aperture and calculate strain in transects of 500-1000 m length at 3 localities along the earthquake rupture length. Those close to the slip maximum have permanent coseismic extensional strains on the order of 1e-4 and even a site 60 km S of the Mw7.7 event has crack strain of 5e-5. These strains are not homogenous, but diminish eastward. These data indicate that surface cracking caused by any one event utilizes the most suitably pre-existing weaknesses, Presumably, over time earthquakes with similar slip characteristics will add constructively in the geological record to produce a crack population characteristic of the long term average earthquake in the region.

  8. Quantifying precision of in situ length and weight measurements of fish

    USGS Publications Warehouse

    Gutreuter, S.; Krzoska, D.J.

    1994-01-01

    We estimated and compared errors in field-made (in situ) measurements of lengths and weights of fish. We made three measurements of length and weight on each of 33 common carp Cyprinus carpio, and on each of a total of 34 bluegills Lepomis macrochirus and black crappies Pomoxis nigromaculatus. Maximum total lengths of all fish were measured to the nearest 1 mm on a conventional measuring board. The bluegills and black crappies (85–282 mm maximum total length) were weighed to the nearest 1 g on a 1,000-g spring-loaded scale. The common carp (415–600 mm maximum total length) were weighed to the nearest 0.05 kg on a 20-kg spring-loaded scale. We present a statistical model for comparison of coefficients of variation of length (Cl ) and weight (Cw ). Expected Cl was near zero and constant across mean length, indicating that length can be measured with good precision in the field. Expected Cw decreased with increasing mean length, and was larger than expected Cl by 5.8 to over 100 times for the bluegills and black crappies, and by 3 to over 20 times for the common carp. Unrecognized in situ weighing errors bias the apparent content of unique information in weight, which is the information not explained by either length or measurement error. We recommend procedures to circumvent effects of weighing errors, including elimination of unnecessary weighing from routine monitoring programs. In situ weighing must be conducted with greater care than is common if the content of unique and nontrivial information in weight is to be correctly identified.

  9. Rapid Sequencing of Complete env Genes from Primary HIV-1 Samples

    PubMed Central

    Eren, Kemal; Ignacio, Caroline; Landais, Elise; Weaver, Steven; Phung, Pham; Ludka, Colleen; Hepler, Lance; Caballero, Gemma; Pollner, Tristan; Guo, Yan; Richman, Douglas; Poignard, Pascal; Paxinos, Ellen E.; Kosakovsky Pond, Sergei L.

    2016-01-01

    Abstract The ability to study rapidly evolving viral populations has been constrained by the read length of next-generation sequencing approaches and the sampling depth of single-genome amplification methods. Here, we develop and characterize a method using Pacific Biosciences’ Single Molecule, Real-Time (SMRT®) sequencing technology to sequence multiple, intact full-length human immunodeficiency virus-1 env genes amplified from viral RNA populations circulating in blood, and provide computational tools for analyzing and visualizing these data. PMID:29492273

  10. Feature-based respiratory motion tracking in native fluoroscopic sequences for dynamic roadmaps during minimally invasive procedures in the thorax and abdomen

    NASA Astrophysics Data System (ADS)

    Wagner, Martin G.; Laeseke, Paul F.; Schubert, Tilman; Slagowski, Jordan M.; Speidel, Michael A.; Mistretta, Charles A.

    2017-03-01

    Fluoroscopic image guidance for minimally invasive procedures in the thorax and abdomen suffers from respiratory and cardiac motion, which can cause severe subtraction artifacts and inaccurate image guidance. This work proposes novel techniques for respiratory motion tracking in native fluoroscopic images as well as a model based estimation of vessel deformation. This would allow compensation for respiratory motion during the procedure and therefore simplify the workflow for minimally invasive procedures such as liver embolization. The method first establishes dynamic motion models for both the contrast-enhanced vasculature and curvilinear background features based on a native (non-contrast) and a contrast-enhanced image sequence acquired prior to device manipulation, under free breathing conditions. The model of vascular motion is generated by applying the diffeomorphic demons algorithm to an automatic segmentation of the subtraction sequence. The model of curvilinear background features is based on feature tracking in the native sequence. The two models establish the relationship between the respiratory state, which is inferred from curvilinear background features, and the vascular morphology during that same respiratory state. During subsequent fluoroscopy, curvilinear feature detection is applied to determine the appropriate vessel mask to display. The result is a dynamic motioncompensated vessel mask superimposed on the fluoroscopic image. Quantitative evaluation of the proposed methods was performed using a digital 4D CT-phantom (XCAT), which provides realistic human anatomy including sophisticated respiratory and cardiac motion models. Four groups of datasets were generated, where different parameters (cycle length, maximum diaphragm motion and maximum chest expansion) were modified within each image sequence. Each group contains 4 datasets consisting of the initial native and contrast enhanced sequences as well as a sequence, where the respiratory motion is tracked. The respiratory motion tracking error was between 1.00 % and 1.09 %. The estimated dynamic vessel masks yielded a Sørensen-Dice coefficient between 0.94 and 0.96. Finally, the accuracy of the vessel contours was measured in terms of the 99th percentile of the error, which ranged between 0.64 and 0.96 mm. The presented results show that the approach is feasible for respiratory motion tracking and compensation and could therefore considerably improve the workflow of minimally invasive procedures in the thorax and abdomen

  11. How important is interannual variability in the climatic interpretation of moraine sequences?

    NASA Astrophysics Data System (ADS)

    Leonard, E. M.; Laabs, B. J. C.; Plummer, M. A.

    2017-12-01

    Mountain glaciers respond to both long-term climate and interannual forcing. Anderson et al. (2014) pointed out that kilometer-scale fluctuations in glacier length may result from interannual variability in temperature and precipitation given a "steady" climate with no long-term trends in mean or variability of temperature and precipitation. They cautioned that use of outermost moraines from the Last Glacial Maximum (LGM) as indicators of LGM climate will, because of the role of interannual forcing, result in overestimation of the magnitude of long-term temperature depression and/or precipitation enhancement. Here we assess the implications of these ideas, by examining the effect of interannual variability on glacier length and inferred magnitude of LGM climate change from present under both an assumed steady LGM climate and an LGM climate with low-magnitude, long-period variation in summer temperature and annual precipitation. We employ both the original 1-stage linear glacier model (Roe and O'Neal, 2009) used by Anderson et al. (2014) and a newer 3-stage linear model (Roe and Baker, 2014). We apply the models to two reconstructed LGM glaciers in the Colorado Sangre de Cristo Mountains. Three-stage-model results indicate that, absent long-term variations through a 7500-year-long LGM, interannual variability would result in overestimation of mean LGM temperature depression from the outermost moraine of 0.2-0.6°C. If small long-term cyclic variations of temperature (±0.5°C) and precipitation (±5%) are introduced, the overestimation of LGM temperature depression reduces to less than 0.4°C, and if slightly greater long-term variation (±1.0°C and ±10% precipitation) is introduced, the magnitude of overestimation is 0.3°C or less. Interannual variability may produce a moraine sequence that differs from the sequence that would be expected were glacier length forced only by long-term climate. With small amplitude (±0.5°C and ±5% precipitation) long-term variation, the moraine sequence expected if forced by a combination of interannual variability and long-term climate differs from that expected based on long-term climate forcing alone in 38% of model runs. With the larger amplitude long-term forcing (±1.0°C and ±10% precipitation) this difference occurs in 20% of model runs.

  12. Arc Length Coding by Interference of Theta Frequency Oscillations May Underlie Context-Dependent Hippocampal Unit Data and Episodic Memory Function

    ERIC Educational Resources Information Center

    Hasselmo, Michael E.

    2007-01-01

    Many memory models focus on encoding of sequences by excitatory recurrent synapses in region CA3 of the hippocampus. However, data and modeling suggest an alternate mechanism for encoding of sequences in which interference between theta frequency oscillations encodes the position within a sequence based on spatial arc length or time. Arc length…

  13. A space-efficient algorithm for local similarities.

    PubMed

    Huang, X Q; Hardison, R C; Miller, W

    1990-10-01

    Existing dynamic-programming algorithms for identifying similar regions of two sequences require time and space proportional to the product of the sequence lengths. Often this space requirement is more limiting than the time requirement. We describe a dynamic-programming local-similarity algorithm that needs only space proportional to the sum of the sequence lengths. The method can also find repeats within a single long sequence. To illustrate the algorithm's potential, we discuss comparison of a 73,360 nucleotide sequence containing the human beta-like globin gene cluster and a corresponding 44,594 nucleotide sequence for rabbit, a problem well beyond the capabilities of other dynamic-programming software.

  14. An EMA analysis of the effect of increasing word length on consonant production in apraxia of speech: a case study.

    PubMed

    Bartle, Carly J; Goozée, Justine V; Murdoch, Bruce E

    2007-03-01

    The effect of increasing word length on the articulatory dynamics (i.e. duration, distance, maximum acceleration, maximum deceleration, and maximum velocity) of consonant production in acquired apraxia of speech was investigated using electromagnetic articulography (EMA). Tongue-tip and tongue-back movement of one apraxic patient was recorded using the AG-200 EMA system during word-initial consonant productions in one, two, and three syllable words. Significantly deviant articulatory parameters were recorded for each of the target consonants during one, two, and three syllables words. Word length effects were most evident during the release phase of target consonant productions. The results are discussed with respect to theories of speech motor control as they relate to AOS.

  15. A high HIV-1 strain variability in London, UK, revealed by full-genome analysis: Results from the ICONIC project

    PubMed Central

    Frampton, Dan; Gallo Cassarino, Tiziano; Raffle, Jade; Hubb, Jonathan; Ferns, R. Bridget; Waters, Laura; Tong, C. Y. William; Kozlakidis, Zisis; Hayward, Andrew; Kellam, Paul; Pillay, Deenan; Clark, Duncan; Nastouli, Eleni; Leigh Brown, Andrew J.

    2018-01-01

    Background & methods The ICONIC project has developed an automated high-throughput pipeline to generate HIV nearly full-length genomes (NFLG, i.e. from gag to nef) from next-generation sequencing (NGS) data. The pipeline was applied to 420 HIV samples collected at University College London Hospitals NHS Trust and Barts Health NHS Trust (London) and sequenced using an Illumina MiSeq at the Wellcome Trust Sanger Institute (Cambridge). Consensus genomes were generated and subtyped using COMET, and unique recombinants were studied with jpHMM and SimPlot. Maximum-likelihood phylogenetic trees were constructed using RAxML to identify transmission networks using the Cluster Picker. Results The pipeline generated sequences of at least 1Kb of length (median = 7.46Kb, IQR = 4.01Kb) for 375 out of the 420 samples (89%), with 174 (46.4%) being NFLG. A total of 365 sequences (169 of them NFLG) corresponded to unique subjects and were included in the down-stream analyses. The most frequent HIV subtypes were B (n = 149, 40.8%) and C (n = 77, 21.1%) and the circulating recombinant form CRF02_AG (n = 32, 8.8%). We found 14 different CRFs (n = 66, 18.1%) and multiple URFs (n = 32, 8.8%) that involved recombination between 12 different subtypes/CRFs. The most frequent URFs were B/CRF01_AE (4 cases) and A1/D, B/C, and B/CRF02_AG (3 cases each). Most URFs (19/26, 73%) lacked breakpoints in the PR+RT pol region, rendering them undetectable if only that was sequenced. Twelve (37.5%) of the URFs could have emerged within the UK, whereas the rest were probably imported from sub-Saharan Africa, South East Asia and South America. For 2 URFs we found highly similar pol sequences circulating in the UK. We detected 31 phylogenetic clusters using the full dataset: 25 pairs (mostly subtypes B and C), 4 triplets and 2 quadruplets. Some of these were not consistent across different genes due to inter- and intra-subtype recombination. Clusters involved 70 sequences, 19.2% of the dataset. Conclusions The initial analysis of genome sequences detected substantial hidden variability in the London HIV epidemic. Analysing full genome sequences, as opposed to only PR+RT, identified previously undetected recombinants. It provided a more reliable description of CRFs (that would be otherwise misclassified) and transmission clusters. PMID:29389981

  16. Scaling exponents for ordered maxima

    DOE PAGES

    Ben-Naim, E.; Krapivsky, P. L.; Lemons, N. W.

    2015-12-22

    We study extreme value statistics of multiple sequences of random variables. For each sequence with N variables, independently drawn from the same distribution, the running maximum is defined as the largest variable to date. We compare the running maxima of m independent sequences and investigate the probability S N that the maxima are perfectly ordered, that is, the running maximum of the first sequence is always larger than that of the second sequence, which is always larger than the running maximum of the third sequence, and so on. The probability S N is universal: it does not depend on themore » distribution from which the random variables are drawn. For two sequences, S N~N –1/2, and in general, the decay is algebraic, S N~N –σm, for large N. We analytically obtain the exponent σ 3≅1.302931 as root of a transcendental equation. Moreover, the exponents σ m grow with m, and we show that σ m~m for large m.« less

  17. Influence of alkyl chain length compatibility on microemulsion structure and solubilization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bansal, V.K.; O'Connell, J.P.; Shah, D.O.

    1980-06-01

    The water solubilization capacity of water/oil microemulsions is studied as a function of alkyl chain length of oil (C/sub 8/ to C/sub 16/), surfactant (C/sub 14/ and C/sub 18/ fatty acid soaps), and alcohol (C/sub 4/ to C/sub 7/). Sodium stearate and sodium myristate were used as surfactants. For n-butanol microemulsions the maximum amount of water solubilized in the microemulsion decreased continuously with increasing oil chain length; for n-heptanol it increased continuously. For n-pentanol and n-hexanol systems, water solubilization reached a maximum when the oil chain length plus alcohol chain length was equal to that of the surfactant. The electricmore » resistance and dielectric constant of the microemulsions also are measured as a function of alkyl chain length of the oil. 48 references.« less

  18. LenVarDB: database of length-variant protein domains.

    PubMed

    Mutt, Eshita; Mathew, Oommen K; Sowdhamini, Ramanathan

    2014-01-01

    Protein domains are functionally and structurally independent modules, which add to the functional variety of proteins. This array of functional diversity has been enabled by evolutionary changes, such as amino acid substitutions or insertions or deletions, occurring in these protein domains. Length variations (indels) can introduce changes at structural, functional and interaction levels. LenVarDB (freely available at http://caps.ncbs.res.in/lenvardb/) traces these length variations, starting from structure-based sequence alignments in our Protein Alignments organized as Structural Superfamilies (PASS2) database, across 731 structural classification of proteins (SCOP)-based protein domain superfamilies connected to 2 730 625 sequence homologues. Alignment of sequence homologues corresponding to a structural domain is available, starting from a structure-based sequence alignment of the superfamily. Orientation of the length-variant (indel) regions in protein domains can be visualized by mapping them on the structure and on the alignment. Knowledge about location of length variations within protein domains and their visual representation will be useful in predicting changes within structurally or functionally relevant sites, which may ultimately regulate protein function. Non-technical summary: Evolutionary changes bring about natural changes to proteins that may be found in many organisms. Such changes could be reflected as amino acid substitutions or insertions-deletions (indels) in protein sequences. LenVarDB is a database that provides an early overview of observed length variations that were set among 731 protein families and after examining >2 million sequences. Indels are followed up to observe if they are close to the active site such that they can affect the activity of proteins. Inclusion of such information can aid the design of bioengineering experiments.

  19. Morphological and molecular differentiation of Parastrigea (Trematoda: Strigeidae) from Mexico, with the description of a new species.

    PubMed

    Hernández-Mena, David Iván; García-Prieto, Luís; García-Varela, Martín

    2014-04-01

    Parastrigea plataleae n. sp. (Digenea: Strigeidae) is described from the intestine of the roseate spoonbill Platalea ajaja (Threskiornithidae) from four localities on the Pacific coast of Mexico. The new species is mainly distinguished from the other 18 described species of Parastrigea based on the ratio of its hindbody length to forebody length. A principal component analysis (PCA) of 16 morphometric traits for 15 specimens of P. plataleae n. sp., five of Parastrigea cincta and 11 of Parastrigea diovadena previously recorded in Mexico, clearly shows three clusters, which correspond to the three species. DNA sequences of the internal transcribed spacers (ITSs) of ribosomal DNA and the mitochondrial gene cytochrome c oxidase subunit I (cox 1) were used to corroborate this morphological distinction. The genetic divergence estimated among P. plataleae n. sp., P. cincta and P. diovadena ranged from 0.5 to 1.48% for ITSs and from 9.31 to 11.47% for cox 1. Maximum parsimony (MP) and maximum likelihood (ML) analyses were performed on the combined datasets (ITSs+cox 1) and on each dataset alone. All of the phylogenetic analyses indicated that the specimens from the roseate spoonbill represent a clade with strong bootstrap support. The morphological evidence and the genetic divergence in combination with the reciprocal monophyly in all of the phylogenetic trees support the hypothesis that the digeneans found in the intestines of roseate spoonbills represent a new species. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  20. Indexing a sequence for mapping reads with a single mismatch.

    PubMed

    Crochemore, Maxime; Langiu, Alessio; Rahman, M Sohel

    2014-05-28

    Mapping reads against a genome sequence is an interesting and useful problem in computational molecular biology and bioinformatics. In this paper, we focus on the problem of indexing a sequence for mapping reads with a single mismatch. We first focus on a simpler problem where the length of the pattern is given beforehand during the data structure construction. This version of the problem is interesting in its own right in the context of the next generation sequencing. In the sequel, we show how to solve the more general problem. In both cases, our algorithm can construct an efficient data structure in O(n log(1+ε) n) time and space and can answer subsequent queries in O(m log log n + K) time. Here, n is the length of the sequence, m is the length of the read, 0<ε<1 and is the optimal output size.

  1. Design of the hairpin ribozyme for targeting specific RNA sequences.

    PubMed

    Hampel, A; DeYoung, M B; Galasinski, S; Siwkowski, A

    1997-01-01

    The following steps should be taken when designing the hairpin ribozyme to cleave a specific target sequence: 1. Select a target sequence containing BN*GUC where B is C, G, or U. 2. Select the target sequence in areas least likely to have extensive interfering structure. 3. Design the conventional hairpin ribozyme as shown in Fig. 1, such that it can form a 4 bp helix 2 and helix 1 lengths up to 10 bp. 4. Synthesize this ribozyme from single-stranded DNA templates with a double-stranded T7 promoter. 5. Prepare a series of short substrates capable of forming a range of helix 1 lengths of 5-10 bp. 6. Identify these by direct RNA sequencing. 7. Assay the extent of cleavage of each substrate to identify the optimal length of helix 1. 8. Prepare the hairpin tetraloop ribozyme to determine if catalytic efficiency can be improved.

  2. Taxonomic Characterization of Honey Bee (Apis mellifera) Pollen Foraging Based on Non-Overlapping Paired-End Sequencing of Nuclear Ribosomal Loci.

    PubMed

    Cornman, R Scott; Otto, Clint R V; Iwanowicz, Deborah; Pettis, Jeffery S

    2015-01-01

    Identifying plant taxa that honey bees (Apis mellifera) forage upon is of great apicultural interest, but traditional methods are labor intensive and may lack resolution. Here we evaluate a high-throughput genetic barcoding approach to characterize trap-collected pollen from multiple North Dakota apiaries across multiple years. We used the Illumina MiSeq platform to generate sequence scaffolds from non-overlapping 300-bp paired-end sequencing reads of the ribosomal internal transcribed spacers (ITS). Full-length sequence scaffolds represented ~530 bp of ITS sequence after adapter trimming, drawn from the 5' of ITS1 and the 3' of ITS2, while skipping the uninformative 5.8S region. Operational taxonomic units (OTUs) were picked from scaffolds clustered at 97% identity, searched by BLAST against the nt database, and given taxonomic assignments using the paired-read lowest common ancestor approach. Taxonomic assignments and quantitative patterns were consistent with known plant distributions, phenology, and observational reports of pollen foraging, but revealed an unexpected contribution from non-crop graminoids and wetland plants. The mean number of plant species assignments per sample was 23.0 (+/- 5.5) and the mean species diversity (effective number of equally abundant species) was 3.3 (+/- 1.2). Bray-Curtis similarities showed good agreement among samples from the same apiary and sampling date. Rarefaction plots indicated that fewer than 50,000 reads are typically needed to characterize pollen samples of this complexity. Our results show that a pre-compiled, curated reference database is not essential for genus-level assignments, but species-level assignments are hindered by database gaps, reference length variation, and probable errors in the taxonomic assignment, requiring post-hoc evaluation. Although the effective per-sample yield achieved using custom MiSeq amplicon primers was less than the machine maximum, primarily due to lower "read2" quality, further protocol optimization and/or a modest reduction in multiplex scale should offset this difficulty. As small quantities of pollen are sufficient for amplification, our approach might be extendable to other questions or species for which large pollen samples are not available.

  3. Taxonomic Characterization of Honey Bee (Apis mellifera) Pollen Foraging Based on Non-Overlapping Paired-End Sequencing of Nuclear Ribosomal Loci

    PubMed Central

    Cornman, R. Scott; Otto, Clint R. V.; Iwanowicz, Deborah; Pettis, Jeffery S.

    2015-01-01

    Identifying plant taxa that honey bees (Apis mellifera) forage upon is of great apicultural interest, but traditional methods are labor intensive and may lack resolution. Here we evaluate a high-throughput genetic barcoding approach to characterize trap-collected pollen from multiple North Dakota apiaries across multiple years. We used the Illumina MiSeq platform to generate sequence scaffolds from non-overlapping 300-bp paired-end sequencing reads of the ribosomal internal transcribed spacers (ITS). Full-length sequence scaffolds represented ~530 bp of ITS sequence after adapter trimming, drawn from the 5’ of ITS1 and the 3’ of ITS2, while skipping the uninformative 5.8S region. Operational taxonomic units (OTUs) were picked from scaffolds clustered at 97% identity, searched by BLAST against the nt database, and given taxonomic assignments using the paired-read lowest common ancestor approach. Taxonomic assignments and quantitative patterns were consistent with known plant distributions, phenology, and observational reports of pollen foraging, but revealed an unexpected contribution from non-crop graminoids and wetland plants. The mean number of plant species assignments per sample was 23.0 (+/- 5.5) and the mean species diversity (effective number of equally abundant species) was 3.3 (+/- 1.2). Bray-Curtis similarities showed good agreement among samples from the same apiary and sampling date. Rarefaction plots indicated that fewer than 50,000 reads are typically needed to characterize pollen samples of this complexity. Our results show that a pre-compiled, curated reference database is not essential for genus-level assignments, but species-level assignments are hindered by database gaps, reference length variation, and probable errors in the taxonomic assignment, requiring post-hoc evaluation. Although the effective per-sample yield achieved using custom MiSeq amplicon primers was less than the machine maximum, primarily due to lower “read2” quality, further protocol optimization and/or a modest reduction in multiplex scale should offset this difficulty. As small quantities of pollen are sufficient for amplification, our approach might be extendable to other questions or species for which large pollen samples are not available. PMID:26700168

  4. Taxonomic characterization of honey bee (Apis mellifera) pollen foraging based on non-overlapping paired-end sequencing of nuclear ribosomal loci

    USGS Publications Warehouse

    Cornman, Robert S.; Otto, Clint R.; Iwanowicz, Deborah; Pettis, Jeffery S

    2015-01-01

    Identifying plant taxa that honey bees (Apis mellifera) forage upon is of great apicultural interest, but traditional methods are labor intensive and may lack resolution. Here we evaluate a high-throughput genetic barcoding approach to characterize trap-collected pollen from multiple North Dakota apiaries across multiple years. We used the Illumina MiSeq platform to generate sequence scaffolds from non-overlapping 300-bp paired-end sequencing reads of the ribosomal internal transcribed spacers (ITS). Full-length sequence scaffolds represented ~530 bp of ITS sequence after adapter trimming, drawn from the 5’ of ITS1 and the 3’ of ITS2, while skipping the uninformative 5.8S region. Operational taxonomic units (OTUs) were picked from scaffolds clustered at 97% identity, searched by BLAST against the nt database, and given taxonomic assignments using the paired-read lowest common ancestor approach. Taxonomic assignments and quantitative patterns were consistent with known plant distributions, phenology, and observational reports of pollen foraging, but revealed an unexpected contribution from non-crop graminoids and wetland plants. The mean number of plant species assignments per sample was 23.0 (+/- 5.5) and the mean species diversity (effective number of equally abundant species) was 3.3 (+/- 1.2). Bray-Curtis similarities showed good agreement among samples from the same apiary and sampling date. Rarefaction plots indicated that fewer than 50,000 reads are typically needed to characterize pollen samples of this complexity. Our results show that a pre-compiled, curated reference database is not essential for genus-level assignments, but species-level assignments are hindered by database gaps, reference length variation, and probable errors in the taxonomic assignment, requiring post-hoc evaluation. Although the effective per-sample yield achieved using custom MiSeq amplicon primers was less than the machine maximum, primarily due to lower “read2” quality, further protocol optimization and/or a modest reduction in multiplex scale should offset this difficulty. As small quantities of pollen are sufficient for amplification, our approach might be extendable to other questions or species for which large pollen samples are not available.

  5. 8 January 2013 Mw=5.7 North Aegean Sea Earthquake Sequence

    NASA Astrophysics Data System (ADS)

    Kürçer, Akın; Yalçın, Hilal; Gülen, Levent; Kalafat, Doǧan

    2014-05-01

    The deformation of the North Aegean Sea is mainly controlled by the westernmost segments of North Anatolian Fault Zone (NAFZ). On January 8, 2013, a moderate earthquake (Mw= 5.7) occurred in the North Aegean Sea, which may be considered to be a part of westernmost splay of the NAFZ. A series of aftershocks were occurred within four months following the mainschock, which have magnitudes varying from 1.9 to 5.0. In this study, a total of 23 earthquake moment tensor solutions that belong to the 2013 earthquake sequence have been obtained by using KOERI and AFAD seismic data. The most widely used Gephart & Forsyth (1984) and Michael (1987) methods have been used to carry out stress tensor inversions. Based on the earthquake moment tensor solutions, distribution of epicenters and seismotectonic setting, the source of this earthquake sequence is a N75°E trending pure dextral strike-slip fault. The temporal and spatial distribution of earthquakes indicate that the rupture unilaterally propagated from SW to NE. The length of the fault has been calculated as approximately 12 km. using the afterschock distribution and empirical equations, suggested by Wells and Coppersmith (1994). The stress tensor analysis indicate that the dominant faulting type in the region is strike-slip and the direction of the regional compressive stress is WNW-ESE. The 1968 Aghios earthquake (Ms=7.3; Ambraseys and Jackson, 1998) and 2013 North Aegean Sea earthquake sequences clearly show that the regional stress has been transferred from SW to NE in this region. The last historical earthquake, the Bozcaada earthquake (M=7.05) had been occurred in the northeast of the 2013 earthquake sequence in 1672. The elapsed time (342 year) and regional stress transfer point out that the 1672 earthquake segment is probably a seismic gap. According to the empirical equations, the surface rupture length of the 1672 Earthquake segment was about 47 km, with a maximum displacement of 170 cm and average displacement of 107 cm. These values indicate that the 1672 earthquake segment is a potential earthquake hazard for this region.

  6. Characterization of full-length MHC class II sequences in Indonesian and Vietnamese cynomolgus macaques.

    PubMed

    Creager, Hannah M; Becker, Ericka A; Sandman, Kelly K; Karl, Julie A; Lank, Simon M; Bimber, Benjamin N; Wiseman, Roger W; Hughes, Austin L; O'Connor, Shelby L; O'Connor, David H

    2011-09-01

    In recent years, the use of cynomolgus macaques in biomedical research has increased greatly. However, with the exception of the Mauritian population, knowledge of the MHC class II genetics of the species remains limited. Here, using cDNA cloning and Sanger sequencing, we identified 127 full-length MHC class II alleles in a group of 12 Indonesian and 12 Vietnamese cynomolgus macaques. Forty two of these were completely novel to cynomolgus macaques while 61 extended the sequence of previously identified alleles from partial to full length. This more than doubles the number of full-length cynomolgus macaque MHC class II alleles available in GenBank, significantly expanding the allele library for the species and laying the groundwork for future evolutionary and functional studies.

  7. Isolation and sequence characterization of DNA-A genome of a new begomovirus strain associated with severe leaf curling symptoms of Jatropha curcas L.

    PubMed

    Chauhan, Sushma; Rahman, Hifzur; Mastan, Shaik G; Pamidimarri, D V N Sudheer; Reddy, Muppala P

    2018-07-20

    Begomoviruses belong to the family Geminiviridae are associated with several disease symptoms, such as mosaic and leaf curling in Jatropha curcas. The molecular characterization of these viral strains will help in developing management strategies to control the disease. In this study, J. curcas that was infected with begomovirus and showed acute leaf curling symptoms were identified. DNA-A segment from pathogenic viral strain was isolated and sequenced. The sequenced genome was assembled and characterized in detail. The full-length DNA-A sequence was covered by primer walking. The genome sequence showed the general organization of DNA-A from begomovirus by the distribution of ORFs in both viral and anti-viral strands. The genome size ranged from 2844 bp-2852 bp. Three strains with minor nucleotide variations were identified, and a phylogenetic analysis was performed by comparing the DNA-A segments from other reported begomovirus isolates. The maximum sequence similarity was observed with Euphorbia yellow mosaic virus (FN435995). In the phylogenetic tree, no clustering was observed with previously reported begomovirus strains isolated from J. curcas host. The strains isolated in this study belong to new begomoviral strain that elicits symptoms of leaf curling in J. curcas. The results indicate that the probable origin of the strains is from Jatropha mosaic virus infecting J. gassypifolia. The strains isolated in this study are referred as Jatropha curcas leaf curl India virus (JCLCIV) based on the major symptoms exhibited by host J. curcas. Copyright © 2018 Elsevier B.V. All rights reserved.

  8. RAPD and Internal Transcribed Spacer Sequence Analyses Reveal Zea nicaraguensis as a Section Luxuriantes Species Close to Zea luxurians

    PubMed Central

    Wang, Pei; Lu, Yanli; Zheng, Mingmin; Rong, Tingzhao; Tang, Qilin

    2011-01-01

    Genetic relationship of a newly discovered teosinte from Nicaragua, Zea nicaraguensis with waterlogging tolerance, was determined based on randomly amplified polymorphic DNA (RAPD) markers and the internal transcribed spacer (ITS) sequences of nuclear ribosomal DNA using 14 accessions from Zea species. RAPD analysis showed that a total of 5,303 fragments were produced by 136 random decamer primers, of which 84.86% bands were polymorphic. RAPD-based UPGMA analysis demonstrated that the genus Zea can be divided into section Luxuriantes including Zea diploperennis, Zea luxurians, Zea perennis and Zea nicaraguensis, and section Zea including Zea mays ssp. mexicana, Zea mays ssp. parviglumis, Zea mays ssp. huehuetenangensis and Zea mays ssp. mays. ITS sequence analysis showed the lengths of the entire ITS region of the 14 taxa in Zea varied from 597 to 605 bp. The average GC content was 67.8%. In addition to the insertion/deletions, 78 variable sites were recorded in the total ITS region with 47 in ITS1, 5 in 5.8S, and 26 in ITS2. Sequences of these taxa were analyzed with neighbor-joining (NJ) and maximum parsimony (MP) methods to construct the phylogenetic trees, selecting Tripsacum dactyloides L. as the outgroup. The phylogenetic relationships of Zea species inferred from the ITS sequences are highly concordant with the RAPD evidence that resolved two major subgenus clades. Both RAPD and ITS sequence analyses indicate that Zea nicaraguensis is more closely related to Zea luxurians than the other teosintes and cultivated maize, which should be regarded as a section Luxuriantes species. PMID:21525982

  9. Multifractal analysis of 2001 Mw 7 . 7 Bhuj earthquake sequence in Gujarat, Western India

    NASA Astrophysics Data System (ADS)

    Aggarwal, Sandeep Kumar; Pastén, Denisse; Khan, Prosanta Kumar

    2017-12-01

    The 2001 Mw 7 . 7 Bhuj mainshock seismic sequence in the Kachchh area, occurring during 2001 to 2012, has been analyzed using mono-fractal and multi-fractal dimension spectrum analysis technique. This region was characterized by frequent moderate shocks of Mw ≥ 5 . 0 for more than a decade since the occurrence of 2001 Bhuj earthquake. The present study is therefore important for precursory analysis using this sequence. The selected long-sequence has been investigated first time for completeness magnitude Mc 3.0 using the maximum curvature method. Multi-fractal Dq spectrum (Dq ∼ q) analysis was carried out using effective window-length of 200 earthquakes with a moving window of 20 events overlapped by 180 events. The robustness of the analysis has been tested by considering the magnitude completeness correction term of 0.2 to Mc 3.0 as Mc 3.2 and we have tested the error in the calculus of Dq for each magnitude threshold. On the other hand, the stability of the analysis has been investigated down to the minimum magnitude of Mw ≥ 2 . 6 in the sequence. The analysis shows the multi-fractal dimension spectrum Dq decreases with increasing of clustering of events with time before a moderate magnitude earthquake in the sequence, which alternatively accounts for non-randomness in the spatial distribution of epicenters and its self-organized criticality. Similar behavior is ubiquitous elsewhere around the globe, and warns for proximity of a damaging seismic event in an area. OS: Please confirm math roman or italics in abs.

  10. New Sequences with Low Correlation and Large Family Size

    NASA Astrophysics Data System (ADS)

    Zeng, Fanxin

    In direct-sequence code-division multiple-access (DS-CDMA) communication systems and direct-sequence ultra wideband (DS-UWB) radios, sequences with low correlation and large family size are important for reducing multiple access interference (MAI) and accepting more active users, respectively. In this paper, a new collection of families of sequences of length pn-1, which includes three constructions, is proposed. The maximum number of cyclically distinct families without GMW sequences in each construction is φ(pn-1)/n·φ(pm-1)/m, where p is a prime number, n is an even number, and n=2m, and these sequences can be binary or polyphase depending upon choice of the parameter p. In Construction I, there are pn distinct sequences within each family and the new sequences have at most d+2 nontrivial periodic correlation {-pm-1, -1, pm-1, 2pm-1,…,dpm-1}. In Construction II, the new sequences have large family size p2n and possibly take the nontrivial correlation values in {-pm-1, -1, pm-1, 2pm-1,…,(3d-4)pm-1}. In Construction III, the new sequences possess the largest family size p(d-1)n and have at most 2d correlation levels {-pm-1, -1,pm-1, 2pm-1,…,(2d-2)pm-1}. Three constructions are near-optimal with respect to the Welch bound because the values of their Welch-Ratios are moderate, WR_??_d, WR_??_3d-4 and WR_??_2d-2, respectively. Each family in Constructions I, II and III contains a GMW sequence. In addition, Helleseth sequences and Niho sequences are special cases in Constructions I and III, and their restriction conditions to the integers m and n, pm≠2 (mod 3) and n≅0 (mod 4), respectively, are removed in our sequences. Our sequences in Construction III include the sequences with Niho type decimation 3·2m-2, too. Finally, some open questions are pointed out and an example that illustrates the performance of these sequences is given.

  11. Assessing the performance of the Oxford Nanopore Technologies MinION

    PubMed Central

    Laver, T.; Harrison, J.; O’Neill, P.A.; Moore, K.; Farbos, A.; Paszkiewicz, K.; Studholme, D.J.

    2015-01-01

    The Oxford Nanopore Technologies (ONT) MinION is a new sequencing technology that potentially offers read lengths of tens of kilobases (kb) limited only by the length of DNA molecules presented to it. The device has a low capital cost, is by far the most portable DNA sequencer available, and can produce data in real-time. It has numerous prospective applications including improving genome sequence assemblies and resolution of repeat-rich regions. Before such a technology is widely adopted, it is important to assess its performance and limitations in respect of throughput and accuracy. In this study we assessed the performance of the MinION by re-sequencing three bacterial genomes, with very different nucleotide compositions ranging from 28.6% to 70.7%; the high G + C strain was underrepresented in the sequencing reads. We estimate the error rate of the MinION (after base calling) to be 38.2%. Mean and median read lengths were 2 kb and 1 kb respectively, while the longest single read was 98 kb. The whole length of a 5 kb rRNA operon was covered by a single read. As the first nanopore-based single molecule sequencer available to researchers, the MinION is an exciting prospect; however, the current error rate limits its ability to compete with existing sequencing technologies, though we do show that MinION sequence reads can enhance contiguity of de novo assembly when used in conjunction with Illumina MiSeq data. PMID:26753127

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stillman, A.E.; Wilke, N.; Li, D.

    Our goal was to determine the feasibility of using an intravascular MR contrast agent to improve 3D MRA. Three-dimensional TOF MRA was performed in nine patients both prior to and following the administration of an ultrasmall particle superparamagnetic iron oxide contrast agent (AMI 227). The lengths of both renal arteries were measured from the maximum intensity projection (MIP) images as well as the individual partitions. Seven of these patients also were studied by a 3D coronary artery MRA sequence. Signal-to-noise ratio (SNR) and contrast-to-noise ratio (CNR) measurements of the right coronary artery were determined both prior to and following themore » administration of AMI 227. Statistical analysis of both renal artery lengths and right coronary SNR and CNR was performed using a one tailed paired t test comparing pre- and postcontrast images. The renal artery lengths significantly increased (right renal artery: 30%, p = 0.001; left renal artery: 25%, p < 0.008) when measured from the individual axial slice partitions. No significant increase in length was observed on the MIP images following contrast. In the right coronary artery, the SNR increased by an average of 80% (p = 0.008) and CNR increased by an average of 109% (p = 0.007). Increased background signal and superimposed venous structures reduced the measurable lengths of the renal arteries from the MIP images. These studies support the hypothesis that 3D MRA in the body will benefit from the use of intravascular contrast agents. Nevertheless, conventional MIP processing is unable to reveal the full advantage of the contrast improvement. 14 refs., 6 figs., 2 tabs.« less

  13. Optimal choice of word length when comparing two Markov sequences using a χ 2-statistic.

    PubMed

    Bai, Xin; Tang, Kujin; Ren, Jie; Waterman, Michael; Sun, Fengzhu

    2017-10-03

    Alignment-free sequence comparison using counts of word patterns (grams, k-tuples) has become an active research topic due to the large amount of sequence data from the new sequencing technologies. Genome sequences are frequently modelled by Markov chains and the likelihood ratio test or the corresponding approximate χ 2 -statistic has been suggested to compare two sequences. However, it is not known how to best choose the word length k in such studies. We develop an optimal strategy to choose k by maximizing the statistical power of detecting differences between two sequences. Let the orders of the Markov chains for the two sequences be r 1 and r 2 , respectively. We show through both simulations and theoretical studies that the optimal k= max(r 1 ,r 2 )+1 for both long sequences and next generation sequencing (NGS) read data. The orders of the Markov chains may be unknown and several methods have been developed to estimate the orders of Markov chains based on both long sequences and NGS reads. We study the power loss of the statistics when the estimated orders are used. It is shown that the power loss is minimal for some of the estimators of the orders of Markov chains. Our studies provide guidelines on choosing the optimal word length for the comparison of Markov sequences.

  14. Rate-determining Step of Flap Endonuclease 1 (FEN1) Reflects a Kinetic Bias against Long Flaps and Trinucleotide Repeat Sequences.

    PubMed

    Tarantino, Mary E; Bilotti, Katharina; Huang, Ji; Delaney, Sarah

    2015-08-21

    Flap endonuclease 1 (FEN1) is a structure-specific nuclease responsible for removing 5'-flaps formed during Okazaki fragment maturation and long patch base excision repair. In this work, we use rapid quench flow techniques to examine the rates of 5'-flap removal on DNA substrates of varying length and sequence. Of particular interest are flaps containing trinucleotide repeats (TNR), which have been proposed to affect FEN1 activity and cause genetic instability. We report that FEN1 processes substrates containing flaps of 30 nucleotides or fewer at comparable single-turnover rates. However, for flaps longer than 30 nucleotides, FEN1 kinetically discriminates substrates based on flap length and flap sequence. In particular, FEN1 removes flaps containing TNR sequences at a rate slower than mixed sequence flaps of the same length. Furthermore, multiple-turnover kinetic analysis reveals that the rate-determining step of FEN1 switches as a function of flap length from product release to chemistry (or a step prior to chemistry). These results provide a kinetic perspective on the role of FEN1 in DNA replication and repair and contribute to our understanding of FEN1 in mediating genetic instability of TNR sequences. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

  15. Integrating De Novo Transcriptome Assembly and Cloning to Obtain Chicken Ovocleidin-17 Full-Length cDNA

    PubMed Central

    Ning, ZhongHua; Hincke, Maxwell T.; Yang, Ning; Hou, ZhuoCheng

    2014-01-01

    Efficiently obtaining full-length cDNA for a target gene is the key step for functional studies and probing genetic variations. However, almost all sequenced domestic animal genomes are not ‘finished’. Many functionally important genes are located in these gapped regions. It can be difficult to obtain full-length cDNA for which only partial amino acid/EST sequences exist. In this study we report a general pipeline to obtain full-length cDNA, and illustrate this approach for one important gene (Ovocleidin-17, OC-17) that is associated with chicken eggshell biomineralization. Chicken OC-17 is one of the best candidates to control and regulate the deposition of calcium carbonate in the calcified eggshell layer. OC-17 protein has been purified, sequenced, and has had its three-dimensional structure solved. However, researchers still cannot conduct OC-17 mRNA related studies because the mRNA sequence is unknown and the gene is absent from the current chicken genome. We used RNA-Seq to obtain the entire transcriptome of the adult hen uterus, and then conducted de novo transcriptome assembling with bioinformatics analysis to obtain candidate OC-17 transcripts. Based on this sequence, we used RACE and PCR cloning methods to successfully obtain the full-length OC-17 cDNA. Temporal and spatial OC-17 mRNA expression analyses were also performed to demonstrate that OC-17 is predominantly expressed in the adult hen uterus during the laying cycle and barely at immature developmental stages. Differential uterine expression of OC-17 was observed in hens laying eggs with weak versus strong eggshell, confirming its important role in the regulation of eggshell mineralization and providing a new tool for genetic selection for eggshell quality parameters. This study is the first one to report the full-length OC-17 cDNA sequence, and builds a foundation for OC-17 mRNA related studies. We provide a general method for biologists experiencing difficulty in obtaining candidate gene full-length cDNA sequences. PMID:24676480

  16. Integrating de novo transcriptome assembly and cloning to obtain chicken Ovocleidin-17 full-length cDNA.

    PubMed

    Zhang, Quan; Liu, Long; Zhu, Feng; Ning, ZhongHua; Hincke, Maxwell T; Yang, Ning; Hou, ZhuoCheng

    2014-01-01

    Efficiently obtaining full-length cDNA for a target gene is the key step for functional studies and probing genetic variations. However, almost all sequenced domestic animal genomes are not 'finished'. Many functionally important genes are located in these gapped regions. It can be difficult to obtain full-length cDNA for which only partial amino acid/EST sequences exist. In this study we report a general pipeline to obtain full-length cDNA, and illustrate this approach for one important gene (Ovocleidin-17, OC-17) that is associated with chicken eggshell biomineralization. Chicken OC-17 is one of the best candidates to control and regulate the deposition of calcium carbonate in the calcified eggshell layer. OC-17 protein has been purified, sequenced, and has had its three-dimensional structure solved. However, researchers still cannot conduct OC-17 mRNA related studies because the mRNA sequence is unknown and the gene is absent from the current chicken genome. We used RNA-Seq to obtain the entire transcriptome of the adult hen uterus, and then conducted de novo transcriptome assembling with bioinformatics analysis to obtain candidate OC-17 transcripts. Based on this sequence, we used RACE and PCR cloning methods to successfully obtain the full-length OC-17 cDNA. Temporal and spatial OC-17 mRNA expression analyses were also performed to demonstrate that OC-17 is predominantly expressed in the adult hen uterus during the laying cycle and barely at immature developmental stages. Differential uterine expression of OC-17 was observed in hens laying eggs with weak versus strong eggshell, confirming its important role in the regulation of eggshell mineralization and providing a new tool for genetic selection for eggshell quality parameters. This study is the first one to report the full-length OC-17 cDNA sequence, and builds a foundation for OC-17 mRNA related studies. We provide a general method for biologists experiencing difficulty in obtaining candidate gene full-length cDNA sequences.

  17. The kinetics of pre-mRNA splicing in the Drosophila genome and the influence of gene architecture

    PubMed Central

    Pai, Athma A; Henriques, Telmo; McCue, Kayla; Burkholder, Adam; Adelman, Karen

    2017-01-01

    Production of most eukaryotic mRNAs requires splicing of introns from pre-mRNA. The splicing reaction requires definition of splice sites, which are initially recognized in either intron-spanning (‘intron definition’) or exon-spanning (‘exon definition’) pairs. To understand how exon and intron length and splice site recognition mode impact splicing, we measured splicing rates genome-wide in Drosophila, using metabolic labeling/RNA sequencing and new mathematical models to estimate rates. We found that the modal intron length range of 60–70 nt represents a local maximum of splicing rates, but that much longer exon-defined introns are spliced even faster and more accurately. We observed unexpectedly low variation in splicing rates across introns in the same gene, suggesting the presence of gene-level influences, and we identified multiple gene level variables associated with splicing rate. Together our data suggest that developmental and stress response genes may have preferentially evolved exon definition in order to enhance the rate or accuracy of splicing. PMID:29280736

  18. The kinetics of pre-mRNA splicing in the Drosophila genome and the influence of gene architecture

    DOE PAGES

    Pai, Athma A.; Henriques, Telmo; McCue, Kayla; ...

    2017-12-27

    Production of most eukaryotic mRNAs requires splicing of introns from pre-mRNA. The splicing reaction requires definition of splice sites, which are initially recognized in either intron-spanning (‘intron definition’) or exon-spanning (‘exon definition’) pairs. To understand how exon and intron length and splice site recognition mode impact splicing, we measured splicing rates genome-wide in Drosophila, using metabolic labeling/RNA sequencing and new mathematical models to estimate rates. We found that the modal intron length range of 60–70 nt represents a local maximum of splicing rates, but that much longer exon-defined introns are spliced even faster and more accurately. We observed unexpectedly lowmore » variation in splicing rates across introns in the same gene, suggesting the presence of gene-level influences, and we identified multiple gene level variables associated with splicing rate. Together our data suggest that developmental and stress response genes may have preferentially evolved exon definition in order to enhance the rate or accuracy of splicing.« less

  19. Physical Properties of Umbral Dots Observed in Sunspots: A Hinode Observation

    NASA Astrophysics Data System (ADS)

    Yadav, Rahul; Mathew, Shibu K.

    2018-04-01

    Umbral dots (UDs) are small-scale bright features observed in the umbral part of sunspots and pores. It is well established that they are manifestations of magnetoconvection phenomena inside umbrae. We study the physical properties of UDs in different sunspots and their dependence on decay rate and filling factor. We have selected high-resolution, G-band continuum filtergrams of seven sunspots from Hinode to study their physical properties. We have also used Michelson Doppler Imager (MDI) continuum images to estimate the decay rate of selected sunspots. An identification and tracking algorithm was developed to identify the UDs in time sequences. The statistical analysis of UDs exhibits an averaged maximum intensity and effective diameter of 0.26 I_{QS} and 270 km. Furthermore, the lifetime, horizontal speed, trajectory length, and displacement length (birth-death distance) of UDs are 8.19 minutes, 0.5 km s-1, 284 km, and 155 km, respectively. We also find a positive correlation between intensity-diameter, intensity-lifetime, and diameter-lifetime of UDs. However, UD properties do not show any significant relation with the decay rate or filling factor.

  20. The kinetics of pre-mRNA splicing in the Drosophila genome and the influence of gene architecture

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pai, Athma A.; Henriques, Telmo; McCue, Kayla

    Production of most eukaryotic mRNAs requires splicing of introns from pre-mRNA. The splicing reaction requires definition of splice sites, which are initially recognized in either intron-spanning (‘intron definition’) or exon-spanning (‘exon definition’) pairs. To understand how exon and intron length and splice site recognition mode impact splicing, we measured splicing rates genome-wide in Drosophila, using metabolic labeling/RNA sequencing and new mathematical models to estimate rates. We found that the modal intron length range of 60–70 nt represents a local maximum of splicing rates, but that much longer exon-defined introns are spliced even faster and more accurately. We observed unexpectedly lowmore » variation in splicing rates across introns in the same gene, suggesting the presence of gene-level influences, and we identified multiple gene level variables associated with splicing rate. Together our data suggest that developmental and stress response genes may have preferentially evolved exon definition in order to enhance the rate or accuracy of splicing.« less

  1. Continuous Mass Measurement on Conveyor Belt

    NASA Astrophysics Data System (ADS)

    Tomobe, Yuki; Tasaki, Ryosuke; Yamazaki, Takanori; Ohnishi, Hideo; Kobayashi, Masaaki; Kurosu, Shigeru

    The continuous mass measurement of packages on a conveyor belt will become greatly important. In the mass measurement, the sequence of products is generally random. An interesting possibility of raising throughput of the conveyor line without increasing the conveyor belt speed is offered by the use of two or three conveyor belt scales (called a multi-stage conveyor belt scale). The multi-stage conveyor belt scale can be created which will adjust the conveyor belt length to the product length. The conveyor belt scale usually has maximum capacities of less than 80kg and 140cm, and achieves measuring rates of more than 150 packages per minute and more. The output signals from the conveyor belt scale are always contaminated with noises due to vibrations of the conveyor and the product to be measured in motion. In this paper an employed digital filter is of Finite Impulse Response (FIR) type designed under the consideration on the dynamics of the conveyor system. The experimental results on the conveyor belt scale suggest that the filtering algorithms are effective enough to practical applications to some extent.

  2. Airfoil System for Cruising Flight

    NASA Technical Reports Server (NTRS)

    Shams, Qamar A. (Inventor); Liu, Tianshu (Inventor)

    2014-01-01

    An airfoil system includes an airfoil body and at least one flexible strip. The airfoil body has a top surface and a bottom surface, a chord length, a span, and a maximum thickness. Each flexible strip is attached along at least one edge thereof to either the top or bottom surface of the airfoil body. The flexible strip has a spanwise length that is a function of the airfoil body's span, a chordwise width that is a function of the airfoil body's chord length, and a thickness that is a function of the airfoil body's maximum thickness.

  3. Long-range correlations and charge transport properties of DNA sequences

    NASA Astrophysics Data System (ADS)

    Liu, Xiao-liang; Ren, Yi; Xie, Qiong-tao; Deng, Chao-sheng; Xu, Hui

    2010-04-01

    By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that λ-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5

  4. The correlation function for density perturbations in an expanding universe. I - Linear theory

    NASA Technical Reports Server (NTRS)

    Mcclelland, J.; Silk, J.

    1977-01-01

    The evolution of the two-point correlation function for adiabatic density perturbations in the early universe is studied. Analytical solutions are obtained for the evolution of linearized spherically symmetric adiabatic density perturbations and the two-point correlation function for these perturbations in the radiation-dominated portion of the early universe. The results are then extended to the regime after decoupling. It is found that: (1) adiabatic spherically symmetric perturbations comparable in scale with the maximum Jeans length would survive the radiation-dominated regime; (2) irregular fluctuations are smoothed out up to the scale of the maximum Jeans length in the radiation era, but regular fluctuations might survive on smaller scales; (3) in general, the only surviving structures for irregularly shaped adiabatic density perturbations of arbitrary but finite scale in the radiation regime are the size of or larger than the maximum Jeans length in that regime; (4) infinite plane waves with a wavelength smaller than the maximum Jeans length but larger than the critical dissipative damping scale could survive the radiation regime; and (5) black holes would also survive the radiation regime and might accrete sufficient mass after decoupling to nucleate the formation of galaxies.

  5. Universal sequence map (USM) of arbitrary discrete sequences

    PubMed Central

    2002-01-01

    Background For over a decade the idea of representing biological sequences in a continuous coordinate space has maintained its appeal but not been fully realized. The basic idea is that any sequence of symbols may define trajectories in the continuous space conserving all its statistical properties. Ideally, such a representation would allow scale independent sequence analysis – without the context of fixed memory length. A simple example would consist on being able to infer the homology between two sequences solely by comparing the coordinates of any two homologous units. Results We have successfully identified such an iterative function for bijective mappingψ of discrete sequences into objects of continuous state space that enable scale-independent sequence analysis. The technique, named Universal Sequence Mapping (USM), is applicable to sequences with an arbitrary length and arbitrary number of unique units and generates a representation where map distance estimates sequence similarity. The novel USM procedure is based on earlier work by these and other authors on the properties of Chaos Game Representation (CGR). The latter enables the representation of 4 unit type sequences (like DNA) as an order free Markov Chain transition table. The properties of USM are illustrated with test data and can be verified for other data by using the accompanying web-based tool:http://bioinformatics.musc.edu/~jonas/usm/. Conclusions USM is shown to enable a statistical mechanics approach to sequence analysis. The scale independent representation frees sequence analysis from the need to assume a memory length in the investigation of syntactic rules. PMID:11895567

  6. Weight compensation characteristics of Armeo®Spring exoskeleton: implications for clinical practice and research.

    PubMed

    Perry, Bonnie E; Evans, Emily K; Stokic, Dobrivoje S

    2017-02-17

    Armeo®Spring exoskeleton is widely used for upper extremity rehabilitation; however, weight compensation provided by the device appears insufficiently characterized to fully utilize it in clinical and research settings. Weight compensation was quantified by measuring static force in the sagittal plane with a load cell attached to the elbow joint of Armeo®Spring. All upper spring settings were examined in 5° increments at the minimum, maximum, and two intermediate upper and lower module length settings, while keeping the lower spring at minimum. The same measurements were made for minimum upper spring setting and maximum lower spring setting at minimum and maximum module lengths. Weight compensation was plotted against upper module angles, and slope was analyzed for each condition. The Armeo®Spring design prompted defining the slack angle and exoskeleton balance angle, which, depending on spring and length settings, divide the operating range into different unloading and loading regions. Higher spring tensions and shorter module lengths provided greater unloading (≤6.32 kg of support). Weight compensation slope decreased faster with shorter length settings (minimum length = -0.082 ± 0.002 kg/°; maximum length = -0.046 ± 0.001 kg/°) independent of spring settings. Understanding the impact of different settings on the Armeo®Spring weight compensation should help define best clinical practice and improve fidelity of research.

  7. Self-assembly assisted polymerization (SAAP): approaching long multi-block copolymers with an ordered chain sequence and controllable block length.

    PubMed

    Wu, Chi; Xie, Zuowei; Zhang, Guangzhao; Zi, Guofu; Tu, Yingfeng; Yang, Yali; Cai, Ping; Nie, Ting

    2002-12-07

    A combination of polymer physics and synthetic chemistry has enabled us to develop self-assembly assisted polymerization (SAAP), leading to the preparation of long multi-block copolymers with an ordered chain sequence and controllable block lengths.

  8. Expression of a polyubiquitin promoter isolated from Gladiolus.

    PubMed

    Joung, Young Hee; Kamo, Kathryn

    2006-10-01

    A polyubiquitin promoter (GUBQ1) including its 5'UTR and intron was isolated from the floral monocot Gladiolus because high levels of expression could not be obtained using publicly available promoters isolated from either cereals or dicots. Sequencing of the promoter revealed highly conserved 5' and 3' intron splicing sites for the 1.234 kb intron. The coding sequence of the first two ubiquitin genes showed the highest homology (87 and 86%, respectively) to the ubiquitin genes of Nicotiana tabacum and Oryza sativa RUBQ2. Transient expression following gene gun bombardment showed that relative levels of GUS activity with the GUBQ1 promoter were comparable to the CaMV 35S promoter in gladiolus, tobacco, rose, rice, and the floral monocot freesia. The highest levels of GUS expression with GUBQ1 were attained with Gladiolus. The full-length GUBQ1 promoter including 5'UTR and intron were necessary for maximum GUS expression in Gladiolus. The relative GUS activity for the promoter only was 9%, and the activity for the promoter with 5'UTR and 399 bp of the full-length 1.234 kb intron was 41%. Arabidopsis plants transformed with uidA under GUBQ1 showed moderate GUS expression throughout young leaves and in the vasculature of older leaves. The highest levels of transient GUS expression in Gladiolus have been achieved using the GUBQ1 promoter. This promoter should be useful for genetic engineering of disease resistance in Gladiolus, rose, and freesia, where high levels of gene expression are important.

  9. Genome Survey Sequencing for the Characterization of the Genetic Background of Rosa roxburghii Tratt and Leaf Ascorbate Metabolism Genes.

    PubMed

    Lu, Min; An, Huaming; Li, Liangliang

    2016-01-01

    Rosa roxburghii Tratt is an important commercial horticultural crop in China that is recognized for its nutritional and medicinal values. In spite of the economic significance, genomic information on this rose species is currently unavailable. In the present research, a genome survey of R. roxburghii was carried out using next-generation sequencing (NGS) technologies. Total 30.29 Gb sequence data was obtained by HiSeq 2500 sequencing and an estimated genome size of R. roxburghii was 480.97 Mb, in which the guanine plus cytosine (GC) content was calculated to be 38.63%. All of these reads were technically assembled and a total of 627,554 contigs with a N50 length of 1.484 kb and furthermore 335,902 scaffolds with a total length of 409.36 Mb were obtained. Transposable elements (TE) sequence of 90.84 Mb which comprised 29.20% of the genome, and 167,859 simple sequence repeats (SSRs) were identified from the scaffolds. Among these, the mono-(66.30%), di-(25.67%), and tri-(6.64%) nucleotide repeats contributed to nearly 99% of the SSRs, and sequence motifs AG/CT (28.81%) and GAA/TTC (14.76%) were the most abundant among the dinucleotide and trinucleotide repeat motifs, respectively. Genome analysis predicted a total of 22,721 genes which have an average length of 2311.52 bp, an average exon length of 228.15 bp, and average intron length of 401.18 bp. Eleven genes putatively involved in ascorbate metabolism were identified and its expression in R. roxburghii leaves was validated by quantitative real-time PCR (qRT-PCR). This is the first report of genome-wide characterization of this rose species.

  10. Effect of sampling rate and record length on the determination of stability and control derivatives

    NASA Technical Reports Server (NTRS)

    Brenner, M. J.; Iliff, K. W.; Whitman, R. K.

    1978-01-01

    Flight data from five aircraft were used to assess the effects of sampling rate and record length reductions on estimates of stability and control derivatives produced by a maximum likelihood estimation method. Derivatives could be extracted from flight data with the maximum likelihood estimation method even if there were considerable reductions in sampling rate and/or record length. Small amplitude pulse maneuvers showed greater degradation of the derivative maneuvers than large amplitude pulse maneuvers when these reductions were made. Reducing the sampling rate was found to be more desirable than reducing the record length as a method of lessening the total computation time required without greatly degrading the quantity of the estimates.

  11. Facies analysis and sequence stratigraphic framework of upper Campanian strata (Neslen and Mount Garfield formations, Bluecastle Tongue of the Castlegate sandstone, and Mancos shale), Eastern Book cliffs, Colorado and Utah

    USGS Publications Warehouse

    Kirschbaum, Mark A.; Hettinger, Robert D.

    2004-01-01

    Facies and sequence-stratigraphic analysis identifies six high-resolution sequences within upper Campanian strata across about 120 miles of the Book Cliffs in western Colorado and eastern Utah. The six sequences are named after prominent sandstone units and include, in ascending order, upper Sego sequence, Neslen sequence, Corcoran sequence, Buck Canyon/lower Cozzette sequence, upper Cozzette sequence, and Cozzette/Rollins sequence. A seventh sequence, the Bluecastle sequence, is present in the extreme western part of the study area. Facies analysis documents deepening- and shallowing- upward successions, parasequence stacking patterns, downlap in subsurface cross sections, facies dislocations, basinward shifts in facies, and truncation of strata.All six sequences display major incision into shoreface deposits of the Sego Sandstone and sandstones of the Corcoran and Cozzette Members of the Mount Garfield Formation. The incised surfaces represent sequence-boundary unconformities that allowed bypass of sediment to lowstand shorelines that are either attached to the older highstand shorelines or are detached from the older highstand shorelines and located southeast of the main study area. The sequence boundary unconformities represent valley incisions that were cut during successive lowstands of relative sea level. The overlying valley-fill deposits generally consist of tidally influenced strata deposited during an overall base level rise. Transgressive surfaces can be traced or projected over, or locally into, estuarine deposits above and landward of their associated shoreface deposits. Maximum flooding surfaces can be traced or projected landward from offshore strata into, or above, coastal-plain deposits. With the exception of the Cozzette/Rollins sequence, the majority of coal-bearing coastal-plain strata was deposited before maximum flooding and is therefore within the transgressive systems tracts. Maximum flooding was followed by strong progradation of parasequences and low preservation potential of coastal-plain strata within the highstand systems tract. The large incised valleys, lack of transgressive retrogradational parasequences, strong progradational nature of highstand parasequences, and low preservation of coastal-plain strata in the highstand systems tracts argue for relatively low accommodation space during deposition of the Sego, Corcoran, and Cozzette sequences. The Buck Canyon/Cozzette and Cozzette/Rollins sequences contrast with other sequences in that the preservation of retrogradational parasequences and the development of large estuaries coincident with maximum flooding indicate a relative increase in accommodation space during deposition of these strata. Following maximum flooding, the Buck Canyon/Cozzette sequence follows the pattern of the other sequences, but the Cozzette/Rollins sequence exhibits a contrasting offlapping pattern with development of offshore clinoforms that downlap and eventually parallel its maximum flooding surface. This highstand systems tract preserves a thick coal-bearing section where the Rollins Sandstone Member of the Mount Garfield Formation parasequences prograde out of the study area, stepping up as much as 800 ft stratigraphically over a distance of about 90 miles. This progradational stacking pattern indicates a higher accommodation space and increased sedimentation rate compared to the previous sequences.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Heng, E-mail: hengli@mdanderson.org; Zhu, X. Ronald; Zhang, Xiaodong

    Purpose: To develop and validate a novel delivery strategy for reducing the respiratory motion–induced dose uncertainty of spot-scanning proton therapy. Methods and Materials: The spot delivery sequence was optimized to reduce dose uncertainty. The effectiveness of the delivery sequence optimization was evaluated using measurements and patient simulation. One hundred ninety-one 2-dimensional measurements using different delivery sequences of a single-layer uniform pattern were obtained with a detector array on a 1-dimensional moving platform. Intensity modulated proton therapy plans were generated for 10 lung cancer patients, and dose uncertainties for different delivery sequences were evaluated by simulation. Results: Without delivery sequence optimization,more » the maximum absolute dose error can be up to 97.2% in a single measurement, whereas the optimized delivery sequence results in a maximum absolute dose error of ≤11.8%. In patient simulation, the optimized delivery sequence reduces the mean of fractional maximum absolute dose error compared with the regular delivery sequence by 3.3% to 10.6% (32.5-68.0% relative reduction) for different patients. Conclusions: Optimizing the delivery sequence can reduce dose uncertainty due to respiratory motion in spot-scanning proton therapy, assuming the 4-dimensional CT is a true representation of the patients' breathing patterns.« less

  13. Use of the LUS in sequence allele designations to facilitate probabilistic genotyping of NGS-based STR typing results.

    PubMed

    Just, Rebecca S; Irwin, Jodi A

    2018-05-01

    Some of the expected advantages of next generation sequencing (NGS) for short tandem repeat (STR) typing include enhanced mixture detection and genotype resolution via sequence variation among non-homologous alleles of the same length. However, at the same time that NGS methods for forensic DNA typing have advanced in recent years, many caseworking laboratories have implemented or are transitioning to probabilistic genotyping to assist the interpretation of complex autosomal STR typing results. Current probabilistic software programs are designed for length-based data, and were not intended to accommodate sequence strings as the product input. Yet to leverage the benefits of NGS for enhanced genotyping and mixture deconvolution, the sequence variation among same-length products must be utilized in some form. Here, we propose use of the longest uninterrupted stretch (LUS) in allele designations as a simple method to represent sequence variation within the STR repeat regions and facilitate - in the nearterm - probabilistic interpretation of NGS-based typing results. An examination of published population data indicated that a reference LUS region is straightforward to define for most autosomal STR loci, and that using repeat unit plus LUS length as the allele designator can represent greater than 80% of the alleles detected by sequencing. A proof of concept study performed using a freely available probabilistic software demonstrated that the LUS length can be used in allele designations when a program does not require alleles to be integers, and that utilizing sequence information improves interpretation of both single-source and mixed contributor STR typing results as compared to using repeat unit information alone. The LUS concept for allele designation maintains the repeat-based allele nomenclature that will permit backward compatibility to extant STR databases, and the LUS lengths themselves will be concordant regardless of the NGS assay or analysis tools employed. Further, these biologically based, easy-to-derive designations uphold clear relationships between parent alleles and their stutter products, enabling analysis in fully continuous probabilistic programs that model stutter while avoiding the algorithmic complexities that come with string based searches. Though using repeat unit plus LUS length as the allele designator does not capture variation that occurs outside of the core repeat regions, this straightforward approach would permit the large majority of known STR sequence variation to be used for mixture deconvolution and, in turn, result in more informative mixture statistics in the near term. Ultimately, the method could bridge the gap from current length-based probabilistic systems to facilitate broader adoption of NGS by forensic DNA testing laboratories. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.

  14. Polypeptide having or assisting in carbohydrate material degrading activity and uses thereof

    DOEpatents

    Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Los, Alrik Pieter

    2016-02-16

    The invention relates to a polypeptide which comprises the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 76% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 76% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.

  15. Polypeptide having beta-glucosidase activity and uses thereof

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schoonneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; De Jong, Rene Marcel

    The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well asmore » the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.« less

  16. Polypeptide having swollenin activity and uses thereof

    DOEpatents

    Schoonneveld-Bergmans, Margot Elizabeth Francoise; Heijne, Wilbert Herman Marie; Vlasie, Monica D; Damveld, Robbertus Antonius

    2015-11-04

    The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.

  17. Polypeptide having beta-glucosidase activity and uses thereof

    DOEpatents

    Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; De Jong, Rene Marcel; Damveld, Robbertus Antonius

    2015-09-01

    The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 70% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 70% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.

  18. Polypeptide having cellobiohydrolase activity and uses thereof

    DOEpatents

    Sagt, Cornelis Maria Jacobus; Schooneveld-Bergmans, Margot Elisabeth Francoise; Roubos, Johannes Andries; Los, Alrik Pieter

    2015-09-15

    The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 93% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 93% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.

  19. Polypeptide having acetyl xylan esterase activity and uses thereof

    DOEpatents

    Schoonneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Los, Alrik Pieter

    2015-10-20

    The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 82% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 82% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.

  20. Polypeptide having carbohydrate degrading activity and uses thereof

    DOEpatents

    Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Vlasie, Monica Diana; Damveld, Robbertus Antonius

    2015-08-18

    The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.

  1. Final progress report, Construction of a genome-wide highly characterized clone resource for genome sequencing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nierman, William C.

    At TIGR, the human Bacterial Artificial Chromosome (BAC) end sequencing and trimming were with an overall sequencing success rate of 65%. CalTech human BAC libraries A, B, C and D as well as Roswell Park Cancer Institute's library RPCI-11 were used. To date, we have generated >300,000 end sequences from >186,000 human BAC clones with an average read length {approx}460 bp for a total of 141 Mb covering {approx}4.7% of the genome. Over sixty percent of the clones have BAC end sequences (BESs) from both ends representing over five-fold coverage of the genome by the paired-end clones. The average phredmore » Q20 length is {approx}400 bp. This high accuracy makes our BESs match the human finished sequences with an average identity of 99% and a match length of 450 bp, and a frequency of one match per 12.8 kb contig sequence. Our sample tracking has ensured a clone tracking accuracy of >90%, which gives researchers a high confidence in (1) retrieving the right clone from the BA C libraries based on the sequence matches; and (2) building a minimum tiling path of sequence-ready clones across the genome and genome assembly scaffolds.« less

  2. Predicted stem-loop structures and variation in nucleotide sequence of 3' noncoding regions among animal calicivirus genomes.

    PubMed

    Seal, B S; Neill, J D; Ridpath, J F

    1994-07-01

    Caliciviruses are nonenveloped with a polyadenylated genome of approximately 7.6 kb and a single capsid protein. The "RNA Fold" computer program was used to analyze 3'-terminal noncoding sequences of five feline calicivirus (FCV), rabbit hemorrhagic disease virus (RHDV), and two San Miguel sea lion virus (SMSV) isolates. The FCV 3'-terminal sequences are 40-46 nucleotides in length and 72-91% similar. The FCV sequences were predicted to contain two possible duplex structures and one stem-loop structure with free energies of -2.1 to -18.2 kcal/mole. The RHDV genomic 3'-terminal RNA sequences are 54 nucleotides in length and share 49% sequence similarity to homologous regions of the FCV genome. The RHDV sequence was predicted to form two duplex structures in the 3'-terminal noncoding region with a single stem-loop structure, resembling that of FCV. In contrast, the SMSV 1 and 4 genomic 3'-terminal noncoding sequences were 185 and 182 nucleotides in length, respectively. Ten possible duplex structures were predicted with an average structural free energy of -35 kcal/mole. Sequence similarity between the two SMSV isolates was 75%. Furthermore, extensive cloverleaflike structures are predicted in the 3' noncoding region of the SMSV genome, in contrast to the predicted single stem-loop structures of FCV or RHDV.

  3. Development of flow in a square mini-channel: Effect of flow oscillation

    NASA Astrophysics Data System (ADS)

    Lobo, Oswald Jason; Chatterjee, Dhiman

    2018-04-01

    In this research paper, we present a numerical prediction of steady and fully oscillatory flows in a square mini-channel connected between two plenums. Flow separation occurs at the contraction of the plenum into the channel which causes an asymmetry in the development of flow in the entrance region. The entrance length and recirculation length are found, for both steady and fully oscillatory flows. It is shown that the maximum entrance length decreases with an increase in the oscillating frequency while the maximum recirculation length and recirculation area increase with an increase in oscillating frequency. The phase of a velocity signal is shown to be a strong function of its location. The phase difference between the velocities with respect to the different points along the centerline and those at the middle of the channel show a significant dependence on the driving frequency. There is a significant variation in the phase angles of the velocity signals computed between a point near the wall and that at the centerline. This phase difference decreases along the channel length and does not change beyond the entrance length. This feature can then be used to determine the maximum entrance length, which is otherwise problematic to ascertain in the case of fully oscillatory flows. The entrance length, thus obtained, is compared with that obtained from the velocity profile consideration and shows good similarity. The phase difference between pressure and velocity is also brought out in this work.

  4. Characterization of copper and nichrome wires for safety fuse

    NASA Astrophysics Data System (ADS)

    Murdani, E.

    2016-11-01

    Fuse is an important component of an electrical circuit to limiting the current through the electrical circuit for electrical equipment safety. Safety fuses are made of a conductor such as copper and nichrome wires. The aim of this research was to determine the maximum current that can flow in the conductor wires (copper and nichrome). In the experiment used copper and nichrome wires by varying the length of wires (0.2 cm to 20 cm) and diameter of wires (0.1, 0.2, 0.3, 0.4 and 0.5) mm until maximum current reached that marked by melted or broken wire. From this experiment, it will be obtained the dependences data of maximum current to the length and diameter of wires. All data are plotted and it's known as a standard curve. The standard curve will provide an alternative choice of replacing fuse wire according to the maximum current requirement, including the wire type (copper and nichrome wires) and wire dimensions (length and diameter of wire).

  5. Influence of spatial and temporal spot distribution on the ocular surface quality and maximum ablation depth after photoablation with a 1050 Hz excimer laser system.

    PubMed

    Mrochen, Michael; Schelling, Urs; Wuellner, Christian; Donitzky, Christof

    2009-02-01

    To investigate the effect of temporal and spatial distributions of laser spots (scan sequences) on the corneal surface quality after ablation and the maximum ablation of a given refractive correction after photoablation with a high-repetition-rate scanning-spot laser. IROC AG, Zurich, Switzerland, and WaveLight AG, Erlangen, Germany. Bovine corneas and poly(methyl methacrylate) (PMMA) plates were photoablated using a 1050 Hz excimer laser prototype for corneal laser surgery. Four temporal and spatial spot distributions (scan sequences) with different temporal overlapping factors were created for 3 myopic, 3 hyperopic, and 3 phototherapeutic keratectomy ablation profiles. Surface quality and maximum ablation depth were measured using a surface profiling system. The surface quality factor increased (rough surfaces) as the amount of temporal overlapping in the scan sequence and the amount of correction increased. The rise in surface quality factor was less for bovine corneas than for PMMA. The scan sequence might cause systematic substructures at the surface of the ablated material depending on the overlapping factor. The maximum ablation varied within the scan sequence. The temporal and spatial distribution of the laser spots (scan sequence) during a corneal laser procedure affected the surface quality and maximum ablation depth of the ablation profile. Corneal laser surgery could theoretically benefit from smaller spot sizes and higher repetition rates. The temporal and spatial spot distributions are relevant to achieving these aims.

  6. Large-Scale Collection and Analysis of Full-Length cDNAs from Brachypodium distachyon and Integration with Pooideae Sequence Resources

    PubMed Central

    Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Takahashi, Fuminori; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo

    2013-01-01

    A comprehensive collection of full-length cDNAs is essential for correct structural gene annotation and functional analyses of genes. We constructed a mixed full-length cDNA library from 21 different tissues of Brachypodium distachyon Bd21, and obtained 78,163 high quality expressed sequence tags (ESTs) from both ends of ca. 40,000 clones (including 16,079 contigs). We updated gene structure annotations of Brachypodium genes based on full-length cDNA sequences in comparison with the latest publicly available annotations. About 10,000 non-redundant gene models were supported by full-length cDNAs; ca. 6,000 showed some transcription unit modifications. We also found ca. 580 novel gene models, including 362 newly identified in Bd21. Using the updated transcription start sites, we searched a total of 580 plant cis-motifs in the −3 kb promoter regions and determined a genome-wide Brachypodium promoter architecture. Furthermore, we integrated the Brachypodium full-length cDNAs and updated gene structures with available sequence resources in wheat and barley in a web-accessible database, the RIKEN Brachypodium FL cDNA database. The database represents a “one-stop” information resource for all genomic information in the Pooideae, facilitating functional analysis of genes in this model grass plant and seamless knowledge transfer to the Triticeae crops. PMID:24130698

  7. Comparing K-mer based methods for improved classification of 16S sequences.

    PubMed

    Vinje, Hilde; Liland, Kristian Hovde; Almøy, Trygve; Snipen, Lars

    2015-07-01

    The need for precise and stable taxonomic classification is highly relevant in modern microbiology. Parallel to the explosion in the amount of sequence data accessible, there has also been a shift in focus for classification methods. Previously, alignment-based methods were the most applicable tools. Now, methods based on counting K-mers by sliding windows are the most interesting classification approach with respect to both speed and accuracy. Here, we present a systematic comparison on five different K-mer based classification methods for the 16S rRNA gene. The methods differ from each other both in data usage and modelling strategies. We have based our study on the commonly known and well-used naïve Bayes classifier from the RDP project, and four other methods were implemented and tested on two different data sets, on full-length sequences as well as fragments of typical read-length. The difference in classification error obtained by the methods seemed to be small, but they were stable and for both data sets tested. The Preprocessed nearest-neighbour (PLSNN) method performed best for full-length 16S rRNA sequences, significantly better than the naïve Bayes RDP method. On fragmented sequences the naïve Bayes Multinomial method performed best, significantly better than all other methods. For both data sets explored, and on both full-length and fragmented sequences, all the five methods reached an error-plateau. We conclude that no K-mer based method is universally best for classifying both full-length sequences and fragments (reads). All methods approach an error plateau indicating improved training data is needed to improve classification from here. Classification errors occur most frequent for genera with few sequences present. For improving the taxonomy and testing new classification methods, the need for a better and more universal and robust training data set is crucial.

  8. The complete chloroplast genome sequence of Mahonia bealei (Berberidaceae) reveals a significant expansion of the inverted repeat and phylogenetic relationship with other angiosperms.

    PubMed

    Ma, Ji; Yang, Bingxian; Zhu, Wei; Sun, Lianli; Tian, Jingkui; Wang, Xumin

    2013-10-10

    Mahonia bealei (Berberidaceae) is a frequently-used traditional Chinese medicinal plant with efficient anti-inflammatory ability. This plant is one of the sources of berberine, a new cholesterol-lowering drug with anti-diabetic activity. We have sequenced the complete nucleotide sequence of the chloroplast (cp) genome of M. bealei. The complete cp genome of M. bealei is 164,792 bp in length, and has a typical structure with large (LSC 73,052 bp) and small (SSC 18,591 bp) single-copy regions separated by a pair of inverted repeats (IRs 36,501 bp) of large size. The Mahonia cp genome contains 111 unique genes and 39 genes are duplicated in the IR regions. The gene order and content of M. bealei are almost unarranged which is consistent with the hypothesis that large IRs stabilize cp genome and reduce gene loss-and-gain probabilities during evolutionary process. A large IR expansion of over 12 kb has occurred in M. bealei, 15 genes (rps19, rpl22, rps3, rpl16, rpl14, rps8, infA, rpl36, rps11, petD, petB, psbH, psbN, psbT and psbB) have expanded to have an additional copy in the IRs. The IR expansion rearrangement occurred via a double-strand DNA break and subsequence repair, which is different from the ordinary gene conversion mechanism. Repeat analysis identified 39 direct/inverted repeats 30 bp or longer with a sequence identity ≥ 90%. Analysis also revealed 75 simple sequence repeat (SSR) loci and almost all are composed of A or T, contributing to a distinct bias in base composition. Comparison of protein-coding sequences with ESTs reveals 9 putative RNA edits and 5 of them resulted in non-synonymous modifications in rpoC1, rps2, rps19 and ycf1. Phylogenetic analysis using maximum parsimony (MP) and maximum likelihood (ML) was performed on a dataset composed of 65 protein-coding genes from 25 taxa, which yields an identical tree topology as previous plastid-based trees, and provides strong support for the sister relationship between Ranunculaceae and Berberidaceae. Molecular dating analyses suggest that Ranunculaceae and Berberidaceae diverged between 90 and 84 mya, which is congruent with the fossil records and with recent estimates of the divergence time of these two taxa. © 2013.

  9. The complete chloroplast genome sequence of Gossypium hirsutum: organization and phylogenetic relationships to other angiosperms

    PubMed Central

    Lee, Seung-Bum; Kaittanis, Charalambos; Jansen, Robert K; Hostetler, Jessica B; Tallon, Luke J; Town, Christopher D; Daniell, Henry

    2006-01-01

    Background Cotton (Gossypium hirsutum) is the most important fiber crop grown in 90 countries. In 2004–2005, US farmers planted 79% of the 5.7-million hectares of nuclear transgenic cotton. Unfortunately, genetically modified cotton has the potential to hybridize with other cultivated and wild relatives, resulting in geographical restrictions to cultivation. However, chloroplast genetic engineering offers the possibility of containment because of maternal inheritance of transgenes. The complete chloroplast genome of cotton provides essential information required for genetic engineering. In addition, the sequence data were used to assess phylogenetic relationships among the major clades of rosids using cotton and 25 other completely sequenced angiosperm chloroplast genomes. Results The complete cotton chloroplast genome is 160,301 bp in length, with 112 unique genes and 19 duplicated genes within the IR, containing a total of 131 genes. There are four ribosomal RNAs, 30 distinct tRNA genes and 17 intron-containing genes. The gene order in cotton is identical to that of tobacco but lacks rpl22 and infA. There are 30 direct and 24 inverted repeats 30 bp or longer with a sequence identity ≥ 90%. Most of the direct repeats are within intergenic spacer regions, introns and a 72 bp-long direct repeat is within the psaA and psaB genes. Comparison of protein coding sequences with expressed sequence tags (ESTs) revealed nucleotide substitutions resulting in amino acid changes in ndhC, rpl23, rpl20, rps3 and clpP. Phylogenetic analysis of a data set including 61 protein-coding genes using both maximum likelihood and maximum parsimony were performed for 28 taxa, including cotton and five other angiosperm chloroplast genomes that were not included in any previous phylogenies. Conclusion Cotton chloroplast genome lacks rpl22 and infA and contains a number of dispersed direct and inverted repeats. RNA editing resulted in amino acid changes with significant impact on their hydropathy. Phylogenetic analysis provides strong support for the position of cotton in the Malvales in the eurosids II clade sister to Arabidopsis in the Brassicales. Furthermore, there is strong support for the placement of the Myrtales sister to the eurosid I clade, although expanded taxon sampling is needed to further test this relationship. PMID:16553962

  10. Cosmological horizons, uncertainty principle, and maximum length quantum mechanics

    NASA Astrophysics Data System (ADS)

    Perivolaropoulos, L.

    2017-05-01

    The cosmological particle horizon is the maximum measurable length in the Universe. The existence of such a maximum observable length scale implies a modification of the quantum uncertainty principle. Thus due to nonlocality of quantum mechanics, the global properties of the Universe could produce a signature on the behavior of local quantum systems. A generalized uncertainty principle (GUP) that is consistent with the existence of such a maximum observable length scale lmax is Δ x Δ p ≥ℏ2/1/1 -α Δ x2 where α =lmax-2≃(H0/c )2 (H0 is the Hubble parameter and c is the speed of light). In addition to the existence of a maximum measurable length lmax=1/√{α }, this form of GUP implies also the existence of a minimum measurable momentum pmin=3/√{3 } 4 ℏ√{α }. Using appropriate representation of the position and momentum quantum operators we show that the spectrum of the one-dimensional harmonic oscillator becomes E¯n=2 n +1 +λnα ¯ where E¯n≡2 En/ℏω is the dimensionless properly normalized n th energy level, α ¯ is a dimensionless parameter with α ¯≡α ℏ/m ω and λn˜n2 for n ≫1 (we show the full form of λn in the text). For a typical vibrating diatomic molecule and lmax=c /H0 we find α ¯˜10-77 and therefore for such a system, this effect is beyond the reach of current experiments. However, this effect could be more important in the early Universe and could produce signatures in the primordial perturbation spectrum induced by quantum fluctuations of the inflaton field.

  11. Employment of Near Full-Length Ribosome Gene TA-Cloning and Primer-Blast to Detect Multiple Species in a Natural Complex Microbial Community Using Species-Specific Primers Designed with Their Genome Sequences.

    PubMed

    Zhang, Huimin; He, Hongkui; Yu, Xiujuan; Xu, Zhaohui; Zhang, Zhizhou

    2016-11-01

    It remains an unsolved problem to quantify a natural microbial community by rapidly and conveniently measuring multiple species with functional significance. Most widely used high throughput next-generation sequencing methods can only generate information mainly for genus-level taxonomic identification and quantification, and detection of multiple species in a complex microbial community is still heavily dependent on approaches based on near full-length ribosome RNA gene or genome sequence information. In this study, we used near full-length rRNA gene library sequencing plus Primer-Blast to design species-specific primers based on whole microbial genome sequences. The primers were intended to be specific at the species level within relevant microbial communities, i.e., a defined genomics background. The primers were tested with samples collected from the Daqu (also called fermentation starters) and pit mud of a traditional Chinese liquor production plant. Sixteen pairs of primers were found to be suitable for identification of individual species. Among them, seven pairs were chosen to measure the abundance of microbial species through quantitative PCR. The combination of near full-length ribosome RNA gene library sequencing and Primer-Blast may represent a broadly useful protocol to quantify multiple species in complex microbial population samples with species-specific primers.

  12. The influence of sequence context and length on the kinetics of DNA duplex formation from complementary hairpins possessing (CNG) repeats.

    PubMed

    Paiva, Anthony M; Sheardy, Richard D

    2005-04-20

    The formation of unusual structures during DNA replication has been invoked for gene expansion in genomes possessing triplet repeat sequences, CNG, where N = A, C, G, or T. In particular, it has been suggested that the daughter strand of the leading strand partially dissociates from the parent strand and forms a hairpin. The equilibrium between the fully duplexed parent:daugter species and the parent:hairpin species is dependent upon their relative stabilities and the rates of reannealing of the daughter strand back to the parent. These stabilities and rates are ultimately influenced by the sequence context of the DNA and its length. Previous work has demonstrated that longer strands are more stable than shorter strands and that the identity of N also influences the thermal stability [Paiva, A. M.; Sheardy, R. D. Biochemistry 2004, 43, 14218-14227]. Here, we show that the rate of duplex formation from complementary hairpins is also sequence context and length dependent. In particular, longer duplexes have higher activation energies than shorter duplexes of the same sequence context. Further, [(CCG):(GGC)] duplexes have lower activation energies than corresponding [(CAG):(GTC)] duplexes of the same length. Hence, hairpins formed from long CNG sequences are more thermodynamically stable and have slower kinetics for reannealing to their complement than shorter analogues. Gene expansion can now be explained in terms of thermodynamics and kinetics.

  13. The association between the maximum step length test and the walking efficiency in children with cerebral palsy.

    PubMed

    Kimoto, Minoru; Okada, Kyoji; Sakamoto, Hitoshi; Kondou, Takanori

    2017-05-01

    [Purpose] To improve walking efficiency could be useful for reducing fatigue and extending possible period of walking in children with cerebral palsy (CP). For this purpose, current study compared conventional parameters of gross motor performance, step length, and cadence in the evaluation of walking efficiency in children with CP. [Subjects and Methods] Thirty-one children with CP (21 boys, 10 girls; mean age, 12.3 ± 2.7 years) participated. Parameters of gross motor performance, including the maximum step length (MSL), maximum side step length, step number, lateral step up number, and single leg standing time, were measured in both dominant and non-dominant sides. Spatio-temporal parameters of walking, including speed, step length, and cadence, were calculated. Total heart beat index (THBI), a parameter of walking efficiency, was also calculated from heartbeats and walking distance in 10 minutes of walking. To analyze the relationships between these parameters and the THBI, the coefficients of determination were calculated using stepwise analysis. [Results] The MSL of the dominant side best accounted for the THBI (R 2 =0.759). [Conclusion] The MSL of the dominant side was the best explanatory parameter for walking efficiency in children with CP.

  14. Maximum Likelihood Estimations and EM Algorithms with Length-biased Data

    PubMed Central

    Qin, Jing; Ning, Jing; Liu, Hao; Shen, Yu

    2012-01-01

    SUMMARY Length-biased sampling has been well recognized in economics, industrial reliability, etiology applications, epidemiological, genetic and cancer screening studies. Length-biased right-censored data have a unique data structure different from traditional survival data. The nonparametric and semiparametric estimations and inference methods for traditional survival data are not directly applicable for length-biased right-censored data. We propose new expectation-maximization algorithms for estimations based on full likelihoods involving infinite dimensional parameters under three settings for length-biased data: estimating nonparametric distribution function, estimating nonparametric hazard function under an increasing failure rate constraint, and jointly estimating baseline hazards function and the covariate coefficients under the Cox proportional hazards model. Extensive empirical simulation studies show that the maximum likelihood estimators perform well with moderate sample sizes and lead to more efficient estimators compared to the estimating equation approaches. The proposed estimates are also more robust to various right-censoring mechanisms. We prove the strong consistency properties of the estimators, and establish the asymptotic normality of the semi-parametric maximum likelihood estimators under the Cox model using modern empirical processes theory. We apply the proposed methods to a prevalent cohort medical study. Supplemental materials are available online. PMID:22323840

  15. Wind tunnel investigation of three axisymmetric cowls of different lengths at Mach numbers from 0.60 to 0.92

    NASA Technical Reports Server (NTRS)

    Re, Richard J.; Abeyounis, William K.

    1993-01-01

    Pressure distributions on three inlets having different cowl lengths were obtained in the Langley 16-Foot Transonic Tunnel. The cowl diameter ratio (highlight diameter to maximum diameter) was 0.85 and the cowl length ratios (cowl length to maximum diameter) were 0.337, 0.439, and 0.547. The cowls had identical nondimensionalized (with respect to cowl length) external geometry and identical internal geometry. The internal contraction ratio (highlight area to throat area) was 1.250. The inlets had longitudinal rows of static pressure orifices on the top and bottom (external) surfaces and on the contraction (internal) and diffuser surfaces. The afterbody was cylindrical in shape, and its diameter was equal to the maximum diameter of the cowl. Depending on the cowl configuration and free-stream Mach number, the mass-flow ratio varied between 0.27 and 0.87 during the tests. Angle of attack varied from 0 to 4.1 deg at selected Mach numbers and mass-flow ratios, and the Reynolds number varied with the Mach number from 3.2x10(exp 6) to 4.2x10(exp 6) per foot.

  16. A retrotransposable element from the mosquito Anopheles gambiae .

    PubMed Central

    Besansky, N J

    1990-01-01

    A family of middle repetitive elements from the African malaria vector Anopheles gambiae is described. Approximately 100 copies of the element, designated T1Ag, are dispersed in the genome. Full-length elements are 4.6 kilobase pairs in length, but truncation of the 5' end is common. Nucleotide sequences of one full-length, two 5'-truncated, and two 5' ends of T1Ag elements were determined and aligned to define a consensus sequence. Sequence analysis revealed two long, overlapping open reading frames followed by a polyadenylation signal, AATAAA, and a tail consisting of tandem repetitions of the motif TGAAA. No direct or inverted long terminal repeats (LTRs) were detected. The first open reading frame, 442 amino acids in length, includes a domain resembling that of nucleic acid-binding proteins. The second open reading frame, 975 amino acids long, resembles the reverse transcriptases of a category of retrotransposable elements without LTRs, variously termed class II retrotransposons, class III elements or non-LTR retrotransposons. Similarity at the sequence and structural levels places T1Ag in this category. Images PMID:1689457

  17. Using specific length amplified fragment sequencing to construct the high-density genetic map for Vitis (Vitis vinifera L. × Vitis amurensis Rupr.).

    PubMed

    Guo, Yinshan; Shi, Guangli; Liu, Zhendong; Zhao, Yuhui; Yang, Xiaoxu; Zhu, Junchi; Li, Kun; Guo, Xiuwu

    2015-01-01

    In this study, 149 F1 plants from the interspecific cross between 'Red Globe' (Vitis vinifera L.) and 'Shuangyou' (Vitis amurensis Rupr.) and the parent were used to construct a molecular genetic linkage map by using the specific length amplified fragment sequencing technique. DNA sequencing generated 41.282 Gb data consisting of 206,411,693 paired-end reads. The average sequencing depths were 68.35 for 'Red Globe,' 63.65 for 'Shuangyou,' and 8.01 for each progeny. In all, 115,629 high-quality specific length amplified fragments were detected, of which 42,279 were polymorphic. The genetic map was constructed using 7,199 of these polymorphic markers. These polymorphic markers were assigned to 19 linkage groups; the total length of the map was 1929.13 cm, with an average distance of 0.28 cm between each maker. To our knowledge, the genetic maps constructed in this study contain the largest number of molecular markers. These high-density genetic maps might form the basis for the fine quantitative trait loci mapping and molecular-assisted breeding of grape.

  18. Correlation between Reynolds number and eccentricity effect in stenosed artery models.

    PubMed

    Javadzadegan, Ashkan; Shimizu, Yasutomo; Behnia, Masud; Ohta, Makoto

    2013-01-01

    Flow recirculation and shear strain are physiological processes within coronary arteries which are associated with pathogenic biological pathways. Distinct Quite apart from coronary stenosis severity, lesion eccentricity can cause flow recirculation and affect shear strain levels within human coronary arteries. The aim of this study is to analyse the effect of lesion eccentricity on the transient flow behaviour in a model of a coronary artery and also to investigate the correlation between Reynolds number (Re) and the eccentricity effect on flow behaviour. A transient particle image velocimetry (PIV) experiment was implemented in two silicone based models with 70% diameter stenosis, one with eccentric stenosis and one with concentric stenosis. At different times throughout the flow cycle, the eccentric model was always associated with a greater recirculation zone length, maximum shear strain rate and maximum axial velocity; however, the highest and lowest impacts of eccentricity were on the recirculation zone length and maximum shear strain rate, respectively. Analysis of the results revealed a negative correlation between the Reynolds number (Re) and the eccentricity effect on maximum axial velocity, maximum shear strain rate and recirculation zone length. As Re number increases the eccentricity effect on the flow behavior becomes negligible.

  19. Successful Recovery of Nuclear Protein-Coding Genes from Small Insects in Museums Using Illumina Sequencing.

    PubMed

    Kanda, Kojun; Pflug, James M; Sproul, John S; Dasenko, Mark A; Maddison, David R

    2015-01-01

    In this paper we explore high-throughput Illumina sequencing of nuclear protein-coding, ribosomal, and mitochondrial genes in small, dried insects stored in natural history collections. We sequenced one tenebrionid beetle and 12 carabid beetles ranging in size from 3.7 to 9.7 mm in length that have been stored in various museums for 4 to 84 years. Although we chose a number of old, small specimens for which we expected low sequence recovery, we successfully recovered at least some low-copy nuclear protein-coding genes from all specimens. For example, in one 56-year-old beetle, 4.4 mm in length, our de novo assembly recovered about 63% of approximately 41,900 nucleotides in a target suite of 67 nuclear protein-coding gene fragments, and 70% using a reference-based assembly. Even in the least successfully sequenced carabid specimen, reference-based assembly yielded fragments that were at least 50% of the target length for 34 of 67 nuclear protein-coding gene fragments. Exploration of alternative references for reference-based assembly revealed few signs of bias created by the reference. For all specimens we recovered almost complete copies of ribosomal and mitochondrial genes. We verified the general accuracy of the sequences through comparisons with sequences obtained from PCR and Sanger sequencing, including of conspecific, fresh specimens, and through phylogenetic analysis that tested the placement of sequences in predicted regions. A few possible inaccuracies in the sequences were detected, but these rarely affected the phylogenetic placement of the samples. Although our sample sizes are low, an exploratory regression study suggests that the dominant factor in predicting success at recovering nuclear protein-coding genes is a high number of Illumina reads, with success at PCR of COI and killing by immersion in ethanol being secondary factors; in analyses of only high-read samples, the primary significant explanatory variable was body length, with small beetles being more successfully sequenced.

  20. Successful Recovery of Nuclear Protein-Coding Genes from Small Insects in Museums Using Illumina Sequencing

    PubMed Central

    Dasenko, Mark A.

    2015-01-01

    In this paper we explore high-throughput Illumina sequencing of nuclear protein-coding, ribosomal, and mitochondrial genes in small, dried insects stored in natural history collections. We sequenced one tenebrionid beetle and 12 carabid beetles ranging in size from 3.7 to 9.7 mm in length that have been stored in various museums for 4 to 84 years. Although we chose a number of old, small specimens for which we expected low sequence recovery, we successfully recovered at least some low-copy nuclear protein-coding genes from all specimens. For example, in one 56-year-old beetle, 4.4 mm in length, our de novo assembly recovered about 63% of approximately 41,900 nucleotides in a target suite of 67 nuclear protein-coding gene fragments, and 70% using a reference-based assembly. Even in the least successfully sequenced carabid specimen, reference-based assembly yielded fragments that were at least 50% of the target length for 34 of 67 nuclear protein-coding gene fragments. Exploration of alternative references for reference-based assembly revealed few signs of bias created by the reference. For all specimens we recovered almost complete copies of ribosomal and mitochondrial genes. We verified the general accuracy of the sequences through comparisons with sequences obtained from PCR and Sanger sequencing, including of conspecific, fresh specimens, and through phylogenetic analysis that tested the placement of sequences in predicted regions. A few possible inaccuracies in the sequences were detected, but these rarely affected the phylogenetic placement of the samples. Although our sample sizes are low, an exploratory regression study suggests that the dominant factor in predicting success at recovering nuclear protein-coding genes is a high number of Illumina reads, with success at PCR of COI and killing by immersion in ethanol being secondary factors; in analyses of only high-read samples, the primary significant explanatory variable was body length, with small beetles being more successfully sequenced. PMID:26716693

  1. Characterization of four species of Trichuris (Nematoda: Enoplida) by their second internal transcribed spacer ribosomal DNA sequence.

    PubMed

    Oliveros, R; Cutillas, C; De Rojas, M; Arias, P

    2000-12-01

    Adult worms of Trichuris ovis and T. globulosa were collected from Ovis aries (sheep) and Capra hircus (goats). T. suis was isolated from Sus scrofa domestica (swine) and T. leporis was isolated from Lepus europaeus (rabbits) in Spain. Genomic DNA was isolated and a ribosomal internal transcribed spacer (ITS2) was amplified and sequenced using polymerase-chain-reaction (PCR) techniques. The ITS2 of T. ovis and T. globulosa was 407 nucleotides in length and had a GC content of about 62%. Furthermore, the ITS2 of T. suis and T. leporis was 534 and 418 nucleotides in length and had a GC content of about 64.8% and 62.4%, respectively. There was evidence of slight variation in the sequence within individuals of all species analyzed, indicating intraindividual variation in the sequence of different copies of the ribosomal DNA. Furthermore, low-level intraspecific variation was detected. Sequence analyses of ITS2 products of T. ovis and T. globulosa demonstrated no sequence difference between them. Nevertheless, differences were detected between the ITS2 sequences of T. suis, T. leporis, and T. ovis, indicating that Trichuris species can reliably be differentiated by their ITS2 sequences and PCR-linked restriction-fragment-length polymorphism (RFLP).

  2. Telomeres shorten more slowly in slow-aging wild animals than in fast-aging ones.

    PubMed

    Dantzer, Ben; Fletcher, Quinn E

    2015-11-01

    Research on the physiological causes of senescence aim to identify common physiological mechanisms that explain age-related declines in fitness across taxonomic groups. Telomeres are repetitive nucleotide sequences found on the ends of eukaryotic chromosomes. Past research indicates that telomere attrition is strongly correlated with inter-specific rates of aging, though these studies cannot distinguish whether telomere attrition is a cause or consequence of the aging process. We extend previous research on this topic by incorporating recent studies to test the hypothesis that telomeres shorten more slowly with age in slow-aging animals than in fast-aging ones. We assembled all studies that have quantified cross-sectional (i.e. between-individual) telomere rates of change (TROC) over the lifespans of wild animals. This included 22 estimates reflecting absolute TROC (TROCabs, bp/yr, primarily measured using the terminal restriction fragment length method), and 10 estimates reflecting relative TROC (TROCrel, relative telomere length/yr, measured using qPCR), from five classes (Aves, Mammalia, Bivalvia, Reptilia, and Actinopterygii). In 14 bird species, we correlated between-individual (i.e. cross-sectional) TROCabs estimates with both maximum lifespan and a phylogenetically-corrected principle component axis (pcPC1) that reflected the slow-fast axis of life-history variation. Bird species characterized by faster life-histories and shorter maximum lifespans had faster TROCabs. In nine studies, both between-individual and within-individual TROC estimates were available (n=8 for TROCabs, n=1 for TROCrel). Within-individual TROC estimates were generally greater than between-individual TROC estimates, which is indicative of selective disappearance of individuals with shorter telomeres. However, the difference between within- and between-individual TROC estimates was only significant in two out of nine studies. The relationship between within-individual TROCabs and maximum lifespan did not differ from the relationship of between-individual TROCabs and maximum lifespan. Overall, our results provide additional support for the hypothesis that TROC is correlated with inter-specific rates of aging and complement the intra-specific research that also find relationships between telomere attrition and components of fitness. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. A Study on the Potential Cost Savings Associated with Implementing Airline Pilot Training Curricula into the Future P-8 MMA Fleet Replacement Squadron

    DTIC Science & Technology

    2006-06-01

    winglets : 35.81m Length: 38.56m Height: 12.83m Fuselage length: 38.02m Tailplane: 14.35m Maximum taxi weight: 83,778kg Maximum fuel...visual and aerodynamic handling deficiencies (by today’s standards) and are only capable of partially qualifying a VP-30 Cat I or Cat III pilot in

  4. Bioinformatic analysis of phage AB3, a phiKMV-like virus infecting Acinetobacter baumannii.

    PubMed

    Zhang, J; Liu, X; Li, X-J

    2015-01-16

    The phages of Acinetobacter baumannii has drawn increasing attention because of the multi-drug resistance of A. baumanni. The aim of this study was to sequence Acinetobacter baumannii phage AB3 and conduct bioinformatic analysis to lay a foundation for genome remodeling and phage therapy. We isolated and sequenced A. baumannii phage AB3 and attempted to annotate and analyze its genome. The results showed that the genome is a double-stranded DNA with a total length of 31,185 base pairs (bp) and 97 open reading frames greater than 100 bp. The genome includes 28 predicted genes, of which 24 are homologous to phage AB1. The entire coding sequence is located on the negative strand, representing 90.8% of the total length. The G+C mol% was 39.18%, without areas of high G+C content over 200 bp in length. No GC island, tRNA gene, or repeated sequence was identified. Gene lengths were 120-3099 bp, with an average of 1011 bp. Six genes were found to be greater than 2000 bp in length. Genomic alignment and phylogenetic analysis of the RNA polymerase gene showed that similar to phage AB1, phage AB3 is a phiKMV-like virus in the T7 phage family.

  5. Quasispecies Analyses of the HIV-1 Near-full-length Genome With Illumina MiSeq

    PubMed Central

    Ode, Hirotaka; Matsuda, Masakazu; Matsuoka, Kazuhiro; Hachiya, Atsuko; Hattori, Junko; Kito, Yumiko; Yokomaku, Yoshiyuki; Iwatani, Yasumasa; Sugiura, Wataru

    2015-01-01

    Human immunodeficiency virus type-1 (HIV-1) exhibits high between-host genetic diversity and within-host heterogeneity, recognized as quasispecies. Because HIV-1 quasispecies fluctuate in terms of multiple factors, such as antiretroviral exposure and host immunity, analyzing the HIV-1 genome is critical for selecting effective antiretroviral therapy and understanding within-host viral coevolution mechanisms. Here, to obtain HIV-1 genome sequence information that includes minority variants, we sought to develop a method for evaluating quasispecies throughout the HIV-1 near-full-length genome using the Illumina MiSeq benchtop deep sequencer. To ensure the reliability of minority mutation detection, we applied an analysis method of sequence read mapping onto a consensus sequence derived from de novo assembly followed by iterative mapping and subsequent unique error correction. Deep sequencing analyses of aHIV-1 clone showed that the analysis method reduced erroneous base prevalence below 1% in each sequence position and discarded only < 1% of all collected nucleotides, maximizing the usage of the collected genome sequences. Further, we designed primer sets to amplify the HIV-1 near-full-length genome from clinical plasma samples. Deep sequencing of 92 samples in combination with the primer sets and our analysis method provided sufficient coverage to identify >1%-frequency sequences throughout the genome. When we evaluated sequences of pol genes from 18 treatment-naïve patients' samples, the deep sequencing results were in agreement with Sanger sequencing and identified numerous additional minority mutations. The results suggest that our deep sequencing method would be suitable for identifying within-host viral population dynamics throughout the genome. PMID:26617593

  6. Divergence of the phytochrome gene family predates angiosperm evolution and suggests that Selaginella and Equisetum arose prior to Psilotum.

    PubMed

    Kolukisaoglu, H U; Marx, S; Wiegmann, C; Hanelt, S; Schneider-Poetsch, H A

    1995-09-01

    Thirty-two partial phytochrome sequences from algae, mosses, ferns, gymnosperms, and angiosperms (11 of them newly released ones from our laboratory) were analyzed by distance and character-state approaches (PHYLIP, TREECON, PAUP). In addition, 12 full-length sequences were analyzed. Despite low bootstrap values at individual internal nodes, the inferred trees (neighbor-joining, Fitch, maximum parsimony) generally showed similar branching orders consistent with other molecular data. Lower plants formed two distinct groups. One basal group consisted of Selaginella, Equisetum, and mosses; the other consisted of a monophyletic cluster of frond-bearing pteridophytes. Psilotum was a member of the latter group and hence perhaps was not, as sometimes suggested, a close relative of the first vascular plants. The results further suggest that phytochrome gene duplication giving rise to a- and b- and later to c-types may have taken place within seedfern genomes. Distance matrices dated the separation of mono- and dicotyledons back to about 260 million years before the present (Myr B.P.) and the separation of Metasequoia and Picea to a fossil record-compatible value of 230 Myr B.P. The Ephedra sequence clustered with the c- or a-type and Metasequoia and Picea sequences clustered with the b-type lineage. The "paleoherb" Nymphaea branched off from the c-type lineage prior to the divergence of mono- and dicotyledons on the a- and b-type branches. Sequences of Piper (another "paleoherb") created problems in that they branched off from different phytochrome lineages at nodes contradicting distance from the inferred trees' origin.

  7. Completion of full length genome sequence of novel avian paramyxovirus strain APMV/Shimane67 isolated from migratory wild geese in Japan.

    PubMed

    Yamamoto, Eiji; Ito, Toshihiro; Ito, Hiroshi

    2016-11-01

    The nucleotide sequences of nucleocapsid protein (N); phosphoprotein (P); matrix protein (M); hemagglutinin-neuraminidase (HN); and large polymerase protein (L) genes, 3'-end leader, 5'-end trailer and intergenic regions of the avian paramyxovirus (APMV) strain goose/Shimane/67/2000 (APMV/Shimane67) were determined. Together with previously reported data on fusion protein (F) gene sequence [46], the determination of the genome sequence of APMV/Shimane67 has been completed in this study. The genome of APMV/Shimane67 comprised 16,146 nucleotides in length and contains six genes in the order of 3'-N-P-M-F-HN-L-5'. The features of the APMV/Shimane67 genome (e.g., nucleotide length of whole genome and each of the six genes, and predicted amino acid length of each of the six genes) were distinct from those of other APMV serotypes. Phylogenetic analysis indicated that although APMV/Shimane67 was grouped with APMV-1, -9 and -12, the evolutionary distance between APMV/Shimane67 and these viruses was longer than that observed between intra-serotype viruses. These results show that the genome sequence of APMV/Shimane67 contains specific characteristics and is distinguishable from other types of APMV.

  8. Reverse Transcription Errors and RNA-DNA Differences at Short Tandem Repeats.

    PubMed

    Fungtammasan, Arkarachai; Tomaszkiewicz, Marta; Campos-Sánchez, Rebeca; Eckert, Kristin A; DeGiorgio, Michael; Makova, Kateryna D

    2016-10-01

    Transcript variation has important implications for organismal function in health and disease. Most transcriptome studies focus on assessing variation in gene expression levels and isoform representation. Variation at the level of transcript sequence is caused by RNA editing and transcription errors, and leads to nongenetically encoded transcript variants, or RNA-DNA differences (RDDs). Such variation has been understudied, in part because its detection is obscured by reverse transcription (RT) and sequencing errors. It has only been evaluated for intertranscript base substitution differences. Here, we investigated transcript sequence variation for short tandem repeats (STRs). We developed the first maximum-likelihood estimator (MLE) to infer RT error and RDD rates, taking next generation sequencing error rates into account. Using the MLE, we empirically evaluated RT error and RDD rates for STRs in a large-scale DNA and RNA replicated sequencing experiment conducted in a primate species. The RT error rates increased exponentially with STR length and were biased toward expansions. The RDD rates were approximately 1 order of magnitude lower than the RT error rates. The RT error rates estimated with the MLE from a primate data set were concordant with those estimated with an independent method, barcoded RNA sequencing, from a Caenorhabditis elegans data set. Our results have important implications for medical genomics, as STR allelic variation is associated with >40 diseases. STR nonallelic transcript variation can also contribute to disease phenotype. The MLE and empirical rates presented here can be used to evaluate the probability of disease-associated transcripts arising due to RDD. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  9. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    PubMed

    Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  10. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis

    PubMed Central

    Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros ‘Jinzaoshi’ were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. ‘Jinzaoshi’, support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales. PMID:27442423

  11. Molecular identification of Trichuris vulpis and Trichuris suis isolated from different hosts.

    PubMed

    Cutillas, Cristina; de Rojas, Manuel; Ariza, Concepción; Ubeda, José Manuel; Guevara, Diego

    2007-01-01

    Trichuris suis was isolated from the cecum of two different hosts (Sus scrofa domestica -- swine and Sus scrofa scrofa -- wild boar) and Trichuris vulpis from dogs in Sevilla, Spain. Genomic DNA was isolated and internal transcribed spacers (ITS)1-5.8S-ITS2 segment from the ribosomal DNA (rDNA) was amplified and sequenced using polymerase chain reaction techniques. The sequence of T. suis from both hosts was 1,396 bp in length while that of T. vulpis was 1,044 bp. ITS1 of both populations isolated of T. suis was 661 nucleotides in length, while the ITS2 was 534 nucleotides in length. Furthermore, the ITS1 of T. vulpis was 410 nucleotides in length, while the ITS2 was 433 nucleotides in length. One hundred fifty-four nucleotides were observed along the 5.8S gene of T. suis and T. vulpis. Intraindividual and intraspecific variations were detected in the rDNA of both species. The presence of microsatellites was observed in all the individuals assayed. Sequence analysis of the ITSs and the 5.8S gene has demonstrated no sequence differences between T. suis isolated from both hosts (S. scrofa domestica -- swine and S. scrofa scrofa -- wild boar). Nevertheless, clear differences were detected between the ITS1 and ITS2 of T. suis and T. vulpis. Furthermore, a comparative molecular analysis between both species and the previously published ITS1-5.8S-ITS2 sequence data of Trichuris ovis, Trichuris leporis, Trichuris muris, Trichuris arvicolae, and Trichuris skrjabini was carried out. A common homology zone was detected in the ITS1 sequence of all species of trichurids.

  12. Molecular cloning and sequence analysis of full-length growth hormone cDNAs from six important economic fishes.

    PubMed

    Zhang, Jing-Nan; Song, Ping; Hu, Jia-Rui; Mo, Sai-Jun; Peng, Mao-Yu; Zhou, Wei; Zou, Ji-Xing; Hu, Yin-Chang

    2005-01-01

    In this study,the full-length cDNAs of GH (Growth Hormone) gene was isolated from six important economic fishes, Siniperca kneri, Epinephelus coioides, Monopterus albus, Silurus asotus, Misgurnus anguillicaudatus and Carassius auratus gibelio Bloch. It is the first time to clone these GH sequences except E. coioides GH. The lengths of the above cDNAs are as follows: 953 bp, 1 023 bp, 825 bp, 1 082 bp, 1 154 bp and 1 180 bp. Each sequence includes an ORF of about 600 bp which encodes a protein of about 200 amino acid: S. kneri, E. coioides and M. albus GHs of 204 amino acid, S. asotus GH of 200 amino acid, M. anguillicaudatus and C. auratus gibelio GHs of 210 amino acid. Then detailed sequence analysis of the six GHs with many other fish sequences was performed. The six sequences all showed high homology to other sequences, especially to sequences within the same order, and many conserved residues were identified, most localized in five domains. The phylogenetic trees (MP and NJ) of many fish GH ORF sequences (including the new six) with Amia calva as outgroup were generally resolved and largely congruent with the morphology-based tree though some incongruities were observed, suggesting GH ORF should be paid more attention to in teleostean phylogeny.

  13. Comprehensive analysis of the T-cell receptor beta chain gene in rhesus monkey by high throughput sequencing

    PubMed Central

    Li, Zhoufang; Liu, Guangjie; Tong, Yin; Zhang, Meng; Xu, Ying; Qin, Li; Wang, Zhanhui; Chen, Xiaoping; He, Jiankui

    2015-01-01

    Profiling immune repertoires by high throughput sequencing enhances our understanding of immune system complexity and immune-related diseases in humans. Previously, cloning and Sanger sequencing identified limited numbers of T cell receptor (TCR) nucleotide sequences in rhesus monkeys, thus their full immune repertoire is unknown. We applied multiplex PCR and Illumina high throughput sequencing to study the TCRβ of rhesus monkeys. We identified 1.26 million TCRβ sequences corresponding to 643,570 unique TCRβ sequences and 270,557 unique complementarity-determining region 3 (CDR3) gene sequences. Precise measurements of CDR3 length distribution, CDR3 amino acid distribution, length distribution of N nucleotide of junctional region, and TCRV and TCRJ gene usage preferences were performed. A comprehensive profile of rhesus monkey immune repertoire might aid human infectious disease studies using rhesus monkeys. PMID:25961410

  14. [Complete genome sequencing and analyses of rabies viruses isolated from wild animals (Chinese Ferret-Badger) in Zhejiang province].

    PubMed

    Lei, Yong-Liang; Wang, Xiao-Guang; Liu, Fu-Ming; Chen, Xiu-Ying; Ye, Bi-Feng; Mei, Jian-Hua; Lan, Jin-Quan; Tang, Qing

    2009-08-01

    Based on sequencing the full-length genomes of two Chinese Ferret-Badger, we analyzed the properties of rabies viruses genetic variation in molecular level to get information on prevalence and variation of rabies viruses in Zhejiang, and to enrich the genome database of rabies viruses street strains isolated from Chinese wildlife. Overlapped fragments were amplified by RT-PCR and full-length genomes were assembled to analyze the nucleotide and deduced protein similarities and phylogenetic analyses of the N genes from Chinese Ferret-Badger, sika deer, vole, dog. Vaccine strains were then determined. The two full-length genomes were completely sequenced to find out that they had the same genetic structure with 11 923 nts including 58 nts-Leader, 1353 nts-NP, 894 nts-PP, 609 nts-MP, 1575 nts-GP, 6386 nts-LP, and 2, 5, 5 nts- intergenic regions (IGRs), 423 nts-Pseudogene-like sequence (Psi), 70 nts-Trailer. The two full-length genomes were in accordance with the properties of Rhabdoviridae Lyssa virus by blast and multi-sequence alignment. The nucleotide and amino acid sequences among Chinese strains had the highest similarity, especially among animals of the same species. Of the two full-length genomes, the similarity in amino acid level was dramatically higher than that in nucleotide level, so that the nucleotide mutations happened in these two genomes were most probably as synonymous mutations. Compared to the referenced rabies viruses, the lengths of the five protein coding regions did not show any changes or recombination, but only with a few-point mutations. It was evident that the five proteins appeared to be stable. The variation sites and types of the two ferret badgers genomes were similar to the referenced vaccine or street strains. The two strains were genotype 1 according to the multi-sequence and phylogenetic analyses, which possessing the distinct geographyphic characteristics of China. All the evidence suggested a cue that these two ferret badgers rabies viruses were likely to be street virus that already circulating in wildlife.

  15. Bose gases near resonance: Renormalized interactions in a condensate

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhou, Fei, E-mail: feizhou@phas.ubc.ca; Mashayekhi, Mohammad S.

    2013-01-15

    Bose gases at large scattering lengths or beyond the usual dilute limit for a long time have been one of the most challenging problems in many-body physics. In this article, we investigate the fundamental properties of a near-resonance Bose gas and illustrate that three-dimensional Bose gases become nearly fermionized near resonance when the chemical potential as a function of scattering lengths reaches a maximum and the atomic condensates lose metastability. The instability and accompanying maximum are shown to be a precursor of the sign change of g{sub 2}, the renormalized two-body interaction between condensed atoms. g{sub 2} changes from effectivelymore » repulsive to attractive when approaching resonance from the molecular side, even though the scattering length is still positive. This occurs when dimers, under the influence of condensates, emerge at zero energy in the atomic gases at a finite positive scattering length. We carry out our studies of Bose gases via applying a self-consistent renormalization group equation which is further subject to a boundary condition. We also comment on the relation between the approach here and the diagrammatic calculation in an early article [D. Borzov, M.S. Mashayekhi, S. Zhang, J.-L. Song, F. Zhou, Phys. Rev. A 85 (2012) 023620]. - Highlights: Black-Right-Pointing-Pointer A Bose gas becomes nearly fermionized when its chemical potential approaches a maximum near resonance. Black-Right-Pointing-Pointer At the maximum, an onset instability sets in at a positive scattering length. Black-Right-Pointing-Pointer Condensates strongly influence the renormalization flow of few-body running coupling constants. Black-Right-Pointing-Pointer The effective two-body interaction constant changes its sign at a positive scattering length.« less

  16. Incorporation of Tyrosine and Glutamine Residues into the Soluble Guanylate Cyclase Heme Distal Pocket Alters NO and O2 Binding*

    PubMed Central

    Derbyshire, Emily R.; Deng, Sarah; Marletta, Michael A.

    2010-01-01

    Nitric oxide (NO) is the physiologically relevant activator of the mammalian hemoprotein soluble guanylate cyclase (sGC). The heme cofactor of α1β1 sGC has a high affinity for NO but has never been observed to form a complex with oxygen. Introduction of a key tyrosine residue in the sGC heme binding domain β1(1–385) is sufficient to produce an oxygen-binding protein, but this mutation in the full-length enzyme did not alter oxygen affinity. To evaluate ligand binding specificity in full-length sGC we mutated several conserved distal heme pocket residues (β1 Val-5, Phe-74, Ile-145, and Ile-149) to introduce a hydrogen bond donor in proximity to the heme ligand. We found that the NO coordination state, NO dissociation, and enzyme activation were significantly affected by the presence of a tyrosine in the distal heme pocket; however, the stability of the reduced porphyrin and the proteins affinity for oxygen were unaltered. Recently, an atypical sGC from Drosophila, Gyc-88E, was shown to form a stable complex with oxygen. Sequence analysis of this protein identified two residues in the predicted heme pocket (tyrosine and glutamine) that may function to stabilize oxygen binding in the atypical cyclase. The introduction of these residues into the rat β1 distal heme pocket (Ile-145 → Tyr and Ile-149 → Gln) resulted in an sGC construct that oxidized via an intermediate with an absorbance maximum at 417 nm. This absorbance maximum is consistent with globin FeII-O2 complexes and is likely the first observation of a FeII-O2 complex in the full-length α1β1 protein. Additionally, these data suggest that atypical sGCs stabilize O2 binding by a hydrogen bonding network involving tyrosine and glutamine. PMID:20231286

  17. Analysis of expressed sequence tags generated from full-length enriched cDNA libraries of melon

    PubMed Central

    2011-01-01

    Background Melon (Cucumis melo), an economically important vegetable crop, belongs to the Cucurbitaceae family which includes several other important crops such as watermelon, cucumber, and pumpkin. It has served as a model system for sex determination and vascular biology studies. However, genomic resources currently available for melon are limited. Result We constructed eleven full-length enriched and four standard cDNA libraries from fruits, flowers, leaves, roots, cotyledons, and calluses of four different melon genotypes, and generated 71,577 and 22,179 ESTs from full-length enriched and standard cDNA libraries, respectively. These ESTs, together with ~35,000 ESTs available in public domains, were assembled into 24,444 unigenes, which were extensively annotated by comparing their sequences to different protein and functional domain databases, assigning them Gene Ontology (GO) terms, and mapping them onto metabolic pathways. Comparative analysis of melon unigenes and other plant genomes revealed that 75% to 85% of melon unigenes had homologs in other dicot plants, while approximately 70% had homologs in monocot plants. The analysis also identified 6,972 gene families that were conserved across dicot and monocot plants, and 181, 1,192, and 220 gene families specific to fleshy fruit-bearing plants, the Cucurbitaceae family, and melon, respectively. Digital expression analysis identified a total of 175 tissue-specific genes, which provides a valuable gene sequence resource for future genomics and functional studies. Furthermore, we identified 4,068 simple sequence repeats (SSRs) and 3,073 single nucleotide polymorphisms (SNPs) in the melon EST collection. Finally, we obtained a total of 1,382 melon full-length transcripts through the analysis of full-length enriched cDNA clones that were sequenced from both ends. Analysis of these full-length transcripts indicated that sizes of melon 5' and 3' UTRs were similar to those of tomato, but longer than many other dicot plants. Codon usages of melon full-length transcripts were largely similar to those of Arabidopsis coding sequences. Conclusion The collection of melon ESTs generated from full-length enriched and standard cDNA libraries is expected to play significant roles in annotating the melon genome. The ESTs and associated analysis results will be useful resources for gene discovery, functional analysis, marker-assisted breeding of melon and closely related species, comparative genomic studies and for gaining insights into gene expression patterns. PMID:21599934

  18. Carbohydrate degrading polypeptide and uses thereof

    DOEpatents

    Sagt, Cornelis Maria Jacobus; Schooneveld-Bergmans, Margot Elisabeth Francoise; Roubos, Johannes Andries; Los, Alrik Pieter

    2015-10-20

    The invention relates to a polypeptide having carbohydrate material degrading activity which comprises the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1 or SEQ ID NO: 4, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional protein and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.

  19. Rapidly rotating polytropes in general relativity

    NASA Technical Reports Server (NTRS)

    Cook, Gregory B.; Shapiro, Stuart L.; Teukolsky, Saul A.

    1994-01-01

    We construct an extensive set of equilibrium sequences of rotating polytropes in general relativity. We determine a number of important physical parameters of such stars, including maximum mass and maximum spin rate. The stability of the configurations against quasi-radial perturbations is diagnosed. Two classes of evolutionary sequences of fixed rest mass and entropy are explored: normal sequences which behave very much like Newtonian evolutionary sequences, and supramassive sequences which exist solely because of relativistic effects. Dissipation leading to loss of angular momentum causes a star to evolve in a quasi-stationary fashion along an evolutionary sequence. Supramassive sequences evolve towards eventual catastrophic collapse to a black hole. Prior to collapse, the star must spin up as it loses angular momentum, an effect which may provide an observational precursor to gravitational collapse to a black hole.

  20. Length Variation and Heteroplasmy Are Frequent in Mitochondrial DNA from Parthenogenetic and Bisexual Lizards (Genus Cnemidophorus)

    PubMed Central

    Densmore, Llewellyn D.; Wright, John W.; Brown, Wesley M.

    1985-01-01

    Samples of mtDNA isolated from each of 92 lizards representing all color pattern classes of Cnemidophorus tesselatus and two populations of C. tigris marmoratus were digested with the restriction endonucleases MboI, TaqI, RsaI and MspI. The mtDNA fragment sizes were compared after radioactive labeling and gel electrophoresis. Three features were notable in the comparisons: (1) there was little variation due to gain or loss of cleavage sites, (2) two fragments varied noticeably in length among the samples, one by a variable amount up to a maximum difference of ∼370 base pairs (bp) and the other by a discrete amount of 35 bp, (3) these two fragments occasionally varied within, as well as between, samples. Two regions that corresponded in size to these variants were identified by restriction endonuclease cleavage mapping. One of these is adjacent to the D-loop. Heteroplasmy, heretofore rarely observed, occurred frequently in these same two regions. Variability in the copy number of a tandemly repeated 64-bp sequence appears to be one component of the variation, but others (e.g. , base substitutions or small additions/deletions) must also be involved. The frequent occurrence of these length variations suggests either that they can be generated rapidly or that they were inherited from a highly polymorphic ancestor. The former interpretation is favored. PMID:2993100

  1. Convolutional encoding of self-dual codes

    NASA Technical Reports Server (NTRS)

    Solomon, G.

    1994-01-01

    There exist almost complete convolutional encodings of self-dual codes, i.e., block codes of rate 1/2 with weights w, w = 0 mod 4. The codes are of length 8m with the convolutional portion of length 8m-2 and the nonsystematic information of length 4m-1. The last two bits are parity checks on the two (4m-1) length parity sequences. The final information bit complements one of the extended parity sequences of length 4m. Solomon and van Tilborg have developed algorithms to generate these for the Quadratic Residue (QR) Codes of lengths 48 and beyond. For these codes and reasonable constraint lengths, there are sequential decodings for both hard and soft decisions. There are also possible Viterbi-type decodings that may be simple, as in a convolutional encoding/decoding of the extended Golay Code. In addition, the previously found constraint length K = 9 for the QR (48, 24;12) Code is lowered here to K = 8.

  2. Putative Monofunctional Type I Polyketide Synthase Units: A Dinoflagellate-Specific Feature?

    PubMed Central

    Eichholz, Karsten; Beszteri, Bánk; John, Uwe

    2012-01-01

    Marine dinoflagellates (alveolata) are microalgae of which some cause harmful algal blooms and produce a broad variety of most likely polyketide synthesis derived phycotoxins. Recently, novel polyketide synthesase (PKS) transcripts have been described from the Florida red tide dinoflagellate Karenia brevis (gymnodiniales) which are evolutionarily related to Type I PKS but were apparently expressed as monofunctional proteins, a feature typical of Type II PKS. Here, we investigated expression units of PKS I-like sequences in Alexandrium ostenfeldii (gonyaulacales) and Heterocapsa triquetra (peridiniales) at the transcript and protein level. The five full length transcripts we obtained were all characterized by polyadenylation, a 3′ UTR and the dinoflagellate specific spliced leader sequence at the 5′end. Each of the five transcripts encoded a single ketoacylsynthase (KS) domain showing high similarity to K. brevis KS sequences. The monofunctional structure was also confirmed using dinoflagellate specific KS antibodies in Western Blots. In a maximum likelihood phylogenetic analysis of KS domains from diverse PKSs, dinoflagellate KSs formed a clade placed well within the protist Type I PKS clade between apicomplexa, haptophytes and chlorophytes. These findings indicate that the atypical PKS I structure, i.e., expression as putative monofunctional units, might be a dinoflagellate specific feature. In addition, the sequenced transcripts harbored a previously unknown, apparently dinoflagellate specific conserved N-terminal domain. We discuss the implications of this novel region with regard to the putative monofunctional organization of Type I PKS in dinoflagellates. PMID:23139807

  3. Phylogeographic Analysis of Mitochondrial DNA in Northern Asian Populations

    PubMed Central

    Derenko, Miroslava ; Malyarchuk, Boris ; Grzybowski, Tomasz ; Denisova, Galina ; Dambueva, Irina ; Perkova, Maria ; Dorzhu, Choduraa ; Luzina, Faina ; Lee, Hong Kyu ; Vanecek, Tomas ; Villems, Richard ; Zakharov, Ilia 

    2007-01-01

    To elucidate the human colonization process of northern Asia and human dispersals to the Americas, a diverse subset of 71 mitochondrial DNA (mtDNA) lineages was chosen for complete genome sequencing from the collection of 1,432 control-region sequences sampled from 18 autochthonous populations of northern, central, eastern, and southwestern Asia. On the basis of complete mtDNA sequencing, we have revised the classification of haplogroups A, D2, G1, M7, and I; identified six new subhaplogroups (I4, N1e, G1c, M7d, M7e, and J1b2a); and fully characterized haplogroups N1a and G1b, which were previously described only by the first hypervariable segment (HVS1) sequencing and coding-region restriction-fragment–length polymorphism analysis. Our findings indicate that the southern Siberian mtDNA pool harbors several lineages associated with the Late Upper Paleolithic and/or early Neolithic dispersals from both eastern Asia and southwestern Asia/southern Caucasus. Moreover, the phylogeography of the D2 lineages suggests that southern Siberia is likely to be a geographical source for the last postglacial maximum spread of this subhaplogroup to northern Siberia and that the expansion of the D2b branch occurred in Beringia ∼7,000 years ago. In general, a detailed analysis of mtDNA gene pools of northern Asians provides the additional evidence to rule out the existence of a northern Asian route for the initial human colonization of Asia. PMID:17924343

  4. Phylogeographic analysis of mitochondrial DNA in northern Asian populations.

    PubMed

    Derenko, Miroslava; Malyarchuk, Boris; Grzybowski, Tomasz; Denisova, Galina; Dambueva, Irina; Perkova, Maria; Dorzhu, Choduraa; Luzina, Faina; Lee, Hong Kyu; Vanecek, Tomas; Villems, Richard; Zakharov, Ilia

    2007-11-01

    To elucidate the human colonization process of northern Asia and human dispersals to the Americas, a diverse subset of 71 mitochondrial DNA (mtDNA) lineages was chosen for complete genome sequencing from the collection of 1,432 control-region sequences sampled from 18 autochthonous populations of northern, central, eastern, and southwestern Asia. On the basis of complete mtDNA sequencing, we have revised the classification of haplogroups A, D2, G1, M7, and I; identified six new subhaplogroups (I4, N1e, G1c, M7d, M7e, and J1b2a); and fully characterized haplogroups N1a and G1b, which were previously described only by the first hypervariable segment (HVS1) sequencing and coding-region restriction-fragment-length polymorphism analysis. Our findings indicate that the southern Siberian mtDNA pool harbors several lineages associated with the Late Upper Paleolithic and/or early Neolithic dispersals from both eastern Asia and southwestern Asia/southern Caucasus. Moreover, the phylogeography of the D2 lineages suggests that southern Siberia is likely to be a geographical source for the last postglacial maximum spread of this subhaplogroup to northern Siberia and that the expansion of the D2b branch occurred in Beringia ~7,000 years ago. In general, a detailed analysis of mtDNA gene pools of northern Asians provides the additional evidence to rule out the existence of a northern Asian route for the initial human colonization of Asia.

  5. Genetic characterization of human herpesvirus type 1: Full-length genome sequence of strain obtained from an encephalitis case from India.

    PubMed

    Bondre, Vijay P; Sankararaman, Vasudha; Andhare, Vijaysinh; Tupekar, Manisha; Sapkal, Gajanan N

    2016-11-01

    Human herpes simplex virus 1 (HSV-1) is the most common cause of sporadic encephalitis in humans that contributes to >10 per cent of the encephalitis cases occurring worldwide. Availability of limited full genome sequences from a small number of isolates resulted in poor understanding of host and viral factors responsible for variable clinical outcome. In this study genetic relationship, extent and source of recombination using full-length genome sequence derived from a newly isolated HSV-1 isolate was studied in comparison with those sampled from patients with varied clinical outcome. Full genome sequence of HSV-1 isolated from cerebrospinal fluid (CSF) of a patient with acute encephalitis syndrome (AES) by inoculation in baby hamster kidney-21 (BHK-21) cells was determined using next-generation sequencing (NGS) technology. Phylogenetic analysis of the newly generated sequence in comparison with 33 additional full-length genomes defined genetic relationship with worldwide distributed strains. The bootscan and similarity plot analysis defined recombination crossovers and similarities between newly isolated Indian HSV-1 with six Asian and a total of 34 worldwide isolated strains. Mapping of 376,332 reads amplified from HSV-1 DNA by NGS generated full-length genome of 151,024 bp from newly isolated Indian HSV-1. Phylogenetic analysis classified worldwide distributed strains into three major evolutionary lineages correlating to their geographic distribution. Lineage 1 containing strains were isolated from America and Europe; lineage 2 contained all the strains from Asian countries along with the North American KOS and RE strains whereas the South African isolates were distributed into two groups under lineage 3. Recombination analysis confirmed events of recombination in Indian HSV-1 genome resulting from mixing of different strains evolved in Asian countries. Our results showed that the full-length genome sequence generated from an Indian HSV-1 isolate shared close genetic relationship with the American KOS and Chinese CR38 strains which belonged to the Asian genetic lineage. Recombination analysis of Indian isolate demonstrated multiple recombination crossover points throughout the genome. This full-length genome sequence amplified from the Indian isolate would be helpful to study HSV evolution, genetic basis of differential pathogenesis, host-virus interactions and viral factors contributing towards differential clinical outcome in human infections.

  6. Genetic characterization of human herpesvirus type 1: Full-length genome sequence of strain obtained from an encephalitis case from India

    PubMed Central

    Bondre, Vijay P.; Sankararaman, Vasudha; Andhare, Vijaysinh; Tupekar, Manisha; Sapkal, Gajanan N.

    2016-01-01

    Background & objectives: Human herpes simplex virus 1 (HSV-1) is the most common cause of sporadic encephalitis in humans that contributes to >10 per cent of the encephalitis cases occurring worldwide. Availability of limited full genome sequences from a small number of isolates resulted in poor understanding of host and viral factors responsible for variable clinical outcome. In this study genetic relationship, extent and source of recombination using full-length genome sequence derived from a newly isolated HSV-1 isolate was studied in comparison with those sampled from patients with varied clinical outcome. Methods: Full genome sequence of HSV-1 isolated from cerebrospinal fluid (CSF) of a patient with acute encephalitis syndrome (AES) by inoculation in baby hamster kidney-21 (BHK-21) cells was determined using next-generation sequencing (NGS) technology. Phylogenetic analysis of the newly generated sequence in comparison with 33 additional full-length genomes defined genetic relationship with worldwide distributed strains. The bootscan and similarity plot analysis defined recombination crossovers and similarities between newly isolated Indian HSV-1 with six Asian and a total of 34 worldwide isolated strains. Results: Mapping of 376,332 reads amplified from HSV-1 DNA by NGS generated full-length genome of 151,024 bp from newly isolated Indian HSV-1. Phylogenetic analysis classified worldwide distributed strains into three major evolutionary lineages correlating to their geographic distribution. Lineage 1 containing strains were isolated from America and Europe; lineage 2 contained all the strains from Asian countries along with the North American KOS and RE strains whereas the South African isolates were distributed into two groups under lineage 3. Recombination analysis confirmed events of recombination in Indian HSV-1 genome resulting from mixing of different strains evolved in Asian countries. Interpretation & conclusions: Our results showed that the full-length genome sequence generated from an Indian HSV-1 isolate shared close genetic relationship with the American KOS and Chinese CR38 strains which belonged to the Asian genetic lineage. Recombination analysis of Indian isolate demonstrated multiple recombination crossover points throughout the genome. This full-length genome sequence amplified from the Indian isolate would be helpful to study HSV evolution, genetic basis of differential pathogenesis, host-virus interactions and viral factors contributing towards differential clinical outcome in human infections. PMID:28361829

  7. Genome-wide comparisons of phylogenetic similarities between partial genomic regions and the full-length genome in Hepatitis E virus genotyping.

    PubMed

    Wang, Shuai; Wei, Wei; Luo, Xuenong; Cai, Xuepeng

    2014-01-01

    Besides the complete genome, different partial genomic sequences of Hepatitis E virus (HEV) have been used in genotyping studies, making it difficult to compare the results based on them. No commonly agreed partial region for HEV genotyping has been determined. In this study, we used a statistical method to evaluate the phylogenetic performance of each partial genomic sequence from a genome wide, by comparisons of evolutionary distances between genomic regions and the full-length genomes of 101 HEV isolates to identify short genomic regions that can reproduce HEV genotype assignments based on full-length genomes. Several genomic regions, especially one genomic region at the 3'-terminal of the papain-like cysteine protease domain, were detected to have relatively high phylogenetic correlations with the full-length genome. Phylogenetic analyses confirmed the identical performances between these regions and the full-length genome in genotyping, in which the HEV isolates involved could be divided into reasonable genotypes. This analysis may be of value in developing a partial sequence-based consensus classification of HEV species.

  8. Morphometric and molecular differentiation between quetzal subspecies of Pharomachrus mocinno (Trogoniformes: Trogonidae).

    PubMed

    Solórzano, Sofía; Oyama, Ken

    2010-03-01

    The resplendent Quetzal (Pharomachrus mocinno) is an endemic Mesoamerican bird species of conservation concern. Within this species, the subspecies P. m. costaricensis and P. m. mocinno, have been recognized by apparent morphometric differences; however, presently there is no sufficient data for confirmation. We analyzed eight morphometric attributes of the body from 41 quetzals: body length, tarsus and cord wing, as well as the length, wide and depth of the bill, body weight; and in the case of the males, the length of the long upper-tail cover feathers. We used multivariate analyses to discriminate morphometric differences between subspecies and contrasted each morphometric attribute between and within subspecies with paired non-parametric Wilcoxon test. In order to review the intraspecific taxonomic status of this bird, we added phylogenetic analysis, and genetic divergence and differentiation based on nucleotide variations in four sequences of mtDNA. The nucleotide variation was estimated in control region, subunit NDH6, and tRNAGlu and tRNAPhe in 26 quetzals from eight localities distributed in five countries. We estimated the genetic divergence and differentiation between subspecies according to a mutation-drift equilibrium model. We obtained the best mutation nucleotide model following the procedure implemented in model test program. We constructed the phylogenetic relationships between subspecies by maximum parsimony and maximum likelihood using PAUP, as well as with Bayesian statistics. The multivariate analyses showed two different morphometric groups, and individuals clustered according to the subspecies that they belong. The paired comparisons between subspecies showed strong differences in most of the attributes analyzed. Along the four mtDNA sequences, we identified 32 nucleotide positions that have a particular nucleotide according to the quetzals subspecies. The genetic divergence and the differentiation was strong and markedly showed two groups within P. mocinno that corresponded to the quetzals subspecies. The model selected for our data was TVM+G. The three phylogenetic methods here used recovered two clear monophyletic clades corresponding to each subspecies, and evidenced a significant and true partition of P. mocinno species into two different genetic, morphometric and ecologic groups. Additionally, according to our calculations, the gene flow between subspecies is interrupted at least from three million years ago. Thus we propose that P. mocinno be divided in two independent species: P. mocinno (Northern species, from Mexico to Nicaragua) and in P. costaricensis (Southern species, Costa Rica and Panama). This new taxonomic classification of the quetzal subspecies allows us to get well conservation achievements because the evaluation about the kind and magnitude of the threats could be more precise.

  9. Rapidly rotating neutron stars in general relativity: Realistic equations of state

    NASA Technical Reports Server (NTRS)

    Cook, Gregory B.; Shapiro, Stuart L.; Teukolsky, Saul A.

    1994-01-01

    We construct equilibrium sequences of rotating neutron stars in general relativity. We compare results for 14 nuclear matter equations of state. We determine a number of important physical parameters for such stars, including the maximum mass and maximum spin rate. The stability of the configurations to quasi-radial perturbations is assessed. We employ a numerical scheme particularly well suited to handle rapid rotation and large departures from spherical symmetry. We provide an extensive tabulation of models for future reference. Two classes of evolutionary sequences of fixed baryon rest mass and entropy are explored: normal sequences, which behave very much like Newtonian sequences, and supramassive sequences, which exist for neutron stars solely because of general relativistic effects. Adiabatic dissipation of energy and angular momentum causes a star to evolve in quasi-stationary fashion along an evolutionary sequence. Supramassive sequences have masses exceeding the maximum mass of a nonrotating neutron star. A supramassive star evolves toward eventual catastrophic collapse to a black hole. Prior to collapse, the star actually spins up as it loses angular momentum, an effect that may provide an observable precursor to gravitational collapse to a black hole.

  10. Experimental investigation of inlet-combustor isolators for a dual-mode scramjet at a Mach number of 4

    NASA Technical Reports Server (NTRS)

    Emami, Saied; Trexler, Carl A.; Auslender, Aaron H.; Weidner, John P.

    1995-01-01

    This report details experimentally derived operational characteristics of numerous two-dimensional planar inlet-combustor isolator configurations at a Mach number of 4. Variations in geometry included (1) inlet cowl length; (2) inlet cowl rotation angle; (3) isolator length; and (4) utilization of a rearward-facing isolator step. To obtain inlet-isolator maximum pressure-rise data relevant to ramjet-engine combustion operation, configurations were mechanically back pressured. Results demonstrated that the combined inlet-isolator maximum back-pressure capability increases as a function of isolator length and contraction ratio, and that the initiation of unstart is nearly independent of inlet cowl length, inlet cowl contraction ratio, and mass capture. Additionally, data are presented quantifying the initiation of inlet unstarts and the corresponding unstart pressure levels.

  11. Maximum permissible voltage of YBCO coated conductors

    NASA Astrophysics Data System (ADS)

    Wen, J.; Lin, B.; Sheng, J.; Xu, J.; Jin, Z.; Hong, Z.; Wang, D.; Zhou, H.; Shen, X.; Shen, C.

    2014-06-01

    Superconducting fault current limiter (SFCL) could reduce short circuit currents in electrical power system. One of the most important thing in developing SFCL is to find out the maximum permissible voltage of each limiting element. The maximum permissible voltage is defined as the maximum voltage per unit length at which the YBCO coated conductors (CC) do not suffer from critical current (Ic) degradation or burnout. In this research, the time of quenching process is changed and voltage is raised until the Ic degradation or burnout happens. YBCO coated conductors test in the experiment are from American superconductor (AMSC) and Shanghai Jiao Tong University (SJTU). Along with the quenching duration increasing, the maximum permissible voltage of CC decreases. When quenching duration is 100 ms, the maximum permissible of SJTU CC, 12 mm AMSC CC and 4 mm AMSC CC are 0.72 V/cm, 0.52 V/cm and 1.2 V/cm respectively. Based on the results of samples, the whole length of CCs used in the design of a SFCL can be determined.

  12. An integrated PCR colony hybridization approach to screen cDNA libraries for full-length coding sequences.

    PubMed

    Pollier, Jacob; González-Guzmán, Miguel; Ardiles-Diaz, Wilson; Geelen, Danny; Goossens, Alain

    2011-01-01

    cDNA-Amplified Fragment Length Polymorphism (cDNA-AFLP) is a commonly used technique for genome-wide expression analysis that does not require prior sequence knowledge. Typically, quantitative expression data and sequence information are obtained for a large number of differentially expressed gene tags. However, most of the gene tags do not correspond to full-length (FL) coding sequences, which is a prerequisite for subsequent functional analysis. A medium-throughput screening strategy, based on integration of polymerase chain reaction (PCR) and colony hybridization, was developed that allows in parallel screening of a cDNA library for FL clones corresponding to incomplete cDNAs. The method was applied to screen for the FL open reading frames of a selection of 163 cDNA-AFLP tags from three different medicinal plants, leading to the identification of 109 (67%) FL clones. Furthermore, the protocol allows for the use of multiple probes in a single hybridization event, thus significantly increasing the throughput when screening for rare transcripts. The presented strategy offers an efficient method for the conversion of incomplete expressed sequence tags (ESTs), such as cDNA-AFLP tags, to FL-coding sequences.

  13. Molecular characterization of partial fusion gene and C-terminus extension length of haemagglutinin-neuraminidase gene of recently isolated Newcastle disease virus isolates in Malaysia

    PubMed Central

    2010-01-01

    Background Newcastle disease (ND), caused by Newcastle disease virus (NDV), is a highly contagious disease of birds and has been one of the major causes of economic losses in the poultry industry. Despite routine vaccination programs, sporadic cases have occasionally occurred in the country and remain a constant threat to commercial poultry. Hence, the present study was aimed to characterize NDV isolates obtained from clinical cases in various locations of Malaysia between 2004 and 2007 based on sequence and phylogenetic analysis of partial F gene and C-terminus extension length of HN gene. Results The coding region of eleven NDV isolates fusion (F) gene and carboxyl terminal region of haemagglutinin-neuraminidase (HN) gene including extensions were amplified by reverse transcriptase PCR and directly sequenced. All the isolates have shown to have non-synonymous to synonymous base substitution rate ranging between 0.081 - 0.264 demonstrating presence of negative selection. Analysis based on F gene showed the characterized isolates possess three different types of protease cleavage site motifs; namely 112RRQKRF117, 112RRRKRF117 and 112GRQGRL117 and appear to show maximum identities with isolates in the region such as cockatoo/14698/90 (Indonesia), Ch/2000 (China), local isolate AF2240 indicating the high similarity of isolates circulating in the South East Asian countries. Meanwhile, one of the isolates resembles commonly used lentogenic vaccine strains. On further characterization of the HN gene, Malaysian isolates had C-terminus extensions of 0, 6 and 11 amino acids. Analysis of the phylogenetic tree revealed that the existence of three genetic groups; namely, genotype II, VII and VIII. Conclusions The study concluded that the occurrence of three types of NDV genotypes and presence of varied carboxyl terminus extension lengths among Malaysian isolates incriminated for sporadic cases. PMID:20691110

  14. Molecular characterization of partial fusion gene and C-terminus extension length of haemagglutinin-neuraminidase gene of recently isolated Newcastle disease virus isolates in Malaysia.

    PubMed

    Berhanu, Ayalew; Ideris, Aini; Omar, Abdul R; Bejo, Mohd Hair

    2010-08-08

    Newcastle disease (ND), caused by Newcastle disease virus (NDV), is a highly contagious disease of birds and has been one of the major causes of economic losses in the poultry industry. Despite routine vaccination programs, sporadic cases have occasionally occurred in the country and remain a constant threat to commercial poultry. Hence, the present study was aimed to characterize NDV isolates obtained from clinical cases in various locations of Malaysia between 2004 and 2007 based on sequence and phylogenetic analysis of partial F gene and C-terminus extension length of HN gene. The coding region of eleven NDV isolates fusion (F) gene and carboxyl terminal region of haemagglutinin-neuraminidase (HN) gene including extensions were amplified by reverse transcriptase PCR and directly sequenced. All the isolates have shown to have non-synonymous to synonymous base substitution rate ranging between 0.081 - 0.264 demonstrating presence of negative selection. Analysis based on F gene showed the characterized isolates possess three different types of protease cleavage site motifs; namely 112RRQKRF117, 112RRRKRF117 and 112GRQGRL117 and appear to show maximum identities with isolates in the region such as cockatoo/14698/90 (Indonesia), Ch/2000 (China), local isolate AF2240 indicating the high similarity of isolates circulating in the South East Asian countries. Meanwhile, one of the isolates resembles commonly used lentogenic vaccine strains. On further characterization of the HN gene, Malaysian isolates had C-terminus extensions of 0, 6 and 11 amino acids. Analysis of the phylogenetic tree revealed that the existence of three genetic groups; namely, genotype II, VII and VIII. The study concluded that the occurrence of three types of NDV genotypes and presence of varied carboxyl terminus extension lengths among Malaysian isolates incriminated for sporadic cases.

  15. Dissection of the Octoploid Strawberry Genome by Deep Sequencing of the Genomes of Fragaria Species

    PubMed Central

    Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N.

    2014-01-01

    Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ∼200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species. PMID:24282021

  16. BAC end sequencing of Pacific white shrimp Litopenaeus vannamei: a glimpse into the genome of Penaeid shrimp

    NASA Astrophysics Data System (ADS)

    Zhao, Cui; Zhang, Xiaojun; Liu, Chengzhang; Huan, Pin; Li, Fuhua; Xiang, Jianhai; Huang, Chao

    2012-05-01

    Little is known about the genome of Pacific white shrimp ( Litopenaeus vannamei). To address this, we conducted BAC (bacterial artificial chromosome) end sequencing of L. vannamei. We selected and sequenced 7 812 BAC clones from the BAC library LvHE from the two ends of the inserts by Sanger sequencing. After trimming and quality filtering, 11 279 BAC end sequences (BESs) including 4 609 pairedends BESs were obtained. The total length of the BESs was 4 340 753 bp, representing 0.18% of the L. vannamei haploid genome. The lengths of the BESs ranged from 100 bp to 660 bp with an average length of 385 bp. Analysis of the BESs indicated that the L. vannamei genome is AT-rich and that the primary repeats patterns were simple sequence repeats (SSRs) and low complexity sequences. Dinucleotide and hexanucleotide repeats were the most common SSR types in the BESs. The most abundant transposable element was gypsy, which may contribute to the generation of the large genome size of L. vannamei. We successfully annotated 4 519 BESs by BLAST searching, including genes involved in immunity and sex determination. Our results provide an important resource for functional gene studies, map construction and integration, and complete genome assembly for this species.

  17. Testing deep reticulate evolution in Amaryllidaceae Tribe Hippeastreae (Asparagales) with ITS and chloroplast sequence data

    USDA-ARS?s Scientific Manuscript database

    The phylogeny of Amaryllidaceae tribe Hippeastreae was inferred using chloroplast (3’ycf1, ndhF, trnL-F) and nuclear (ITS rDNA) sequence data under maximum parsimony and maximum likelihood frameworks. Network analyses were applied to resolve conflicting signals among data sets and putative scenarios...

  18. Nitrogen Removal over Nitrite by Aeration Control in Aerobic Granular Sludge Sequencing Batch Reactors

    PubMed Central

    Lochmatter, Samuel; Maillard, Julien; Holliger, Christof

    2014-01-01

    This study investigated the potential of aeration control for the achievement of N-removal over nitrite with aerobic granular sludge in sequencing batch reactors. N-removal over nitrite requires less COD, which is particularly interesting if COD is the limiting parameter for nutrient removal. The nutrient removal performances for COD, N and P have been analyzed as well as the concentration of nitrite-oxidizing bacteria in the granular sludge. Aeration phase length control combined with intermittent aeration or alternate high-low DO, has proven to be an efficient way to reduce the nitrite-oxidizing bacteria population and hence achieve N-removal over nitrite. N-removal efficiencies of up to 95% were achieved for an influent wastewater with COD:N:P ratios of 20:2.5:1. The total N-removal rate was 0.18 kgN·m−3·d−1. With N-removal over nitrate the N-removal was only 74%. At 20 °C, the nitrite-oxidizing bacteria concentration decreased by over 95% in 60 days and it was possible to switch from N-removal over nitrite to N-removal over nitrate and back again. At 15 °C, the nitrite-oxidizing bacteria concentration decreased too but less, and nitrite oxidation could not be completely suppressed. However, the combination of aeration phase length control and high-low DO was also at 15 °C successful to maintain the nitrite pathway despite the fact that the maximum growth rate of nitrite-oxidizing bacteria at temperatures below 20 °C is in general higher than the one of ammonium-oxidizing bacteria. PMID:25006970

  19. Influence of F0 and Sequence Length of Audio and Electroglottographic Signals on Perturbation Measures for Voice Assessment.

    PubMed

    Hohm, Julian; Döllinger, Michael; Bohr, Christopher; Kniesburges, Stefan; Ziethe, Anke

    2015-07-01

    Within the functional assessment of voice disorders, an objective analysis of measured parameters from audio, electroglottographic (EGG), or visual signals is desired. In a typical clinical situation, reliable objective analysis is not always possible due to missing standardization and unknown stability of the clinical parameters. The aim of this study was to investigate the robustness/stability of measured clinical parameters of the audio and EGG signals in a typical clinical setting to ensure a reliable objective analysis. In particular, the influence of F0 and of the sequence length on several definitions of jitter and shimmer will be analyzed. Seventy-four young healthy women produced a sustained vowel /a/ and an upward triad with abrupt changeovers. Different sequence lengths (100, 150, 500, and 1000 ms) of sustained phonation and triads (100 and 150 ms) were extracted from the audio and EGG signals. In total, six variations of jitter and four variations of shimmer parameters were analyzed. Jitter%, Jitter11p, and JitterPPQ of the audio signal as well as Jittermean, Shimmer, and Shimmer11p of the EGG signal are unaffected by both sequence length and F0. Influence of F0 and sequence length on several perturbation measures of the audio and EGG signals was identified. For an objective clinical voice assessment, unaffected definitions of jitter and shimmer should be preferred and applied to enable comparability between different recordings, examinations, and studies. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  20. Structural analysis of two length variants of the rDNA intergenic spacer from Eruca sativa.

    PubMed

    Lakshmikumaran, M; Negi, M S

    1994-03-01

    Restriction enzyme analysis of the rRNA genes of Eruca sativa indicated the presence of many length variants within a single plant and also between different cultivars which is unusual for most crucifers studied so far. Two length variants of the rDNA intergenic spacer (IGS) from a single individual E. sativa (cv. Itsa) plant were cloned and characterized. The complete nucleotide sequences of both the variants (3 kb and 4 kb) were determined. The intergenic spacer contains three families of tandemly repeated DNA sequences denoted as A, B and C. However, the long (4 kb) variant shows the presence of an additional repeat, denoted as D, which is a duplication of a 224 bp sequence just upstream of the putative transcription initiation site. Repeat units belonging to the three different families (A, B and C) were in the size range of 22 to 30 bp. Such short repeat elements are present in the IGS of most of the crucifers analysed so far. Sequence analysis of the variants (3 kb and 4 kb) revealed that the length heterogeneity of the spacer is located at three different regions and is due to the varying copy numbers of repeat units belonging to families A and B. Length variation of the spacer is also due to the presence of a large duplication (D repeats) in the 4 kb variant which is absent in the 3 kb variant. The putative transcription initiation site was identified by comparisons with the rDNA sequences from other plant species.

  1. DNA interactions with a Methylene Blue redox indicator depend on the DNA length and are sequence specific.

    PubMed

    Farjami, Elaheh; Clima, Lilia; Gothelf, Kurt V; Ferapontova, Elena E

    2010-06-01

    A DNA molecular beacon approach was used for the analysis of interactions between DNA and Methylene Blue (MB) as a redox indicator of a hybridization event. DNA hairpin structures of different length and guanine (G) content were immobilized onto gold electrodes in their folded states through the alkanethiol linker at the 5'-end. Binding of MB to the folded hairpin DNA was electrochemically studied and compared with binding to the duplex structure formed by hybridization of the hairpin DNA to a complementary DNA strand. Variation of the electrochemical signal from the DNA-MB complex was shown to depend primarily on the DNA length and sequence used: the G-C base pairs were the preferential sites of MB binding in the duplex. For short 20 nts long DNA sequences, the increased electrochemical response from MB bound to the duplex structure was consistent with the increased amount of bound and electrochemically readable MB molecules (i.e. MB molecules that are available for the electron transfer (ET) reaction with the electrode). With longer DNA sequences, the balance between the amounts of the electrochemically readable MB molecules bound to the hairpin DNA and to the hybrid was opposite: a part of the MB molecules bound to the long-sequence DNA duplex seem to be electrochemically mute due to long ET distance. The increasing electrochemical response from MB bound to the short-length DNA hybrid contrasts with the decreasing signal from MB bound to the long-length DNA hybrid and allows an "off"-"on" genosensor development.

  2. Homogeneity of the 16S rDNA sequence among geographically disparate isolates of Taylorella equigenitalis

    PubMed Central

    Matsuda, M; Tazumi, A; Kagawa, S; Sekizuka, T; Murayama, O; Moore, JE; Millar, BC

    2006-01-01

    Background At present, six accessible sequences of 16S rDNA from Taylorella equigenitalis (T. equigenitalis) are available, whose sequence differences occur at a few nucleotide positions. Thus it is important to determine these sequences from additional strains in other countries, if possible, in order to clarify any anomalies regarding 16S rDNA sequence heterogeneity. Here, we clone and sequence the approximate full-length 16S rDNA from additional strains of T. equigenitalis isolated in Japan, Australia and France and compare these sequences to the existing published sequences. Results Clarification of any anomalies regarding 16S rDNA sequence heterogeneity of T. equigenitalis was carried out. When cloning, sequencing and comparison of the approximate full-length 16S rDNA from 17 strains of T. equigenitalis isolated in Japan, Australia and France, nucleotide sequence differences were demonstrated at the six loci in the 1,469 nucleotide sequence. Moreover, 12 polymorphic sites occurred among 23 sequences of the 16S rDNA, including the six reference sequences. Conclusion High sequence similarity (99.5% or more) was observed throughout, except from nucleotide positions 138 to 501 where substitutions and deletions were noted. PMID:16398935

  3. Classification of Kiwifruit Grades Based on Fruit Shape Using a Single Camera

    PubMed Central

    Fu, Longsheng; Sun, Shipeng; Li, Rui; Wang, Shaojin

    2016-01-01

    This study aims to demonstrate the feasibility for classifying kiwifruit into shape grades by adding a single camera to current Chinese sorting lines equipped with weight sensors. Image processing methods are employed to calculate fruit length, maximum diameter of the equatorial section, and projected area. A stepwise multiple linear regression method is applied to select significant variables for predicting minimum diameter of the equatorial section and volume and to establish corresponding estimation models. Results show that length, maximum diameter of the equatorial section and weight are selected to predict the minimum diameter of the equatorial section, with the coefficient of determination of only 0.82 when compared to manual measurements. Weight and length are then selected to estimate the volume, which is in good agreement with the measured one with the coefficient of determination of 0.98. Fruit classification based on the estimated minimum diameter of the equatorial section achieves a low success rate of 84.6%, which is significantly improved using a linear combination of the length/maximum diameter of the equatorial section and projected area/length ratios, reaching 98.3%. Thus, it is possible for Chinese kiwifruit sorting lines to reach international standards of grading kiwifruit on fruit shape classification by adding a single camera. PMID:27376292

  4. Variation in Size and Form between Left and Right Maxillary Central Incisor Teeth.

    PubMed

    Vadavadagi, Suneel V; Hombesh, M N; Choudhury, Gopal Krishna; Deshpande, Sumith; Anusha, C V; Murthy, D Kiran

    2015-02-01

    To compare the variation in size of left and right maxillary central incisors for male patients (using digital calipers of 0.01 mm accuracy). To compare the variation in size of left and right maxillary central incisors for female patients (using digital calipers of 0.01 mm accuracy). To find out the difference between the maxillary central incisors of men and women. Its clinical applicability if difference exists. A total of 70 dental students of PMNM Dental College and Hospital were selected. Of 70 dental students, 40 male and 30 female were selected. Impressions were made for all subjects, using irreversible hydrocolloid (Algitex, manufacturer DPI, Batch-T-8804) using perforated stock metal trays. The mesiodistal crown width and cervical width were measured for each incisor and recorded separately for left and right teeth. The length was measured for each incisor and recorded separately for left and right maxillary central incisor using digitec height caliper. The mean value of maximum crown length of maxillary left central incisor of male was greater in length compared with maxillary right central incisor. Mean value of maximum crown length for male patient right and left side was greater compared with maximum crown length of female patient. When compared the dimensions of teeth between two sex, male group shows larger values to female group.

  5. Simple tools for assembling and searching high-density picolitre pyrophosphate sequence data.

    PubMed

    Parker, Nicolas J; Parker, Andrew G

    2008-04-18

    The advent of pyrophosphate sequencing makes large volumes of sequencing data available at a lower cost than previously possible. However, the short read lengths are difficult to assemble and the large dataset is difficult to handle. During the sequencing of a virus from the tsetse fly, Glossina pallidipes, we found the need for tools to search quickly a set of reads for near exact text matches. A set of tools is provided to search a large data set of pyrophosphate sequence reads under a "live" CD version of Linux on a standard PC that can be used by anyone without prior knowledge of Linux and without having to install a Linux setup on the computer. The tools permit short lengths of de novo assembly, checking of existing assembled sequences, selection and display of reads from the data set and gathering counts of sequences in the reads. Demonstrations are given of the use of the tools to help with checking an assembly against the fragment data set; investigating homopolymer lengths, repeat regions and polymorphisms; and resolving inserted bases caused by incomplete chain extension. The additional information contained in a pyrophosphate sequencing data set beyond a basic assembly is difficult to access due to a lack of tools. The set of simple tools presented here would allow anyone with basic computer skills and a standard PC to access this information.

  6. cWINNOWER algorithm for finding fuzzy dna motifs

    NASA Technical Reports Server (NTRS)

    Liang, S.; Samanta, M. P.; Biegel, B. A.

    2004-01-01

    The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if a clique consisting of a sufficiently large number of mutated copies of the motif (i.e., the signals) is present in the DNA sequence. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum detectable clique size qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12,000 for (l, d) = (15, 4). Copyright Imperial College Press.

  7. cWINNOWER Algorithm for Finding Fuzzy DNA Motifs

    NASA Technical Reports Server (NTRS)

    Liang, Shoudan

    2003-01-01

    The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if multiple mutated copies of the motif (i.e., the signals) are present in the DNA sequence in sufficient abundance. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum number of detectable motifs qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc, by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12000 for (l,d) = (15,4).

  8. Body growth and life history in wild mountain gorillas (Gorilla beringei beringei) from Volcanoes National Park, Rwanda.

    PubMed

    Galbany, Jordi; Abavandimwe, Didier; Vakiener, Meagan; Eckardt, Winnie; Mudakikwa, Antoine; Ndagijimana, Felix; Stoinski, Tara S; McFarlin, Shannon C

    2017-07-01

    Great apes show considerable diversity in socioecology and life history, but knowledge of their physical growth in natural settings is scarce. We characterized linear body size growth in wild mountain gorillas from Volcanoes National Park, Rwanda, a population distinguished by its extreme folivory and accelerated life histories. In 131 individuals (0.09-35.26 years), we used non-invasive parallel laser photogrammetry to measure body length, back width, arm length and two head dimensions. Nonparametric LOESS regression was used to characterize cross-sectional distance and velocity growth curves for males and females, and consider links with key life history milestones. Sex differences became evident between 8.5 and 10.0 years of age. Thereafter, female growth velocities declined, while males showed increased growth velocities until 10.0-14.5 years across dimensions. Body dimensions varied in growth; females and males reached 98% of maximum body length at 11.7 and 13.1 years, respectively. Females attained 95.3% of maximum body length by mean age at first birth. Neonates were 31% of maternal size, and doubled in size by mean weaning age. Males reached maximum body and arm length and back width before emigration, but experienced continued growth in head dimensions. While comparable data are scarce, our findings provide preliminary support for the prediction that mountain gorillas reach maximum body size at earlier ages compared to more frugivorous western gorillas. Data from other wild populations are needed to better understand comparative great ape development, and investigate links between trajectories of physical, behavioral, and reproductive maturation. © 2017 Wiley Periodicals, Inc.

  9. Draft Genome Sequence of Pseudomonas sp. Strain LFM046, a Producer of Medium-Chain-Length Polyhydroxyalkanoate

    PubMed Central

    Cardinali-Rezende, Juliana; Alexandrino, Paulo Moises Raduan; Nahat, Rafael Augusto Theodoro Pereira de Souza; Sant’Ana, Débora Parrine Vieira; Silva, Luiziana Ferreira; Gomez, José Gregório Cabrera

    2015-01-01

    Pseudomonas sp. LFM046 is a medium-chain-length polyhydroxyalkanoate (PHAMCL) producer capable of using various carbon sources (carbohydrates, organic acids, and vegetable oils) and was first isolated from sugarcane cultivation soil in Brazil. The genome sequence was found to be 5.97 Mb long with a G+C content of 66%. PMID:26294616

  10. Using specific length amplified fragment sequencing to construct the high-density genetic map for Vitis (Vitis vinifera L. × Vitis amurensis Rupr.)

    PubMed Central

    Guo, Yinshan; Shi, Guangli; Liu, Zhendong; Zhao, Yuhui; Yang, Xiaoxu; Zhu, Junchi; Li, Kun; Guo, Xiuwu

    2015-01-01

    In this study, 149 F1 plants from the interspecific cross between ‘Red Globe’ (Vitis vinifera L.) and ‘Shuangyou’ (Vitis amurensis Rupr.) and the parent were used to construct a molecular genetic linkage map by using the specific length amplified fragment sequencing technique. DNA sequencing generated 41.282 Gb data consisting of 206,411,693 paired-end reads. The average sequencing depths were 68.35 for ‘Red Globe,’ 63.65 for ‘Shuangyou,’ and 8.01 for each progeny. In all, 115,629 high-quality specific length amplified fragments were detected, of which 42,279 were polymorphic. The genetic map was constructed using 7,199 of these polymorphic markers. These polymorphic markers were assigned to 19 linkage groups; the total length of the map was 1929.13 cm, with an average distance of 0.28 cm between each maker. To our knowledge, the genetic maps constructed in this study contain the largest number of molecular markers. These high-density genetic maps might form the basis for the fine quantitative trait loci mapping and molecular-assisted breeding of grape. PMID:26089826

  11. The complete chloroplast genome sequence of Citrus sinensis (L.) Osbeck var 'Ridge Pineapple': organization and phylogenetic relationships to other angiosperms

    PubMed Central

    Bausher, Michael G; Singh, Nameirakpam D; Lee, Seung-Bum; Jansen, Robert K; Daniell, Henry

    2006-01-01

    Background The production of Citrus, the largest fruit crop of international economic value, has recently been imperiled due to the introduction of the bacterial disease Citrus canker. No significant improvements have been made to combat this disease by plant breeding and nuclear transgenic approaches. Chloroplast genetic engineering has a number of advantages over nuclear transformation; it not only increases transgene expression but also facilitates transgene containment, which is one of the major impediments for development of transgenic trees. We have sequenced the Citrus chloroplast genome to facilitate genetic improvement of this crop and to assess phylogenetic relationships among major lineages of angiosperms. Results The complete chloroplast genome sequence of Citrus sinensis is 160,129 bp in length, and contains 133 genes (89 protein-coding, 4 rRNAs and 30 distinct tRNAs). Genome organization is very similar to the inferred ancestral angiosperm chloroplast genome. However, in Citrus the infA gene is absent. The inverted repeat region has expanded to duplicate rps19 and the first 84 amino acids of rpl22. The rpl22 gene in the IRb region has a nonsense mutation resulting in 9 stop codons. This was confirmed by PCR amplification and sequencing using primers that flank the IR/LSC boundaries. Repeat analysis identified 29 direct and inverted repeats 30 bp or longer with a sequence identity ≥ 90%. Comparison of protein-coding sequences with expressed sequence tags revealed six putative RNA edits, five of which resulted in non-synonymous modifications in petL, psbH, ycf2 and ndhA. Phylogenetic analyses using maximum parsimony (MP) and maximum likelihood (ML) methods of a dataset composed of 61 protein-coding genes for 30 taxa provide strong support for the monophyly of several major clades of angiosperms, including monocots, eudicots, rosids and asterids. The MP and ML trees are incongruent in three areas: the position of Amborella and Nymphaeales, relationship of the magnoliid genus Calycanthus, and the monophyly of the eurosid I clade. Both MP and ML trees provide strong support for the monophyly of eurosids II and for the placement of Citrus (Sapindales) sister to a clade including the Malvales/Brassicales. Conclusion This is the first complete chloroplast genome sequence for a member of the Rutaceae and Sapindales. Expansion of the inverted repeat region to include rps19 and part of rpl22 and presence of two truncated copies of rpl22 is unusual among sequenced chloroplast genomes. Availability of a complete Citrus chloroplast genome sequence provides valuable information on intergenic spacer regions and endogenous regulatory sequences for chloroplast genetic engineering. Phylogenetic analyses resolve relationships among several major clades of angiosperms and provide strong support for the monophyly of the eurosid II clade and the position of the Sapindales sister to the Brassicales/Malvales. PMID:17010212

  12. Gate length variation effect on performance of gate-first self-aligned In₀.₅₃Ga₀.₄₇As MOSFET.

    PubMed

    Mohd Razip Wee, Mohd F; Dehzangi, Arash; Bollaert, Sylvain; Wichmann, Nicolas; Majlis, Burhanuddin Y

    2013-01-01

    A multi-gate n-type In₀.₅₃Ga₀.₄₇As MOSFET is fabricated using gate-first self-aligned method and air-bridge technology. The devices with different gate lengths were fabricated with the Al2O3 oxide layer with the thickness of 8 nm. In this letter, impact of gate length variation on device parameter such as threshold voltage, high and low voltage transconductance, subthreshold swing and off current are investigated at room temperature. Scaling the gate length revealed good enhancement in all investigated parameters but the negative shift in threshold voltage was observed for shorter gate lengths. The high drain current of 1.13 A/mm and maximum extrinsic transconductance of 678 mS/mm with the field effect mobility of 364 cm(2)/Vs are achieved for the gate length and width of 0.2 µm and 30 µm, respectively. The source/drain overlap length for the device is approximately extracted about 51 nm with the leakage current in order of 10(-8) A. The results of RF measurement for cut-off and maximum oscillation frequency for devices with different gate lengths are compared.

  13. Gate Length Variation Effect on Performance of Gate-First Self-Aligned In0.53Ga0.47As MOSFET

    PubMed Central

    Mohd Razip Wee, Mohd F.; Dehzangi, Arash; Bollaert, Sylvain; Wichmann, Nicolas; Majlis, Burhanuddin Y.

    2013-01-01

    A multi-gate n-type In0.53Ga0.47As MOSFET is fabricated using gate-first self-aligned method and air-bridge technology. The devices with different gate lengths were fabricated with the Al2O3 oxide layer with the thickness of 8 nm. In this letter, impact of gate length variation on device parameter such as threshold voltage, high and low voltage transconductance, subthreshold swing and off current are investigated at room temperature. Scaling the gate length revealed good enhancement in all investigated parameters but the negative shift in threshold voltage was observed for shorter gate lengths. The high drain current of 1.13 A/mm and maximum extrinsic transconductance of 678 mS/mm with the field effect mobility of 364 cm2/Vs are achieved for the gate length and width of 0.2 µm and 30µm, respectively. The source/drain overlap length for the device is approximately extracted about 51 nm with the leakage current in order of 10−8 A. The results of RF measurement for cut-off and maximum oscillation frequency for devices with different gate lengths are compared. PMID:24367548

  14. End-to-end distance and contour length distribution functions of DNA helices

    NASA Astrophysics Data System (ADS)

    Zoli, Marco

    2018-06-01

    I present a computational method to evaluate the end-to-end and the contour length distribution functions of short DNA molecules described by a mesoscopic Hamiltonian. The method generates a large statistical ensemble of possible configurations for each dimer in the sequence, selects the global equilibrium twist conformation for the molecule, and determines the average base pair distances along the molecule backbone. Integrating over the base pair radial and angular fluctuations, I derive the room temperature distribution functions as a function of the sequence length. The obtained values for the most probable end-to-end distance and contour length distance, providing a measure of the global molecule size, are used to examine the DNA flexibility at short length scales. It is found that, also in molecules with less than ˜60 base pairs, coiled configurations maintain a large statistical weight and, consistently, the persistence lengths may be much smaller than in kilo-base DNA.

  15. In Planta Synthesis of Designer-Length Tobacco Mosaic Virus-Based Nano-Rods That Can Be Used to Fabricate Nano-Wires.

    PubMed

    Saunders, Keith; Lomonossoff, George P

    2017-01-01

    We have utilized plant-based transient expression to produce tobacco mosaic virus (TMV)-based nano-rods of predetermined lengths. This is achieved by expressing RNAs containing the TMV origin of assembly sequence (OAS) and the sequence of the TMV coat protein either on the same RNA molecule or on two separate constructs. We show that the length of the resulting nano-rods is dependent upon the length of the RNA that possesses the OAS element. By expressing a version of the TMV coat protein that incorporates a metal-binding peptide at its C-terminus in the presence of RNA containing the OAS we have been able to produce nano-rods of predetermined length that are coated with cobalt-platinum. These nano-rods have the properties of defined-length nano-wires that make them ideal for many developing bionanotechnological processes.

  16. In Planta Synthesis of Designer-Length Tobacco Mosaic Virus-Based Nano-Rods That Can Be Used to Fabricate Nano-Wires

    PubMed Central

    Saunders, Keith; Lomonossoff, George P.

    2017-01-01

    We have utilized plant-based transient expression to produce tobacco mosaic virus (TMV)-based nano-rods of predetermined lengths. This is achieved by expressing RNAs containing the TMV origin of assembly sequence (OAS) and the sequence of the TMV coat protein either on the same RNA molecule or on two separate constructs. We show that the length of the resulting nano-rods is dependent upon the length of the RNA that possesses the OAS element. By expressing a version of the TMV coat protein that incorporates a metal-binding peptide at its C-terminus in the presence of RNA containing the OAS we have been able to produce nano-rods of predetermined length that are coated with cobalt-platinum. These nano-rods have the properties of defined-length nano-wires that make them ideal for many developing bionanotechnological processes. PMID:28878782

  17. Kinematics and Kinetics of Taekwon-do Side Kick

    PubMed Central

    Wąsik, Jacek

    2011-01-01

    The aim of the paper is to present an analysis of the influence of selected kinematic factors on the side kick technique. This issue is especially important in the traditional version of taekwon-do, in which a single strike may reveal the winner. Six taekwon-do (International Taekwon-do Federation) athletes were asked to participate in this case study. Generally accepted criteria of sports technique biomechanical analysis were adhered to. The athletes executed a side kick three times (in Taekwon-do terminology referred to as yop chagi) in a way which they use the kick in board breaking. The obtained data were used to determine the mean velocity changes in the function of relative extension length of the kicking leg. The maximum knee and foot velocities in the Cartesian coordinate system were determined. The leg lifting time and the duration of kick execution as well as the maximum force which the standing foot exerted on the ground were also determined. On the basis of the obtained values, mean values and standard deviations were calculated. The correlation dependence (r=0.72) shows that greater knee velocity affects the velocity which the foot develops as well as the fact that the total time of kick execution depends on the velocity which the knee (r = −0.59) and the foot (r = − 0.86) develop in the leg lifting phase. The average maximum speed was obtained at the length of the leg equal to 82% of the maximum length of the fully extended leg. This length can be considered the optimum value for achieving the maximum dynamics of the kick. PMID:23486086

  18. Intraspecific variability of Steinernema feltiae strains from Cemoro Lawang, eastern Java, Indonesia.

    PubMed

    Addis, T; Mulawarman, M; Waeyenberge, L; Moens, M; Viaene, N; Ehlers, R U

    2010-01-01

    Four strains of Steinernema feltiae from Eastern Java, Indonesia were characterized based on morphometric, morphological and molecular data. In addition, their virulence against last instar Tenebrio molitor and heat tolerance was tested. Infective juvenile have a mean body length ranging from 749 to 792 microm. The maximum sequence difference among the four strains was 7 bp (8.8%) in the ITS and 2 bp (0.3%) in D2D3 regions of the rDNA. All the strains are not reproductively isolated and can reproduce with European strain S. feltiae Owiplant. The lowest LC50 was observed for strain SCM (373) and the highest for S. feltiae strain Owiplant (458) IJs/40 T. molitor. All four strains showed relatively better mean heat tolerance when compared with S. feltiae Owiplant, both in adapted and non-adapted heat tolerance experiments.

  19. Towards predicting the encoding capability of MR fingerprinting sequences.

    PubMed

    Sommer, K; Amthor, T; Doneva, M; Koken, P; Meineke, J; Börnert, P

    2017-09-01

    Sequence optimization and appropriate sequence selection is still an unmet need in magnetic resonance fingerprinting (MRF). The main challenge in MRF sequence design is the lack of an appropriate measure of the sequence's encoding capability. To find such a measure, three different candidates for judging the encoding capability have been investigated: local and global dot-product-based measures judging dictionary entry similarity as well as a Monte Carlo method that evaluates the noise propagation properties of an MRF sequence. Consistency of these measures for different sequence lengths as well as the capability to predict actual sequence performance in both phantom and in vivo measurements was analyzed. While the dot-product-based measures yielded inconsistent results for different sequence lengths, the Monte Carlo method was in a good agreement with phantom experiments. In particular, the Monte Carlo method could accurately predict the performance of different flip angle patterns in actual measurements. The proposed Monte Carlo method provides an appropriate measure of MRF sequence encoding capability and may be used for sequence optimization. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. Effects of Step Length, Age, and Fall History on Hip and Knee Kinetics and Knee Co-contraction during the Maximum Step Length Test

    PubMed Central

    Schulz, Brian W.; Jongprasithporn, Manutchanok; Hart-Hughes, Stephanie J.; Bulat, Tatjana

    2017-01-01

    Background Maximum step length is a brief clinical test involving stepping out and back as far as possible with the arms folded across the chest. This test has been shown to predict fall risk, but the biomechanics of this test are not fully understood. Knee and hip kinetics (moments and powers) are greater for longer steps and for younger subjects, but younger subjects also step farther. Methods To separate effects of step length, age, and fall history on joint kinetics; 14 healthy younger, 14 older non-fallers, and 11 older fallers (27(5), 72(5), 75(6) years respectively) all stepped to the same relative target distances of 20-80% of their height. Knee and hip kinetics and knee co-contraction were calculated. Findings Hip and knee kinetics and knee co-contraction all increased with step length, but older non-fallers and fallers utilized greater stepping hip and less stepping knee extensor kinetics. Fallers had greater stepping knee co-contraction than non-fallers. Stance knee co-contraction of non-fallers was similar to young for shorter steps and similar to fallers for longer steps. Interpretation Age had minimal effects and fall history had no effects on joint kinetics of steps to similar distances. Effects of age and fall history on knee co-contraction may contribute to age-related kinetic differences and shorter maximal step lengths of older non-fallers and fallers, but step length correlated with every variable tested. Thus, declines in maximum step length could indicate declines in hip and knee extensor kinetics and impaired performance on similar tasks like recovering from a trip. PMID:23978310

  1. Isolation and characterization of full-length putative alcohol dehydrogenase genes from polygonum minus

    NASA Astrophysics Data System (ADS)

    Hamid, Nur Athirah Abd; Ismail, Ismanizan

    2013-11-01

    Polygonum minus, locally named as Kesum is an aromatic herb which is high in secondary metabolite content. Alcohol dehydrogenase is an important enzyme that catalyzes the reversible oxidation of alcohol and aldehyde with the presence of NAD(P)(H) as co-factor. The main focus of this research is to identify the gene of ADH. The total RNA was extracted from leaves of P. minus which was treated with 150 μM Jasmonic acid. Full-length cDNA sequence of ADH was isolated via rapid amplification cDNA end (RACE). Subsequently, in silico analysis was conducted on the full-length cDNA sequence and PCR was done on genomic DNA to determine the exon and intron organization. Two sequences of ADH, designated as PmADH1 and PmADH2 were successfully isolated. Both sequences have ORF of 801 bp which encode 266 aa residues. Nucleotide sequence comparison of PmADH1 and PmADH2 indicated that both sequences are highly similar at the ORF region but divergent in the 3' untranslated regions (UTR). The amino acid is differ at the 107 residue; PmADH1 contains Gly (G) residue while PmADH2 contains Cys (C) residue. The intron-exon organization pattern of both sequences are also same, with 3 introns and 4 exons. Based on in silico analysis, both sequences contain "classical" short chain alcohol dehydrogenases/reductases ((c) SDRs) conserved domain. The results suggest that both sequences are the members of short chain alcohol dehydrogenase family.

  2. Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

    PubMed

    Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

    2017-07-01

    PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.

  3. Frequency and Voltage Dependence of the Dielectrophoretic Trapping of Short Lengths of DNA and dCTP in a Nanopipette

    PubMed Central

    Ying, Liming; White, Samuel S.; Bruckbauer, Andreas; Meadows, Lisa; Korchev, Yuri E.; Klenerman, David

    2004-01-01

    The study of the properties of DNA under high electric fields is of both fundamental and practical interest. We have exploited the high electric fields produced locally in the tip of a nanopipette to probe the motion of double- and single-stranded 40-mer DNA, a 1-kb single-stranded DNA, and a single-nucleotide triphosphate (dCTP) just inside and outside the pipette tip at different frequencies and amplitudes of applied voltages. We used dual laser excitation and dual color detection to simultaneously follow two fluorophore-labeled DNA sequences with millisecond time resolution, significantly faster than studies to date. A strong trapping effect was observed during the negative half cycle for all DNA samples and also the dCTP. This effect was maximum below 1 Hz and decreased with higher frequency. We assign this trapping to strong dielectrophoresis due to the high electric field and electric field gradient in the pipette tip. Dielectrophoresis in electrodeless tapered nanostructures has potential applications for controlled mixing and manipulation of short lengths of DNA and other biomolecules, opening new possibilities in miniaturized biological analysis. PMID:14747337

  4. De Novo Transcriptomic Analysis of Peripheral Blood Lymphocytes from the Chinese Goose: Gene Discovery and Immune System Pathway Description

    PubMed Central

    Tariq, Mansoor; Chen, Rong; Yuan, Hongyu; Liu, Yanjie; Wu, Yanan; Wang, Junya; Xia, Chun

    2015-01-01

    Background The Chinese goose is one of the most economically important poultry birds and is a natural reservoir for many avian viruses. However, the nature and regulation of the innate and adaptive immune systems of this waterfowl species are not completely understood due to limited information on the goose genome. Recently, transcriptome sequencing technology was applied in the genomic studies focused on novel gene discovery. Thus, this study described the transcriptome of the goose peripheral blood lymphocytes to identify immunity relevant genes. Principal Findings De novo transcriptome assembly of the goose peripheral blood lymphocytes was sequenced by Illumina-Solexa technology. In total, 211,198 unigenes were assembled from the 69.36 million cleaned reads. The average length, N50 size and the maximum length of the assembled unigenes were 687 bp, 1,298 bp and 18,992 bp, respectively. A total of 36,854 unigenes showed similarity by BLAST search against the NCBI non-redundant (Nr) protein database. For functional classification, 163,161 unigenes were comprised of three Gene Ontology (Go) categories and 67 subcategories. A total of 15,334 unigenes were annotated into 25 eukaryotic orthologous groups (KOGs) categories. Kyoto Encyclopedia of Genes and Genomes (KEGG) database annotated 39,585 unigenes into six biological functional groups and 308 pathways. Among the 2,757 unigenes that participated in the 15 immune system KEGG pathways, 125 of the most important immune relevant genes were summarized and analyzed by STRING analysis to identify gene interactions and relationships. Moreover, 10 genes were confirmed by PCR and analyzed. Of these 125 unigenes, 109 unigenes, approximately 87%, were not previously identified in the goose. Conclusion This de novo transcriptome analysis could provide important Chinese goose sequence information and highlights the value of new gene discovery, pathways investigation and immune system gene identification, and comparison with other avian species as useful tools to understand the goose immune system. PMID:25816068

  5. RT-PCR and sequence analysis of the full-length fusion protein of Canine Distemper Virus from domestic dogs.

    PubMed

    Romanutti, Carina; Gallo Calderón, Marina; Keller, Leticia; Mattion, Nora; La Torre, José

    2016-02-01

    During 2007-2014, 84 out of 236 (35.6%) samples from domestic dogs submitted to our laboratory for diagnostic purposes were positive for Canine Distemper Virus (CDV), as analyzed by RT-PCR amplification of a fragment of the nucleoprotein gene. Fifty-nine of them (70.2%) were from dogs that had been vaccinated against CDV. The full-length gene encoding the Fusion (F) protein of fifteen isolates was sequenced and compared with that of those of other CDVs, including wild-type and vaccine strains. Phylogenetic analysis using the F gene full-length sequences grouped all the Argentinean CDV strains in the SA2 clade. Sequence identity with the Onderstepoort vaccine strain was 89.0-90.6%, and the highest divergence was found in the 135 amino acids corresponding to the F protein signal-peptide, Fsp (64.4-66.7% identity). In contrast, this region was highly conserved among the local strains (94.1-100% identity). One extra putative N-glycosylation site was identified in the F gene of CDV Argentinean strains with respect to the vaccine strain. The present report is the first to analyze full-length F protein sequences of CDV strains circulating in Argentina, and contributes to the knowledge of molecular epidemiology of CDV, which may help in understanding future disease outbreaks. Copyright © 2015 Elsevier B.V. All rights reserved.

  6. Software for optimization of SNP and PCR-RFLP genotyping to discriminate many genomes with the fewest assays

    PubMed Central

    Gardner, Shea N; Wagner, Mark C

    2005-01-01

    Background Microbial forensics is important in tracking the source of a pathogen, whether the disease is a naturally occurring outbreak or part of a criminal investigation. Results A method and SPR Opt (SNP and PCR-RFLP Optimization) software to perform a comprehensive, whole-genome analysis to forensically discriminate multiple sequences is presented. Tools for the optimization of forensic typing using Single Nucleotide Polymorphism (SNP) and PCR-Restriction Fragment Length Polymorphism (PCR-RFLP) analyses across multiple isolate sequences of a species are described. The PCR-RFLP analysis includes prediction and selection of optimal primers and restriction enzymes to enable maximum isolate discrimination based on sequence information. SPR Opt calculates all SNP or PCR-RFLP variations present in the sequences, groups them into haplotypes according to their co-segregation across those sequences, and performs combinatoric analyses to determine which sets of haplotypes provide maximal discrimination among all the input sequences. Those set combinations requiring that membership in the fewest haplotypes be queried (i.e. the fewest assays be performed) are found. These analyses highlight variable regions based on existing sequence data. These markers may be heterogeneous among unsequenced isolates as well, and thus may be useful for characterizing the relationships among unsequenced as well as sequenced isolates. The predictions are multi-locus. Analyses of mumps and SARS viruses are summarized. Phylogenetic trees created based on SNPs, PCR-RFLPs, and full genomes are compared for SARS virus, illustrating that purported phylogenies based only on SNP or PCR-RFLP variations do not match those based on multiple sequence alignment of the full genomes. Conclusion This is the first software to optimize the selection of forensic markers to maximize information gained from the fewest assays, accepting whole or partial genome sequence data as input. As more sequence data becomes available for multiple strains and isolates of a species, automated, computational approaches such as those described here will be essential to make sense of large amounts of information, and to guide and optimize efforts in the laboratory. The software and source code for SPR Opt is publicly available and free for non-profit use at . PMID:15904493

  7. Length estimations of presumed upward connecting leaders in lightning flashes to flat water and flat ground

    NASA Astrophysics Data System (ADS)

    Stolzenburg, Maribeth; Marshall, Thomas C.; Karunarathne, Sumedhe; Orville, Richard E.

    2018-10-01

    Using video data recorded at 50,000 frames per second for nearby negative lightning flashes, estimates are derived for the length of positive upward connecting leaders (UCLs) that presumably formed prior to new ground attachments. Return strokes were 1.7 to 7.8 km distant, yielding image resolutions of 4.25 to 19.5 m. No UCLs are imaged in these data, indicating those features were too transient or too dim compared to other lightning processes that are imaged at these resolutions. Upper bound lengths for 17 presumed UCLs are determined from the height above flat ground or water of the successful stepped leader tip in the image immediately prior to (within 20 μs before) the return stroke. Better estimates of maximum UCL lengths are determined using the downward stepped leader tip's speed of advance and the estimated return stroke time within its first frame. For 17 strokes, the upper bound length of the possible UCL averages 31.6 m and ranges from 11.3 to 50.3 m. Among the close strokes (those with spatial resolution <8 m per pixel), the five which connected to water (salt water lagoon) have UCL upper bound estimates averaging significantly shorter (24.1 m) than the average for the three close strokes which connected to land (36.9 m). The better estimates of maximum UCL lengths for the eight close strokes average 20.2 m, with slightly shorter average of 18.3 m for the five that connected to water. All the better estimates of UCL maximum lengths are <38 m in this dataset

  8. Operating length and velocity of human M. vastus lateralis fascicles during vertical jumping

    PubMed Central

    Nikolaidou, Maria Elissavet; Marzilger, Robert; Bohm, Sebastian; Mersmann, Falk

    2017-01-01

    Humans achieve greater jump height during a counter-movement jump (CMJ) than in a squat jump (SJ). However, the crucial difference is the mean mechanical power output during the propulsion phase, which could be determined by intrinsic neuro-muscular mechanisms for power production. We measured M. vastus lateralis (VL) fascicle length changes and activation patterns and assessed the force–length, force–velocity and power–velocity potentials during the jumps. Compared with the SJ, the VL fascicles operated on a more favourable portion of the force–length curve (7% greater force potential, i.e. fraction of VL maximum force according to the force–length relationship) and more disadvantageous portion of the force–velocity curve (11% lower force potential, i.e. fraction of VL maximum force according to the force–velocity relationship) in the CMJ, indicating a reciprocal effect of force–length and force–velocity potentials for force generation. The higher muscle activation (15%) could therefore explain the moderately greater jump height (5%) in the CMJ. The mean fascicle-shortening velocity in the CMJ was closer to the plateau of the power–velocity curve, which resulted in a greater (15%) power–velocity potential (i.e. fraction of VL maximum power according to the power–velocity relationship). Our findings provide evidence for a cumulative effect of three different mechanisms—i.e. greater force–length potential, greater power–velocity potential and greater muscle activity—for an advantaged power production in the CMJ contributing to the marked difference in mean mechanical power (56%) compared with SJ. PMID:28573027

  9. Identification of Medically Important Yeasts Using PCR-Based Detection of DNA Sequence Polymorphisms in the Internal Transcribed Spacer 2 Region of the rRNA Genes

    PubMed Central

    Chen, Y. C.; Eisner, J. D.; Kattar, M. M.; Rassoulian-Barrett, S. L.; LaFe, K.; Yarfitz, S. L.; Limaye, A. P.; Cookson, B. T.

    2000-01-01

    Identification of medically relevant yeasts can be time-consuming and inaccurate with current methods. We evaluated PCR-based detection of sequence polymorphisms in the internal transcribed spacer 2 (ITS2) region of the rRNA genes as a means of fungal identification. Clinical isolates (401), reference strains (6), and type strains (27), representing 34 species of yeasts were examined. The length of PCR-amplified ITS2 region DNA was determined with single-base precision in less than 30 min by using automated capillary electrophoresis. Unique, species-specific PCR products ranging from 237 to 429 bp were obtained from 92% of the clinical isolates. The remaining 8%, divided into groups with ITS2 regions which differed by ≤2 bp in mean length, all contained species-specific DNA sequences easily distinguishable by restriction enzyme analysis. These data, and the specificity of length polymorphisms for identifying yeasts, were confirmed by DNA sequence analysis of the ITS2 region from 93 isolates. Phenotypic and ITS2-based identification was concordant for 427 of 434 yeast isolates examined using sequence identity of ≥99%. Seven clinical isolates contained ITS2 sequences that did not agree with their phenotypic identification, and ITS2-based phylogenetic analyses indicate the possibility of new or clinically unusual species in the Rhodotorula and Candida genera. This work establishes an initial database, validated with over 400 clinical isolates, of ITS2 length and sequence polymorphisms for 34 species of yeasts. We conclude that size and restriction analysis of PCR-amplified ITS2 region DNA is a rapid and reliable method to identify clinically significant yeasts, including potentially new or emerging pathogenic species. PMID:10834993

  10. Discrimination of germline V genes at different sequencing lengths and mutational burdens: A new tool for identifying and evaluating the reliability of V gene assignment.

    PubMed

    Zhang, Bochao; Meng, Wenzhao; Prak, Eline T Luning; Hershberg, Uri

    2015-12-01

    Immune repertoires are collections of lymphocytes that express diverse antigen receptor gene rearrangements consisting of Variable (V), (Diversity (D) in the case of heavy chains) and Joining (J) gene segments. Clonally related cells typically share the same germline gene segments and have highly similar junctional sequences within their third complementarity determining regions. Identifying clonal relatedness of sequences is a key step in the analysis of immune repertoires. The V gene is the most important for clone identification because it has the longest sequence and the greatest number of sequence variants. However, accurate identification of a clone's germline V gene source is challenging because there is a high degree of similarity between different germline V genes. This difficulty is compounded in antibodies, which can undergo somatic hypermutation. Furthermore, high-throughput sequencing experiments often generate partial sequences and have significant error rates. To address these issues, we describe a novel method to estimate which germline V genes (or alleles) cannot be discriminated under different conditions (read lengths, sequencing errors or somatic hypermutation frequencies). Starting with any set of germline V genes, this method measures their similarity using different sequencing lengths and calculates their likelihood of unambiguous assignment under different levels of mutation. Hence, one can identify, under different experimental and biological conditions, the germline V genes (or alleles) that cannot be uniquely identified and bundle them together into groups of specific V genes with highly similar sequences. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. A generalized global alignment algorithm.

    PubMed

    Huang, Xiaoqiu; Chao, Kun-Mao

    2003-01-22

    Homologous sequences are sometimes similar over some regions but different over other regions. Homologous sequences have a much lower global similarity if the different regions are much longer than the similar regions. We present a generalized global alignment algorithm for comparing sequences with intermittent similarities, an ordered list of similar regions separated by different regions. A generalized global alignment model is defined to handle sequences with intermittent similarities. A dynamic programming algorithm is designed to compute an optimal general alignment in time proportional to the product of sequence lengths and in space proportional to the sum of sequence lengths. The algorithm is implemented as a computer program named GAP3 (Global Alignment Program Version 3). The generalized global alignment model is validated by experimental results produced with GAP3 on both DNA and protein sequences. The GAP3 program extends the ability of standard global alignment programs to recognize homologous sequences of lower similarity. The GAP3 program is freely available for academic use at http://bioinformatics.iastate.edu/aat/align/align.html.

  12. The length-force behavior and operating length range of squid muscle vary as a function of position in the mantle wall.

    PubMed

    Thompson, Joseph T; Shelton, Ryan M; Kier, William M

    2014-06-15

    Hollow cylindrical muscular organs are widespread in animals and are effective in providing support for locomotion and movement, yet are subject to significant non-uniformities in circumferential muscle strain. During contraction of the mantle of squid, the circular muscle fibers along the inner (lumen) surface of the mantle experience circumferential strains 1.3 to 1.6 times greater than fibers along the outer surface of the mantle. This transmural gradient of strain may require the circular muscle fibers near the inner and outer surfaces of the mantle to operate in different regions of the length-tension curve during a given mantle contraction cycle. We tested the hypothesis that circular muscle contractile properties vary transmurally in the mantle of the Atlantic longfin squid, Doryteuthis pealeii. We found that both the length-twitch force and length-tetanic force relationships of the obliquely striated, central mitochondria-poor (CMP) circular muscle fibers varied with radial position in the mantle wall. CMP circular fibers near the inner surface of the mantle produced higher force relative to maximum isometric tetanic force, P0, at all points along the ascending limb of the length-tension curve than CMP circular fibers near the outer surface of the mantle. The mean ± s.d. maximum isometric tetanic stresses at L₀ (the preparation length that produced the maximum isometric tetanic force) of 212 ± 105 and 290 ± 166 kN m(-2) for the fibers from the outer and inner surfaces of the mantle, respectively, did not differ significantly (P=0.29). The mean twitch:tetanus ratios for the outer and inner preparations, 0.60 ± 0.085 and 0.58 ± 0.10, respectively, did not differ significantly (P=0.67). The circular fibers did not exhibit length-dependent changes in contraction kinetics when given a twitch stimulus. As the stimulation frequency increased, L₀ was approximately 1.06 times longer than LTW, the mean preparation length that yielded maximum isometric twitch force. Sonomicrometry experiments revealed that the CMP circular muscle fibers operated in vivo primarily along the ascending limb of the length-tension curve. The CMP fibers functioned routinely over muscle lengths at which force output ranged from only 85% to 40% of P₀, and during escape jets from 100% to 30% of P₀. Our work shows that the functional diversity of obliquely striated muscles is much greater than previously recognized. © 2014. Published by The Company of Biologists Ltd.

  13. A survey of the sorghum transcriptome using single-molecule long reads

    DOE PAGES

    Abdel-Ghany, Salah E.; Hamilton, Michael; Jacobi, Jennifer L.; ...

    2016-06-24

    Alternative splicing and alternative polyadenylation (APA) of pre-mRNAs greatly contribute to transcriptome diversity, coding capacity of a genome and gene regulatory mechanisms in eukaryotes. Second-generation sequencing technologies have been extensively used to analyse transcriptomes. However, a major limitation of short-read data is that it is difficult to accurately predict full-length splice isoforms. Here we sequenced the sorghum transcriptome using Pacific Biosciences single-molecule real-time long-read isoform sequencing and developed a pipeline called TAPIS (Transcriptome Analysis Pipeline for Isoform Sequencing) to identify full-length splice isoforms and APA sites. Our analysis reveals transcriptome-wide full-length isoforms at an unprecedented scale with over 11,000 novelmore » splice isoforms. Additionally, we uncover APA ofB11,000 expressed genes and more than 2,100 novel genes. Lastly, these results greatly enhance sorghum gene annotations and aid in studying gene regulation in this important bioenergy crop. The TAPIS pipeline will serve as a useful tool to analyse Iso-Seq data from any organism.« less

  14. A survey of the sorghum transcriptome using single-molecule long reads

    PubMed Central

    Abdel-Ghany, Salah E.; Hamilton, Michael; Jacobi, Jennifer L.; Ngam, Peter; Devitt, Nicholas; Schilkey, Faye; Ben-Hur, Asa; Reddy, Anireddy S. N.

    2016-01-01

    Alternative splicing and alternative polyadenylation (APA) of pre-mRNAs greatly contribute to transcriptome diversity, coding capacity of a genome and gene regulatory mechanisms in eukaryotes. Second-generation sequencing technologies have been extensively used to analyse transcriptomes. However, a major limitation of short-read data is that it is difficult to accurately predict full-length splice isoforms. Here we sequenced the sorghum transcriptome using Pacific Biosciences single-molecule real-time long-read isoform sequencing and developed a pipeline called TAPIS (Transcriptome Analysis Pipeline for Isoform Sequencing) to identify full-length splice isoforms and APA sites. Our analysis reveals transcriptome-wide full-length isoforms at an unprecedented scale with over 11,000 novel splice isoforms. Additionally, we uncover APA of ∼11,000 expressed genes and more than 2,100 novel genes. These results greatly enhance sorghum gene annotations and aid in studying gene regulation in this important bioenergy crop. The TAPIS pipeline will serve as a useful tool to analyse Iso-Seq data from any organism. PMID:27339290

  15. Production of a full-length infectious GFP-tagged cDNA clone of Beet mild yellowing virus for the study of plant-polerovirus interactions.

    PubMed

    Stevens, Mark; Viganó, Felicita

    2007-04-01

    The full-length cDNA of Beet mild yellowing virus (Broom's Barn isolate) was sequenced and cloned into the vector pLitmus 29 (pBMYV-BBfl). The sequence of BMYV-BBfl (5721 bases) shared 96% and 98% nucleotide identity with the other complete sequences of BMYV (BMYV-2ITB, France and BMYV-IPP, Germany respectively). Full-length capped RNA transcripts of pBMYV-BBfl were synthesised and found to be biologically active in Arabidopsis thaliana protoplasts following electroporation or PEG inoculation when the protoplasts were subsequently analysed using serological and molecular methods. The BMYV sequence was modified by inserting DNA that encoded the jellyfish green fluorescent protein (GFP) into the P5 gene close to its 3' end. A. thaliana protoplasts electroporated with these RNA transcripts were biologically active and up to 2% of transfected protoplasts showed GFP-specific fluorescence. The exploitation of these cDNA clones for the study of the biology of beet poleroviruses is discussed.

  16. Back-Face Strain for Monitoring Stable Crack Extension in Precracked Flexure Specimens

    NASA Technical Reports Server (NTRS)

    Salem, Jonathan A.; Ghosn, Louis J.

    2010-01-01

    Calibrations relating back-face strain to crack length in precracked flexure specimens were developed for different strain gage sizes. The functions were verified via experimental compliance measurements of notched and precracked ceramic beams. Good agreement between the functions and experiments occurred, and fracture toughness was calculated via several operational methods: maximum test load and optically measured precrack length; load at 2 percent crack extension and optical precrack length; maximum load and back-face strain crack length. All the methods gave vary comparable results. The initiation toughness, K(sub Ii) , was also estimated from the initial compliance and load.The results demonstrate that stability of precracked ceramics specimens tested in four-point flexure is a common occurrence, and that methods such as remotely-monitored load-point displacement are only adequate for detecting stable extension of relatively deep cracks.

  17. Mitochondrial genome of the sweet potato hornworm, Agrius convolvuli (Lepidoptera: Sphingidae), and comparison with other Lepidoptera species.

    PubMed

    Dai, Li-Shang; Li, Sheng; Yu, Hui-Min; Wei, Guo-Qing; Wang, Lei; Qian, Cen; Zhang, Cong-Fen; Li, Jun; Sun, Yu; Zhao, Yue; Zhu, Bao-Jian; Liu, Chao-Liang

    2017-02-01

    In the present study, we sequenced the complete mitochondrial genome (mitogenome) of Agrius convolvuli (Lepidoptera: Sphingidae) and compared it with previously sequenced mitogenomes of lepidopteran species. The mitogenome was a circular molecule, 15 349 base pairs (bp) long, containing 37 genes. The order and orientation of genes in the A. convolvuli mitogenome were similar to those in sequenced mitogenomes of other lepidopterans. All 13 protein-coding genes (PCGs) were initiated by ATN codons, except for the cytochrome c oxidase subunit 1 (cox1) gene, which seemed to be initiated by the codon CGA, as observed in other lepidopterans. Three of the 13 PCGs had the incomplete termination codon T, while the remainder terminated with TAA. Additionally, the codon distributions of the 13 PCGs revealed that Asn, Ile, Leu2, Lys, Phe, and Tyr were the most frequently used codon families. All transfer RNAs were folded into the expected cloverleaf structure except for tRNA Ser (AGN), which lacked a stable dihydrouridine arm. The length of the adenine (A) + thymine (T)-rich region was 331 bp. This region included the motif ATAGA followed by a 19-bp poly-T stretch and a microsatellite-like (TA) 8 element next to the motif ATTTA. Phylogenetic analyses (maximum likelihood and Bayesian methods) showed that A. convolvuli belongs to the family Sphingidae.

  18. The complete mitochondrial genome of Plodia interpunctella (Lepidoptera: Pyralidae) and comparison with other Pyraloidea insects.

    PubMed

    Liu, Qiu-Ning; Chai, Xin-Yue; Bian, Dan-Dan; Zhou, Chun-Lin; Tang, Bo-Ping

    2016-01-01

    The mitochondrial (mt) genome can provide important information for the understanding of phylogenetic relationships. The complete mt genome of Plodia interpunctella (Lepidoptera: Pyralidae) has been sequenced. The circular genome is 15 287 bp in size, encoding 13 protein-coding genes (PCGs), 2 rRNA genes, 22 tRNA genes, and a control region. The AT skew of this mt genome is slightly negative, and the nucleotide composition is biased toward A+T nucleotides (80.15%). All PCGs start with the typical ATN (ATA, ATC, ATG, and ATT) codons, except for the cox1 gene which may start with the CGA codon. Four of the 13 PCGs harbor the incomplete termination codon T or TA. All the tRNA genes are folded into the typical clover-leaf structure of mitochondrial tRNA, except for trnS1 (AGN) in which the DHU arm fails to form a stable stem-loop structure. The overlapping sequences are 35 bp in total and are found in seven different locations. A total of 240 bp of intergenic spacers are scattered in 16 regions. The control region of the mt genome is 327 bp in length and consisted of several features common to the sequenced lepidopteran insects. Phylogenetic analysis based on 13 PCGs using the Maximum Likelihood method shows that the placement of P. interpunctella was within the Pyralidae.

  19. Cloning of three heat shock protein genes (HSP70, HSP90α and HSP90β) and their expressions in response to thermal stress in loach (Misgurnus anguillicaudatus) fed with different levels of vitamin C.

    PubMed

    Yan, Jie; Liang, Xiao; Zhang, Yin; Li, Yang; Cao, Xiaojuan; Gao, Jian

    2017-07-01

    Heat shock protein 70 (HSP70) and 90 (HSP90) are the most broadly studied proteins in HSP families. They play key roles in cells as molecular chaperones, in response to stress conditions such as thermal stress. In this study, full-length cDNA sequences of HSP70, HSP90α and HSP90β from loach Misgurnus anguillicaudatus were cloned. The full-length cDNA of HSP70 in loach was 2332bp encoding 644 amino acids, while HSP90α and HSP90β were 2586bp and 2678bp in length, encoding 729 and 727 amino acids, respectively. The deduced amino acid sequences of HSP70 in loach shared the highest identity with those of Megalobrama amblycephala and Cyprinus carpio. The deduced amino acid sequences of HSP90α and HSP90β in loach both shared the highest identity with those of M. amblycephala. Their mRNA tissue expression results showed that the maximum expressions of HSP70, HSP90α and HSP90β were respectively present in the intestine, brain and kidney of loach. Quantitative real-time PCR was employed to analyze the temporal expressions of HSP70, HSP90α and HSP90β in livers of loaches fed with different levels of vitamin C under thermal stress. Expression levels of the three HSP genes in loach fed the diet without vitamin C supplemented at 0 h of thermal stress were significantly lower than those at 2 h, 6 h, 12 h and 24 h of thermal stress. It indicated that expressions of the three HSP genes were sensitive to thermal stress in loach. The three HSP genes in loaches fed with 1000 mg/kg vitamin C expressed significantly lower than other vitamin C groups at many time points of thermal stress, suggesting 1000 mg/kg dietary vitamin C might decrease the body damages caused by the thermal stress. This study will be of value for further studies into thermal stress tolerance in loach. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. Large-scale analysis of full-length cDNAs from the tomato (Solanum lycopersicum) cultivar Micro-Tom, a reference system for the Solanaceae genomics.

    PubMed

    Aoki, Koh; Yano, Kentaro; Suzuki, Ayako; Kawamura, Shingo; Sakurai, Nozomu; Suda, Kunihiro; Kurabayashi, Atsushi; Suzuki, Tatsuya; Tsugane, Taneaki; Watanabe, Manabu; Ooga, Kazuhide; Torii, Maiko; Narita, Takanori; Shin-I, Tadasu; Kohara, Yuji; Yamamoto, Naoki; Takahashi, Hideki; Watanabe, Yuichiro; Egusa, Mayumi; Kodama, Motoichiro; Ichinose, Yuki; Kikuchi, Mari; Fukushima, Sumire; Okabe, Akiko; Arie, Tsutomu; Sato, Yuko; Yazawa, Katsumi; Satoh, Shinobu; Omura, Toshikazu; Ezura, Hiroshi; Shibata, Daisuke

    2010-03-30

    The Solanaceae family includes several economically important vegetable crops. The tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. Recently, a number of tomato resources have been developed in parallel with the ongoing tomato genome sequencing project. In particular, a miniature cultivar, Micro-Tom, is regarded as a model system in tomato genomics, and a number of genomics resources in the Micro-Tom-background, such as ESTs and mutagenized lines, have been established by an international alliance. To accelerate the progress in tomato genomics, we developed a collection of fully-sequenced 13,227 Micro-Tom full-length cDNAs. By checking redundant sequences, coding sequences, and chimeric sequences, a set of 11,502 non-redundant full-length cDNAs (nrFLcDNAs) was generated. Analysis of untranslated regions demonstrated that tomato has longer 5'- and 3'-untranslated regions than most other plants but rice. Classification of functions of proteins predicted from the coding sequences demonstrated that nrFLcDNAs covered a broad range of functions. A comparison of nrFLcDNAs with genes of sixteen plants facilitated the identification of tomato genes that are not found in other plants, most of which did not have known protein domains. Mapping of the nrFLcDNAs onto currently available tomato genome sequences facilitated prediction of exon-intron structure. Introns of tomato genes were longer than those of Arabidopsis and rice. According to a comparison of exon sequences between the nrFLcDNAs and the tomato genome sequences, the frequency of nucleotide mismatch in exons between Micro-Tom and the genome-sequencing cultivar (Heinz 1706) was estimated to be 0.061%. The collection of Micro-Tom nrFLcDNAs generated in this study will serve as a valuable genomic tool for plant biologists to bridge the gap between basic and applied studies. The nrFLcDNA sequences will help annotation of the tomato whole-genome sequence and aid in tomato functional genomics and molecular breeding. Full-length cDNA sequences and their annotations are provided in the database KaFTom http://www.pgb.kazusa.or.jp/kaftom/ via the website of the National Bioresource Project Tomato http://tomato.nbrp.jp.

  1. Characterization of full-length sequenced cDNA inserts (FLIcs) from Atlantic salmon (Salmo salar)

    PubMed Central

    Andreassen, Rune; Lunner, Sigbjørn; Høyheim, Bjørn

    2009-01-01

    Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs) are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP), the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91%) of the transcripts were annotated using Gene Ontology (GO) terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS). The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS). This suggests that the remaining cDNA libraries generated by SGP represent a valuable cCDS FLIc source. The conservation of 7-mers in 3'UTRs indicates that these motifs are functionally important. Identity between some of these 7-mers and miRNA target sequences suggests that they are miRNA targets in Salmo salar transcripts as well. PMID:19878547

  2. A survey and evaluations of histogram-based statistics in alignment-free sequence comparison.

    PubMed

    Luczak, Brian B; James, Benjamin T; Girgis, Hani Z

    2017-12-06

    Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences. We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover's distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover's distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours. The source code of the benchmarking tool is available as Supplementary Materials. © The Author 2017. Published by Oxford University Press.

  3. DNA Sequencing Using capillary Electrophoresis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dr. Barry Karger

    2011-05-09

    The overall goal of this program was to develop capillary electrophoresis as the tool to be used to sequence for the first time the Human Genome. Our program was part of the Human Genome Project. In this work, we were highly successful and the replaceable polymer we developed, linear polyacrylamide, was used by the DOE sequencing lab in California to sequence a significant portion of the human genome using the MegaBase multiple capillary array electrophoresis instrument. In this final report, we summarize our efforts and success. We began our work by separating by capillary electrophoresis double strand oligonucleotides using cross-linkedmore » polyacrylamide gels in fused silica capillaries. This work showed the potential of the methodology. However, preparation of such cross-linked gel capillaries was difficult with poor reproducibility, and even more important, the columns were not very stable. We improved stability by using non-cross linked linear polyacrylamide. Here, the entangled linear chains could move when osmotic pressure (e.g. sample injection) was imposed on the polymer matrix. This relaxation of the polymer dissipated the stress in the column. Our next advance was to use significantly lower concentrations of the linear polyacrylamide that the polymer could be automatically blown out after each run and replaced with fresh linear polymer solution. In this way, a new column was available for each analytical run. Finally, while testing many linear polymers, we selected linear polyacrylamide as the best matrix as it was the most hydrophilic polymer available. Under our DOE program, we demonstrated initially the success of the linear polyacrylamide to separate double strand DNA. We note that the method is used even today to assay purity of double stranded DNA fragments. Our focus, of course, was on the separation of single stranded DNA for sequencing purposes. In one paper, we demonstrated the success of our approach in sequencing up to 500 bases. Other application papers of sequencing up to this level were also published in the mid 1990's. A major interest of the sequencing community has always been read length. The longer the sequence read per run the more efficient the process as well as the ability to read repeat sequences. We therefore devoted a great deal of time to studying the factors influencing read length in capillary electrophoresis, including polymer type and molecule weight, capillary column temperature, applied electric field, etc. In our initial optimization, we were able to demonstrate, for the first time, the sequencing of over 1000 bases with 90% accuracy. The run required 80 minutes for separation. Sequencing of 1000 bases per column was next demonstrated on a multiple capillary instrument. Our studies revealed that linear polyacrylamide produced the longest read lengths because the hydrophilic single strand DNA had minimal interaction with the very hydrophilic linear polyacrylamide. Any interaction of the DNA with the polymer would lead to broader peaks and lower read length. Another important parameter was the molecular weight of the linear chains. High molecular weight (> 1 MDA) was important to allow the long single strand DNA to reptate through the entangled polymer matrix. In an important paper, we showed an inverse emulsion method to prepare reproducibility linear polyacrylamide polymer with an average MWT of 9MDa. This approach was used in the polymer for sequencing the human genome. Another critical factor in the successful use of capillary electrophoresis for sequencing was the sample preparation method. In the Sanger sequencing reaction, high concentration of salts and dideoxynucleotide remained. Since the sample was introduced to the capillary column by electrokinetic injection, these salt ions would be favorably injected into the column over the sequencing fragments, thus reducing the signal for longer fragments and hence reading read length. In two papers, we examined the role of individual components from the sequencing reaction and then developed a protocol to reduce the deleterious salts. We demonstrated a robust method for achieving long read length DNA sequencing. Continuing our advances, we next demonstrated the achievement of over 1000 bases in less than one hour with a base calling accuracy of between 98 and 99%. In this work, we implemented energy transfer dyes which allowed for cleaner differentiation of the 4 dye labeled terminal nucleotides. In addition, we developed improved base calling software to help read sequencing when the separation was only minimal as occurs at long read lengths. Another critical parameter we studied was column temperature. We demonstrated that read lengths improved as the column temperature was increased from room temperature to 60 C or 70 C. The higher temperature relaxed the DNA chains under the influence of the high electric field.« less

  4. Minimap2: pairwise alignment for nucleotide sequences.

    PubMed

    Li, Heng

    2018-05-10

    Recent advances in sequencing technologies promise ultra-long reads of ∼100 kilo bases (kb) in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 mega bases (Mb) in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms. Minimap2 is a general-purpose alignment program to map DNA or long mRNA sequences against a large reference database. It works with accurate short reads of ≥ 100bp in length, ≥1kb genomic reads at error rate ∼15%, full-length noisy Direct RNA or cDNA reads, and assembly contigs or closely related full chromosomes of hundreds of megabases in length. Minimap2 does split-read alignment, employs concave gap cost for long insertions and deletions (INDELs) and introduces new heuristics to reduce spurious alignments. It is 3-4 times as fast as mainstream short-read mappers at comparable accuracy, and is ≥30 times faster than long-read genomic or cDNA mappers at higher accuracy, surpassing most aligners specialized in one type of alignment. https://github.com/lh3/minimap2. hengli@broadinstitute.org.

  5. The optimum spanning catenary cable

    NASA Astrophysics Data System (ADS)

    Wang, C. Y.

    2015-03-01

    A heavy cable spans two points in space. There exists an optimum cable length such that the maximum tension is minimized. If the two end points are at the same level, the optimum length is 1.258 times the distance between the ends. The optimum lengths for end points of different heights are also found.

  6. The complete chloroplast genome sequence of Hibiscus syriacus.

    PubMed

    Kwon, Hae-Yun; Kim, Joon-Hyeok; Kim, Sea-Hyun; Park, Ji-Min; Lee, Hyoshin

    2016-09-01

    The complete chloroplast genome sequence of Hibiscus syriacus L. is presented in this study. The genome is composed of 161 019 bp in length, with a typical circular structure containing a pair of inverted repeats of 25 745 bp of length separated by a large single-copy region and a small single-copy region of 89 698 bp and 19 831 bp of length, respectively. The overall GC content is 36.8%. One hundred and fourteen genes were annotated, including 81 protein-coding genes, 4 ribosomal RNA genes and 29 transfer RNA genes.

  7. The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome

    PubMed Central

    Camargo, Anamaria A.; Samaia, Helena P. B.; Dias-Neto, Emmanuel; Simão, Daniel F.; Migotto, Italo A.; Briones, Marcelo R. S.; Costa, Fernando F.; Aparecida Nagai, Maria; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; Sonati, Maria de Fátima; Tajara, Eloiza H.; Valentini, Sandro R.; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Arnaldi, Liliane A. T.; de Assis, Angela M.; Bengtson, Mário Henrique; Bergamo, Nadia Aparecida; Bombonato, Vanessa; de Camargo, Maria E. R.; Canevari, Renata A.; Carraro, Dirce M.; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Corrêa, Rosana F. R.; Costa, Maria Cristina R.; Curcio, Cyntia; Hokama, Paula O. M.; Ferreira, Ari J. S.; Furuzawa, Gilberto K.; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Krieger, José E.; Leite, Luciana C. C.; Majumder, Paromita; Marins, Mozart; Marques, Everaldo R.; Melo, Analy S. A.; Melo, Monica; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana G.; Prevedel, Aline C.; Rahal, Paula; Rainho, Claudia A.; Reis, Eduardo M. R.; Ribeiro, Marcelo L.; da Rós, Nancy; de Sá, Renata G.; Sales, Magaly M.; Sant'anna, Simone Cristina; dos Santos, Mariana L.; da Silva, Aline M.; da Silva, Neusa P.; Silva, Wilson A.; da Silveira, Rosana A.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Soares, Fernando; Moreira, Eloisa S.; Nunes, Diana N.; Correa, Ricardo G.; Zalcberg, Heloisa; Carvalho, Alex F.; Reis, Luis F. L.; Brentani, Ricardo R.; Simpson, Andrew J. G.; de Souza, Sandro J.

    2001-01-01

    Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription–PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning. PMID:11593022

  8. The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome.

    PubMed

    Camargo, A A; Samaia, H P; Dias-Neto, E; Simão, D F; Migotto, I A; Briones, M R; Costa, F F; Nagai, M A; Verjovski-Almeida, S; Zago, M A; Andrade, L E; Carrer, H; El-Dorry, H F; Espreafico, E M; Habr-Gama, A; Giannella-Neto, D; Goldman, G H; Gruber, A; Hackel, C; Kimura, E T; Maciel, R M; Marie, S K; Martins, E A; Nobrega, M P; Paco-Larson, M L; Pardini, M I; Pereira, G G; Pesquero, J B; Rodrigues, V; Rogatto, S R; da Silva, I D; Sogayar, M C; Sonati, M F; Tajara, E H; Valentini, S R; Alberto, F L; Amaral, M E; Aneas, I; Arnaldi, L A; de Assis, A M; Bengtson, M H; Bergamo, N A; Bombonato, V; de Camargo, M E; Canevari, R A; Carraro, D M; Cerutti, J M; Correa, M L; Correa, R F; Costa, M C; Curcio, C; Hokama, P O; Ferreira, A J; Furuzawa, G K; Gushiken, T; Ho, P L; Kimura, E; Krieger, J E; Leite, L C; Majumder, P; Marins, M; Marques, E R; Melo, A S; Melo, M B; Mestriner, C A; Miracca, E C; Miranda, D C; Nascimento, A L; Nobrega, F G; Ojopi, E P; Pandolfi, J R; Pessoa, L G; Prevedel, A C; Rahal, P; Rainho, C A; Reis, E M; Ribeiro, M L; da Ros, N; de Sa, R G; Sales, M M; Sant'anna, S C; dos Santos, M L; da Silva, A M; da Silva, N P; Silva, W A; da Silveira, R A; Sousa, J F; Stecconi, D; Tsukumo, F; Valente, V; Soares, F; Moreira, E S; Nunes, D N; Correa, R G; Zalcberg, H; Carvalho, A F; Reis, L F; Brentani, R R; Simpson, A J; de Souza, S J; Melo, M

    2001-10-09

    Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription-PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning.

  9. Score distributions of gapped multiple sequence alignments down to the low-probability tail

    NASA Astrophysics Data System (ADS)

    Fieth, Pascal; Hartmann, Alexander K.

    2016-08-01

    Assessing the significance of alignment scores of optimally aligned DNA or amino acid sequences can be achieved via the knowledge of the score distribution of random sequences. But this requires obtaining the distribution in the biologically relevant high-scoring region, where the probabilities are exponentially small. For gapless local alignments of infinitely long sequences this distribution is known analytically to follow a Gumbel distribution. Distributions for gapped local alignments and global alignments of finite lengths can only be obtained numerically. To obtain result for the small-probability region, specific statistical mechanics-based rare-event algorithms can be applied. In previous studies, this was achieved for pairwise alignments. They showed that, contrary to results from previous simple sampling studies, strong deviations from the Gumbel distribution occur in case of finite sequence lengths. Here we extend the studies to multiple sequence alignments with gaps, which are much more relevant for practical applications in molecular biology. We study the distributions of scores over a large range of the support, reaching probabilities as small as 10-160, for global and local (sum-of-pair scores) multiple alignments. We find that even after suitable rescaling, eliminating the sequence-length dependence, the distributions for multiple alignment differ from the pairwise alignment case. Furthermore, we also show that the previously discussed Gaussian correction to the Gumbel distribution needs to be refined, also for the case of pairwise alignments.

  10. Genome analysis and identification of gelatinase encoded gene in Enterobacter aerogenes

    NASA Astrophysics Data System (ADS)

    Shahimi, Safiyyah; Mutalib, Sahilah Abdul; Khalid, Rozida Abdul; Repin, Rul Aisyah Mat; Lamri, Mohd Fadly; Bakar, Mohd Faizal Abu; Isa, Mohd Noor Mat

    2016-11-01

    In this study, bioinformatic analysis towards genome sequence of E. aerogenes was done to determine gene encoded for gelatinase. Enterobacter aerogenes was isolated from hot spring water and gelatinase species-specific bacterium to porcine and fish gelatin. This bacterium offers the possibility of enzymes production which is specific to both species gelatine, respectively. Enterobacter aerogenes was partially genome sequenced resulting in 5.0 mega basepair (Mbp) total size of sequence. From pre-process pipeline, 87.6 Mbp of total reads, 68.8 Mbp of total high quality reads and 78.58 percent of high quality percentage was determined. Genome assembly produced 120 contigs with 67.5% of contigs over 1 kilo base pair (kbp), 124856 bp of N50 contig length and 55.17 % of GC base content percentage. About 4705 protein gene was identified from protein prediction analysis. Two candidate genes selected have highest similarity identity percentage against gelatinase enzyme available in Swiss-Prot and NCBI online database. They were NODE_9_length_26866_cov_148.013245_12 containing 1029 base pair (bp) sequence with 342 amino acid sequence and NODE_24_length_155103_cov_177.082458_62 which containing 717 bp sequence with 238 amino acid sequence, respectively. Thus, two paired of primers (forward and reverse) were designed, based on the open reading frame (ORF) of selected genes. Genome analysis of E. aerogenes resulting genes encoded gelatinase were identified.

  11. Design and implementation of low complexity wake-up receiver for underwater acoustic sensor networks

    NASA Astrophysics Data System (ADS)

    Yue, Ming

    This thesis designs a low-complexity dual Pseudorandom Noise (PN) scheme for identity (ID) detection and coarse frame synchronization. The two PN sequences for a node are identical and are separated by a specified length of gap which serves as the ID of different sensor nodes. The dual PN sequences are short in length but are capable of combating severe underwater acoustic (UWA) multipath fading channels that exhibit time varying impulse responses up to 100 taps. The receiver ID detection is implemented on a microcontroller MSP430F5529 by calculating the correlation between the two segments of the PN sequence with the specified separation gap. When the gap length is matched, the correlator outputs a peak which triggers the wake-up enable. The time index of the correlator peak is used as the coarse synchronization of the data frame. The correlator is implemented by an iterative algorithm that uses only one multiplication and two additions for each sample input regardless of the length of the PN sequence, thus achieving low computational complexity. The real-time processing requirement is also met via direct memory access (DMA) and two circular buffers to accelerate data transfer between the peripherals and the memory. The proposed dual PN detection scheme has been successfully tested by simulated fading channels and real-world measured channels. The results show that, in long multipath channels with more than 60 taps, the proposed scheme achieves high detection rate and low false alarm rate using maximal-length sequences as short as 31 bits to 127 bits, therefore it is suitable as a low-power wake-up receiver. The future research will integrate the wake-up receiver with Digital Signal Processors (DSP) for payload detection.

  12. Systematic Evaluation of the Dependence of Deoxyribozyme Catalysis on Random Region Length

    PubMed Central

    Velez, Tania E.; Singh, Jaydeep; Xiao, Ying; Allen, Emily C.; Wong, On Yi; Chandra, Madhavaiah; Kwon, Sarah C.; Silverman, Scott K.

    2012-01-01

    Functional nucleic acids are DNA and RNA aptamers that bind targets, or they are deoxyribozymes and ribozymes that have catalytic activity. These functional DNA and RNA sequences can be identified from random-sequence pools by in vitro selection, which requires choosing the length of the random region. Shorter random regions allow more complete coverage of sequence space but may not permit the structural complexity necessary for binding or catalysis. In contrast, longer random regions are sampled incompletely but may allow adoption of more complicated structures that enable function. In this study, we systematically examined random region length (N20 through N60) for two particular deoxyribozyme catalytic activities, DNA cleavage and tyrosine-RNA nucleopeptide linkage formation. For both activities, we previously identified deoxyribozymes using only N40 regions. In the case of DNA cleavage, here we found that shorter N20 and N30 regions allowed robust catalytic function, either by DNA hydrolysis or by DNA deglycosylation and strand scission via β-elimination, whereas longer N50 and N60 regions did not lead to catalytically active DNA sequences. Follow-up selections with N20, N30, and N40 regions revealed an interesting interplay of metal ion cofactors and random region length. Separately, for Tyr-RNA linkage formation, N30 and N60 regions provided catalytically active sequences, whereas N20 was unsuccessful, and the N40 deoxyribozymes were functionally superior (in terms of rate and yield) to N30 and N60. Collectively, the results indicate that with future in vitro selection experiments for DNA and RNA catalysts, and by extension for aptamers, random region length should be an important experimental variable. PMID:23088677

  13. Novel full-length major histocompatibility complex class I allele discovery and haplotype definition in pig-tailed macaques.

    PubMed

    Semler, Matthew R; Wiseman, Roger W; Karl, Julie A; Graham, Michael E; Gieger, Samantha M; O'Connor, David H

    2018-06-01

    Pig-tailed macaques (Macaca nemestrina, Mane) are important models for human immunodeficiency virus (HIV) studies. Their infectability with minimally modified HIV makes them a uniquely valuable animal model to mimic human infection with HIV and progression to acquired immunodeficiency syndrome (AIDS). However, variation in the pig-tailed macaque major histocompatibility complex (MHC) and the impact of individual transcripts on the pathogenesis of HIV and other infectious diseases is understudied compared to that of rhesus and cynomolgus macaques. In this study, we used Pacific Biosciences single-molecule real-time circular consensus sequencing to describe full-length MHC class I (MHC-I) transcripts for 194 pig-tailed macaques from three breeding centers. We then used the full-length sequences to infer Mane-A and Mane-B haplotypes containing groups of MHC-I transcripts that co-segregate due to physical linkage. In total, we characterized full-length open reading frames (ORFs) for 313 Mane-A, Mane-B, and Mane-I sequences that defined 86 Mane-A and 106 Mane-B MHC-I haplotypes. Pacific Biosciences technology allows us to resolve these Mane-A and Mane-B haplotypes to the level of synonymous allelic variants. The newly defined haplotypes and transcript sequences containing full-length ORFs provide an important resource for infectious disease researchers as certain MHC haplotypes have been shown to provide exceptional control of simian immunodeficiency virus (SIV) replication and prevention of AIDS-like disease in nonhuman primates. The increased allelic resolution provided by Pacific Biosciences sequencing also benefits transplant research by allowing researchers to more specifically match haplotypes between donors and recipients to the level of nonsynonymous allelic variation, thus reducing the risk of graft-versus-host disease.

  14. Designing robust watermark barcodes for multiplex long-read sequencing.

    PubMed

    Ezpeleta, Joaquín; Krsticevic, Flavia J; Bulacio, Pilar; Tapia, Elizabeth

    2017-03-15

    To attain acceptable sample misassignment rates, current approaches to multiplex single-molecule real-time sequencing require upstream quality improvement, which is obtained from multiple passes over the sequenced insert and significantly reduces the effective read length. In order to fully exploit the raw read length on multiplex applications, robust barcodes capable of dealing with the full single-pass error rates are needed. We present a method for designing sequencing barcodes that can withstand a large number of insertion, deletion and substitution errors and are suitable for use in multiplex single-molecule real-time sequencing. The manuscript focuses on the design of barcodes for full-length single-pass reads, impaired by challenging error rates in the order of 11%. The proposed barcodes can multiplex hundreds or thousands of samples while achieving sample misassignment probabilities as low as 10-7 under the above conditions, and are designed to be compatible with chemical constraints imposed by the sequencing process. Software tools for constructing watermark barcode sets and demultiplexing barcoded reads, together with example sets of barcodes and synthetic barcoded reads, are freely available at www.cifasis-conicet.gov.ar/ezpeleta/NS-watermark . ezpeleta@cifasis-conicet.gov.ar. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  15. Prediction of enhancer-promoter interactions via natural language processing.

    PubMed

    Zeng, Wanwen; Wu, Mengmeng; Jiang, Rui

    2018-05-09

    Precise identification of three-dimensional genome organization, especially enhancer-promoter interactions (EPIs), is important to deciphering gene regulation, cell differentiation and disease mechanisms. Currently, it is a challenging task to distinguish true interactions from other nearby non-interacting ones since the power of traditional experimental methods is limited due to low resolution or low throughput. We propose a novel computational framework EP2vec to assay three-dimensional genomic interactions. We first extract sequence embedding features, defined as fixed-length vector representations learned from variable-length sequences using an unsupervised deep learning method in natural language processing. Then, we train a classifier to predict EPIs using the learned representations in supervised way. Experimental results demonstrate that EP2vec obtains F1 scores ranging from 0.841~ 0.933 on different datasets, which outperforms existing methods. We prove the robustness of sequence embedding features by carrying out sensitivity analysis. Besides, we identify motifs that represent cell line-specific information through analysis of the learned sequence embedding features by adopting attention mechanism. Last, we show that even superior performance with F1 scores 0.889~ 0.940 can be achieved by combining sequence embedding features and experimental features. EP2vec sheds light on feature extraction for DNA sequences of arbitrary lengths and provides a powerful approach for EPIs identification.

  16. UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences.

    PubMed

    Du, Pu-Feng; Zhao, Wei; Miao, Yang-Yang; Wei, Le-Yi; Wang, Likun

    2017-11-14

    With the avalanche of biological sequences in public databases, one of the most challenging problems in computational biology is to predict their biological functions and cellular attributes. Most of the existing prediction algorithms can only handle fixed-length numerical vectors. Therefore, it is important to be able to represent biological sequences with various lengths using fixed-length numerical vectors. Although several algorithms, as well as software implementations, have been developed to address this problem, these existing programs can only provide a fixed number of representation modes. Every time a new sequence representation mode is developed, a new program will be needed. In this paper, we propose the UltraPse as a universal software platform for this problem. The function of the UltraPse is not only to generate various existing sequence representation modes, but also to simplify all future programming works in developing novel representation modes. The extensibility of UltraPse is particularly enhanced. It allows the users to define their own representation mode, their own physicochemical properties, or even their own types of biological sequences. Moreover, UltraPse is also the fastest software of its kind. The source code package, as well as the executables for both Linux and Windows platforms, can be downloaded from the GitHub repository.

  17. Characterizing DNA preservation in degraded specimens of Amara alpina (Carabidae: Coleoptera).

    PubMed

    Heintzman, Peter D; Elias, Scott A; Moore, Karen; Paszkiewicz, Konrad; Barnes, Ian

    2014-05-01

    DNA preserved in degraded beetle (Coleoptera) specimens, including those derived from dry-stored museum and ancient permafrost-preserved environments, could provide a valuable resource for researchers interested in species and population histories over timescales from decades to millenia. However, the potential of these samples as genetic resources is currently unassessed. Here, using Sanger and Illumina shotgun sequence data, we explored DNA preservation in specimens of the ground beetle Amara alpina, from both museum and ancient environments. Nearly all museum specimens had amplifiable DNA, with the maximum amplifiable fragment length decreasing with age. Amplification of DNA was only possible in 45% of ancient specimens. Preserved mitochondrial DNA fragments were significantly longer than those of nuclear DNA in both museum and ancient specimens. Metagenomic characterization of extracted DNA demonstrated that parasite-derived sequences, including Wolbachia and Spiroplasma, are recoverable from museum beetle specimens. Ancient DNA extracts contained beetle DNA in amounts comparable to museum specimens. Overall, our data demonstrate that there is great potential for both museum and ancient specimens of beetles in future genetic studies, and we see no reason why this would not be the case for other orders of insect. © 2013 John Wiley & Sons Ltd.

  18. Complete Mitochondrial Genome of the Red Fox (Vuples vuples) and Phylogenetic Analysis with Other Canid Species.

    PubMed

    Zhong, Hua-Ming; Zhang, Hong-Hai; Sha, Wei-Lai; Zhang, Cheng-De; Chen, Yu-Cai

    2010-04-01

    The whole mitochondrial genome sequence of red fox (Vuples vuples) was determined. It had a total length of 16 723 bp. As in most mammal mitochondrial genome, it contained 13 protein coding genes, two ribosome RNA genes, 22 transfer RNA genes and one control region. The base composition was 31.3% A, 26.1% C, 14.8% G and 27.8% T, respectively. The codon usage of red fox, arctic fox, gray wolf, domestic dog and coyote followed the same pattern except for an unusual ATT start codon, which initiates the NADH dehydrogenase subunit 3 gene in the red fox. A long tandem repeat rich in AC was found between conserved sequence block 1 and 2 in the control region. In order to confirm the phylogenetic relationships of red fox to other canids, phylogenetic trees were reconstructed by neighbor-joining and maximum parsimony methods using 12 concatenated heavy-strand protein-coding genes. The result indicated that arctic fox was the sister group of red fox and they both belong to the red fox-like clade in family Canidae, while gray wolf, domestic dog and coyote belong to wolf-like clade. The result was in accordance with existing phylogenetic results.

  19. Biomechanical optimization of implant diameter and length for immediate loading: a nonlinear finite element analysis.

    PubMed

    Kong, Liang; Gu, Zexu; Li, Tao; Wu, Junjie; Hu, Kaijin; Liu, Yanpu; Zhou, Hongzhi; Liu, Baolin

    2009-01-01

    A nonlinear finite element method was applied to examine the effects of implant diameter and length on the maximum von Mises stresses in the jaw, and to evaluate the maximum displacement of the implant-abutment complex in immediate-loading models. The implant diameter (D) ranged from 3.0 to 5.0 mm and implant length (L) ranged from 6.0 to 16.0 mm. The results showed that the maximum von Mises stress in cortical bone was decreased by 65.8% under a buccolingual load with an increase in D. In cancellous bone, it was decreased by 71.5% under an axial load with an increase in L. The maximum displacement in the implant-abutment complex decreased by 64.8% under a buccolingual load with an increase in D. The implant was found to be more sensitive to L than to D under axial loads, while D played a more important role in enhancing its stability under buccolingual loads. When D exceeded 4.0 mm and L exceeded 11.0 mm, both minimum stress and displacement were obtained. Therefore, these dimensions were the optimal biomechanical selections for immediate-loading implants in type B/2 bone.

  20. Application of viromics: a new approach to the understanding of viral infections in humans.

    PubMed

    Ramamurthy, Mageshbabu; Sankar, Sathish; Kannangai, Rajesh; Nandagopal, Balaji; Sridharan, Gopalan

    2017-12-01

    This review is focused at exploring the strengths of modern technology driven data compiled in the areas of virus gene sequencing, virus protein structures and their implication to viral diagnosis and therapy. The information for virome analysis (viromics) is generated by the study of viral genomes (entire nucleotide sequence) and viral genes (coding for protein). Presently, the study of viral infectious diseases in terms of etiopathogenesis and development of newer therapeutics is undergoing rapid changes. Currently, viromics relies on deep sequencing, next generation sequencing (NGS) data and public domain databases like GenBank and unique virus specific databases. Two commonly used NGS platforms: Illumina and Ion Torrent, recommend maximum fragment lengths of about 300 and 400 nucleotides for analysis respectively. Direct detection of viruses in clinical samples is now evolving using these methods. Presently, there are a considerable number of good treatment options for HBV/HIV/HCV. These viruses however show development of drug resistance. The drug susceptibility regions of the genomes are sequenced and the prediction of drug resistance is now possible from 3 public domains available on the web. This has been made possible through advances in the technology with the advent of high throughput sequencing and meta-analysis through sophisticated and easy to use software and the use of high speed computers for bioinformatics. More recently NGS technology has been improved with single-molecule real-time sequencing. Here complete long reads can be obtained with less error overcoming a limitation of the NGS which is inherently prone to software anomalies that arise in the hands of personnel without adequate training. The development in understanding the viruses in terms of their genome, pathobiology, transcriptomics and molecular epidemiology constitutes viromics. It could be stated that these developments will bring about radical changes and advancement especially in the field of antiviral therapy and diagnostic virology.

  1. Genome Sequencing and Analysis of Geographically Diverse Clinical Isolates of Herpes Simplex Virus 2

    PubMed Central

    Lamers, Susanna L.; Weiner, Brian; Ray, Stuart C.; Colgrove, Robert C.; Diaz, Fernando; Jing, Lichen; Wang, Kening; Saif, Sakina; Young, Sarah; Henn, Matthew; Laeyendecker, Oliver; Tobian, Aaron A. R.; Cohen, Jeffrey I.; Koelle, David M.; Quinn, Thomas C.; Knipe, David M.

    2015-01-01

    ABSTRACT Herpes simplex virus 2 (HSV-2), the principal causative agent of recurrent genital herpes, is a highly prevalent viral infection worldwide. Limited information is available on the amount of genomic DNA variation between HSV-2 strains because only two genomes have been determined, the HG52 laboratory strain and the newly sequenced SD90e low-passage-number clinical isolate strain, each from a different geographical area. In this study, we report the nearly complete genome sequences of 34 HSV-2 low-passage-number and laboratory strains, 14 of which were collected in Uganda, 1 in South Africa, 11 in the United States, and 8 in Japan. Our analyses of these genomes demonstrated remarkable sequence conservation, regardless of geographic origin, with the maximum nucleotide divergence between strains being 0.4% across the genome. In contrast, prior studies indicated that HSV-1 genomes exhibit more sequence diversity, as well as geographical clustering. Additionally, unlike HSV-1, little viral recombination between HSV-2 strains could be substantiated. These results are interpreted in light of HSV-2 evolution, epidemiology, and pathogenesis. Finally, the newly generated sequences more closely resemble the low-passage-number SD90e than HG52, supporting the use of the former as the new reference genome of HSV-2. IMPORTANCE Herpes simplex virus 2 (HSV-2) is a causative agent of genital and neonatal herpes. Therefore, knowledge of its DNA genome and genetic variability is central to preventing and treating genital herpes. However, only two full-length HSV-2 genomes have been reported. In this study, we sequenced 34 additional HSV-2 low-passage-number and laboratory viral genomes and initiated analysis of the genetic diversity of HSV-2 strains from around the world. The analysis of these genomes will facilitate research aimed at vaccine development, diagnosis, and the evaluation of clinical manifestations and transmission of HSV-2. This information will also contribute to our understanding of HSV evolution. PMID:26018166

  2. lakemorpho: Calculating lake morphometry metrics in R.

    PubMed

    Hollister, Jeffrey; Stachelek, Joseph

    2017-01-01

    Metrics describing the shape and size of lakes, known as lake morphometry metrics, are important for any limnological study. In cases where a lake has long been the subject of study these data are often already collected and are openly available. Many other lakes have these data collected, but access is challenging as it is often stored on individual computers (or worse, in filing cabinets) and is available only to the primary investigators. The vast majority of lakes fall into a third category in which the data are not available. This makes broad scale modelling of lake ecology a challenge as some of the key information about in-lake processes are unavailable. While this valuable in situ information may be difficult to obtain, several national datasets exist that may be used to model and estimate lake morphometry. In particular, digital elevation models and hydrography have been shown to be predictive of several lake morphometry metrics. The R package lakemorpho has been developed to utilize these data and estimate the following morphometry metrics: surface area, shoreline length, major axis length, minor axis length, major and minor axis length ratio, shoreline development, maximum depth, mean depth, volume, maximum lake length, mean lake width, maximum lake width, and fetch. In this software tool article we describe the motivation behind developing lakemorpho , discuss the implementation in R, and describe the use of lakemorpho with an example of a typical use case.

  3. Tandem alternative polyadenylation events of genes in non-eosinophilic nasal polyp tissue identified by high-throughput sequencing analysis

    PubMed Central

    TIAN, PENG; LI, JIE; LIU, XIANG; LI, YUXI; CHEN, MEIHENG; MA, YUN; ZHENG, YI QING; FU, YONGGUI; ZOU, HUA

    2014-01-01

    Nasal polyps (NP) is highly associated with the disorder of immune cells. Alternative polyadenylation (APA) produces mRNA isoforms with different length of 3′-untranslated region (UTR) and regulates gene expression. It has been proven that this APA-mediated regulation of 3′UTR length is an immune-associated phenomenon. The aim of this study was to investigate the genome-wide alternative tandem 3′UTR length switching events in non-eosinophilic nasal polyp tissue. Thirteen patients diagnosed as having non-eosinophilic nasal polyps were included in this study. Nasal polyp tissue and control mucosa were collected during surgery. The 3′ end library of cDNA was constructed. The recovered libraries were sequenced with second sequencing technology, and the sequencing data were analyzed by an in-house bioinformatics pipeline. Tandem 3′UTR length switching between samples was detected by a test of linear trend alternative to independence. We found a significant alteration in the tandem 3′UTR length in 1,920 genes in nasal polyp samples. Functional annotation results showed that several gene ontology (GO) terms were enriched in the list of genes with switched APA sites, including regulation of transcription, macromolecule catabolic localization and mRNA processing. The results suggested that APA-mediated alternative 3′UTR regulation plays an important role in the post-transcriptional regulation of gene expression in non-eosinophilic nasal polyps. PMID:24715051

  4. An improved model for whole genome phylogenetic analysis by Fourier transform.

    PubMed

    Yin, Changchuan; Yau, Stephen S-T

    2015-10-07

    DNA sequence similarity comparison is one of the major steps in computational phylogenetic studies. The sequence comparison of closely related DNA sequences and genomes is usually performed by multiple sequence alignments (MSA). While the MSA method is accurate for some types of sequences, it may produce incorrect results when DNA sequences undergone rearrangements as in many bacterial and viral genomes. It is also limited by its computational complexity for comparing large volumes of data. Previously, we proposed an alignment-free method that exploits the full information contents of DNA sequences by Discrete Fourier Transform (DFT), but still with some limitations. Here, we present a significantly improved method for the similarity comparison of DNA sequences by DFT. In this method, we map DNA sequences into 2-dimensional (2D) numerical sequences and then apply DFT to transform the 2D numerical sequences into frequency domain. In the 2D mapping, the nucleotide composition of a DNA sequence is a determinant factor and the 2D mapping reduces the nucleotide composition bias in distance measure, and thus improving the similarity measure of DNA sequences. To compare the DFT power spectra of DNA sequences with different lengths, we propose an improved even scaling algorithm to extend shorter DFT power spectra to the longest length of the underlying sequences. After the DFT power spectra are evenly scaled, the spectra are in the same dimensionality of the Fourier frequency space, then the Euclidean distances of full Fourier power spectra of the DNA sequences are used as the dissimilarity metrics. The improved DFT method, with increased computational performance by 2D numerical representation, can be applicable to any DNA sequences of different length ranges. We assess the accuracy of the improved DFT similarity measure in hierarchical clustering of different DNA sequences including simulated and real datasets. The method yields accurate and reliable phylogenetic trees and demonstrates that the improved DFT dissimilarity measure is an efficient and effective similarity measure of DNA sequences. Due to its high efficiency and accuracy, the proposed DFT similarity measure is successfully applied on phylogenetic analysis for individual genes and large whole bacterial genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. Draft Genome Sequence of Pseudomonas sp. Strain LFM046, a Producer of Medium-Chain-Length Polyhydroxyalkanoate.

    PubMed

    Cardinali-Rezende, Juliana; Alexandrino, Paulo Moises Raduan; Nahat, Rafael Augusto Theodoro Pereira de Souza; Sant'Ana, Débora Parrine Vieira; Silva, Luiziana Ferreira; Gomez, José Gregório Cabrera; Taciro, Marilda Keico

    2015-08-20

    Pseudomonas sp. LFM046 is a medium-chain-length polyhydroxyalkanoate (PHAMCL) producer capable of using various carbon sources (carbohydrates, organic acids, and vegetable oils) and was first isolated from sugarcane cultivation soil in Brazil. The genome sequence was found to be 5.97 Mb long with a G+C content of 66%. Copyright © 2015 Cardinali-Rezende et al.

  6. Identification of gyrB and rpoB gene mutations and differentially expressed proteins between a novobiocin-resistant Aeromonas hydrophila catfish vaccine strain and its virulent parent strain

    USDA-ARS?s Scientific Manuscript database

    Sequence comparison between the full-length 2412 bp DNA gyrase subunit B (gyrB) gene of a novobiocin resistant Aeromonas hydrophila AH11NOVO vaccine strain and that of its virulent parent strain AH11P revealed 10 missense mutations. Similarly, sequence comparison between the full-length 4092 bp RNA ...

  7. Turbulent Mixing in Exponential Transverse Jets

    DTIC Science & Technology

    1990-09-30

    parameter. The flame length of the jets is a direct measurement of the molecular scale mixing rate. ACCOMPLISHMENTS From observations of the trajectory...and cross-sectional size of the vortices, as well as the flame length , our measurements reveal the following: i) Under acceleration, the roll up and... flame lengths are a weak maximum when the acceleration parameter (x is about unity. For large cc, flame lengths slowly decline with increasing a, in

  8. Long length cuttings from no. 2 common hardwood lumber

    Treesearch

    Edwin L. Lucas; Edwin L. Lucas

    1973-01-01

    Long length cuttings (up to 60 inches) are obtainable in abundance from No. 2 Common oak lumber. Cutting for the maximum area of clear one face (ClF) parts 18 to 60 inches in length, we found that 46 percent of all the cuttings were 36 inches long or longer. The recovery of the long length cuttings did not reduce the overall yield of parts produced from the lumber....

  9. The Repeat Sequences and Elevated Substitution Rates of the Chloroplast accD Gene in Cupressophytes

    PubMed Central

    Li, Jia; Su, Yingjuan; Wang, Ting

    2018-01-01

    The plastid accD gene encodes a subunit of the acetyl-CoA carboxylase (ACCase) enzyme. The length of accD gene has been supposed to expand in Cryptomeria japonica, Taiwania cryptomerioides, Cephalotaxus, Taxus chinensis, and Podocarpus lambertii, and the main reason for this phenomenon was the existence of tandemly repeated sequences. However, it is still unknown whether the accD gene length in other cupressophytes has expanded. Here, in order to investigate how widespread this phenomenon was, 18 accD sequences and its surrounding regions of cupressophyte were sequenced and analyzed. Together with 39 GenBank sequence data, our taxon sampling covered all the extant gymnosperm orders. The repetitive elements and substitution rates of accD among 57 gymnosperm species were analyzed, the results show: (1) Reading frame length of accD gene in 18 cupressophytes species has also expanded. (2) Many repetitive elements were identified in accD gene of cupressophyte lineages. (3) The synonymous and non-synonymous substitution rates of accD were accelerated in cupressophytes. (4) accD was located in rearrangement endpoints. These results suggested that repetitive elements may mediate the chloroplast genome rearrangement and accelerated the substitution rates. PMID:29731764

  10. Large-scale identification and characterization of alternative splicing variants of human gene transcripts using 56 419 completely sequenced and manually annotated full-length cDNAs

    PubMed Central

    Takeda, Jun-ichi; Suzuki, Yutaka; Nakao, Mitsuteru; Barrero, Roberto A.; Koyanagi, Kanako O.; Jin, Lihua; Motono, Chie; Hata, Hiroko; Isogai, Takao; Nagai, Keiichi; Otsuki, Tetsuji; Kuryshev, Vladimir; Shionyu, Masafumi; Yura, Kei; Go, Mitiko; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Wiemann, Stefan; Nomura, Nobuo; Sugano, Sumio; Gojobori, Takashi; Imanishi, Tadashi

    2006-01-01

    We report the first genome-wide identification and characterization of alternative splicing in human gene transcripts based on analysis of the full-length cDNAs. Applying both manual and computational analyses for 56 419 completely sequenced and precisely annotated full-length cDNAs selected for the H-Invitational human transcriptome annotation meetings, we identified 6877 alternative splicing genes with 18 297 different alternative splicing variants. A total of 37 670 exons were involved in these alternative splicing events. The encoded protein sequences were affected in 6005 of the 6877 genes. Notably, alternative splicing affected protein motifs in 3015 genes, subcellular localizations in 2982 genes and transmembrane domains in 1348 genes. We also identified interesting patterns of alternative splicing, in which two distinct genes seemed to be bridged, nested or having overlapping protein coding sequences (CDSs) of different reading frames (multiple CDS). In these cases, completely unrelated proteins are encoded by a single locus. Genome-wide annotations of alternative splicing, relying on full-length cDNAs, should lay firm groundwork for exploring in detail the diversification of protein function, which is mediated by the fast expanding universe of alternative splicing variants. PMID:16914452

  11. Determination of Shapes of Boattail Bodies of Revolution for Minimum Wave Drag

    NASA Technical Reports Server (NTRS)

    Adams, Mac C.

    1951-01-01

    By use of an approximate equation for the wave drag of slender bodies of revolution in a supersonic flow field, the optimum shapes of certain boattail bodies are determined for minimum wave drag. The properties of three specific families of bodies are determined, the first family consisting of bodies having a given length and base area and a contour passing through a prescribed point between the nose and base, the second family having fixed length, base area, and maximum area, and the third family having given length, volume, and base area. The method presented is easily generalized to determine minimum-wave-drag profile shapes which have contours that must pass through any prescribed number of points. According to linearized theory, the optimum profiles are found to have infinite slope at the nose but zero radius of curvature so that the bodies appear to have pointed noses, a zero slope at the body base, and no variation of wave drag with Mach number. For those bodies having a specified intermediate.diameter (that is, location and magnitude given), the maximum body diameter is shown to be larger, in general, than the specified diameter. It is also shown that, for bodies having a specified maximum diameter, the location of the maximum diameter is not arbitrary but is determined from the ratio of base diameter to maximum diameter.

  12. Assessment of sex in a modern Turkish population using cranial anthropometric parameters.

    PubMed

    Ekizoglu, Oguzhan; Hocaoglu, Elif; Inci, Ercan; Can, Ismail Ozgur; Solmaz, Dilek; Aksoy, Sema; Buran, Cudi Ferat; Sayin, Ibrahim

    2016-07-01

    The utilization of radiological imaging methods in anthropometric studies is being expanded by the application of modern imaging methods, leading to a decrease in costs, a decrease in the time required for analysis and the ability to create three-dimensional images. This retrospective study investigated 400 patients within the 18-45-years age group (mean age: 30.7±11.2years) using cranial computed tomography images. We measured 14 anthropometric parameters (basion-bregma height, basion-prosthion length, maximum cranial length and cranial base lengths, maximum cranial breadth, bizygomatic diameter, upper facial breadth, bimastoid diameter, orbital breadth, orbital length, biorbital breadth, interorbital breadth, foramen magnum breadth and foramen magnum length) of cranial measurements. The intra- and inter-observer repeatability and consistency were good. From the results of logistic regression analysis using morphometric measurements, the most conspicuous measurements in terms of dimorphism were maximum cranial length, bizygomatic diameter, basion-bregma height, and cranial base length. The most dimorphic structure was the bizygomatic diameter with an accuracy rate of 83% in females and 77% in males. In this study, 87.5% of females and 87.0% of males were classified accurately by this model including four parameters with a sensitivity of 91.5% and specificity of 85.0%. In conclusion, CT cranial morphometric analysis may be reliable for the assessment of sex in the Turkish population and is recommended for comparison of data of modern populations with those of former populations. Additionally, cranial morphometric data that we obtained from modern Turkish population may reveal population specific data, which may help current criminal investigations and identification of disaster victims. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  13. Accurate typing of short tandem repeats from genome-wide sequencing data and its applications.

    PubMed

    Fungtammasan, Arkarachai; Ananda, Guruprasad; Hile, Suzanne E; Su, Marcia Shu-Wei; Sun, Chen; Harris, Robert; Medvedev, Paul; Eckert, Kristin; Makova, Kateryna D

    2015-05-01

    Short tandem repeats (STRs) are implicated in dozens of human genetic diseases and contribute significantly to genome variation and instability. Yet profiling STRs from short-read sequencing data is challenging because of their high sequencing error rates. Here, we developed STR-FM, short tandem repeat profiling using flank-based mapping, a computational pipeline that can detect the full spectrum of STR alleles from short-read data, can adapt to emerging read-mapping algorithms, and can be applied to heterogeneous genetic samples (e.g., tumors, viruses, and genomes of organelles). We used STR-FM to study STR error rates and patterns in publicly available human and in-house generated ultradeep plasmid sequencing data sets. We discovered that STRs sequenced with a PCR-free protocol have up to ninefold fewer errors than those sequenced with a PCR-containing protocol. We constructed an error correction model for genotyping STRs that can distinguish heterozygous alleles containing STRs with consecutive repeat numbers. Applying our model and pipeline to Illumina sequencing data with 100-bp reads, we could confidently genotype several disease-related long trinucleotide STRs. Utilizing this pipeline, for the first time we determined the genome-wide STR germline mutation rate from a deeply sequenced human pedigree. Additionally, we built a tool that recommends minimal sequencing depth for accurate STR genotyping, depending on repeat length and sequencing read length. The required read depth increases with STR length and is lower for a PCR-free protocol. This suite of tools addresses the pressing challenges surrounding STR genotyping, and thus is of wide interest to researchers investigating disease-related STRs and STR evolution. © 2015 Fungtammasan et al.; Published by Cold Spring Harbor Laboratory Press.

  14. Complete mitochondrial genome sequences of the northern spotted owl (Strix occidentalis caurina) and the barred owl (Strix varia; Aves: Strigiformes: Strigidae) confirm the presence of a duplicated control region

    PubMed Central

    Henderson, James B.; Sellas, Anna B.; Fuchs, Jérôme; Bowie, Rauri C.K.; Dumbacher, John P.

    2017-01-01

    We report here the successful assembly of the complete mitochondrial genomes of the northern spotted owl (Strix occidentalis caurina) and the barred owl (S. varia). We utilized sequence data from two sequencing methodologies, Illumina paired-end sequence data with insert lengths ranging from approximately 250 nucleotides (nt) to 9,600 nt and read lengths from 100–375 nt and Sanger-derived sequences. We employed multiple assemblers and alignment methods to generate the final assemblies. The circular genomes of S. o. caurina and S. varia are comprised of 19,948 nt and 18,975 nt, respectively. Both code for two rRNAs, twenty-two tRNAs, and thirteen polypeptides. They both have duplicated control region sequences with complex repeat structures. We were not able to assemble the control regions solely using Illumina paired-end sequence data. By fully spanning the control regions, Sanger-derived sequences enabled accurate and complete assembly of these mitochondrial genomes. These are the first complete mitochondrial genome sequences of owls (Aves: Strigiformes) possessing duplicated control regions. We searched the nuclear genome of S. o. caurina for copies of mitochondrial genes and found at least nine separate stretches of nuclear copies of gene sequences originating in the mitochondrial genome (Numts). The Numts ranged from 226–19,522 nt in length and included copies of all mitochondrial genes except tRNAPro, ND6, and tRNAGlu. Strix occidentalis caurina and S. varia exhibited an average of 10.74% (8.68% uncorrected p-distance) divergence across the non-tRNA mitochondrial genes. PMID:29038757

  15. Conservation of Shannon's redundancy for proteins. [information theory applied to amino acid sequences

    NASA Technical Reports Server (NTRS)

    Gatlin, L. L.

    1974-01-01

    Concepts of information theory are applied to examine various proteins in terms of their redundancy in natural originators such as animals and plants. The Monte Carlo method is used to derive information parameters for random protein sequences. Real protein sequence parameters are compared with the standard parameters of protein sequences having a specific length. The tendency of a chain to contain some amino acids more frequently than others and the tendency of a chain to contain certain amino acid pairs more frequently than other pairs are used as randomness measures of individual protein sequences. Non-periodic proteins are generally found to have random Shannon redundancies except in cases of constraints due to short chain length and genetic codes. Redundant characteristics of highly periodic proteins are discussed. A degree of periodicity parameter is derived.

  16. One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly.

    PubMed

    Koren, Sergey; Phillippy, Adam M

    2015-02-01

    Like a jigsaw puzzle with large pieces, a genome sequenced with long reads is easier to assemble. However, recent sequencing technologies have favored lowering per-base cost at the expense of read length. This has dramatically reduced sequencing cost, but resulted in fragmented assemblies, which negatively affect downstream analyses and hinder the creation of finished (gapless, high-quality) genomes. In contrast, emerging long-read sequencing technologies can now produce reads tens of kilobases in length, enabling the automated finishing of microbial genomes for under $1000. This promises to improve the quality of reference databases and facilitate new studies of chromosomal structure and variation. We present an overview of these new technologies and the methods used to assemble long reads into complete genomes. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.

  17. Understanding Adherence and Prescription Patterns Using Large-Scale Claims Data.

    PubMed

    Bjarnadóttir, Margrét V; Malik, Sana; Onukwugha, Eberechukwu; Gooden, Tanisha; Plaisant, Catherine

    2016-02-01

    Advanced computing capabilities and novel visual analytics tools now allow us to move beyond the traditional cross-sectional summaries to analyze longitudinal prescription patterns and the impact of study design decisions. For example, design decisions regarding gaps and overlaps in prescription fill data are necessary for measuring adherence using prescription claims data. However, little is known regarding the impact of these decisions on measures of medication possession (e.g., medication possession ratio). The goal of the study was to demonstrate the use of visualization tools for pattern discovery, hypothesis generation, and study design. We utilized EventFlow, a novel discrete event sequence visualization software, to investigate patterns of prescription fills, including gaps and overlaps, utilizing large-scale healthcare claims data. The study analyzes data of individuals who had at least two prescriptions for one of five hypertension medication classes: ACE inhibitors, angiotensin II receptor blockers, beta blockers, calcium channel blockers, and diuretics. We focused on those members initiating therapy with diuretics (19.2%) who may have concurrently or subsequently take drugs in other classes as well. We identified longitudinal patterns in prescription fills for antihypertensive medications, investigated the implications of decisions regarding gap length and overlaps, and examined the impact on the average cost and adherence of the initial treatment episode. A total of 790,609 individuals are included in the study sample, 19.2% (N = 151,566) of whom started on diuretics first during the study period. The average age was 52.4 years and 53.1% of the population was female. When the allowable gap was zero, 34% of the population had continuous coverage and the average length of continuous coverage was 2 months. In contrast, when the allowable gap was 30 days, 69% of the population showed a single continuous prescription period with an average length of 5 months. The average prescription cost of the period of continuous coverage ranged from US$3.44 (when the maximum gap was 0 day) to US$9.08 (when the maximum gap was 30 days). Results were less impactful when considering overlaps. This proof-of-concept study illustrates the use of visual analytics tools in characterizing longitudinal medication possession. We find that prescription patterns and associated prescription costs are more influenced by allowable gap lengths than by definitions and treatment of overlap. Research using medication gaps and overlaps to define medication possession in prescription claims data should pay particular attention to the definition and use of gap lengths.

  18. Subgrouping Automata: automatic sequence subgrouping using phylogenetic tree-based optimum subgrouping algorithm.

    PubMed

    Seo, Joo-Hyun; Park, Jihyang; Kim, Eun-Mi; Kim, Juhan; Joo, Keehyoung; Lee, Jooyoung; Kim, Byung-Gee

    2014-02-01

    Sequence subgrouping for a given sequence set can enable various informative tasks such as the functional discrimination of sequence subsets and the functional inference of unknown sequences. Because an identity threshold for sequence subgrouping may vary according to the given sequence set, it is highly desirable to construct a robust subgrouping algorithm which automatically identifies an optimal identity threshold and generates subgroups for a given sequence set. To meet this end, an automatic sequence subgrouping method, named 'Subgrouping Automata' was constructed. Firstly, tree analysis module analyzes the structure of tree and calculates the all possible subgroups in each node. Sequence similarity analysis module calculates average sequence similarity for all subgroups in each node. Representative sequence generation module finds a representative sequence using profile analysis and self-scoring for each subgroup. For all nodes, average sequence similarities are calculated and 'Subgrouping Automata' searches a node showing statistically maximum sequence similarity increase using Student's t-value. A node showing the maximum t-value, which gives the most significant differences in average sequence similarity between two adjacent nodes, is determined as an optimum subgrouping node in the phylogenetic tree. Further analysis showed that the optimum subgrouping node from SA prevents under-subgrouping and over-subgrouping. Copyright © 2013. Published by Elsevier Ltd.

  19. The primary structure of the Saccharomyces cerevisiae gene for 3-phosphoglycerate kinase.

    PubMed Central

    Hitzeman, R A; Hagie, F E; Hayflick, J S; Chen, C Y; Seeburg, P H; Derynck, R

    1982-01-01

    The DNA sequence of the gene for the yeast glycolytic enzyme, 3-phosphoglycerate kinase (PGK), has been obtained by sequencing part of a 3.1 kbp HindIII fragment obtained from the yeast genome. The structural gene sequence corresponds to a reading frame of 1251 bp coding for 416 amino acids with no intervening DNA sequences. The amino acid sequence is approximately 65 percent homologous with human and horse PGK protein sequences and is in general agreement with the published protein sequence for yeast PGK. As for other highly expressed structural genes in yeast, the coding sequence is highly codon biased with 95 percent of the amino acids coded for by a select 25 codons (out of 61 possible). Besides structural DNA sequence, 291 bp of 5'-flanking sequence and 286 bp of 3'-flanking sequence were determined. Transcription starts 36 nucleotides upstream from the translational start and stops 86-93 nucleotides downstream from the translational stop. These results suggest a non-polyadenylated mRNA length of 1373 to 1380 nucleotides, which is consistent with the observed length of 1500 nucleotides for polyadenylated PGK mRNA. A sequence TATATATAAA is found at 145 nucleotides upstream from the translational start. This sequence resembles the TATAAA box that is possibly associated with RNA polymerase II binding. Images PMID:6296791

  20. On the Trend of the Annual Mean, Maximum, and Minimum Temperature and the Diurnal Temperature Range in the Armagh Observatory, Northern Ireland, Dataset, 1844 -2012

    NASA Technical Reports Server (NTRS)

    Wilson, Robert M.

    2013-01-01

    Examined are the annual averages, 10-year moving averages, decadal averages, and sunspot cycle (SC) length averages of the mean, maximum, and minimum surface air temperatures and the diurnal temperature range (DTR) for the Armagh Observatory, Northern Ireland, during the interval 1844-2012. Strong upward trends are apparent in the Armagh surface-air temperatures (ASAT), while a strong downward trend is apparent in the DTR, especially when the ASAT data are averaged by decade or over individual SC lengths. The long-term decrease in the decadaland SC-averaged annual DTR occurs because the annual minimum temperatures have risen more quickly than the annual maximum temperatures. Estimates are given for the Armagh annual mean, maximum, and minimum temperatures and the DTR for the current decade (2010-2019) and SC24.

  1. Trends and Variability of the Outdoor Skating Season in Canada during 1951-2005

    NASA Astrophysics Data System (ADS)

    Damyanov, Nikolay Nikolaev

    Climate change affects a range of human activities, including one of Canada's prime sources of entertainment: ice skating. Whether done recreationally or as hockey, its outdoor component is heavily dependent on weather and climate. Based on information obtained from public works officials from various Canadian cities, I have established a meteorological criterion for the initiation of an outdoor skating season (OSS) as the last day in a sequence of the first three consecutive fall/winter days with a maximum temperature below -5 °C. In addition, I derive a proxy of the OSS length, defined as the total number of days with a maximum temperature below -5 °C after the OSS start date and before the start of March. Using these filters, I have extracted the start dates and the lengths of the OSS for each year during the fifty-five year period 1951-2005 from a comprehensive daily temperature dataset (Vincent et al., 2002). For each station, I created time series of both the OSS start dates and OSS lengths, and calculated the magnitude, sign and statistical significance of the slopes of the best-fit lines to each time series. In order to establish a relationship of the OSS with large-scale climate patterns, I grouped stations into six climatic regions. Depending on location, I then tested each region for correlation with the Pacific North-American teleconnection pattern (PNA) or the North Atlantic Oscillation (NAO), using a composite analysis method. Lastly, I removed the signal due to these climate fluctuations from the OSS start date and length trends in order to determine how much of the variability was caused by these interannual climate oscillations. The results of the study indicate that most stations in British Columbia and southwest Alberta, as well as these in the southern Ontario/Quebc region have witnessed a progressively later onset of the OSS over time. The Prairies, northwest Canada, and some Maritime locales show the opposite trend, although the magnitudes of the slopes are smaller. Significance tests on the regression lines show that most of these trends are not significant at the 95% level. However, OSS start dates in western Canada are very well correlated with PNA patterns by happening later on the average whenever PNA is positive and more warm air is channeled towards the west coast; the OSS start dates in eastern Canada show a similar connection with the NAO. The OSS lengths exhibit different trends: five of the six regions show a decrease in OSS length with the only region having experienced a lengthening of the OSS being the Maritimes. The statistical significance of the OSS length slopes is much higher than that of OSS start slopes, and the correlation with the PNA or NAO is similar in both cases. After carrying out the last procedure (removal of the PNA and NAO signals from the OSS start date and length series), I found an increase in the new slopes and their significance for more than half of my geographic regions' OSS start date and length trends.

  2. GASP: Gapped Ancestral Sequence Prediction for proteins

    PubMed Central

    Edwards, Richard J; Shields, Denis C

    2004-01-01

    Background The prediction of ancestral protein sequences from multiple sequence alignments is useful for many bioinformatics analyses. Predicting ancestral sequences is not a simple procedure and relies on accurate alignments and phylogenies. Several algorithms exist based on Maximum Parsimony or Maximum Likelihood methods but many current implementations are unable to process residues with gaps, which may represent insertion/deletion (indel) events or sequence fragments. Results Here we present a new algorithm, GASP (Gapped Ancestral Sequence Prediction), for predicting ancestral sequences from phylogenetic trees and the corresponding multiple sequence alignments. Alignments may be of any size and contain gaps. GASP first assigns the positions of gaps in the phylogeny before using a likelihood-based approach centred on amino acid substitution matrices to assign ancestral amino acids. Important outgroup information is used by first working down from the tips of the tree to the root, using descendant data only to assign probabilities, and then working back up from the root to the tips using descendant and outgroup data to make predictions. GASP was tested on a number of simulated datasets based on real phylogenies. Prediction accuracy for ungapped data was similar to three alternative algorithms tested, with GASP performing better in some cases and worse in others. Adding simple insertions and deletions to the simulated data did not have a detrimental effect on GASP accuracy. Conclusions GASP (Gapped Ancestral Sequence Prediction) will predict ancestral sequences from multiple protein alignments of any size. Although not as accurate in all cases as some of the more sophisticated maximum likelihood approaches, it can process a wide range of input phylogenies and will predict ancestral sequences for gapped and ungapped residues alike. PMID:15350199

  3. Full-length genome sequences of five hepatitis C virus isolates representing subtypes 3g, 3h, 3i and 3k, and a unique genotype 3 variant.

    PubMed

    Lu, Ling; Li, Chunhua; Yuan, Jie; Lu, Teng; Okamoto, Hiroaki; Murphy, Donald G

    2013-03-01

    We characterized the full-length genomes of five distinct hepatitis C virus (HCV)-3 isolates. These represent the first complete genomes for subtypes 3g and 3h, the second such genomes for 3k and 3i, and of one novel variant presently not assigned to a subtype. Each genome was determined from 18-25 overlapping fragments. They had lengths of 9579-9660 nt and each contained a single ORF encoding 3020-3025 aa. They were isolated from five patients residing in Canada; four were of Asian origin and one was of Somali origin. Phylogenetic analysis using 64 partial NS5B sequences differentiated 10 assigned subtypes, 3a-3i and 3k, and two additional lineages within genotype 3. From the data of this study, HCV-3 full-length sequences are now available for six of the assigned subtypes and one unassigned. Our findings should add insights to HCV evolutionary studies and clinical applications.

  4. Metagenomic and near full-length 16S rRNA sequence data in support of the phylogenetic analysis of the rumen bacterial community in steers

    USDA-ARS?s Scientific Manuscript database

    Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplificatio...

  5. Sequence investigation of 34 forensic autosomal STRs with massively parallel sequencing.

    PubMed

    Zhang, Suhua; Niu, Yong; Bian, Yingnan; Dong, Rixia; Liu, Xiling; Bao, Yun; Jin, Chao; Zheng, Hancheng; Li, Chengtao

    2018-05-01

    STRs vary not only in the length of the repeat units and the number of repeats but also in the region with which they conform to an incremental repeat pattern. Massively parallel sequencing (MPS) offers new possibilities in the analysis of STRs since they can simultaneously sequence multiple targets in a single reaction and capture potential internal sequence variations. Here, we sequenced 34 STRs applied in the forensic community of China with a custom-designed panel. MPS performance were evaluated from sequencing reads analysis, concordance study and sensitivity testing. High coverage sequencing data were obtained to determine the constitute ratios and heterozygous balance. No actual inconsistent genotypes were observed between capillary electrophoresis (CE) and MPS, demonstrating the reliability of the panel and the MPS technology. With the sequencing data from the 200 investigated individuals, 346 and 418 alleles were obtained via CE and MPS technologies at the 34 STRs, indicating MPS technology provides higher discrimination than CE detection. The whole study demonstrated that STR genotyping with the custom panel and MPS technology has the potential not only to reveal length and sequence variations but also to satisfy the demands of high throughput and high multiplexing with acceptable sensitivity.

  6. Spatio-Temporal Structure, Path Characteristics, and Perceptual Grouping in Immediate Serial Spatial Recall

    PubMed Central

    De Lillo, Carlo; Kirby, Melissa; Poole, Daniel

    2016-01-01

    Immediate serial spatial recall measures the ability to retain sequences of locations in short-term memory and is considered the spatial equivalent of digit span. It is tested by requiring participants to reproduce sequences of movements performed by an experimenter or displayed on a monitor. Different organizational factors dramatically affect serial spatial recall but they are often confounded or underspecified. Untangling them is crucial for the characterization of working-memory models and for establishing the contribution of structure and memory capacity to spatial span. We report five experiments assessing the relative role and independence of factors that have been reported in the literature. Experiment 1 disentangled the effects of spatial clustering and path-length by manipulating the distance of items displayed on a touchscreen monitor. Long-path sequences segregated by spatial clusters were compared with short-path sequences not segregated by clusters. Recall was more accurate for sequences segregated by clusters independently from path-length. Experiment 2 featured conditions where temporal pauses were introduced between or within cluster boundaries during the presentation of sequences with the same paths. Thus, the temporal structure of the sequences was either consistent or inconsistent with a hierarchical representation based on segmentation by spatial clusters but the effect of structure could not be confounded with effects of path-characteristics. Pauses at cluster boundaries yielded more accurate recall, as predicted by a hierarchical model. In Experiment 3, the systematic manipulation of sequence structure, path-length, and presence of path-crossings of sequences showed that structure explained most of the variance, followed by the presence/absence of path-crossings, and path-length. Experiments 4 and 5 replicated the results of the previous experiments in immersive virtual reality navigation tasks where the viewpoint of the observer changed dynamically during encoding and recall. This suggested that the effects of structure in spatial span are not dependent on perceptual grouping processes induced by the aerial view of the stimulus array typically afforded by spatial recall tasks. These results demonstrate the independence of coding strategies based on structure from effects of path characteristics and perceptual grouping in immediate serial spatial recall. PMID:27891101

  7. Cell Wall and Membrane-Associated Exo-β-d-Glucanases from Developing Maize Seedlings1

    PubMed Central

    Kim, Jong-Bum; Olek, Anna T.; Carpita, Nicholas C.

    2000-01-01

    A β-d-glucan exohydrolase was purified from the cell walls of developing maize (Zea mays L.) shoots. The cell wall enzyme preferentially hydrolyzes the non-reducing terminal glucosyl residue from (1→3)-β-d-glucans, but also hydrolyzes (1→2)-, (1→6)-, and (1→4)-β-d-glucosyl units in decreasing order of activity. Polyclonal antisera raised against the purified exo-β-d-glucanase (ExGase) were used to select partial-length cDNA clones, and the complete sequence of 622 amino acid residues was deduced from the nucleotide sequences of the cDNA and a full-length genomic clone. Northern gel-blot analysis revealed what appeared to be a single transcript, but three distinct polypeptides were detected in immunogel-blot analyses of the ExGases extracted from growing coleoptiles. Two polypeptides appear in the cell wall, where one polypeptide is constitutive, and the second appears at the time of the maximum rate of elongation and reaches peak activity after elongation has ceased. The appearance of the second polypeptide coincides with the disappearance of the mixed-linkage (1→3),(1→4)-β-d-glucan, whose accumulation is associated with cell elongation in grasses. The third polypeptide of the ExGase is an extrinsic protein associated with the exterior surface of the plasma membrane. Although the activity of the membrane-associated ExGase is highest against (1→3)-β-d-glucans, the activity against (1→4)-β-d-glucan linkages is severely attenuated and, therefore, the enzyme is unlikely to be involved with turnover of the (1→3),(1→4)-β-d-glucan. We propose three potential functions for this novel ExGase at the membrane-wall interface. PMID:10859178

  8. In Vitro Magnetic Resonance Imaging Evaluation of Fragmented, Open-Coil, Percutaneous Peripheral Nerve Stimulation Leads.

    PubMed

    Shellock, Frank G; Zare, Armaan; Ilfeld, Brian M; Chae, John; Strother, Robert B

    2018-04-01

    Percutaneous peripheral nerve stimulation (PNS) is an FDA-cleared pain treatment. Occasionally, fragments of the lead (MicroLead, SPR Therapeutics, LLC, Cleveland, OH, USA) may be retained following lead removal. Since the lead is metallic, there are associated magnetic resonance imaging (MRI) risks. Therefore, the objective of this investigation was to evaluate MRI-related issues (i.e., magnetic field interactions, heating, and artifacts) for various lead fragments. Testing was conducted using standardized techniques on lead fragments of different lengths (i.e., 50, 75, and 100% of maximum possible fragment length of 12.7 cm) to determine MRI-related problems. Magnetic field interactions (i.e., translational attraction and torque) and artifacts were tested for the longest lead fragment at 3 Tesla. MRI-related heating was evaluated at 1.5 Tesla/64 MHz and 3 Tesla/128 MHz with each lead fragment placed in a gelled-saline filled phantom. Temperatures were recorded on the lead fragments while using relatively high RF power levels. Artifacts were evaluated using T1-weighted, spin echo, and gradient echo (GRE) pulse sequences. The longest lead fragment produced only minor magnetic field interactions. For the lead fragments evaluated, physiologically inconsequential MRI-related heating occurred at 1.5 Tesla/64 MHz while under certain 3 Tesla/128 MHz conditions, excessive temperature elevations may occur. Artifacts extended approximately 7 mm from the lead fragment on the GRE pulse sequence, suggesting that anatomy located at a position greater than this distance may be visualized on MRI. MRI may be performed safely in patients with retained lead fragments at 1.5 Tesla using the specific conditions of this study (i.e., MR Conditional). Due to possible excessive temperature rises at 3 Tesla, performing MRI at that field strength is currently inadvisable. © 2017 International Neuromodulation Society.

  9. Radiofrequency ablation of bone with cooled probes and impedance control energy delivery in a pig model: MR imaging features.

    PubMed

    Cantwell, Colin P; Flavin, Robert; Deane, Richard; Sheehan, Katherine; Dervan, Peter; O'Byrne, John; Eustace, Stephen

    2007-08-01

    To determine the coronal marrow ablation length and detect cortical thinning after radiofrequency ablation (RFA) of bone in a pig model. Twelve pigs underwent RFA with a 1- or 2-cm single internally cooled electrode placed at the mid-diaphyseal point of their long bones at 1, 7, or 28 days before euthanasia. Twelve minutes of impedance control radiofrequency energy was delivered at maximum output from a 200-W generator. Pigs were imaged with axial and coronal turbo spin-echo (SE) T1- and T2-weighted frequency-selective fat suppression sequences by using spectral presaturation with inversion recovery (SPIR). A radiologist blinded to the timing of the treatment and the results of other imaging sequences measured the coronal ablation zone length and cortical thickness. The pigs were euthanized, and the ablated bone underwent histologic examination. At SPIR imaging, the zone of marrow ablation was defined as an area of low signal intensity surrounded by a high-signal-intensity band. At T1-weighted imaging, the zone of marrow ablation was defined as a heterogeneously isointense area surrounded by a low-signal-intensity band. The mean (+/-standard deviation) coronal marrow ablation zone measurement with SPIR imaging at 28 days was 47 mm +/- 9 (range, 34-73 mm) for the 1-cm electrode and 51 mm +/- 7 (range, 33-67 mm) for the 2-cm electrode. Two humeral fractures occurred at 21 and 28 days after therapy. Thinning of the cortex adjacent to the electrode insertion site was identified in the humeral group only. The change in the marrow signal intensity with impedance-controlled RFA is larger than that reported for temperature-controlled protocols. RFA leads to bone weakening.

  10. Distribution and Evolution of Yersinia Leucine-Rich Repeat Proteins

    PubMed Central

    Hu, Yueming; Huang, He; Hui, Xinjie; Cheng, Xi; White, Aaron P.

    2016-01-01

    Leucine-rich repeat (LRR) proteins are widely distributed in bacteria, playing important roles in various protein-protein interaction processes. In Yersinia, the well-characterized type III secreted effector YopM also belongs to the LRR protein family and is encoded by virulence plasmids. However, little has been known about other LRR members encoded by Yersinia genomes or their evolution. In this study, the Yersinia LRR proteins were comprehensively screened, categorized, and compared. The LRR proteins encoded by chromosomes (LRR1 proteins) appeared to be more similar to each other and different from those encoded by plasmids (LRR2 proteins) with regard to repeat-unit length, amino acid composition profile, and gene expression regulation circuits. LRR1 proteins were also different from LRR2 proteins in that the LRR1 proteins contained an E3 ligase domain (NEL domain) in the C-terminal region or an NEL domain-encoding nucleotide relic in flanking genomic sequences. The LRR1 protein-encoding genes (LRR1 genes) varied dramatically and were categorized into 4 subgroups (a to d), with the LRR1a to -c genes evolving from the same ancestor and LRR1d genes evolving from another ancestor. The consensus and ancestor repeat-unit sequences were inferred for different LRR1 protein subgroups by use of a maximum parsimony modeling strategy. Structural modeling disclosed very similar repeat-unit structures between LRR1 and LRR2 proteins despite the different unit lengths and amino acid compositions. Structural constraints may serve as the driving force to explain the observed mutations in the LRR regions. This study suggests that there may be functional variation and lays the foundation for future experiments investigating the functions of the chromosomally encoded LRR proteins of Yersinia. PMID:27217422

  11. Maximum height and minimum time vertical jumping.

    PubMed

    Domire, Zachary J; Challis, John H

    2015-08-20

    The performance criterion in maximum vertical jumping has typically been assumed to simply raise the center of mass as high as possible. In many sporting activities minimizing movement time during the jump is likely also critical to successful performance. The purpose of this study was to examine maximum height jumps performed while minimizing jump time. A direct dynamics model was used to examine squat jump performance, with dual performance criteria: maximize jump height and minimize jump time. The muscle model had activation dynamics, force-length, force-velocity properties, and a series of elastic component representing the tendon. The simulations were run in two modes. In Mode 1 the model was placed in a fixed initial position. In Mode 2 the simulation model selected the initial squat configuration as well as the sequence of muscle activations. The inclusion of time as a factor in Mode 1 simulations resulted in a small decrease in jump height and moderate time savings. The improvement in time was mostly accomplished by taking off from a less extended position. In Mode 2 simulations, more substantial time savings could be achieved by beginning the jump in a more upright posture. However, when time was weighted more heavily in these simulations, there was a more substantial reduction in jump height. Future work is needed to examine the implications for countermovement jumping and to examine the possibility of minimizing movement time as part of the control scheme even when the task is to jump maximally. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Giardia telomeric sequence d(TAGGG)4 forms two intramolecular G-quadruplexes in K+ solution: effect of loop length and sequence on the folding topology.

    PubMed

    Hu, Lanying; Lim, Kah Wai; Bouaziz, Serge; Phan, Anh Tuân

    2009-11-25

    Recently, it has been shown that in K(+) solution the human telomeric sequence d[TAGGG(TTAGGG)(3)] forms a (3 + 1) intramolecular G-quadruplex, while the Bombyx mori telomeric sequence d[TAGG(TTAGG)(3)], which differs from the human counterpart only by one G deletion in each repeat, forms a chair-type intramolecular G-quadruplex, indicating an effect of G-tract length on the folding topology of G-quadruplexes. To explore the effect of loop length and sequence on the folding topology of G-quadruplexes, here we examine the structure of the four-repeat Giardia telomeric sequence d[TAGGG(TAGGG)(3)], which differs from the human counterpart only by one T deletion within the non-G linker in each repeat. We show by NMR that this sequence forms two different intramolecular G-quadruplexes in K(+) solution. The first one is a novel basket-type antiparallel-stranded G-quadruplex containing two G-tetrads, a G x (A-G) triad, and two A x T base pairs; the three loops are consecutively edgewise-diagonal-edgewise. The second one is a propeller-type parallel-stranded G-quadruplex involving three G-tetrads; the three loops are all double-chain-reversal. Recurrence of several structural elements in the observed structures suggests a "cut and paste" principle for the design and prediction of G-quadruplex topologies, for which different elements could be extracted from one G-quadruplex and inserted into another.

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hachisu, Izumi; Kato, Mariko, E-mail: hachisu@ea.c.u-tokyo.ac.jp, E-mail: mariko@educ.cc.keio.ac.jp

    We identified a general course of classical nova outbursts in the B – V versus U – B color-color diagram. It is reported that novae show spectra similar to those of A-F supergiants near optical light maximum. However, they do not follow the supergiant sequence in the color-color diagram, neither the blackbody nor the main-sequence sequence. Instead, we found that novae evolve along a new sequence in the pre-maximum and near-maximum phases, which we call 'the nova-giant sequence'. This sequence is parallel to but Δ(U – B) ≈ –0.2 mag bluer than the supergiant sequence. This is because the massmore » of a nova envelope is much (∼10{sup –4} times) less than that of a normal supergiant. After optical maximum, its color quickly evolves back blueward along the same nova-giant sequence and reaches the point of free-free emission (B – V = –0.03, U – B = –0.97), which coincides with the intersection of the blackbody sequence and the nova-giant sequence, and remains there for a while. Then the color evolves leftward (blueward in B – V but almost constant in U – B), owing mainly to the development of strong emission lines. This is the general course of nova outbursts in the color-color diagram, which was deduced from eight well-observed novae in various speed classes. For a nova with unknown extinction, we can determine a reliable value of the color excess by matching the observed track of the target nova with this general course. This is a new and convenient method for obtaining the color excesses of classical novae. Using this method, we redetermined the color excesses of 20 well-observed novae. The obtained color excesses are in reasonable agreement with the previous results, which in turn support the idea of our general track of nova outbursts. Additionally, we estimated the absolute V magnitudes of about 30 novae using a method for time-stretching nova light curves to analyze the distance-reddening relations of the novae.« less

  14. Method and apparatus for biological sequence comparison

    DOEpatents

    Marr, T.G.; Chang, W.I.

    1997-12-23

    A method and apparatus are disclosed for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence. 5 figs.

  15. Method and apparatus for biological sequence comparison

    DOEpatents

    Marr, Thomas G.; Chang, William I-Wei

    1997-01-01

    A method and apparatus for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence.

  16. Long-read sequencing of the coffee bean transcriptome reveals the diversity of full-length transcripts

    PubMed Central

    Cheng, Bing; Furtado, Agnelo

    2017-01-01

    Abstract Polyploidization contributes to the complexity of gene expression, resulting in numerous related but different transcripts. This study explored the transcriptome diversity and complexity of the tetraploid Arabica coffee (Coffea arabica) bean. Long-read sequencing (LRS) by Pacbio Isoform sequencing (Iso-seq) was used to obtain full-length transcripts without the difficulty and uncertainty of assembly required for reads from short-read technologies. The tetraploid transcriptome was annotated and compared with data from the sub-genome progenitors. Caffeine and sucrose genes were targeted for case analysis. An isoform-level tetraploid coffee bean reference transcriptome with 95 995 distinct transcripts (average 3236 bp) was obtained. A total of 88 715 sequences (92.42%) were annotated with BLASTx against NCBI non-redundant plant proteins, including 34 719 high-quality annotations. Further BLASTn analysis against NCBI non-redundant nucleotide sequences, Coffea canephora coding sequences with UTR, C. arabica ESTs, and Rfam resulted in 1213 sequences without hits, were potential novel genes in coffee. Longer UTRs were captured, especially in the 5΄UTRs, facilitating the identification of upstream open reading frames. The LRS also revealed more and longer transcript variants in key caffeine and sucrose metabolism genes from this polyploid genome. Long sequences (>10 kilo base) were poorly annotated. LRS technology shows the limitation of previous studies. It provides an important tool to produce a reference transcriptome including more of the diversity of full-length transcripts to help understand the biology and support the genetic improvement of polyploid species such as coffee. PMID:29048540

  17. Themoanaerobacterium calidifontis sp. nov., a novel anaerobic, thermophilic, ethanol-producing bacterium from hot springs in China.

    PubMed

    Shang, Shu-mei; Qian, Long; Zhang, Xu; Li, Kun-zhi; Chagan, Irbis

    2013-06-01

    A novel thermophilic Gram staining positive strain Rx1 was isolated from hot springs in Baoshan of Yunnan Province, China. The strain was characterized as a hemicellulose-decomposing obligate anaerobe bacterium that is rod-shaped (diameter: 0.5-0.7 μm; length: 2.0-6.7 μm), spore-forming, and motile. Its growth temperature range is 38-68 °C (optimum 50-55 °C) and pH range is 4.5-8.0 (optimum 7.0). The maximum tolerance concentration of NaCl was 3 %. Rx1 converted thiosulfate to elemental sulfur and reduced sulfite to hydrogen sulfide. The bacterium grew by utilizing xylan and starch, as well as a wide range of monosaccharide and polysaccharides, including glucose and xylose. The main products of fermentation were ethanol, lactate, acetate, CO2, and H2. The maximum xylanase activity in the culture supernatant after 30 h of incubation at 55 °C was 16.2 U/ml. Rx1 DNA G + C content was 36 mol %. 16S rRNA gene sequence analysis indicated that strain Rx1 belonged to the genus Thermoanaerobacterium of the family 'Thermoanaerobacteriaceae' (Firmicutes), with Thermoanaerobacterium aciditolerans 761-119 (99.2 % 16S rRNA gene sequence similarity) being its closest relative. DNA-DNA hybridization between Rx1 and T. aciditolerans 761-119 showed 36 % relatedness. Based on its physiological and biochemical tests and DNA-DNA hybridization analyses, the isolate is considered to represent a novel species in the genus Thermoanaerobacterium, for which the name Thermoanaerobacterium calidifontis sp. nov. is proposed, with the type strain is Rx1 (=JCM 18270 = CCTCC M 2011109).

  18. The Complete Mitochondrial Genome of Coptotermes ‘suzhouensis’ (syn. Coptotermes formosanus) (Isoptera: Rhinotermitidae) and Molecular Phylogeny Analysis

    PubMed Central

    Li, Juan; Zhu, Jin-long; Lou, Shi-di; Wang, Ping; Zhang, You-sen; Wang, Lin; Yin, Ruo-chun; Zhang, Ping-ping

    2018-01-01

    Abstract Coptotermes suzhouensis (Isoptera: Rhinotermitidae) is a significant subterranean termite pest of wooden structures and is widely distributed in southeastern China. The complete mitochondrial DNA sequence of C. suzhouensis was analyzed in this study. The mitogenome was a circular molecule of 15,764 bp in length, which contained 13 protein-coding genes (PCGs), 22 transfer RNA genes, two ribosomal RNA genes, and an A+T-rich region with a gene arrangement typical of Isoptera mitogenomes. All PCGs were initiated by ATN codons and terminated by complete termination codons (TAA), except COX2, ND5, and Cytb, which ended with an incomplete termination codon T. All tRNAs displayed a typical clover-leaf structure, except for tRNASer(AGN), which did not contain the stem-loop structure in the DHU arm. The A+T content (69.23%) of the A+T-rich region (949 bp) was higher than that of the entire mitogenome (65.60%), and two different sets of repeat units (A+B) were distributed in this region. Comparison of complete mitogenome sequences with those of Coptotermes formosanus indicated that the two taxa have very high genetic similarity. Forty-one representative termite species were used to construct phylogenetic trees by maximum likelihood, maximum parsimony, and Bayesian inference methods. The phylogenetic analyses also strongly supported (BPP, MLBP, and MPBP = 100%) that all C. suzhouensis and C. formosanus samples gathered into one clade with genetic distances between 0.000 and 0.002. This study provides molecular evidence for a more robust phylogenetic position of C. suzhouensis and inferrs that C. suzhouensis was the synonymy of C. formosanus. PMID:29718488

  19. Emergence of Cryptosporidium hominis Monkey Genotype II and Novel Subtype Family Ik in the Squirrel Monkey (Saimiri sciureus) in China.

    PubMed

    Liu, Xuehan; Xie, Na; Li, Wei; Zhou, Ziyao; Zhong, Zhijun; Shen, Liuhong; Cao, Suizhong; Yu, Xingming; Hu, Yanchuan; Chen, Weigang; Peng, Gangneng

    2015-01-01

    A single Cryptosporidium isolate from a squirrel monkey with no clinical symptoms was obtained from a zoo in Ya'an city, China, and was genotyped by PCR amplification and DNA sequencing of the small-subunit ribosomal RNA (SSU rRNA), 70-kDa heat shock protein (HSP70), Cryptosporidium oocyst wall protein, and actin genes. This multilocus genetic characterization determined that the isolate was Cryptosporidium hominis, but carried 2, 10, and 6 nucleotide differences in the SSU rRNA, HSP70, and actin loci, respectively, which is comparable to the variations at these loci between C. hominis and the previously reported monkey genotype (2, 3, and 3 nucleotide differences). Phylogenetic studies, based on neighbor-joining and maximum likelihood methods, showed that the isolate identified in the current study had a distinctly discordant taxonomic status, distinct from known C. hominis and also from the monkey genotype, with respect to the three loci. Restriction fragment length polymorphisms of the SSU rRNA gene obtained from this study were similar to those of known C. hominis but clearly differentiated from the monkey genotype. Further subtyping was performed by sequence analysis of the gene encoding the 60-kDa glycoprotein (gp60). Maximum homology of only 88.3% to C. hominis subtype IdA10G4 was observed for the current isolate, and phylogenetic analysis demonstrated that this particular isolate belonged to a novel C. hominis subtype family, IkA7G4. This study is the first to report C. hominis infection in the squirrel monkey and, based on the observed genetic characteristics, confirms a new C. hominis genotype, monkey genotype II. Thus, these results provide novel insights into genotypic variation in C. hominis.

  20. Assessing the genetic diversity of Cu resistance in mine tailings through high-throughput recovery of full-length copA genes

    PubMed Central

    Li, Xiaofang; Zhu, Yong-Guan; Shaban, Babak; Bruxner, Timothy J. C.; Bond, Philip L.; Huang, Longbin

    2015-01-01

    Characterizing the genetic diversity of microbial copper (Cu) resistance at the community level remains challenging, mainly due to the polymorphism of the core functional gene copA. In this study, a local BLASTN method using a copA database built in this study was developed to recover full-length putative copA sequences from an assembled tailings metagenome; these sequences were then screened for potentially functioning CopA using conserved metal-binding motifs, inferred by evolutionary trace analysis of CopA sequences from known Cu resistant microorganisms. In total, 99 putative copA sequences were recovered from the tailings metagenome, out of which 70 were found with high potential to be functioning in Cu resistance. Phylogenetic analysis of selected copA sequences detected in the tailings metagenome showed that topology of the copA phylogeny is largely congruent with that of the 16S-based phylogeny of the tailings microbial community obtained in our previous study, indicating that the development of copA diversity in the tailings might be mainly through vertical descent with few lateral gene transfer events. The method established here can be used to explore copA (and potentially other metal resistance genes) diversity in any metagenome and has the potential to exhaust the full-length gene sequences for downstream analyses. PMID:26286020

  1. Characterization of genetic sequence variation of 58 STR loci in four major population groups.

    PubMed

    Novroski, Nicole M M; King, Jonathan L; Churchill, Jennifer D; Seah, Lay Hong; Budowle, Bruce

    2016-11-01

    Massively parallel sequencing (MPS) can identify sequence variation within short tandem repeat (STR) alleles as well as their nominal allele lengths that traditionally have been obtained by capillary electrophoresis. Using the MiSeq FGx Forensic Genomics System (Illumina), STRait Razor, and in-house excel workbooks, genetic variation was characterized within STR repeat and flanking regions of 27 autosomal, 7 X-chromosome and 24 Y-chromosome STR markers in 777 unrelated individuals from four population groups. Seven hundred and forty six autosomal, 227 X-chromosome, and 324 Y-chromosome STR alleles were identified by sequence compared with 357 autosomal, 107 X-chromosome, and 189 Y-chromosome STR alleles that were identified by length. Within the observed sequence variation, 227 autosomal, 156 X-chromosome, and 112 Y-chromosome novel alleles were identified and described. One hundred and seventy six autosomal, 123 X-chromosome, and 93 Y-chromosome sequence variants resided within STR repeat regions, and 86 autosomal, 39 X-chromosome, and 20 Y-chromosome variants were located in STR flanking regions. Three markers, D18S51, DXS10135, and DYS385a-b had 1, 4, and 1 alleles, respectively, which contained both a novel repeat region variant and a flanking sequence variant in the same nucleotide sequence. There were 50 markers that demonstrated a relative increase in diversity with the variant sequence alleles compared with those of traditional nominal length alleles. These population data illustrate the genetic variation that exists in the commonly used STR markers in the selected population samples and provide allele frequencies for statistical calculations related to STR profiling with MPS data. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  2. A better sequence-read simulator program for metagenomics.

    PubMed

    Johnson, Stephen; Trost, Brett; Long, Jeffrey R; Pittet, Vanessa; Kusalik, Anthony

    2014-01-01

    There are many programs available for generating simulated whole-genome shotgun sequence reads. The data generated by many of these programs follow predefined models, which limits their use to the authors' original intentions. For example, many models assume that read lengths follow a uniform or normal distribution. Other programs generate models from actual sequencing data, but are limited to reads from single-genome studies. To our knowledge, there are no programs that allow a user to generate simulated data following non-parametric read-length distributions and quality profiles based on empirically-derived information from metagenomics sequencing data. We present BEAR (Better Emulation for Artificial Reads), a program that uses a machine-learning approach to generate reads with lengths and quality values that closely match empirically-derived distributions. BEAR can emulate reads from various sequencing platforms, including Illumina, 454, and Ion Torrent. BEAR requires minimal user input, as it automatically determines appropriate parameter settings from user-supplied data. BEAR also uses a unique method for deriving run-specific error rates, and extracts useful statistics from the metagenomic data itself, such as quality-error models. Many existing simulators are specific to a particular sequencing technology; however, BEAR is not restricted in this way. Because of its flexibility, BEAR is particularly useful for emulating the behaviour of technologies like Ion Torrent, for which no dedicated sequencing simulators are currently available. BEAR is also the first metagenomic sequencing simulator program that automates the process of generating abundances, which can be an arduous task. BEAR is useful for evaluating data processing tools in genomics. It has many advantages over existing comparable software, such as generating more realistic reads and being independent of sequencing technology, and has features particularly useful for metagenomics work.

  3. Harmonic Series Meets Fibonacci Sequence

    ERIC Educational Resources Information Center

    Chen, Hongwei; Kennedy, Chris

    2012-01-01

    The terms of a conditionally convergent series may be rearranged to converge to any prescribed real value. What if the harmonic series is grouped into Fibonacci length blocks? Or the harmonic series is arranged in alternating Fibonacci length blocks? Or rearranged and alternated into separate blocks of even and odd terms of Fibonacci length?

  4. Flameless Combustion Workshop

    DTIC Science & Technology

    2005-09-20

    Flame volume, and flame length during the HiTAC condition were further studied numerically and systematically. A simple HiTAC flame volume can be...oxygen concentration (stoichiometric ratio) is included, was derived to describe the local influence of buoyancy force along the chemical flame length . It...and low oxygen concentration oxidizer condition. Furthermore, the maximum entrainments along the flame length are estimated. 6. NO emission formed by

  5. Common position of indels that cause deviations from canonical genome organization in different measles virus strains.

    PubMed

    Ivancic-Jelecki, Jelena; Slovic, Anamarija; Šantak, Maja; Tešović, Goran; Forcic, Dubravko

    2016-07-29

    The canonical genome organization of measles virus (MV) is characterized by total size of 15 894 nucleotides (nts) and defined length of every genomic region, both coding and non-coding. Only rarely have reports of strains possessing non-canonical genomic properties (possessing indels, with or without the change of total genome length) been published. The observed mutations are mutually compensatory in a sense that the total genome length remains polyhexameric. Although programmed and highly precise pseudo-templated nucleotide additions during transcription are inherent to polymerases of all viruses belonging to family Paramyxoviridae, a similar mechanism that would serve to non-randomly correct genome length, if an indel has occurred during replication, has so far not been described in the context of a complete virus genome. We compiled all complete MV genomic sequences (64 in total) available in open access sequence databases. Multiple sequence comparisons and phylogenetic analyses were performed with the aim of exploring whether non-recombinant and non-evolutionary linked measles strains that show deviations from canonical genome organization possess a common genetic characteristic. In 11 MV sequences we detected deviations from canonical genome organization due to short indels located within homopolymeric stretches or next to them. In nine out of 11 identified non-canonical MV sequences, a common feature was observed: one mutation, either an insertion or a deletion, was located in a 28 nts long region in F gene 5' untranslated region (positions 5051-5078 in genomic cDNA of canonical strains). This segment is composed of five tandemly linked homopolymeric stretches, its consensus sequence is G6-7C7-8A6-7G1-3C5-6. Although none of the mononucleotide repeats within this segment has fixed length, the total number of nts in canonical strains is always 28. These nine non-canonical strains, as well as the tenth (not mutated in 5051-5078 segment), can be grouped in three clusters, based on their passage histories/epidemiological data/genetic similarities. There are no indications that the 3 clusters are evolutionary linked, other than the fact that they all belong to clade D. A common narrow genomic region was found to be mutated in different, non-related, wild type strains suggesting that this region might have a function in non-random genome length corrections occurring during MV replication.

  6. K2 and K2*: efficient alignment-free sequence similarity measurement based on Kendall statistics.

    PubMed

    Lin, Jie; Adjeroh, Donald A; Jiang, Bing-Hua; Jiang, Yue

    2018-05-15

    Alignment-free sequence comparison methods can compute the pairwise similarity between a huge number of sequences much faster than sequence-alignment based methods. We propose a new non-parametric alignment-free sequence comparison method, called K2, based on the Kendall statistics. Comparing to the other state-of-the-art alignment-free comparison methods, K2 demonstrates competitive performance in generating the phylogenetic tree, in evaluating functionally related regulatory sequences, and in computing the edit distance (similarity/dissimilarity) between sequences. Furthermore, the K2 approach is much faster than the other methods. An improved method, K2*, is also proposed, which is able to determine the appropriate algorithmic parameter (length) automatically, without first considering different values. Comparative analysis with the state-of-the-art alignment-free sequence similarity methods demonstrates the superiority of the proposed approaches, especially with increasing sequence length, or increasing dataset sizes. The K2 and K2* approaches are implemented in the R language as a package and is freely available for open access (http://community.wvu.edu/daadjeroh/projects/K2/K2_1.0.tar.gz). yueljiang@163.com. Supplementary data are available at Bioinformatics online.

  7. WebPrInSeS: automated full-length clone sequence identification and verification using high-throughput sequencing data.

    PubMed

    Massouras, Andreas; Decouttere, Frederik; Hens, Korneel; Deplancke, Bart

    2010-07-01

    High-throughput sequencing (HTS) is revolutionizing our ability to obtain cheap, fast and reliable sequence information. Many experimental approaches are expected to benefit from the incorporation of such sequencing features in their pipeline. Consequently, software tools that facilitate such an incorporation should be of great interest. In this context, we developed WebPrInSeS, a web server tool allowing automated full-length clone sequence identification and verification using HTS data. WebPrInSeS encompasses two separate software applications. The first is WebPrInSeS-C which performs automated sequence verification of user-defined open-reading frame (ORF) clone libraries. The second is WebPrInSeS-E, which identifies positive hits in cDNA or ORF-based library screening experiments such as yeast one- or two-hybrid assays. Both tools perform de novo assembly using HTS data from any of the three major sequencing platforms. Thus, WebPrInSeS provides a highly integrated, cost-effective and efficient way to sequence-verify or identify clones of interest. WebPrInSeS is available at http://webprinses.epfl.ch/ and is open to all users.

  8. WebPrInSeS: automated full-length clone sequence identification and verification using high-throughput sequencing data

    PubMed Central

    Massouras, Andreas; Decouttere, Frederik; Hens, Korneel; Deplancke, Bart

    2010-01-01

    High-throughput sequencing (HTS) is revolutionizing our ability to obtain cheap, fast and reliable sequence information. Many experimental approaches are expected to benefit from the incorporation of such sequencing features in their pipeline. Consequently, software tools that facilitate such an incorporation should be of great interest. In this context, we developed WebPrInSeS, a web server tool allowing automated full-length clone sequence identification and verification using HTS data. WebPrInSeS encompasses two separate software applications. The first is WebPrInSeS-C which performs automated sequence verification of user-defined open-reading frame (ORF) clone libraries. The second is WebPrInSeS-E, which identifies positive hits in cDNA or ORF-based library screening experiments such as yeast one- or two-hybrid assays. Both tools perform de novo assembly using HTS data from any of the three major sequencing platforms. Thus, WebPrInSeS provides a highly integrated, cost-effective and efficient way to sequence-verify or identify clones of interest. WebPrInSeS is available at http://webprinses.epfl.ch/ and is open to all users. PMID:20501601

  9. ATP hydrolysis provides functions that promote rejection of pairings between different copies of long repeated sequences

    PubMed Central

    Danilowicz, Claudia; Hermans, Laura; Coljee, Vincent; Prévost, Chantal

    2017-01-01

    Abstract During DNA recombination and repair, RecA family proteins must promote rapid joining of homologous DNA. Repeated sequences with >100 base pair lengths occupy more than 1% of bacterial genomes; however, commitment to strand exchange was believed to occur after testing ∼20–30 bp. If that were true, pairings between different copies of long repeated sequences would usually become irreversible. Our experiments reveal that in the presence of ATP hydrolysis even 75 bp sequence-matched strand exchange products remain quite reversible. Experiments also indicate that when ATP hydrolysis is present, flanking heterologous dsDNA regions increase the reversibility of sequence matched strand exchange products with lengths up to ∼75 bp. Results of molecular dynamics simulations provide insight into how ATP hydrolysis destabilizes strand exchange products. These results inspired a model that shows how pairings between long repeated sequences could be efficiently rejected even though most homologous pairings form irreversible products. PMID:28854739

  10. The Complete Sequence of a Human Parainfluenzavirus 4 Genome

    PubMed Central

    Yea, Carmen; Cheung, Rose; Collins, Carol; Adachi, Dena; Nishikawa, John; Tellier, Raymond

    2009-01-01

    Although the human parainfluenza virus 4 (HPIV4) has been known for a long time, its genome, alone among the human paramyxoviruses, has not been completely sequenced to date. In this study we obtained the first complete genomic sequence of HPIV4 from a clinical isolate named SKPIV4 obtained at the Hospital for Sick Children in Toronto (Ontario, Canada). The coding regions for the N, P/V, M, F and HN proteins show very high identities (95% to 97%) with previously available partial sequences for HPIV4B. The sequence for the L protein and the non-coding regions represent new information. A surprising feature of the genome is its length, more than 17 kb, making it the longest genome within the genus Rubulavirus, although the length is well within the known range of 15 kb to 19 kb for the subfamily Paramyxovirinae. The availability of a complete genomic sequence will facilitate investigations on a respiratory virus that is still not completely characterized. PMID:21994536

  11. Sequence Data for Clostridium autoethanogenum using Three Generations of Sequencing Technologies

    DOE PAGES

    Utturkar, Sagar M.; Klingeman, Dawn Marie; Bruno-Barcena, José M.; ...

    2015-04-14

    During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequencemore » datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data.« less

  12. Power law tails in phylogenetic systems.

    PubMed

    Qin, Chongli; Colwell, Lucy J

    2018-01-23

    Covariance analysis of protein sequence alignments uses coevolving pairs of sequence positions to predict features of protein structure and function. However, current methods ignore the phylogenetic relationships between sequences, potentially corrupting the identification of covarying positions. Here, we use random matrix theory to demonstrate the existence of a power law tail that distinguishes the spectrum of covariance caused by phylogeny from that caused by structural interactions. The power law is essentially independent of the phylogenetic tree topology, depending on just two parameters-the sequence length and the average branch length. We demonstrate that these power law tails are ubiquitous in the large protein sequence alignments used to predict contacts in 3D structure, as predicted by our theory. This suggests that to decouple phylogenetic effects from the interactions between sequence distal sites that control biological function, it is necessary to remove or down-weight the eigenvectors of the covariance matrix with largest eigenvalues. We confirm that truncating these eigenvectors improves contact prediction.

  13. On the Importance of Cycle Minimum in Sunspot Cycle Prediction

    NASA Technical Reports Server (NTRS)

    Wilson, Robert M.; Hathaway, David H.; Reichmann, Edwin J.

    1996-01-01

    The characteristics of the minima between sunspot cycles are found to provide important information for predicting the amplitude and timing of the following cycle. For example, the time of the occurrence of sunspot minimum sets the length of the previous cycle, which is correlated by the amplitude-period effect to the amplitude of the next cycle, with cycles of shorter (longer) than average length usually being followed by cycles of larger (smaller) than average size (true for 16 of 21 sunspot cycles). Likewise, the size of the minimum at cycle onset is correlated with the size of the cycle's maximum amplitude, with cycles of larger (smaller) than average size minima usually being associated with larger (smaller) than average size maxima (true for 16 of 22 sunspot cycles). Also, it was found that the size of the previous cycle's minimum and maximum relates to the size of the following cycle's minimum and maximum with an even-odd cycle number dependency. The latter effect suggests that cycle 23 will have a minimum and maximum amplitude probably larger than average in size (in particular, minimum smoothed sunspot number Rm = 12.3 +/- 7.5 and maximum smoothed sunspot number RM = 198.8 +/- 36.5, at the 95-percent level of confidence), further suggesting (by the Waldmeier effect) that it will have a faster than average rise to maximum (fast-rising cycles have ascent durations of about 41 +/- 7 months). Thus, if, as expected, onset for cycle 23 will be December 1996 +/- 3 months, based on smoothed sunspot number, then the length of cycle 22 will be about 123 +/- 3 months, inferring that it is a short-period cycle and that cycle 23 maximum amplitude probably will be larger than average in size (from the amplitude-period effect), having an RM of about 133 +/- 39 (based on the usual +/- 30 percent spread that has been seen between observed and predicted values), with maximum amplitude occurrence likely sometime between July 1999 and October 2000.

  14. Smooth muscle fatigue due to repeated urinary bladder neurostimulation: an in vivo study.

    PubMed

    Bross, S; Schumacher, S; Scheepe, J R; Seif, C; Jünemann, K P; Alken, P

    1999-01-01

    The presented study investigates the influence of different pause lengths between two consecutive stimulations of the S3 roots on intravesical pressure during bladder neurostimulation. In eight male foxhounds (aged 7-18 months), laminectomy and placement of a modified Brindley electrode were performed. In four series with different pause lengths between two consecutive stimulations (1, 3, 5, and 15 min), the maximum intravesical pressure was measured during stimulation. The changes in intravesical pressure were registered in these four series, each series with six stimulations. A 15-min interval elapsed before the commencement of each series. In the series with a pause length of 15 min, the consecutive stimulations did not result in significant changes in maximum intravesical pressure. In the 5-min series, a significant decrease in intravesical pressure was not observed after the third stimulation. In the 3-min series, a significant decrease was seen at almost every stimulation (average decrease of 3.8% per stimulation) and in the 1-min series, a significant decrease was also observed at almost every stimulation (average decrease of 5.9% per stimulation). The results of repeated bladder neurostimulation demonstrate that the maximum intravesical pressure is dependent on the pause length between two consecutive stimulations. The detrusor muscle showed reversible and short-lived signs of fatigue. This implies the importance of a minimum 5-min interval between two subsequent stimulations. A pause length <5 min leads to a falsification of the results and thus to lower validity of the investigation.

  15. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  16. Whole Genome Sequencing of Greater Amberjack (Seriola dumerili) for SNP Identification on Aligned Scaffolds and Genome Structural Variation Analysis Using Parallel Resequencing

    PubMed Central

    Aokic, Jun-ya; Kawase, Junya; Hamada, Kazuhisa; Fujimoto, Hiroshi; Yamamoto, Ikki; Usuki, Hironori

    2018-01-01

    Greater amberjack (Seriola dumerili) is distributed in tropical and temperate waters worldwide and is an important aquaculture fish. We carried out de novo sequencing of the greater amberjack genome to construct a reference genome sequence to identify single nucleotide polymorphisms (SNPs) for breeding amberjack by marker-assisted or gene-assisted selection as well as to identify functional genes for biological traits. We obtained 200 times coverage and constructed a high-quality genome assembly using next generation sequencing technology. The assembled sequences were aligned onto a yellowtail (Seriola quinqueradiata) radiation hybrid (RH) physical map by sequence homology. A total of 215 of the longest amberjack sequences, with a total length of 622.8 Mbp (92% of the total length of the genome scaffolds), were lined up on the yellowtail RH map. We resequenced the whole genomes of 20 greater amberjacks and mapped the resulting sequences onto the reference genome sequence. About 186,000 nonredundant SNPs were successfully ordered on the reference genome. Further, we found differences in the genome structural variations between two greater amberjack populations using BreakDancer. We also analyzed the greater amberjack transcriptome and mapped the annotated sequences onto the reference genome sequence. PMID:29785397

  17. No evidence for the use of DIR, D–D fusions, chromosome 15 open reading frames or VHreplacement in the peripheral repertoire was found on application of an improved algorithm, JointML, to 6329 human immunoglobulin H rearrangements

    PubMed Central

    Ohm-Laursen, Line; Nielsen, Morten; Larsen, Stine R; Barington, Torben

    2006-01-01

    Antibody diversity is created by imprecise joining of the variability (V), diversity (D) and joining (J) gene segments of the heavy and light chain loci. Analysis of rearrangements is complicated by somatic hypermutations and uncertainty concerning the sources of gene segments and the precise way in which they recombine. It has been suggested that D genes with irregular recombination signal sequences (DIR) and chromosome 15 open reading frames (OR15) can replace conventional D genes, that two D genes or inverted D genes may be used and that the repertoire can be further diversified by heavy chain V gene (VH) replacement. Safe conclusions require large, well-defined sequence samples and algorithms minimizing stochastic assignment of segments. Two computer programs were developed for analysis of heavy chain joints. JointHMM is a profile hidden Markow model, while JointML is a maximum-likelihood-based method taking the lengths of the joint and the mutational status of the VH gene into account. The programs were applied to a set of 6329 clonally unrelated rearrangements. A conventional D gene was found in 80% of unmutated sequences and 64% of mutated sequences, while D-gene assignment was kept below 5% in artificial (randomly permutated) rearrangements. No evidence for the use of DIR, OR15, multiple D genes or VH replacements was found, while inverted D genes were used in less than 1‰ of the sequences. JointML was shown to have a higher predictive performance for D-gene assignment in mutated and unmutated sequences than four other publicly available programs. An online version 1·0 of JointML is available at http://www.cbs.dtu.dk/services/VDJsolver. PMID:17005006

  18. The Effects of Angular Orientation on Flame Spread over Thin Materials

    DTIC Science & Technology

    1999-12-01

    Notation 7 5 Upward Spread With Burnout 8 6a Observed Flame Lengths on Napkins, Increments 2.5 cm 9 6b Observed Flame Lengths on Pet Film, Increments...Frequency of Extinguishment During Flame Spread 21 15 Flame Spread Velocity 21 VI 16 Flame Length Measured Parallel to the Surface 22 17 Comparison of... flame length (Lf) were measured from a video recording of the test. Despite erratic burn fronts with discontinuous flaming regions, the maximum

  19. Evaluation of Methods for de novo Genome assembly from High-throughput Sequencing Reads Reveals Dependencies that Affect the Quality of the Results

    USDA-ARS?s Scientific Manuscript database

    Recent developments in high-throughput sequencing technology have made low-cost sequencing an attractive approach for many genome analysis tasks. Increasing read lengths, improving quality and the production of increasingly larger numbers of usable sequences per instrument-run continue to make whole...

  20. Synthesis of DNA

    DOEpatents

    Mariella, Jr., Raymond P.

    2008-11-18

    A method of synthesizing a desired double-stranded DNA of a predetermined length and of a predetermined sequence. Preselected sequence segments that will complete the desired double-stranded DNA are determined. Preselected segment sequences of DNA that will be used to complete the desired double-stranded DNA are provided. The preselected segment sequences of DNA are assembled to produce the desired double-stranded DNA.

  1. Preliminary analysis of length and GC content variation in the ribosomal first internal transcribed spacer (ITS1) of marine animals.

    PubMed

    Chow, S; Ueno, Y; Toyokawa, M; Oohara, I; Takeyama, H

    2009-01-01

    Length and guanine-cytosine (GC) content of the ribosomal first internal transcribed spacer (ITS1) were compared across a wide variety of marine animal species, and its phylogenetic utility was investigated. From a total of 773 individuals representing 599 species, we only failed to amplify the ITS1 sequence from 87 individuals by polymerase chain reaction with universal ITS1 primers. No species was found to have an ITS1 region shorter than 100 bp. In general, the ITS1 sequences of vertebrates were longer (318 to 2,318 bp) and richer in GC content (56.8% to 78%) than those of invertebrates (117 to 1,613 bp and 35.8% to 71.3%, respectively). Specifically, gelatinous animals (Cnidaria and Ctenophora) were observed to have short ITS1 sequences (118 to 422 bp) with lower GC content (35.8% to 61.7%) than the other animal taxa. Mollusca and Crustacea were diverse groups with respect to ITS1 length, ranging from 108 to 1,118 and 182 to 1,613 bp, respectively. No universal relationship between length and GC content was observed. Our data indicated that ITS1 has a limited utility for phylogenetic analysis as obtaining confident sequence alignment was often impossible between different genera of the same family and even between congeneric species.

  2. Grasp and index finger reach zone during one-handed smartphone rear interaction: effects of task type, phone width and hand length.

    PubMed

    Lee, Songil; Kyung, Gyouhyung; Lee, Jungyong; Moon, Seung Ki; Park, Kyoung Jong

    2016-11-01

    Recently, some smartphones have introduced index finger interaction functions on the rear surface. The current study investigated the effects of task type, phone width, and hand length on grasp, index finger reach zone, discomfort, and muscle activation during such interaction. We considered five interaction tasks (neutral, comfortable, maximum, vertical, and horizontal strokes), two device widths (60 and 90 mm) and three hand lengths. Horizontal (vertical) strokes deviated from the horizontal axis in the range from -10.8° to -13.5° (81.6-88.4°). Maximum strokes appeared to be excessive as these caused 43.8% greater discomfort than did neutral strokes. The 90-mm width also appeared to be excessive as it resulted in 12.3% increased discomfort relative to the 60-mm width. The small-hand group reported 11.9-18.2% higher discomfort ratings, and the percent maximum voluntary exertion of their flexor digitorum superficialis muscle, pertaining to index finger flexion, was also 6.4% higher. These findings should be considered to make smartphone rear interaction more comfortable. Practitioner Summary: Among neutral, comfortable, maximum, horizontal, and vertical index finger strokes on smartphone rear surfaces, maximum vs. neutral strokes caused 43.8% greater discomfort. Horizontal (vertical) strokes deviated from the horizontal (vertical) axis. Discomfort increased by 12.3% with 90-mm- vs. 60-mm-wide devices. Rear interaction regions of five commercialised smartphones should be lowered 20 to 30 mm for more comfortable rear interaction.

  3. Use of extremely short Förster resonance energy transfer probes in real-time polymerase chain reaction

    PubMed Central

    Kutyavin, Igor V.

    2013-01-01

    Described in the article is a new approach for the sequence-specific detection of nucleic acids in real-time polymerase chain reaction (PCR) using fluorescently labeled oligonucleotide probes. The method is based on the production of PCR amplicons, which fold into dumbbell-like secondary structures carrying a specially designed ‘probe-luring’ sequence at their 5′ ends. Hybridization of this sequence to a complementary ‘anchoring’ tail introduced at the 3′ end of a fluorescent probe enables the probe to bind to its target during PCR, and the subsequent probe cleavage results in the florescence signal. As it has been shown in the study, this amplicon-endorsed and guided formation of the probe-target duplex allows the use of extremely short oligonucleotide probes, up to tetranucleotides in length. In particular, the short length of the fluorescent probes makes possible the development of a ‘universal’ probe inventory that is relatively small in size but represents all possible sequence variations. The unparalleled cost-effectiveness of the inventory approach is discussed. Despite the short length of the probes, this new method, named Angler real-time PCR, remains highly sequence specific, and the results of the study indicate that it can be effectively used for quantitative PCR and the detection of polymorphic variations. PMID:24013564

  4. Autogenic dynamics of debris-flow fans

    NASA Astrophysics Data System (ADS)

    van den Berg, Wilco; de Haas, Tjalling; Braat, Lisanne; Kleinhans, Maarten

    2015-04-01

    Alluvial fans develop their semi-conical shape by cyclic avulsion of their geomorphologically active sector from a fixed fan apex. These cyclic avulsions have been attributed to both allogenic and autogenic forcings and processes. Autogenic dynamics have been extensively studied on fluvial fans through physical scale experiments, and are governed by cyclic alternations of aggradation by unconfined sheet flow, fanhead incision leading to channelized flow, channel backfilling and avulsion. On debris-flow fans, however, autogenic dynamics have not yet been directly observed. We experimentally created debris-flow fans under constant extrinsic forcings, and show that autogenic dynamics are a fundamental intrinsic process on debris-flow fans. We found that autogenic cycles on debris-flow fans are driven by sequences of backfilling, avulsion and channelization, similar to the cycles on fluvial fans. However, the processes that govern these sequences are unique for debris-flow fans, and differ fundamentally from the processes that govern autogenic dynamics on fluvial fans. We experimentally observed that backfilling commenced after the debris flows reached their maximum possible extent. The next debris flows then progressively became shorter, driven by feedbacks on fan morphology and flow-dynamics. The progressively decreasing debris-flow length caused in-channel sedimentation, which led to increasing channel overflow and wider debris flows. This reduced the impulse of the liquefied flow body to the flow front, which then further reduced flow velocity and runout length, and induced further in-channel sedimentation. This commenced a positive feedback wherein debris flows became increasingly short and wide, until the channel was completely filled and the apex cross-profile was plano-convex. At this point, there was no preferential transport direction by channelization, and the debris flows progressively avulsed towards the steepest, preferential, flow path. Simultaneously, the debris flows started to channelize, forced by increasingly effective concentration of the flow impulse to the flow front, which caused more effective lateral levee formation and an increasingly well-defined channel. This process continued until the debris flows reached their maximum possible extent and the cycle was reverted. Channelization occurred in the absence of erosion, in contrast with fluvial fans. Backfilling and channelization cycles were gradual and symmetric, requiring multiple debris flows to be completed. These results add debris-flow fans to the spectrum of fan-shaped aqueous systems that are affected by autogenic dynamics, now ranging from low-gradient rivers systems to steep-gradient mass-flow fans.

  5. Identifying the role of initial wave parameters on tsunami focusing

    NASA Astrophysics Data System (ADS)

    Aydın, Baran

    2018-04-01

    Unexpected local tsunami amplification, which is referred to as tsunami focusing, is attributed to two different mechanisms: bathymetric features of the ocean bottom such as underwater ridges and dipolar shape of the initial wave itself. In this study, we characterize the latter; that is, we explore how amplitude and location of the focusing point vary with certain geometric parameters of the initial wave such as its steepness and crest length. Our results reveal two important features of tsunami focusing: for mild waves maximum wave amplitude increases significantly with transverse length of wave crest, while location of the focusing point is almost invariant. For steep waves, on the other hand, increasing crest length dislocates focusing point significantly, while it causes a rather small increase in wave maximum.

  6. Conservation and variability of West Nile virus proteins.

    PubMed

    Koo, Qi Ying; Khan, Asif M; Jung, Keun-Ok; Ramdas, Shweta; Miotto, Olivo; Tan, Tin Wee; Brusic, Vladimir; Salmon, Jerome; August, J Thomas

    2009-01-01

    West Nile virus (WNV) has emerged globally as an increasingly important pathogen for humans and domestic animals. Studies of the evolutionary diversity of the virus over its known history will help to elucidate conserved sites, and characterize their correspondence to other pathogens and their relevance to the immune system. We describe a large-scale analysis of the entire WNV proteome, aimed at identifying and characterizing evolutionarily conserved amino acid sequences. This study, which used 2,746 WNV protein sequences collected from the NCBI GenPept database, focused on analysis of peptides of length 9 amino acids or more, which are immunologically relevant as potential T-cell epitopes. Entropy-based analysis of the diversity of WNV sequences, revealed the presence of numerous evolutionarily stable nonamer positions across the proteome (entropy value of < or = 1). The representation (frequency) of nonamers variant to the predominant peptide at these stable positions was, generally, low (< or = 10% of the WNV sequences analyzed). Eighty-eight fragments of length 9-29 amino acids, representing approximately 34% of the WNV polyprotein length, were identified to be identical and evolutionarily stable in all analyzed WNV sequences. Of the 88 completely conserved sequences, 67 are also present in other flaviviruses, and several have been associated with the functional and structural properties of viral proteins. Immunoinformatic analysis revealed that the majority (78/88) of conserved sequences are potentially immunogenic, while 44 contained experimentally confirmed human T-cell epitopes. This study identified a comprehensive catalogue of completely conserved WNV sequences, many of which are shared by other flaviviruses, and majority are potential epitopes. The complete conservation of these immunologically relevant sequences through the entire recorded WNV history suggests they will be valuable as components of peptide-specific vaccines or other therapeutic applications, for sequence-specific diagnosis of a wide-range of Flavivirus infections, and for studies of homologous sequences among other flaviviruses.

  7. Sequencing, Analysis, and Annotation of Expressed Sequence Tags for Camelus dromedarius

    PubMed Central

    Al-Swailem, Abdulaziz M.; Shehata, Maher M.; Abu-Duhier, Faisel M.; Al-Yamani, Essam J.; Al-Busadah, Khalid A.; Al-Arawi, Mohammed S.; Al-Khider, Ali Y.; Al-Muhaimeed, Abdullah N.; Al-Qahtani, Fahad H.; Manee, Manee M.; Al-Shomrani, Badr M.; Al-Qhtani, Saad M.; Al-Harthi, Amer S.; Akdemir, Kadir C.; Otu, Hasan H.

    2010-01-01

    Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and ∼40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism. PMID:20502665

  8. Contribution of Zinc Solubilizing Bacteria in Growth Promotion and Zinc Content of Wheat.

    PubMed

    Kamran, Sana; Shahid, Izzah; Baig, Deeba N; Rizwan, Muhammad; Malik, Kauser A; Mehnaz, Samina

    2017-01-01

    Zinc is an imperative micronutrient required for optimum plant growth. Zinc solubilizing bacteria are potential alternatives for zinc supplementation and convert applied inorganic zinc to available forms. This study was conducted to screen zinc solubilizing rhizobacteria isolated from wheat and sugarcane, and to analyze their effect on wheat growth and development. Fourteen exo-polysaccharides producing bacterial isolates of wheat were identified and characterized biochemically as well as on the basis of 16S rRNA gene sequences. Along these, 10 identified sugarcane isolates were also screened for zinc solubilizing ability on five different insoluble zinc sources. Out of 24, five strains, i.e., EPS 1 ( Pseudomonas fragi) , EPS 6 ( Pantoea dispersa) , EPS 13 ( Pantoea agglomerans) , PBS 2 ( E. cloacae) and LHRW1 ( Rhizobium sp.) were selected (based on their zinc solubilizing and PGP activities) for pot scale plant experiments. ZnCO 3 was used as zinc source and wheat seedlings were inoculated with these five strains, individually, to assess their effect on plant growth and development. The effect on plants was analyzed based on growth parameters and quantifying zinc content of shoot, root and grains using atomic absorption spectroscopy. Plant experiment was performed in two sets. For first set of plant experiments (harvested after 1 month), maximum shoot and root dry weights and shoot lengths were noted for the plants inoculated with Rhizobium sp. (LHRW1) while E. cloacae (PBS 2) increased both shoot and root lengths. Highest zinc content was found in shoots of E. cloacae (PBS 2) and in roots of P. agglomerans (EPS 13) followed by zinc supplemented control. For second set of plant experiment, when plants were harvested after three months, Pantoea dispersa (EPS 6), P. agglomerans (EPS 13) and E. cloacae (PBS 2) significantly increased shoot dry weights. However, significant increase in root dry weights and maximum zinc content was recorded for Pseudomonas fragi (EPS 1) inoculated plants, isolated from wheat rhizosphere. While maximum zinc content for roots was quantified in the control plants indicating the plant's inability to transport zinc to grains, supporting accelerated bioavailability of zinc to plant grains with zinc solubilizing rhizobacteria.

  9. The influence of muscle length on the fatigue-related reduction in joint range of motion of the human dorsiflexors.

    PubMed

    Cheng, Arthur J; Davidson, Andrew W; Rice, Charles L

    2010-06-01

    The fatigue-related reduction in joint range of motion (ROM) during dynamic contraction tasks may be related to muscle length-dependent alterations in torque and contractile kinetics, but this has not been systematically explored previously. Twelve young men performed a repetitive voluntary muscle shortening contraction task of the dorsiflexors at a contraction load of 30% of maximum voluntary isometric contraction (MVC) torque, until total 40 degrees ROM had decreased by 50% at task failure (POST) to 20 degrees ROM. At both a short (5 degrees dorsiflexion) and long muscle length (35 degrees plantar flexion joint angle relative to a 0 degrees neutral ankle joint position), voluntary activation, MVC torque, and evoked tibialis anterior contractile properties of a 52.8 Hz high-frequency isometric tetanus [peak evoked torque, maximum rate of torque development (MRTD), maximum rate of relaxation (MRR)] were evaluated at baseline (PRE), at POST, and up to 10 min of recovery. At POST, we measured similar fatigue-related reductions in torque (voluntary and evoked) and slowing of contractile kinetics (MRTD and MRR) at both the short and long muscle lengths. Thus, the fatigue-related reduction in ROM could not be explained by length-dependent fatigue. Although torque (voluntary and evoked) at both muscle lengths was depressed and remained blunted throughout the recovery period, this was not related to the rapid recovery of ROM at 0.5 min after task failure. The reduction in ROM, however, was strongly related to the reduction in joint angular velocity (R(2) = 0.80) during the fatiguing task, although additional factors cannot yet be overlooked.

  10. Approximate sample sizes required to estimate length distributions

    USGS Publications Warehouse

    Miranda, L.E.

    2007-01-01

    The sample sizes required to estimate fish length were determined by bootstrapping from reference length distributions. Depending on population characteristics and species-specific maximum lengths, 1-cm length-frequency histograms required 375-1,200 fish to estimate within 10% with 80% confidence, 2.5-cm histograms required 150-425 fish, proportional stock density required 75-140 fish, and mean length required 75-160 fish. In general, smaller species, smaller populations, populations with higher mortality, and simpler length statistics required fewer samples. Indices that require low sample sizes may be suitable for monitoring population status, and when large changes in length are evident, additional sampling effort may be allocated to more precisely define length status with more informative estimators. ?? Copyright by the American Fisheries Society 2007.

  11. Young Children's Understandings of Length Measurement: Evaluating a Learning Trajectory

    ERIC Educational Resources Information Center

    Szilagyi, Janka; Clements, Douglas H.; Sarama, Julie

    2013-01-01

    This study investigated the development of length measurement ideas in students from prekindergarten through 2nd grade. The main purpose was to evaluate and elaborate the developmental progression, or levels of thinking, of a hypothesized learning trajectory for length measurement to ensure that the sequence of levels of thinking is consistent…

  12. Recognition of maximum flooding events in mixed siliciclastic-carbonate systems: Key to global chronostratigraphic correlation

    USGS Publications Warehouse

    Mancini, E.A.; Tew, B.H.

    1997-01-01

    The maximum flooding event within a depositional sequence is an important datum for correlation because it represents a virtually synchronous horizon. This event is typically recognized by a distinctive physical surface and/or a significant change in microfossil assemblages (relative fossil abundance peaks) in siliciclastic deposits from shoreline to continental slope environments in a passive margin setting. Recognition of maximum flooding events in mixed siliciclastic-carbonate sediments is more complicated because the entire section usually represents deposition in continental shelf environments with varying rates of biologic and carbonate productivity versus siliciclastic influx. Hence, this event cannot be consistently identified simply by relative fossil abundance peaks. Factors such as siliciclastic input, carbonate productivity, sediment accumulation rates, and paleoenvironmental conditions dramatically affect the relative abundances of microfossils. Failure to recognize these complications can lead to a sequence stratigraphic interpretation that substantially overestimates the number of depositional sequences of 1 to 10 m.y. duration.

  13. The Main Sequence of Explosive Solar Active Regions: Comparison of Emerging and Mature Active Regions

    NASA Technical Reports Server (NTRS)

    Falconer, David; Moore, Ron

    2011-01-01

    For mature active regions, an active region s magnetic flux content determines the maximum free energy the active region can have. Most Large flares and CMEs occur in active regions that are near their free-energy limit. Active-region flare power radiated in the GOES 1-8 band increases steeply as the free-energy limit is approached. We infer that the free-energy limit is set by the rate of release of an active region s free magnetic energy by flares, CMEs and coronal heating balancing the maximum rate the Sun can put free energy into the active region s magnetic field. This balance of maximum power results in explosive active regions residing in a "mainsequence" in active-region (flux content, free energy content) phase space, which sequence is analogous to the main sequence of hydrogen-burning stars in (mass, luminosity) phase space.

  14. Molecular phylogenetic relationships among Lemnaceae and Araceae using the chloroplast trnL-trnF intergenic spacer.

    PubMed

    Rothwell, Gar W; Van Atta, Michelle R; Ballard, Harvey E; Stockey, Ruth A

    2004-02-01

    We test competing hypotheses of relationships among Aroids (Araceae) and duckweeds (Lemnaceae) using sequences of the trnL-trnF spacer region of the chloroplast genome. Included in the analysis were 22 aroid genera including Pistia and five genera of Lemnaceae including the recently segregated genus Landoltia. Aponogeton was used as an outgroup to root the tree. A data set of 522 aligned nucleotides yielded maximum parsimony and maximum likelihood trees similar to those previously derived from restriction site data. Pistia and the Lemnaceae are placed in two separate and well-supported clades, suggesting at least two independent origins of the floating aquatic growth form within the aroid clade. Within the Lemnaceae there is only partial support for the paradigm of sequential morphological reduction, given that Wolffia is sister to Wolffiella+Lemna. As in the results of the restriction site analysis, pantropical Pistia is placed with Colocasia and Typhonium of southeastern Asia, indicative of Old World affinities. Branch lengths leading to duckweed terminal taxa are much longer relative to other ingroup taxa (including Pistia), evidently as a result of higher rates of nucleotide substitutions and insertion/deletion events. Morphological reduction within the duckweeds roughly correlates with accelerated chloroplast genome evolution.

  15. SHORT-TERM SOLAR FLARE PREDICTION USING MULTIRESOLUTION PREDICTORS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yu Daren; Huang Xin; Hu Qinghua

    2010-01-20

    Multiresolution predictors of solar flares are constructed by a wavelet transform and sequential feature extraction method. Three predictors-the maximum horizontal gradient, the length of neutral line, and the number of singular points-are extracted from Solar and Heliospheric Observatory/Michelson Doppler Imager longitudinal magnetograms. A maximal overlap discrete wavelet transform is used to decompose the sequence of predictors into four frequency bands. In each band, four sequential features-the maximum, the mean, the standard deviation, and the root mean square-are extracted. The multiresolution predictors in the low-frequency band reflect trends in the evolution of newly emerging fluxes. The multiresolution predictors in the high-frequencymore » band reflect the changing rates in emerging flux regions. The variation of emerging fluxes is decoupled by wavelet transform in different frequency bands. The information amount of these multiresolution predictors is evaluated by the information gain ratio. It is found that the multiresolution predictors in the lowest and highest frequency bands contain the most information. Based on these predictors, a C4.5 decision tree algorithm is used to build the short-term solar flare prediction model. It is found that the performance of the short-term solar flare prediction model based on the multiresolution predictors is greatly improved.« less

  16. Comparing the November 2002 Denali and November 2001 Kunlun earthquakes

    USGS Publications Warehouse

    Bufe, C.G.

    2004-01-01

    Major strike-slip earthquakes recently occurred in Alaska on the central Denali fault (M 7.9) on 3 November 2002, and in Tibet on the central Kunlun fault (M 7.8) on 14 November 2001. Both earthquakes generated large surface waves with Ms [U.S. Geological Survey (USGS)] of 8.5 (Denali) and 8.0 (Kunlun). Each event occurred on an east-west-trending strike-slip fault situated near the northern boundary of an intense deformation zone that is characterized by lateral extrusion and rotation of crustal blocks. Each earthquake produced east-directed nearly unilateral ruptures that propagated 300 to 400 km. Maximum lateral surface offsets and maximum moment release occurred well beyond 100 km from the rupture initiation, with the events exhibiting by far the largest separations of USGS hypocenter and Harvard Moment Tensor Centroid (CMT) for strike-slip earthquakes in the 27-year CMT catalog. In each sequence, the largest aftershock was more than two orders of magnitude smaller than the mainshock. Regional moment release had been accelerating prior to the main shocks. The close proximity in space and time of the 1964 Prince William Sound and 2002 Denali earthquakes, relative to their rupture lengths and estimated return times, suggests that these events may be part of a recurrent cluster in the vicinity of a complex plate boundary.

  17. Not all (possibly) “random” sequences are created equal

    PubMed Central

    Pincus, Steve; Kalman, Rudolf E.

    1997-01-01

    The need to assess the randomness of a single sequence, especially a finite sequence, is ubiquitous, yet is unaddressed by axiomatic probability theory. Here, we assess randomness via approximate entropy (ApEn), a computable measure of sequential irregularity, applicable to single sequences of both (even very short) finite and infinite length. We indicate the novelty and facility of the multidimensional viewpoint taken by ApEn, in contrast to classical measures. Furthermore and notably, for finite length, finite state sequences, one can identify maximally irregular sequences, and then apply ApEn to quantify the extent to which given sequences differ from maximal irregularity, via a set of deficit (defm) functions. The utility of these defm functions which we show allows one to considerably refine the notions of probabilistic independence and normality, is featured in several studies, including (i) digits of e, π, √2, and √3, both in base 2 and in base 10, and (ii) sequences given by fractional parts of multiples of irrationals. We prove companion analytic results, which also feature in a discussion of the role and validity of the almost sure properties from axiomatic probability theory insofar as they apply to specified sequences and sets of sequences (in the physical world). We conclude by relating the present results and perspective to both previous and subsequent studies. PMID:11038612

  18. Unified Deep Learning Architecture for Modeling Biology Sequence.

    PubMed

    Wu, Hongjie; Cao, Chengyuan; Xia, Xiaoyan; Lu, Qiang

    2017-10-09

    Prediction of the spatial structure or function of biological macromolecules based on their sequence remains an important challenge in bioinformatics. When modeling biological sequences using traditional sequencing models, characteristics, such as long-range interactions between basic units, the complicated and variable output of labeled structures, and the variable length of biological sequences, usually lead to different solutions on a case-by-case basis. This study proposed the use of bidirectional recurrent neural networks based on long short-term memory or a gated recurrent unit to capture long-range interactions by designing the optional reshape operator to adapt to the diversity of the output labels and implementing a training algorithm to support the training of sequence models capable of processing variable-length sequences. Additionally, the merge and pooling operators enhanced the ability to capture short-range interactions between basic units of biological sequences. The proposed deep-learning model and its training algorithm might be capable of solving currently known biological sequence-modeling problems through the use of a unified framework. We validated our model on one of the most difficult biological sequence-modeling problems currently known, with our results indicating the ability of the model to obtain predictions of protein residue interactions that exceeded the accuracy of current popular approaches by 10% based on multiple benchmarks.

  19. Maximum kinetic energy considerations in proton stereotactic radiosurgery.

    PubMed

    Sengbusch, Evan R; Mackie, Thomas R

    2011-04-12

    The purpose of this study was to determine the maximum proton kinetic energy required to treat a given percentage of patients eligible for stereotactic radiosurgery (SRS) with coplanar arc-based proton therapy, contingent upon the number and location of gantry angles used. Treatment plans from 100 consecutive patients treated with SRS at the University of Wisconsin Carbone Cancer Center between June of 2007 and March of 2010 were analyzed. For each target volume within each patient, in-house software was used to place proton pencil beam spots over the distal surface of the target volume from 51 equally-spaced gantry angles of up to 360°. For each beam spot, the radiological path length from the surface of the patient to the distal boundary of the target was then calculated along a ray from the gantry location to the location of the beam spot. This data was used to generate a maximum proton energy requirement for each patient as a function of the arc length that would be spanned by the gantry angles used in a given treatment. If only a single treatment angle is required, 100% of the patients included in the study could be treated by a proton beam with a maximum kinetic energy of 118 MeV. As the length of the treatment arc is increased to 90°, 180°, 270°, and 360°, the maximum energy requirement increases to 127, 145, 156, and 179 MeV, respectively. A very high percentage of SRS patients could be treated at relatively low proton energies if the gantry angles used in the treatment plan do not span a large treatment arc. Maximum proton kinetic energy requirements increase linearly with size of the treatment arc.

  20. Shotgun Protein Sequencing with Meta-contig Assembly*

    PubMed Central

    Guthals, Adrian; Clauser, Karl R.; Bandeira, Nuno

    2012-01-01

    Full-length de novo sequencing from tandem mass (MS/MS) spectra of unknown proteins such as antibodies or proteins from organisms with unsequenced genomes remains a challenging open problem. Conventional algorithms designed to individually sequence each MS/MS spectrum are limited by incomplete peptide fragmentation or low signal to noise ratios and tend to result in short de novo sequences at low sequencing accuracy. Our shotgun protein sequencing (SPS) approach was developed to ameliorate these limitations by first finding groups of unidentified spectra from the same peptides (contigs) and then deriving a consensus de novo sequence for each assembled set of spectra (contig sequences). But whereas SPS enables much more accurate reconstruction of de novo sequences longer than can be recovered from individual MS/MS spectra, it still requires error-tolerant matching to homologous proteins to group smaller contig sequences into full-length protein sequences, thus limiting its effectiveness on sequences from poorly annotated proteins. Using low and high resolution CID and high resolution HCD MS/MS spectra, we address this limitation with a Meta-SPS algorithm designed to overlap and further assemble SPS contigs into Meta-SPS de novo contig sequences extending as long as 100 amino acids at over 97% accuracy without requiring any knowledge of homologous protein sequences. We demonstrate Meta-SPS using distinct MS/MS data sets obtained with separate enzymatic digestions and discuss how the remaining de novo sequencing limitations relate to MS/MS acquisition settings. PMID:22798278

  1. Shotgun protein sequencing with meta-contig assembly.

    PubMed

    Guthals, Adrian; Clauser, Karl R; Bandeira, Nuno

    2012-10-01

    Full-length de novo sequencing from tandem mass (MS/MS) spectra of unknown proteins such as antibodies or proteins from organisms with unsequenced genomes remains a challenging open problem. Conventional algorithms designed to individually sequence each MS/MS spectrum are limited by incomplete peptide fragmentation or low signal to noise ratios and tend to result in short de novo sequences at low sequencing accuracy. Our shotgun protein sequencing (SPS) approach was developed to ameliorate these limitations by first finding groups of unidentified spectra from the same peptides (contigs) and then deriving a consensus de novo sequence for each assembled set of spectra (contig sequences). But whereas SPS enables much more accurate reconstruction of de novo sequences longer than can be recovered from individual MS/MS spectra, it still requires error-tolerant matching to homologous proteins to group smaller contig sequences into full-length protein sequences, thus limiting its effectiveness on sequences from poorly annotated proteins. Using low and high resolution CID and high resolution HCD MS/MS spectra, we address this limitation with a Meta-SPS algorithm designed to overlap and further assemble SPS contigs into Meta-SPS de novo contig sequences extending as long as 100 amino acids at over 97% accuracy without requiring any knowledge of homologous protein sequences. We demonstrate Meta-SPS using distinct MS/MS data sets obtained with separate enzymatic digestions and discuss how the remaining de novo sequencing limitations relate to MS/MS acquisition settings.

  2. An optimized protocol for generation and analysis of Ion Proton sequencing reads for RNA-Seq.

    PubMed

    Yuan, Yongxian; Xu, Huaiqian; Leung, Ross Ka-Kit

    2016-05-26

    Previous studies compared running cost, time and other performance measures of popular sequencing platforms. However, comprehensive assessment of library construction and analysis protocols for Proton sequencing platform remains unexplored. Unlike Illumina sequencing platforms, Proton reads are heterogeneous in length and quality. When sequencing data from different platforms are combined, this can result in reads with various read length. Whether the performance of the commonly used software for handling such kind of data is satisfactory is unknown. By using universal human reference RNA as the initial material, RNaseIII and chemical fragmentation methods in library construction showed similar result in gene and junction discovery number and expression level estimated accuracy. In contrast, sequencing quality, read length and the choice of software affected mapping rate to a much larger extent. Unspliced aligner TMAP attained the highest mapping rate (97.27 % to genome, 86.46 % to transcriptome), though 47.83 % of mapped reads were clipped. Long reads could paradoxically reduce mapping in junctions. With reference annotation guide, the mapping rate of TopHat2 significantly increased from 75.79 to 92.09 %, especially for long (>150 bp) reads. Sailfish, a k-mer based gene expression quantifier attained highly consistent results with that of TaqMan array and highest sensitivity. We provided for the first time, the reference statistics of library preparation methods, gene detection and quantification and junction discovery for RNA-Seq by the Ion Proton platform. Chemical fragmentation performed equally well with the enzyme-based one. The optimal Ion Proton sequencing options and analysis software have been evaluated.

  3. Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies

    PubMed Central

    2014-01-01

    Background The size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination. Results We develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome. Conclusions In addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied. PMID:24647006

  4. Biological sequence compression algorithms.

    PubMed

    Matsumoto, T; Sadakane, K; Imai, H

    2000-01-01

    Today, more and more DNA sequences are becoming available. The information about DNA sequences are stored in molecular biology databases. The size and importance of these databases will be bigger and bigger in the future, therefore this information must be stored or communicated efficiently. Furthermore, sequence compression can be used to define similarities between biological sequences. The standard compression algorithms such as gzip or compress cannot compress DNA sequences, but only expand them in size. On the other hand, CTW (Context Tree Weighting Method) can compress DNA sequences less than two bits per symbol. These algorithms do not use special structures of biological sequences. Two characteristic structures of DNA sequences are known. One is called palindromes or reverse complements and the other structure is approximate repeats. Several specific algorithms for DNA sequences that use these structures can compress them less than two bits per symbol. In this paper, we improve the CTW so that characteristic structures of DNA sequences are available. Before encoding the next symbol, the algorithm searches an approximate repeat and palindrome using hash and dynamic programming. If there is a palindrome or an approximate repeat with enough length then our algorithm represents it with length and distance. By using this preprocessing, a new program achieves a little higher compression ratio than that of existing DNA-oriented compression algorithms. We also describe new compression algorithm for protein sequences.

  5. Experimental Pressure Distributions on Axisymmetric Cowls at Mach Numbers From 0.60 to 0.92

    NASA Technical Reports Server (NTRS)

    Re, Richard J.

    2006-01-01

    Pressure distributions on four nacelle cowl models of the same length and highlight area but different geometries external to the highlight are compared. The diameter ratio (ratio of highlight diameter to maximum diameter) of the four cowls was 0.854 and the length ratio (ratio of cowl length to maximum diameter) was 0.439. The cowls had the same internal geometry from the highlight to the throat with a contraction ratio (ratio of highlight area to throat area) of 1.250. Data for two other cowls which had a diameter ratio of 0.880, a length ratio of 0.400 and a contraction ratio 1.250 are also included. All the cowls had rows of static pressure orifices on the top and bottom surfaces. Mass-flow ratio was varied between 0.27 and 0.93. Some data were obtained between angles of attack from -2.1deg and 4.1deg. The test was conducted in the Langley 16-Foot Transonic Tunnel.

  6. Influence of Turbulent Flow and Fractal Scaling on Effective Permeability of Fracture Network

    NASA Astrophysics Data System (ADS)

    Zhu, J.

    2017-12-01

    A new approach is developed to calculate hydraulic gradient dependent effective permeability of a fractal fracture network where both laminar and turbulent flows may occur in individual fractures. A critical fracture length is used to distinguish flow characteristics in individual fractures. The developed new solutions can be used for the case of a general scaling relationship, an extension to the linear scaling. We examine the impact on the effective permeability of the network of fractal fracture network characteristics, which include the fractal scaling coefficient and exponent, fractal dimension, ratio of minimum over maximum fracture lengths. Results demonstrate that the developed solution can explain more variations of the effective permeability in relation to the fractal dimensions estimated from the field observations. At high hydraulic gradient the effective permeability decreases with the fractal scaling exponent, but increases with the fractal scaling exponent at low gradient. The effective permeability increases with the scaling coefficient, fractal dimension, fracture length ratio and maximum fracture length.

  7. Energy Spectra of Higher Reynolds Number Turbulence by the DNS with up to 122883 Grid Points

    NASA Astrophysics Data System (ADS)

    Ishihara, Takashi; Kaneda, Yukio; Morishita, Koji; Yokokawa, Mitsuo; Uno, Atsuya

    2014-11-01

    Large-scale direct numerical simulations (DNS) of forced incompressible turbulence in a periodic box with up to 122883 grid points have been performed using K computer. The maximum Taylor-microscale Reynolds number Rλ, and the maximum Reynolds number Re based on the integral length scale are over 2000 and 105, respectively. Our previous DNS with Rλ up to 1100 showed that the energy spectrum has a slope steeper than - 5 / 3 (the Kolmogorov scaling law) by factor 0 . 1 at the wavenumber range (kη < 0 . 03). Here η is the Kolmogorov length scale. Our present DNS at higher resolutions show that the energy spectra with different Reynolds numbers (Rλ > 1000) are well normalized not by the integral length-scale but by the Kolmogorov length scale, at the wavenumber range of the steeper slope. This result indicates that the steeper slope is not inherent character in the inertial subrange, and is affected by viscosity.

  8. How Does Sequence Structure Affect the Judgment of Time? Exploring a Weighted Sum of Segments Model

    ERIC Educational Resources Information Center

    Matthews, William J.

    2013-01-01

    This paper examines the judgment of segmented temporal intervals, using short tone sequences as a convenient test case. In four experiments, we investigate how the relative lengths, arrangement, and pitches of the tones in a sequence affect judgments of sequence duration, and ask whether the data can be described by a simple weighted sum of…

  9. Analysis of interface crack branching

    NASA Technical Reports Server (NTRS)

    Ballarini, R.; Mukai, D. J.; Miller, G. R.

    1989-01-01

    A solution is presented for the problem of a finite length crack branching off the interface between two bonded dissimilar isotropic materials. Results are presented in terms of the ratio of the energy release rate of a branched interface crack to the energy release rate of a straight interface crack with the same total length. It is found that this ratio reaches a maximum when the interface crack branches into the softer material. Longer branches tend to have smaller maximum energy release rate ratio angles indicating that all else being equal, a branch crack will tend to turn back parallel to the interface as it grows.

  10. Near Full-Length Identification of a Novel HIV-1 CRF01_AE/B/C Recombinant in Northern Myanmar.

    PubMed

    Zhou, Yan-Heng; Chen, Xin; Liang, Yue-Bo; Pang, Wei; Qin, Wei-Hong; Zhang, Chiyu; Zheng, Yong-Tang

    2015-08-01

    The Myanmar-China border appears to be the "hot spot" region for the occurrence of HIV-1 recombination. The majority of the previous analyses of HIV-1 recombination were based on partial genomic sequences, which obviously cannot reflect the reality of the genetic diversity of HIV-1 in this area well. Here, we present a near full-length characterization of a novel HIV-1 CRF01_AE/B/C recombinant isolated from a long-distance truck driver in Northern Myanmar. It is the first description of a near full-length genomic sequence in Myanmar since 2003, and might be one of the most complicated HIV-1 chimeras ever detected in Myanmar, containing four CRF01_AE, six B segments, and five C segments separated by 14 breakpoints throughout its genome. The discovery and characterization of this new CRF01_AE/B/C recombinant indicate that intersubtype recombination is ongoing in Myanmar, continuously generating new forms of HIV-1. More work based on near full-length sequence analyses is urgently needed to better understand the genetic diversity of HIV-1 in these regions.

  11. Characterization, genetic diversity, and evolutionary link of Cucumber mosaic virus strain New Delhi from India.

    PubMed

    Koundal, Vikas; Haq, Qazi Mohd Rizwanul; Praveen, Shelly

    2011-02-01

    The genome of Cucumber mosaic virus New Delhi strain (CMV-ND) from India, obtained from tomato, was completely sequenced and compared with full genome sequences of 14 known CMV strains from subgroups I and II, for their genetic diversity. Sequence analysis suggests CMV-ND shares maximum sequence identity at the nucleotide level with a CMV strain from Taiwan. Among all 15 strains of CMV, the encoded protein 2b is least conserved, whereas the coat protein (CP) is most conserved. Sequence identity values and phylogram results indicate that CMV-ND belongs to subgroup I. Based on the recombination detection program result, it appears that CMV is prone to recombination, and different RNA components of CMV-ND have evolved differently. Recombinational analysis of all 15 CMV strains detected maximum recombination breakpoints in RNA2; CP showed the least recombination sites.

  12. Comparison of the efficiency of rat papillary muscles during afterloaded isotonic contractions and contractions with sinusoidal length changes.

    PubMed

    Mellors, L J; Gibbs, C L; Barclay, C J

    2001-05-01

    The results of previous studies suggest that the maximum mechanical efficiency of rat papillary muscles is lower during a contraction protocol involving sinusoidal length changes than during one involving afterloaded isotonic contractions. The aim of this study was to compare directly the efficiency of isolated rat papillary muscle preparations in isotonic and sinusoidal contraction protocols. Experiments were performed in vitro (27 degrees C) using left ventricular papillary muscles from adult rats. Each preparation performed three contraction protocols: (i) low-frequency afterloaded isotonic contractions (10 twitches at 0.2 Hz), (ii) sinusoidal length change contractions with phasic stimulation (40 twitches at 2 Hz) and (iii) high-frequency afterloaded isotonic contractions (40 twitches at 2 Hz). The first two protocols resembled those used in previous studies and the third combined the characteristics of the first two. The parameters for each protocol were adjusted to those that gave maximum efficiency. For the afterloaded isotonic protocols, the afterload was set to 0.3 of the maximum developed force. The sinusoidal length change protocol incorporated a cycle amplitude of +/-5% resting length and a stimulus phase of -10 degrees. Measurements of force output, muscle length change and muscle temperature change were used to calculate the work and heat produced during and after each protocol. Net mechanical efficiency was defined as the proportion of the energy (enthalpy) liberated by the muscle that appeared as work. The efficiency in the low-frequency, isotonic contraction protocol was 21.1+/-1.4% (mean +/- s.e.m., N=6) and that in the sinusoidal protocol was 13.2+/-0.7%, consistent with previous results. This difference was not due to the higher frequency or greater number of twitches because efficiency in the high-frequency, isotonic protocol was 21.5+/-1.0%. Although these results apparently confirm that efficiency is protocol-dependent, additional experiments designed to measure work output unambiguously indicated that the method used to calculate work output in isotonic contractions overestimated actual work output. When net work output, which excludes work done by parallel elastic elements, rather than total work output was used to determine efficiency in afterloaded isotonic contractions, efficiency was similar to that for sinusoidal contractions. The maximum net mechanical efficiency of rat papillary muscles performing afterloaded isotonic or sinusoidal length change contractions was between 10 and 15%.

  13. The influence of viral coding sequences on pestivirus IRES activity reveals further parallels with translation initiation in prokaryotes.

    PubMed Central

    Fletcher, Simon P; Ali, Iraj K; Kaminski, Ann; Digard, Paul; Jackson, Richard J

    2002-01-01

    Classical swine fever virus (CSFV) is a member of the pestivirus family, which shares many features in common with hepatitis C virus (HCV). It is shown here that CSFV has an exceptionally efficient cis-acting internal ribosome entry segment (IRES), which, like that of HCV, is strongly influenced by the sequences immediately downstream of the initiation codon, and is optimal with viral coding sequences in this position. Constructs that retained 17 or more codons of viral coding sequence exhibited full IRES activity, but with only 12 codons, activity was approximately 66% of maximum in vitro (though close to maximum in transfected BHK cells), whereas with just 3 codons or fewer, the activity was only approximately 15% of maximum. The minimal coding region elements required for high activity were exchanged between HCV and CSFV. Although maximum activity was observed in each case with the homologous combination of coding region and 5' UTR, the heterologous combinations were sufficiently active to rule out a highly specific functional interplay between the 5' UTR and coding sequences. On the other hand, inversion of the coding sequences resulted in low IRES activity, particularly with the HCV coding sequences. RNA structure probing showed that the efficiency of internal initiation of these chimeric constructs correlated most closely with the degree of single-strandedness of the region around and immediately downstream of the initiation codon. The low activity IRESs could not be rescued by addition of supplementary eIF4A (the initiation factor with ATP-dependent RNA helicase activity). The extreme sensitivity to secondary structure around the initiation codon is likely to be due to the fact that the eIF4F complex (which has eIF4A as one of its subunits) is not required for and does not participate in initiation on these IRESs. PMID:12515388

  14. libFLASM: a software library for fixed-length approximate string matching.

    PubMed

    Ayad, Lorraine A K; Pissis, Solon P P; Retha, Ahmad

    2016-11-10

    Approximate string matching is the problem of finding all factors of a given text that are at a distance at most k from a given pattern. Fixed-length approximate string matching is the problem of finding all factors of a text of length n that are at a distance at most k from any factor of length ℓ of a pattern of length m. There exist bit-vector techniques to solve the fixed-length approximate string matching problem in time [Formula: see text] and space [Formula: see text] under the edit and Hamming distance models, where w is the size of the computer word; as such these techniques are independent of the distance threshold k or the alphabet size. Fixed-length approximate string matching is a generalisation of approximate string matching and, hence, has numerous direct applications in computational molecular biology and elsewhere. We present and make available libFLASM, a free open-source C++ software library for solving fixed-length approximate string matching under both the edit and the Hamming distance models. Moreover we describe how fixed-length approximate string matching is applied to solve real problems by incorporating libFLASM into established applications for multiple circular sequence alignment as well as single and structured motif extraction. Specifically, we describe how it can be used to improve the accuracy of multiple circular sequence alignment in terms of the inferred likelihood-based phylogenies; and we also describe how it is used to efficiently find motifs in molecular sequences representing regulatory or functional regions. The comparison of the performance of the library to other algorithms show how it is competitive, especially with increasing distance thresholds. Fixed-length approximate string matching is a generalisation of the classic approximate string matching problem. We present libFLASM, a free open-source C++ software library for solving fixed-length approximate string matching. The extensive experimental results presented here suggest that other applications could benefit from using libFLASM, and thus further maintenance and development of libFLASM is desirable.

  15. Immunoglobulin from Antarctic fish species of Rajidae family.

    PubMed

    Coscia, Maria Rosaria; Cocca, Ennio; Giacomelli, Stefano; Cuccaro, Fausta; Oreste, Umberto

    2012-03-01

    Immunoglobulins (Ig) of Chondroichthyes have been extensively studied in sharks; in contrast, in skates investigations on Ig remain scarce and fragmentary despite the high occurrence of skates in all of the major oceans of the world. To focus on Rajidae Igμ, the most abundant heavy chain isotype, we have chosen the Antarctic species Bathyraja eatonii, Bathyraja albomaculata, Bathyraja brachyurops, and Amblyraja georgiana which live at high latitudes in the Southern Ocean, and at very low temperatures. We prepared mRNA from the spleen of individuals of each species and performed RT-PCR experiments using two oligonucleotides designed on the alignment of various elasmobranch Igμ heavy chain sequences available in GenBank. The PCR products, about 1400-nt long, were cloned and sequenced. Nucleotide sequence identities calculated for the constant region domains ranged from 88.5% to 97.5% between species, and from 91.1% to 99.7% within species. In a distance tree, including also Raja erinacea sequences, two major branches were obtained, one containing Arhynchobatinae sequences, the other one Rajinae sequences. Four presumptive D gene segments were identified in the region of the VH/D/JH recombination; two different D segments were often found in the same sequence. Moreover, 5-15 genomic fragments of different lengths, carrying the gene locus encoding Igμ chain were revealed by Southern blotting analysis. B. eatonii amino acid sequences were analyzed for the positional diversity by Shannon entropy analysis, showing CH4 as the most conserved domain, and CH3 as the most variable one. B. eatonii CDR3 region length varied between 11 and 15 amino acid residues; the mean length (13.4 aa) was greater than that of Leucoraja eglanteria sequences (7.7 aa). An alignment of representative sequences of Antarctic species and R. erinacea showed that more cysteine residues not involved in the intradomain disulfide bridges were present in Antarctic species. Copyright © 2011 Elsevier B.V. All rights reserved.

  16. Analysis of full-length sequences of two Citrus yellow mosaic badnavirus isolates infecting Citrus jambhiri (Rough Lemon) and Citrus sinensis L. Osbeck (Sweet Orange) from a nursery in India.

    PubMed

    Anthony Johnson, A M; Borah, B K; Sai Gopal, D V R; Dasgupta, I

    2012-12-01

    Citrus yellow mosaic badna virus (CMBV), a member of the Family Caulimoviridae, Genus Badnavirus is the causative agent of mosaic disease among Citrus species in southern India. Despite its reported prevalence in several citrus species, complete information on clear functional genomics or functional information of full-length genomes from all the CMBV isolates infecting citrus species are not available in publicly accessible databases. CMBV isolates from Rough Lemon and Sweet Orange collected from a nursery were cloned and sequenced. The analysis revealed high sequence homology of the two CMBV isolates with previously reported CMBV sequences implying that they represent new variants. Based on computational analysis of the predicted secondary structures, the possible functions of some CMBV proteins have been analyzed.

  17. Identification of a new genotype H wild-type mumps virus strain and its molecular relatedness to other virulent and attenuated strains.

    PubMed

    Amexis, Georgios; Rubin, Steven; Chatterjee, Nando; Carbone, Kathryn; Chumakov, Kostantin

    2003-06-01

    A single clinical isolate of mumps virus designated 88-1961 was obtained from a patient hospitalized with a clinical history of upper respiratory tract infection, parotitis, severe headache, fever and lymphadenopathy. We have sequenced the full-length genome of 88-1961 and compared it against all available full-length sequences of mumps virus. Based upon its nucleotide sequence of the SH gene 88-1961 was identified as a genotype H mumps strain. The overall extent of nucleotide and amino acid differences between each individual gene and protein of 88-1961 and the full-length mumps samples showed that the missense to silent ratios were unevenly distributed. Upon evaluation of the consensus sequence of 88-1961, four positions were found to be clearly heterogeneous at the nucleotide level (NP 315C/T, NP 318C/T, F 271A/C, and HN 855C/T). Sequence analysis revealed that the amino acid sequences for the NP, M, and the L protein were the most conserved, whereas the SH protein exhibited the highest variability among the compared mumps genotypes A, B, and G. No identifying molecular patterns in the non-coding (intergenic) or coding regions of 88-1961 were found when we compared it against relatively virulent (Urabe AM9 B, Glouc1/UK96, 87-1004 and 87-1005) and non-virulent mumps strains (Jeryl Lynn and all Urabe Am9 A substrains). Copyright 2003 Wiley-Liss, Inc.

  18. Bioinformatics analysis and detection of gelatinase encoded gene in Lysinibacillussphaericus

    NASA Astrophysics Data System (ADS)

    Repin, Rul Aisyah Mat; Mutalib, Sahilah Abdul; Shahimi, Safiyyah; Khalid, Rozida Mohd.; Ayob, Mohd. Khan; Bakar, Mohd. Faizal Abu; Isa, Mohd Noor Mat

    2016-11-01

    In this study, we performed bioinformatics analysis toward genome sequence of Lysinibacillussphaericus (L. sphaericus) to determine gene encoded for gelatinase. L. sphaericus was isolated from soil and gelatinase species-specific bacterium to porcine and bovine gelatin. This bacterium offers the possibility of enzymes production which is specific to both species of meat, respectively. The main focus of this research is to identify the gelatinase encoded gene within the bacteria of L. Sphaericus using bioinformatics analysis of partially sequence genome. From the research study, three candidate gene were identified which was, gelatinase candidate gene 1 (P1), NODE_71_length_93919_cov_158.931839_21 which containing 1563 base pair (bp) in size with 520 amino acids sequence; Secondly, gelatinase candidate gene 2 (P2), NODE_23_length_52851_cov_190.061386_17 which containing 1776 bp in size with 591 amino acids sequence; and Thirdly, gelatinase candidate gene 3 (P3), NODE_106_length_32943_cov_169.147919_8 containing 1701 bp in size with 566 amino acids sequence. Three pairs of oligonucleotide primers were designed and namely as, F1, R1, F2, R2, F3 and R3 were targeted short sequences of cDNA by PCR. The amplicons were reliably results in 1563 bp in size for candidate gene P1 and 1701 bp in size for candidate gene P3. Therefore, the results of bioinformatics analysis of L. Sphaericus resulting in gene encoded gelatinase were identified.

  19. Signal sequence and keyword trap in silico for selection of full-length human cDNAs encoding secretion or membrane proteins from oligo-capped cDNA libraries.

    PubMed

    Otsuki, Tetsuji; Ota, Toshio; Nishikawa, Tetsuo; Hayashi, Koji; Suzuki, Yutaka; Yamamoto, Jun-ichi; Wakamatsu, Ai; Kimura, Kouichi; Sakamoto, Katsuhiko; Hatano, Naoto; Kawai, Yuri; Ishii, Shizuko; Saito, Kaoru; Kojima, Shin-ichi; Sugiyama, Tomoyasu; Ono, Tetsuyoshi; Okano, Kazunori; Yoshikawa, Yoko; Aotsuka, Satoshi; Sasaki, Naokazu; Hattori, Atsushi; Okumura, Koji; Nagai, Keiichi; Sugano, Sumio; Isogai, Takao

    2005-01-01

    We have developed an in silico method of selection of human full-length cDNAs encoding secretion or membrane proteins from oligo-capped cDNA libraries. Fullness rates were increased to about 80% by combination of the oligo-capping method and ATGpr, software for prediction of translation start point and the coding potential. Then, using 5'-end single-pass sequences, cDNAs having the signal sequence were selected by PSORT ('signal sequence trap'). We also applied 'secretion or membrane protein-related keyword trap' based on the result of BLAST search against the SWISS-PROT database for the cDNAs which could not be selected by PSORT. Using the above procedures, 789 cDNAs were primarily selected and subjected to full-length sequencing, and 334 of these cDNAs were finally selected as novel. Most of the cDNAs (295 cDNAs: 88.3%) were predicted to encode secretion or membrane proteins. In particular, 165(80.5%) of the 205 cDNAs selected by PSORT were predicted to have signal sequences, while 70 (54.2%) of the 129 cDNAs selected by 'keyword trap' preserved the secretion or membrane protein-related keywords. Many important cDNAs were obtained, including transporters, receptors, and ligands, involved in significant cellular functions. Thus, an efficient method of selecting secretion or membrane protein-encoding cDNAs was developed by combining the above four procedures.

  20. Experimental and analytical study of high velocity impact on Kevlar/Epoxy composite plates

    NASA Astrophysics Data System (ADS)

    Sikarwar, Rahul S.; Velmurugan, Raman; Madhu, Velmuri

    2012-12-01

    In the present study, impact behavior of Kevlar/Epoxy composite plates has been carried out experimentally by considering different thicknesses and lay-up sequences and compared with analytical results. The effect of thickness, lay-up sequence on energy absorbing capacity has been studied for high velocity impact. Four lay-up sequences and four thickness values have been considered. Initial velocities and residual velocities are measured experimentally to calculate the energy absorbing capacity of laminates. Residual velocity of projectile and energy absorbed by laminates are calculated analytically. The results obtained from analytical study are found to be in good agreement with experimental results. It is observed from the study that 0/90 lay-up sequence is most effective for impact resistance. Delamination area is maximum on the back side of the plate for all thickness values and lay-up sequences. The delamination area on the back is maximum for 0/90/45/-45 laminates compared to other lay-up sequences.

  1. Toward Genomics-Based Breeding in C3 Cool-Season Perennial Grasses.

    PubMed

    Talukder, Shyamal K; Saha, Malay C

    2017-01-01

    Most important food and feed crops in the world belong to the C3 grass family. The future of food security is highly reliant on achieving genetic gains of those grasses. Conventional breeding methods have already reached a plateau for improving major crops. Genomics tools and resources have opened an avenue to explore genome-wide variability and make use of the variation for enhancing genetic gains in breeding programs. Major C3 annual cereal breeding programs are well equipped with genomic tools; however, genomic research of C3 cool-season perennial grasses is lagging behind. In this review, we discuss the currently available genomics tools and approaches useful for C3 cool-season perennial grass breeding. Along with a general review, we emphasize the discussion focusing on forage grasses that were considered orphan and have little or no genetic information available. Transcriptome sequencing and genotype-by-sequencing technology for genome-wide marker detection using next-generation sequencing (NGS) are very promising as genomics tools. Most C3 cool-season perennial grass members have no prior genetic information; thus NGS technology will enhance collinear study with other C3 model grasses like Brachypodium and rice. Transcriptomics data can be used for identification of functional genes and molecular markers, i.e., polymorphism markers and simple sequence repeats (SSRs). Genome-wide association study with NGS-based markers will facilitate marker identification for marker-assisted selection. With limited genetic information, genomic selection holds great promise to breeders for attaining maximum genetic gain of the cool-season C3 perennial grasses. Application of all these tools can ensure better genetic gains, reduce length of selection cycles, and facilitate cultivar development to meet the future demand for food and fodder.

  2. Diffusion weighted whole body imaging with background body signal suppression (DWIBS): technical improvement using free breathing, STIR and high resolution 3D display.

    PubMed

    Takahara, Taro; Imai, Yutaka; Yamashita, Tomohiro; Yasuda, Seiei; Nasu, Seiji; Van Cauteren, Marc

    2004-01-01

    To examine a new way of body diffusion weighted imaging (DWI) using the short TI inversion recovery-echo planar imaging (STIR-EPI) sequence and free breathing scanning (diffusion weighted whole body imaging with background body signal suppression; DWIBS) to obtain three-dimensional displays. 1) Apparent contrast-to-noise ratios (AppCNR) between lymph nodes and surrounding fat tissue were compared in three types of DWI with and without breath-holding, with variable lengths of scan time and slice thickness. 2) The STIR-EPI sequence and spin echo-echo planar imaging (SE-EPI) sequence with chemical shift selective (CHESS) pulse were compared in terms of their degree of fat suppression. 3) Eleven patients with neck, chest, and abdominal malignancy were scanned with DWIBS for evaluation of feasibility. Whole body imaging was done in a later stage of the study using the peripheral vascular coil. The AppCNR of 8 mm slice thickness images reconstructed from 4 mm slice thickness source images obtained in a free breathing scan of 430 sec were much better than 9 mm slice thickness breath-hold scans obtained in 25 sec. High resolution multi-planar reformat (MPR) and maximum intensity projection (MIP) images could be made from the data set of 4 mm slice thickness images. Fat suppression was much better in the STIR-EPI sequence than SE-EPI with CHESS pulse. The feasibility of DWIBS was showed in clinical scans of 11 patients. Whole body images were successfully obtained with adequate fat suppression. Three-dimensional DWIBS can be obtained with this technique, which may allow us to screen for malignancies in the whole body.

  3. Massive Collection of Full-Length Complementary DNA Clones and Microarray Analyses:. Keys to Rice Transcriptome Analysis

    NASA Astrophysics Data System (ADS)

    Kikuchi, Shoshi

    2009-02-01

    Completion of the high-precision genome sequence analysis of rice led to the collection of about 35,000 full-length cDNA clones and the determination of their complete sequences. Mapping of these full-length cDNA sequences has given us information on (1) the number of genes expressed in the rice genome; (2) the start and end positions and exon-intron structures of rice genes; (3) alternative transcripts; (4) possible encoded proteins; (5) non-protein-coding (np) RNAs; (6) the density of gene localization on the chromosome; (7) setting the parameters of gene prediction programs; and (8) the construction of a microarray system that monitors global gene expression. Manual curation for rice gene annotation by using mapping information on full-length cDNA and EST assemblies has revealed about 32,000 expressed genes in the rice genome. Analysis of major gene families, such as those encoding membrane transport proteins (pumps, ion channels, and secondary transporters), along with the evolution from bacteria to higher animals and plants, reveals how gene numbers have increased through adaptation to circumstances. Family-based gene annotation also gives us a new way of comparing organisms. Massive amounts of data on gene expression under many kinds of physiological conditions are being accumulated in rice oligoarrays (22K and 44K) based on full-length cDNA sequences. Cluster analyses of genes that have the same promoter cis-elements, that have similar expression profiles, or that encode enzymes in the same metabolic pathways or signal transduction cascades give us clues to understanding the networks of gene expression in rice. As a tool for that purpose, we recently developed "RiCES", a tool for searching for cis-elements in the promoter regions of clustered genes.

  4. Length and sequence heterogeneity in 5S rDNA of Populus deltoides.

    PubMed

    Negi, Madan S; Rajagopal, Jyothi; Chauhan, Neeti; Cronn, Richard; Lakshmikumaran, Malathi

    2002-12-01

    The 5S rRNA genes and their associated non-transcribed spacer (NTS) regions are present as repeat units arranged in tandem arrays in plant genomes. Length heterogeneity in 5S rDNA repeats was previously identified in Populus deltoides and was also observed in the present study. Primers were designed to amplify the 5S rDNA NTS variants from the P. deltoides genome. The PCR-amplified products from the two accessions of P. deltoides (G3 and G48) suggested the presence of length heterogeneity of 5S rDNA units within and among accessions, and the size of the spacers ranged from 385 to 434 bp. Sequence analysis of the non-transcribed spacer (NTS) revealed two distinct classes of 5S rDNA within both accessions: class 1, which contained GAA trinucleotide microsatellite repeats, and class 2, which lacked the repeats. The class 1 spacer shows length variation owing to the microsatellite, with two clones exhibiting 10 GAA repeat units and one clone exhibiting 16 such repeat units. However, distance analysis shows that class 1 spacer sequences are highly similar inter se, yielding nucleotide diversity (pi) estimates that are less than 0.15% of those obtained for class 2 spacers (pi = 0.0183 vs. 0.1433, respectively). The presence of microsatellite in the NTS region leading to variation in spacer length is reported and discussed for the first time in P. deltoides.

  5. Crustal dynamics project session 4 validation and intercomparison experiments 1979-1980 report

    NASA Technical Reports Server (NTRS)

    Liebrecht, P.; Kolenkiewicz, R.; Ryan, J.; Hothem, L.

    1983-01-01

    As part of the Crustal Dynamics Project, an experiment was performed to verify the ability of Satellite Laser Ranging (SLR), Very Long Baseline interferometry (VLBI) and Doppler Satellite Positioning System (Doppler) techniques to estimate the baseline distances between several locations. The Goddard Space Flight Center (GSFC) lasers were in operation at all five sites available to them. The ten baselines involved were analyzed using monthly orbits and various methods of selecting data. The standard deviation of the monthly SLR baseline lengths was at the 7 cm level. The GSFC VLBI (Mark III) data was obtained during three separate experiments. November 1979 at Haystack and Owens Valley, and April and July 1980 at Haystack, Owens Valley, and Fort Davis. Repeatability of the VLBI in determining baseline lengths was calculated to be at the 2 cm level. Jet Propulsion Laboratory (JPL) VLBI (Mark II) data was acquired on the Owens Valley to Goldstone baseline on ten occasions between August 1979 and November 1980. The repeatability of these baseline length determinations was calculated to be at the 5 cm level. National Geodetic Survey (NGS) Doppler data was acquired at all five sites in January 1980. Repeatability of the Doppler determined baseline lengths results were calculated at approximately 30 cm. An intercomparison between baseline distances and associated parameters was made utilizing SLR, VLBI, and Doppler results on all available baselines. The VLBI and SLR length determinations were compared on four baselines with a resultant mean difference of -1 cm and a maximum difference of 12 cm. The SLR and Doppler length determinations were compared on ten baselines with a resultant mean difference of about 30 cm and a maximum difference of about 60 cm. The VLBI and Doppler lengths from seven baselines showed a resultant mean difference of about 30 cm and maximum difference of about 1 meter. The intercomparison of baseline orientation parameters were consistent with past analysis.

  6. The GAAS metagenomic tool and its estimations of viral and microbial average genome size in four major biomes.

    PubMed

    Angly, Florent E; Willner, Dana; Prieto-Davó, Alejandra; Edwards, Robert A; Schmieder, Robert; Vega-Thurber, Rebecca; Antonopoulos, Dionysios A; Barott, Katie; Cottrell, Matthew T; Desnues, Christelle; Dinsdale, Elizabeth A; Furlan, Mike; Haynes, Matthew; Henn, Matthew R; Hu, Yongfei; Kirchman, David L; McDole, Tracey; McPherson, John D; Meyer, Folker; Miller, R Michael; Mundt, Egbert; Naviaux, Robert K; Rodriguez-Mueller, Beltran; Stevens, Rick; Wegley, Linda; Zhang, Lixin; Zhu, Baoli; Rohwer, Forest

    2009-12-01

    Metagenomic studies characterize both the composition and diversity of uncultured viral and microbial communities. BLAST-based comparisons have typically been used for such analyses; however, sampling biases, high percentages of unknown sequences, and the use of arbitrary thresholds to find significant similarities can decrease the accuracy and validity of estimates. Here, we present Genome relative Abundance and Average Size (GAAS), a complete software package that provides improved estimates of community composition and average genome length for metagenomes in both textual and graphical formats. GAAS implements a novel methodology to control for sampling bias via length normalization, to adjust for multiple BLAST similarities by similarity weighting, and to select significant similarities using relative alignment lengths. In benchmark tests, the GAAS method was robust to both high percentages of unknown sequences and to variations in metagenomic sequence read lengths. Re-analysis of the Sargasso Sea virome using GAAS indicated that standard methodologies for metagenomic analysis may dramatically underestimate the abundance and importance of organisms with small genomes in environmental systems. Using GAAS, we conducted a meta-analysis of microbial and viral average genome lengths in over 150 metagenomes from four biomes to determine whether genome lengths vary consistently between and within biomes, and between microbial and viral communities from the same environment. Significant differences between biomes and within aquatic sub-biomes (oceans, hypersaline systems, freshwater, and microbialites) suggested that average genome length is a fundamental property of environments driven by factors at the sub-biome level. The behavior of paired viral and microbial metagenomes from the same environment indicated that microbial and viral average genome sizes are independent of each other, but indicative of community responses to stressors and environmental conditions.

  7. Identification of a Conserved Non-Protein-Coding Genomic Element that Plays an Essential Role in Alphabaculovirus Pathogenesis

    PubMed Central

    Kikhno, Irina

    2014-01-01

    Highly homologous sequences 154–157 bp in length grouped under the name of “conserved non-protein-coding element” (CNE) were revealed in all of the sequenced genomes of baculoviruses belonging to the genus Alphabaculovirus. A CNE alignment led to the detection of a set of highly conserved nucleotide clusters that occupy strictly conserved positions in the CNE sequence. The significant length of the CNE and conservation of both its length and cluster architecture were identified as a combination of characteristics that make this CNE different from known viral non-coding functional sequences. The essential role of the CNE in the Alphabaculovirus life cycle was demonstrated through the use of a CNE-knockout Autographa californica multiple nucleopolyhedrovirus (AcMNPV) bacmid. It was shown that the essential function of the CNE was not mediated by the presumed expression activities of the protein- and non-protein-coding genes that overlap the AcMNPV CNE. On the basis of the presented data, the AcMNPV CNE was categorized as a complex-structured, polyfunctional genomic element involved in an essential DNA transaction that is associated with an undefined function of the baculovirus genome. PMID:24740153

  8. Sequence-length variation of mtDNA HVS-I C-stretch in Chinese ethnic groups.

    PubMed

    Chen, Feng; Dang, Yong-hui; Yan, Chun-xia; Liu, Yan-ling; Deng, Ya-jun; Fulton, David J R; Chen, Teng

    2009-10-01

    The purpose of this study was to investigate mitochondrial DNA (mtDNA) hypervariable segment-I (HVS-I) C-stretch variations and explore the significance of these variations in forensic and population genetics studies. The C-stretch sequence variation was studied in 919 unrelated individuals from 8 Chinese ethnic groups using both direct and clone sequencing approaches. Thirty eight C-stretch haplotypes were identified, and some novel and population specific haplotypes were also detected. The C-stretch genetic diversity (GD) values were relatively high, and probability (P) values were low. Additionally, C-stretch length heteroplasmy was observed in approximately 9% of individuals studied. There was a significant correlation (r=-0.961, P<0.01) between the expansion of the cytosine sequence length in the C-stretch of HVS-I and a reduction in the number of upstream adenines. These results indicate that the C-stretch could be a useful genetic maker in forensic identification of Chinese populations. The results from the Fst and dA genetic distance matrix, neighbor-joining tree, and principal component map also suggest that C-stretch could be used as a reliable genetic marker in population genetics.

  9. A phylogenetic comparison of urease-positive thermophilic Campylobacter (UPTC) and urease-negative (UN) C. lari.

    PubMed

    Hirayama, Junichi; Tazumi, Akihiro; Hayashi, Kyohei; Tasaki, Erina; Kuribayashi, Takashi; Moore, John E; Millar, Beverley C; Matsuda, Motoo

    2011-06-01

    In the present study, the reliability of full-length gene sequence information for several genes including 16S rRNA was examined, for the discrimination of the two representative Campylobacter lari taxa, namely urease-negative (UN) C. lari and urease-positive thermophilic Campylobacter (UPTC). As previously described, 16S rRNA gene sequence are not reliable for the molecular discrimination of UN C. lari from UPTC organisms employing both the unweighted pair group method using arithmetic means analysis (UPGMA) and neighbor joining (NJ) methods. In addition, three composite full-length gene sequences (ciaB, flaC and vacJ) out of seven gene loci examined were reliable for discrimination employing dendrograms constructed by the UPGMA method. In addition, all the dendrograms of the NJ phylogenetic trees constructed based on the nine gene information were not reliable for the discrimination. Three composite full-length gene sequences (ciaB, flaC and vacJ) were reliable for the molecular discrimination between UN C. lari and UPTC organisms employing the UPGMA method, as well as among four thermophilic Campylobacter species. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. Molecular Phylogeny of the Bamboo Sharks (Chiloscyllium spp.)

    PubMed Central

    Masstor, Noor Haslina; Samat, Abdullah; Nor, Shukor Md; Md-Zain, Badrul Munir

    2014-01-01

    Chiloscyllium, commonly called bamboo shark, can be found inhabiting the waters of the Indo-West Pacific around East Asian countries such as Malaysia, Myanmar, Thailand, Singapore, and Indonesia. The International Union for Conservation of Nature (IUCN) Red List has categorized them as nearly threatened sharks out of their declining population status due to overexploitation. A molecular study was carried out to portray the systematic relationships within Chiloscyllium species using 12S rRNA and cytochrome b gene sequences. Maximum parsimony and Bayesian were used to reconstruct their phylogeny trees. A total of 381 bp sequences' lengths were successfully aligned in the 12S rRNA region, with 41 bp sites being parsimony-informative. In the cytochrome b region, a total of 1120 bp sites were aligned, with 352 parsimony-informative characters. All analyses yield phylogeny trees on which C. indicum has close relationships with C. plagiosum. C. punctatum is sister taxon to both C. indicum and C. plagiosum while C. griseum and C. hasseltii formed their own clade as sister taxa. These Chiloscyllium classifications can be supported by some morphological characters (lateral dermal ridges on the body, coloring patterns, and appearance of hypobranchials and basibranchial plate) that can clearly be used to differentiate each species. PMID:25013766

  11. Systematic Characterization and Comparative Analysis of the Rabbit Immunoglobulin Repertoire

    PubMed Central

    Lavinder, Jason J.; Hoi, Kam Hon; Reddy, Sai T.; Wine, Yariv; Georgiou, George

    2014-01-01

    Rabbits have been used extensively as a model system for the elucidation of the mechanism of immunoglobulin diversification and for the production of antibodies. We employed Next Generation Sequencing to analyze Ig germline V and J gene usage, CDR3 length and amino acid composition, and gene conversion frequencies within the functional (transcribed) IgG repertoire of the New Zealand white rabbit (Oryctolagus cuniculus). Several previously unannotated rabbit heavy chain variable (VH) and light chain variable (VL) germline elements were deduced bioinformatically using multidimensional scaling and k-means clustering methods. We estimated the gene conversion frequency in the rabbit at 23% of IgG sequences with a mean gene conversion tract length of 59±36 bp. Sequencing and gene conversion analysis of the chicken, human, and mouse repertoires revealed that gene conversion occurs much more extensively in the chicken (frequency 70%, tract length 79±57 bp), was observed to a small, yet statistically significant extent in humans, but was virtually absent in mice. PMID:24978027

  12. Cell culture compositions

    DOEpatents

    Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yiao, Jian

    2014-03-18

    The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6 (SEQ ID NO:1 encodes the full length endoglucanase; SEQ ID NO:4 encodes the mature form), and the corresponding endoglucanase VI amino acid sequence ("EGVI"; SEQ ID NO:3 is the signal sequence; SEQ ID NO:2 is the mature sequence). The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.

  13. Effect of base sequence on the DNA cross-linking properties of pyrrolobenzodiazepine (PBD) dimers

    PubMed Central

    Rahman, Khondaker M.; James, Colin H.; Thurston, David E.

    2011-01-01

    Pyrrolo[2,1-c][1,4]benzodiazepine (PBD) dimers are synthetic sequence-selective DNA minor-groove cross-linking agents that possess two electrophilic imine moieties (or their equivalent) capable of forming covalent aminal linkages with guanine C2-NH2 functionalities. The PBD dimer SJG-136, which has a C8–O–(CH2)3–O–C8′′ central linker joining the two PBD moieties, is currently undergoing phase II clinical trials and current research is focused on developing analogues of SJG-136 with different linker lengths and substitution patterns. Using a reversed-phase ion pair HPLC/MS method to evaluate interaction with oligonucleotides of varying length and sequence, we recently reported (JACS, 2009, 131, 13 756) that SJG-136 can form three different types of adducts: inter- and intrastrand cross-linked adducts, and mono-alkylated adducts. These studies have now been extended to include PBD dimers with a longer central linker (C8–O–(CH2)5–O–C8′), demonstrating that the type and distribution of adducts appear to depend on (i) the length of the C8/C8′-linker connecting the two PBD units, (ii) the positioning of the two reactive guanine bases on the same or opposite strands, and (iii) their separation (i.e. the number of base pairs, usually ATs, between them). Based on these data, a set of rules are emerging that can be used to predict the DNA–interaction behaviour of a PBD dimer of particular C8–C8′ linker length towards a given DNA sequence. These observations suggest that it may be possible to design PBD dimers to target specific DNA sequences. PMID:21427082

  14. Stick balancing with reflex delay in case of parametric forcing

    NASA Astrophysics Data System (ADS)

    Insperger, Tamas

    2011-04-01

    The effect of parametric forcing on a PD control of an inverted pendulum is analyzed in the presence of feedback delay. The stability of the time-periodic and time-delayed system is determined numerically using the first-order semi-discretization method in the 5-dimensional parameter space of the pendulum's length, the forcing frequency, the forcing amplitude, the proportional and the differential gains. It is shown that the critical length of the pendulum (that can just be balanced against the time-delay) can significantly be decreased by parametric forcing even if the maximum forcing acceleration is limited. The numerical analysis showed that the critical stick length about 30 cm corresponding to the unforced system with reflex delay 0.1 s can be decreased to 18 cm with keeping maximum acceleration below the gravitational acceleration.

  15. Resequencing of the common marmoset genome improves genome assemblies and gene-coding sequence analysis.

    PubMed

    Sato, Kengo; Kuroki, Yoko; Kumita, Wakako; Fujiyama, Asao; Toyoda, Atsushi; Kawai, Jun; Iriki, Atsushi; Sasaki, Erika; Okano, Hideyuki; Sakakibara, Yasubumi

    2015-11-20

    The first draft of the common marmoset (Callithrix jacchus) genome was published by the Marmoset Genome Sequencing and Analysis Consortium. The draft was based on whole-genome shotgun sequencing, and the current assembly version is Callithrix_jacches-3.2.1, but there still exist 187,214 undetermined gap regions and supercontigs and relatively short contigs that are unmapped to chromosomes in the draft genome. We performed resequencing and assembly of the genome of common marmoset by deep sequencing with high-throughput sequencing technology. Several different sequence runs using Illumina sequencing platforms were executed, and 181 Gbp of high-quality bases including mate-pairs with long insert lengths of 3, 8, 20, and 40 Kbp were obtained, that is, approximately 60× coverage. The resequencing significantly improved the MGSAC draft genome sequence. The N50 of the contigs, which is a statistical measure used to evaluate assembly quality, doubled. As a result, 51% of the contigs (total length: 299 Mbp) that were unmapped to chromosomes in the MGSAC draft were merged with chromosomal contigs, and the improved genome sequence helped to detect 5,288 new genes that are homologous to human cDNAs and the gaps in 5,187 transcripts of the Ensembl gene annotations were completely filled.

  16. Effects of wind velocity and slope on flame properties

    Treesearch

    David R. Weise; Gregory S. Biging

    1996-01-01

    Abstract: The combined effects of wind velocity and percent slope on flame length and angle were measured in an open-topped, tilting wind tunnel by burning fuel beds composed of vertical birch sticks and aspen excelsior. Mean flame length ranged from 0.08 to 1.69 m; 0.25 m was the maximum observed flame length for most backing fires. Flame angle ranged from -46o to 50o...

  17. De Novo Genome and Transcriptome Assembly of the Canadian Beaver (Castor canadensis).

    PubMed

    Lok, Si; Paton, Tara A; Wang, Zhuozhi; Kaur, Gaganjot; Walker, Susan; Yuen, Ryan K C; Sung, Wilson W L; Whitney, Joseph; Buchanan, Janet A; Trost, Brett; Singh, Naina; Apresto, Beverly; Chen, Nan; Coole, Matthew; Dawson, Travis J; Ho, Karen; Hu, Zhizhou; Pullenayegum, Sanjeev; Samler, Kozue; Shipstone, Arun; Tsoi, Fiona; Wang, Ting; Pereira, Sergio L; Rostami, Pirooz; Ryan, Carol Ann; Tong, Amy Hin Yan; Ng, Karen; Sundaravadanam, Yogi; Simpson, Jared T; Lim, Burton K; Engstrom, Mark D; Dutton, Christopher J; Kerr, Kevin C R; Franke, Maria; Rapley, William; Wintle, Richard F; Scherer, Stephen W

    2017-02-09

    The Canadian beaver ( Castor canadensis ) is the largest indigenous rodent in North America. We report a draft annotated assembly of the beaver genome, the first for a large rodent and the first mammalian genome assembled directly from uncorrected and moderate coverage (< 30 ×) long reads generated by single-molecule sequencing. The genome size is 2.7 Gb estimated by k-mer analysis. We assembled the beaver genome using the new Canu assembler optimized for noisy reads. The resulting assembly was refined using Pilon supported by short reads (80 ×) and checked for accuracy by congruency against an independent short read assembly. We scaffolded the assembly using the exon-gene models derived from 9805 full-length open reading frames (FL-ORFs) constructed from the beaver leukocyte and muscle transcriptomes. The final assembly comprised 22,515 contigs with an N50 of 278,680 bp and an N50-scaffold of 317,558 bp. Maximum contig and scaffold lengths were 3.3 and 4.2 Mb, respectively, with a combined scaffold length representing 92% of the estimated genome size. The completeness and accuracy of the scaffold assembly was demonstrated by the precise exon placement for 91.1% of the 9805 assembled FL-ORFs and 83.1% of the BUSCO (Benchmarking Universal Single-Copy Orthologs) gene set used to assess the quality of genome assemblies. Well-represented were genes involved in dentition and enamel deposition, defining characteristics of rodents with which the beaver is well-endowed. The study provides insights for genome assembly and an important genomics resource for Castoridae and rodent evolutionary biology. Copyright © 2017 Lok et al.

  18. De Novo Genome and Transcriptome Assembly of the Canadian Beaver (Castor canadensis)

    PubMed Central

    Lok, Si; Paton, Tara A.; Wang, Zhuozhi; Kaur, Gaganjot; Walker, Susan; Yuen, Ryan K. C.; Sung, Wilson W. L.; Whitney, Joseph; Buchanan, Janet A.; Trost, Brett; Singh, Naina; Apresto, Beverly; Chen, Nan; Coole, Matthew; Dawson, Travis J.; Ho, Karen; Hu, Zhizhou; Pullenayegum, Sanjeev; Samler, Kozue; Shipstone, Arun; Tsoi, Fiona; Wang, Ting; Pereira, Sergio L.; Rostami, Pirooz; Ryan, Carol Ann; Tong, Amy Hin Yan; Ng, Karen; Sundaravadanam, Yogi; Simpson, Jared T.; Lim, Burton K.; Engstrom, Mark D.; Dutton, Christopher J.; Kerr, Kevin C. R.; Franke, Maria; Rapley, William; Wintle, Richard F.; Scherer, Stephen W.

    2017-01-01

    The Canadian beaver (Castor canadensis) is the largest indigenous rodent in North America. We report a draft annotated assembly of the beaver genome, the first for a large rodent and the first mammalian genome assembled directly from uncorrected and moderate coverage (< 30 ×) long reads generated by single-molecule sequencing. The genome size is 2.7 Gb estimated by k-mer analysis. We assembled the beaver genome using the new Canu assembler optimized for noisy reads. The resulting assembly was refined using Pilon supported by short reads (80 ×) and checked for accuracy by congruency against an independent short read assembly. We scaffolded the assembly using the exon–gene models derived from 9805 full-length open reading frames (FL-ORFs) constructed from the beaver leukocyte and muscle transcriptomes. The final assembly comprised 22,515 contigs with an N50 of 278,680 bp and an N50-scaffold of 317,558 bp. Maximum contig and scaffold lengths were 3.3 and 4.2 Mb, respectively, with a combined scaffold length representing 92% of the estimated genome size. The completeness and accuracy of the scaffold assembly was demonstrated by the precise exon placement for 91.1% of the 9805 assembled FL-ORFs and 83.1% of the BUSCO (Benchmarking Universal Single-Copy Orthologs) gene set used to assess the quality of genome assemblies. Well-represented were genes involved in dentition and enamel deposition, defining characteristics of rodents with which the beaver is well-endowed. The study provides insights for genome assembly and an important genomics resource for Castoridae and rodent evolutionary biology. PMID:28087693

  19. Improving the Performance of Two-Stage Gas Guns By Adding a Diaphragm in the Pump Tube

    NASA Technical Reports Server (NTRS)

    Bogdanoff, D. W.; Miller, Robert J.

    1995-01-01

    Herein, we study the technique of improving the gun performance by installing a diaphragm in the pump tube of the gun. A CFD study is carried out for the 0.28 in. gun in the Hypervelocity Free Flight Radiation (HFF RAD) range at the NASA Ames Research Center. The normal, full-length pump tube is studied as well as two pump tubes of reduced length (approximately 75% and approximately 33% of the normal length). Significant improvements in performance are calculated to be gained for the reduced length pump tubes upon the addition of the diaphragm. These improvements are identified as reductions in maximum pressures in the pump tube and at the projectile base of approximately 20%, while maintaining the projectile muzzle velocity or as increases in muzzle velocity of approximately 0.5 km/sec while not increasing the maximum pressures in the gun. Also, it is found that both guns with reduced pump tube length (with diaphragms) could maintain the performance of gun with the full length pump tube without diaphragms, whereas the guns with reduced pump tube lengths without diaphragms could not. A five-shot experimental investigation of the pump tube diaphragm technique is carried out for the gun with a pump tube length of 75% normal. The CFD predictions of increased muzzle velocity are borne out by the experimental data. Modest, but useful muzzle velocity increases (2.5 - 6%) are obtained upon the installation of a diaphragm, compared to a benchmark shot without a diaphragm.

  20. Polycrystalline diamond RF MOSFET with MoO3 gate dielectric

    NASA Astrophysics Data System (ADS)

    Ren, Zeyang; Zhang, Jinfeng; Zhang, Jincheng; Zhang, Chunfu; Chen, Dazheng; Quan, Rudai; Yang, Jiayin; Lin, Zhiyu; Hao, Yue

    2017-12-01

    We report the radio frequency characteristics of the diamond metal-oxide-semiconductor field effect transistor with MoO3 gate dielectric for the first time. The device with 2-μm gate length was fabricated on high quality polycrystalline diamond. The maximum drain current of 150 mA/mm at VGS = -5 V and the maximum transconductance of 27 mS/mm were achieved. The extrinsic cutoff frequency of 1.2 GHz and the maximum oscillation frequency of 1.9 GHz have been measured. The moderate frequency characteristics are attributed to the moderate transconductance limited by the series resistance along the channel. We expect that the frequency characteristics of the device can be improved by increasing the magnitude of gm, or fundamentally decreasing the gate-controlled channel resistance and series resistance along the channel, and down-scaling the gate length.

  1. Molecular cloning, sequence analysis and phylogeny of first caudata g-type lysozyme in axolotl (Ambystoma mexicanum).

    PubMed

    Yu, Haining; Gao, Jiuxiang; Lu, Yiling; Guang, Huijuan; Cai, Shasha; Zhang, Songyan; Wang, Yipeng

    2013-11-01

    Lysozymes are key proteins that play important roles in innate immune defense in many animal phyla by breaking down the bacterial cell-walls. In this study, we report the molecular cloning, sequence analysis and phylogeny of the first caudate amphibian g-lysozyme: a full-length spleen cDNA library from axolotl (Ambystoma mexicanum). A goose-type (g-lysozyme) EST was identified and the full-length cDNA was obtained using RACE-PCR. The axolotl g-lysozyme sequence represents an open reading frame for a putative signal peptide and the mature protein composed of 184 amino acids. The calculated molecular mass and the theoretical isoelectric point (pl) of this mature protein are 21523.0 Da and 4.37, respectively. Expression of g-lysozyme mRNA is predominantly found in skin, with lower levels in spleen, liver, muscle, and lung. Phylogenetic analysis revealed that caudate amphibian g-lysozyme had distinct evolution pattern for being juxtaposed with not only anura amphibian, but also with the fish, bird and mammal. Although the first complete cDNA sequence for caudate amphibian g-lysozyme is reported in the present study, clones encoding axolotl's other functional immune molecules in the full-length cDNA library will have to be further sequenced to gain insight into the fundamental aspects of antibacterial mechanisms in caudate.

  2. Structure of the highly repeated, long interspersed DNA family (LINE or L1Rn) of the rat.

    PubMed Central

    D'Ambrosio, E; Waitzkin, S D; Witney, F R; Salemme, A; Furano, A V

    1986-01-01

    We present the DNA sequence of a 6.7-kilobase member of the rat long interspersed repeated DNA family (LINE or L1Rn). This member (LINE 3) is flanked by a perfect 14-base-pair (bp) direct repeat and is a full-length, or close-to-full-length, member of this family. LINE 3 contains an approximately 100-bp A-rich right end, a number of long (greater than 400-bp) open reading frames, and a ca. 200-bp G + C-rich (ca. 60%) cluster near each terminus. Comparison of the LINE 3 sequence with the sequence of about one-half of another member, which we also present, as well as restriction enzyme analysis of the genomic copies of this family, indicates that in length and overall structure LINE 3 is quite typical of the 40,000 or so other genomic members of this family which would account for as much as 10% of the rat genome. Therefore, the rat LINE family is relatively homogeneous, which contrasts with the heterogeneous LINE families in primates and mice. Transcripts corresponding to the entire LINE sequence are abundant in the nuclear RNA of rat liver. The characteristics of the rat LINE family are discussed with respect to the possible function and evolution of this family of DNA sequences. Images PMID:3023845

  3. Perceived empty duration between sounds of different lengths: Possible relation with repetition and rhythmic grouping.

    PubMed

    Kuroda, Tsuyoshi; Tomimatsu, Erika; Grondin, Simon; Miyazaki, Makoto

    2016-11-01

    We investigated how perceived duration of empty time intervals would be modulated by the length of sounds marking those intervals. Three sounds were successively presented in Experiment 1. Each sound was short (S) or long (L), and the temporal position of the middle sound's onset was varied. The lengthening of each sound resulted in delayed perception of the onset; thus, the middle sound's onset had to be presented earlier in the SLS than in the LSL sequence so that participants perceived the three sounds as presented at equal interonset intervals. In Experiment 2, a short sound and a long sound were alternated repeatedly, and the relative duration of the SL interval to the LS interval was varied. This repeated sequence was perceived as consisting of equal interonset intervals when the onsets of all sounds were aligned at physically equal intervals. If the same onset delay as in the preceding experiment had occurred, participants should have perceived equality between the interonset intervals in the repeated sequence when the SL interval was physically shortened relative to the LS interval. The effects of sound length seemed to be canceled out when the presentation of intervals was repeated. Finally, the perceived duration of the interonset intervals in the repeated sequence was not influenced by whether the participant's native language was French or Japanese, or by how the repeated sequence was perceptually segmented into rhythmic groups.

  4. Dynamic Energy Landscapes of Riboswitches Help Interpret Conformational Rearrangements and Function

    PubMed Central

    Quarta, Giulio; Sin, Ken; Schlick, Tamar

    2012-01-01

    Riboswitches are RNAs that modulate gene expression by ligand-induced conformational changes. However, the way in which sequence dictates alternative folding pathways of gene regulation remains unclear. In this study, we compute energy landscapes, which describe the accessible secondary structures for a range of sequence lengths, to analyze the transcriptional process as a given sequence elongates to full length. In line with experimental evidence, we find that most riboswitch landscapes can be characterized by three broad classes as a function of sequence length in terms of the distribution and barrier type of the conformational clusters: low-barrier landscape with an ensemble of different conformations in equilibrium before encountering a substrate; barrier-free landscape in which a direct, dominant “downhill” pathway to the minimum free energy structure is apparent; and a barrier-dominated landscape with two isolated conformational states, each associated with a different biological function. Sharing concepts with the “new view” of protein folding energy landscapes, we term the three sequence ranges above as the sensing, downhill folding, and functional windows, respectively. We find that these energy landscape patterns are conserved in various riboswitch classes, though the order of the windows may vary. In fact, the order of the three windows suggests either kinetic or thermodynamic control of ligand binding. These findings help understand riboswitch structure/function relationships and open new avenues to riboswitch design. PMID:22359488

  5. Modeling participation duration, with application to the North American Breeding Bird Survey

    USGS Publications Warehouse

    Link, William; Sauer, John

    2014-01-01

    We consider “participation histories,” binary sequences consisting of alternating finite sequences of 1s and 0s, ending with an infinite sequence of 0s. Our work is motivated by a study of observer tenure in the North American Breeding Bird Survey (BBS). In our analysis, j indexes an observer’s years of service and Xj is an indicator of participation in the survey; 0s interspersed among 1s correspond to years when observers did not participate, but subsequently returned to service. Of interest is the observer’s duration D = max {j: Xj = 1}. Because observed records X = (X1, X2,..., Xn)1 are of finite length, all that we can directly infer about duration is that D ⩾ max {j ⩽n: Xj = 1}; model-based analysis is required for inference about D. We propose models in which lengths of 0s and 1s sequences have distributions determined by the index j at which they begin; 0s sequences are infinite with positive probability, an estimable parameter. We found that BBS observers’ lengths of service vary greatly, with 25.3% participating for only a single year, 49.5% serving for 4 or fewer years, and an average duration of 8.7 years, producing an average of 7.7 counts.

  6. Differences in a ribosomal DNA sequence of Strongylus species allows identification of single eggs.

    PubMed

    Campbell, A J; Gasser, R B; Chilton, N B

    1995-03-01

    In the current study, molecular techniques were evaluated for the species identification of individual strongyle eggs. Adult worms of Strongylus edentatus, S. equinus and S. vulgaris were collected at necropsy from horses from Australia and the U.S.A. Genomic DNA was isolated and a ribosomal transcribed spacer (ITS-2) amplified and sequenced using polymerase chain reaction (PCR) techniques. The length of the ITS-2 sequence of S. edentatus, S. equinus and S. vulgaris ranged between 217 and 235 nucleotides. Extensive sequence analysis demonstrated a low degree (0-0.9%) of intraspecific variation in the ITS-2 for the Strongylus species examined, whereas the levels of interspecific differences (13-29%) were significantly greater. Interspecific differences in the ITS-2 sequences allowed unequivocal species identification of single worms and eggs using PCR-linked restriction fragment length polymorphism. These results demonstrate the potential of the ribosomal spacers as genetic markers for species identification of single strongyle eggs from horse faeces.

  7. Toward a Better Compression for DNA Sequences Using Huffman Encoding

    PubMed Central

    Almarri, Badar; Al Yami, Sultan; Huang, Chun-Hsi

    2017-01-01

    Abstract Due to the significant amount of DNA data that are being generated by next-generation sequencing machines for genomes of lengths ranging from megabases to gigabases, there is an increasing need to compress such data to a less space and a faster transmission. Different implementations of Huffman encoding incorporating the characteristics of DNA sequences prove to better compress DNA data. These implementations center on the concepts of selecting frequent repeats so as to force a skewed Huffman tree, as well as the construction of multiple Huffman trees when encoding. The implementations demonstrate improvements on the compression ratios for five genomes with lengths ranging from 5 to 50 Mbp, compared with the standard Huffman tree algorithm. The research hence suggests an improvement on all such DNA sequence compression algorithms that use the conventional Huffman encoding. The research suggests an improvement on all DNA sequence compression algorithms that use the conventional Huffman encoding. Accompanying software is publicly available (AL-Okaily, 2016). PMID:27960065

  8. Toward a Better Compression for DNA Sequences Using Huffman Encoding.

    PubMed

    Al-Okaily, Anas; Almarri, Badar; Al Yami, Sultan; Huang, Chun-Hsi

    2017-04-01

    Due to the significant amount of DNA data that are being generated by next-generation sequencing machines for genomes of lengths ranging from megabases to gigabases, there is an increasing need to compress such data to a less space and a faster transmission. Different implementations of Huffman encoding incorporating the characteristics of DNA sequences prove to better compress DNA data. These implementations center on the concepts of selecting frequent repeats so as to force a skewed Huffman tree, as well as the construction of multiple Huffman trees when encoding. The implementations demonstrate improvements on the compression ratios for five genomes with lengths ranging from 5 to 50 Mbp, compared with the standard Huffman tree algorithm. The research hence suggests an improvement on all such DNA sequence compression algorithms that use the conventional Huffman encoding. The research suggests an improvement on all DNA sequence compression algorithms that use the conventional Huffman encoding. Accompanying software is publicly available (AL-Okaily, 2016 ).

  9. A comparative study of working memory: immediate serial spatial recall in baboons (Papio papio) and humans.

    PubMed

    Fagot, Joël; De Lillo, Carlo

    2011-12-01

    Two experiments assessed if non-human primates can be meaningfully compared to humans in a non-verbal test of serial recall. A procedure was used that was derived from variations of the Corsi test, designed to test the effects of sequence structure and movement path length in humans. Two baboons were tested in Experiment 1. The monkeys showed several attributes of human serial recall. These included an easier recall of sequences with a shorter number of items and of sequences characterized by a shorter path length when the number of items was kept constant. However, the accuracy and speed of processing did not indicate that the monkeys were able to benefit from the spatiotemporal structure of sequences. Humans tested in Experiment 2 showed a quantitatively longer memory span, and, in contrast with monkeys, benefitted from sequence structure. The results are discussed in relation to differences in how human and non-human primates segment complex visual patterns. Copyright © 2011 Elsevier Ltd. All rights reserved.

  10. Choice-specific sequences in parietal cortex during a virtual-navigation decision task

    PubMed Central

    Harvey, Christopher D.; Coen, Philip; Tank, David W.

    2012-01-01

    The posterior parietal cortex (PPC) plays an important role in many cognitive behaviors; however, the neural circuit dynamics underlying PPC function are not well understood. Here we optically imaged the spatial and temporal activity patterns of neuronal populations in mice performing a PPC-dependent task that combined a perceptual decision and memory-guided navigation in a virtual environment. Individual neurons had transient activation staggered relative to one another in time, forming a sequence of neuronal activation spanning the entire length of a task trial. Distinct sequences of neurons were triggered on trials with opposite behavioral choices and defined divergent, choice-specific trajectories through a state space of neuronal population activity. Cells participating in the different sequences and at distinct time points in the task were anatomically intermixed over microcircuit length scales (< 100 micrometers). During working memory decision tasks the PPC may therefore perform computations through sequence-based circuit dynamics, rather than long-lived stable states, implemented using anatomically intermingled microcircuits. PMID:22419153

  11. An evolution based biosensor receptor DNA sequence generation algorithm.

    PubMed

    Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M; Lee, Jaewan; Zang, Yupeng

    2010-01-01

    A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements.

  12. The Effects of Within-Sequence Acoustic Similarity on the Short-Term Retention of Consonants and Words

    ERIC Educational Resources Information Center

    Marcer, D.; And Others

    1977-01-01

    Compares the rates of forgetting of five-item sequences of acoustically similar and dissimilar consonants and words in the absence of proactive and retroactive interference in order to test whether within sequence similarity rather than stimulus length would have a greater influence on retention. (Author/RK)

  13. Dr. Sanger's Apprentice: A Computer-Aided Instruction to Protein Sequencing.

    ERIC Educational Resources Information Center

    Schmidt, Thomas G.; Place, Allen R.

    1985-01-01

    Modeled after the program "Mastermind," this program teaches students the art of protein sequencing. The program (written in Turbo Pascal for the IBM PC, requiring 128K, a graphics adapter, and an 8070 mathematics coprocessor) generates a polypeptide whose sequence and length can be user-defined (for practice) or computer-generated (for…

  14. Quantitative analysis and prediction of G-quadruplex forming sequences in double-stranded DNA

    PubMed Central

    Kim, Minji; Kreig, Alex; Lee, Chun-Ying; Rube, H. Tomas; Calvert, Jacob; Song, Jun S.; Myong, Sua

    2016-01-01

    Abstract G-quadruplex (GQ) is a four-stranded DNA structure that can be formed in guanine-rich sequences. GQ structures have been proposed to regulate diverse biological processes including transcription, replication, translation and telomere maintenance. Recent studies have demonstrated the existence of GQ DNA in live mammalian cells and a significant number of potential GQ forming sequences in the human genome. We present a systematic and quantitative analysis of GQ folding propensity on a large set of 438 GQ forming sequences in double-stranded DNA by integrating fluorescence measurement, single-molecule imaging and computational modeling. We find that short minimum loop length and the thymine base are two main factors that lead to high GQ folding propensity. Linear and Gaussian process regression models further validate that the GQ folding potential can be predicted with high accuracy based on the loop length distribution and the nucleotide content of the loop sequences. Our study provides important new parameters that can inform the evaluation and classification of putative GQ sequences in the human genome. PMID:27095201

  15. High resolution identity testing of inactivated poliovirus vaccines

    PubMed Central

    Mee, Edward T.; Minor, Philip D.; Martin, Javier

    2015-01-01

    Background Definitive identification of poliovirus strains in vaccines is essential for quality control, particularly where multiple wild-type and Sabin strains are produced in the same facility. Sequence-based identification provides the ultimate in identity testing and would offer several advantages over serological methods. Methods We employed random RT-PCR and high throughput sequencing to recover full-length genome sequences from monovalent and trivalent poliovirus vaccine products at various stages of the manufacturing process. Results All expected strains were detected in previously characterised products and the method permitted identification of strains comprising as little as 0.1% of sequence reads. Highly similar Mahoney and Sabin 1 strains were readily discriminated on the basis of specific variant positions. Analysis of a product known to contain incorrect strains demonstrated that the method correctly identified the contaminants. Conclusion Random RT-PCR and shotgun sequencing provided high resolution identification of vaccine components. In addition to the recovery of full-length genome sequences, the method could also be easily adapted to the characterisation of minor variant frequencies and distinction of closely related products on the basis of distinguishing consensus and low frequency polymorphisms. PMID:26049003

  16. Nucleotide sequence of an exceptionally long 5.8S ribosomal RNA from Crithidia fasciculata.

    PubMed Central

    Schnare, M N; Gray, M W

    1982-01-01

    In Crithidia fasciculata, a trypanosomatid protozoan, the large ribosomal subunit contains five small RNA species (e, f, g, i, j) in addition to 5S rRNA [Gray, M.W. (1981) Mol. Cell. Biol. 1, 347-357]. The complete primary sequence of species i is shown here to be pAACGUGUmCGCGAUGGAUGACUUGGCUUCCUAUCUCGUUGA ... AGAmACGCAGUAAAGUGCGAUAAGUGGUApsiCAAUUGmCAGAAUCAUUCAAUUACCGAAUCUUUGAACGAAACGG ... CGCAUGGGAGAAGCUCUUUUGAGUCAUCCCCGUGCAUGCCAUAUUCUCCAmGUGUCGAA(C)OH. This sequence establishes that species i is a 5.8S rRNA, despite its exceptional length (171-172 nucleotides). The extra nucleotides in C. fasciculata 5.8S rRNA are located in a region whose primary sequence and length are highly variable among 5.8S rRNAs, but which is capable of forming a stable hairpin loop structure (the "G+C-rich hairpin"). The sequence of C. fasciculata 5.8S rRNA is no more closely related to that of another protozoan, Acanthamoeba castellanii, than it is to representative 5.8S rRNA sequences from the other eukaryotic kingdoms, emphasizing the deep phylogenetic divisions that seem to exist within the Kingdom Protista. Images PMID:7079176

  17. Identifying the Basal Angiosperm Node in Chloroplast GenomePhylogenies: Sampling One's Way Out of the Felsenstein Zone

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Leebens-Mack, Jim; Raubeson, Linda A.; Cui, Liying

    2005-05-27

    While there has been strong support for Amborella and Nymphaeales (water lilies) as branching from basal-most nodes in the angiosperm phylogeny, this hypothesis has recently been challenged by phylogenetic analyses of 61 protein-coding genes extracted from the chloroplast genome sequences of Amborella, Nymphaea and 12 other available land plant chloroplast genomes. These character-rich analyses placed the monocots, represented by three grasses (Poaceae), as sister to all other extant angiosperm lineages. We have extracted protein-coding regions from draft sequences for six additional chloroplast genomes to test whether this surprising result could be an artifact of long-branch attraction due to limited taxonmore » sampling. The added taxa include three monocots (Acorus, Yucca and Typha), a water lily (Nuphar), a ranunculid(Ranunculus), and a gymnosperm (Ginkgo). Phylogenetic analyses of the expanded DNA and protein datasets together with microstructural characters (indels) provided unambiguous support for Amborella and the Nymphaeales as branching from the basal-most nodes in the angiospermphylogeny. However, their relative positions proved to be dependent on method of analysis, with parsimony favoring Amborella as sister to all other angiosperms, and maximum likelihood and neighbor-joining methods favoring an Amborella + Nympheales clade as sister. The maximum likelihood phylogeny supported the later hypothesis, but the likelihood for the former hypothesis was not significantly different. Parametric bootstrap analysis, single gene phylogenies, estimated divergence dates and conflicting in del characters all help to illuminate the nature of the conflict in resolution of the most basal nodes in the angiospermphylogeny. Molecular dating analyses provided median age estimates of 161 mya for the most recent common ancestor of all extant angiosperms and 145 mya for the most recent common ancestor of monocots, magnoliids andeudicots. Whereas long sequences reduce variance in branch lengths and molecular dating estimates, the impact of improved taxon sampling on the rooting of the angiosperm phylogeny together with the results of parametric bootstrap analyses demonstrate how long-branch attraction can mislead genome-scale phylogenetic analyses.« less

  18. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

    PubMed Central

    Marques, M Carmen; Alonso-Cantabrana, Hugo; Forment, Javier; Arribas, Raquel; Alamar, Santiago; Conejero, Vicente; Perez-Amador, Miguel A

    2009-01-01

    Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new EST collection denotes an important step towards the identification of all genes in the citrus genome. Furthermore, public availability of the cDNA clones generated in this study, and not only their sequence, enables testing of the biological function of the genes represented in the collection. Expression of the citrus SEP3 homologue, CitrSEP, in Arabidopsis results in early flowering, along with other phenotypes resembling the over-expression of the Arabidopsis SEPALLATA genes. Our findings suggest that the members of the SEP gene family play similar roles in these quite distant plant species. PMID:19747386

  19. Trough Coating Solar Cells Without Spillover

    NASA Technical Reports Server (NTRS)

    Heaps, J. D.

    1986-01-01

    Problem with trough coating of silicon on ceramic - spillover of molten silicon - overcome by combination of redesigned heaters and tiltable trough. Modifications make it possible to coat virtually any length of ceramic with film of solar-cell-grade silicon. Previously, maximum length coated before spillover occurred was 2 inches (5.1 cm).

  20. Using Maximum Entropy to Find Patterns in Genomes

    NASA Astrophysics Data System (ADS)

    Liu, Sophia; Hockenberry, Adam; Lancichinetti, Andrea; Jewett, Michael; Amaral, Luis

    The existence of over- and under-represented sequence motifs in genomes provides evidence of selective evolutionary pressures on biological mechanisms such as transcription, translation, ligand-substrate binding, and host immunity. To accurately identify motifs and other genome-scale patterns of interest, it is essential to be able to generate accurate null models that are appropriate for the sequences under study. There are currently no tools available that allow users to create random coding sequences with specified amino acid composition and GC content. Using the principle of maximum entropy, we developed a method that generates unbiased random sequences with pre-specified amino acid and GC content. Our method is the simplest way to obtain maximally unbiased random sequences that are subject to GC usage and primary amino acid sequence constraints. This approach can also be easily be expanded to create unbiased random sequences that incorporate more complicated constraints such as individual nucleotide usage or even di-nucleotide frequencies. The ability to generate correctly specified null models will allow researchers to accurately identify sequence motifs which will lead to a better understanding of biological processes. National Institute of General Medical Science, Northwestern University Presidential Fellowship, National Science Foundation, David and Lucile Packard Foundation, Camille Dreyfus Teacher Scholar Award.

Top