Sample records for sequencing marssim final

  1. Press Oil Final Release Survey

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Whicker, Jeffrey Jay; Ruedig, Elizabeth

    There are forty-eight 55 gallon barrels filled with hydraulic oil that are candidates for release and recycle. This oil needs to be characterized prior to release. Principles of sampling as provided in MARSAME/MARSSIM approaches were used as guidance for sampling.

  2. Multi-Agency Radiation Survey and Site Investigation Manual (MARSSIM)

    EPA Pesticide Factsheets

    The Multi-Agency Radiation Survey and Site Investigation Manual (MARSSIM) provides detailed guidance on how to demonstrate that a site is in compliance with a radiation dose- or risk-based regulation.

  3. TECHNICAL BASIS DOCUMENT OF MARSSIM FIELD CALIBRATION FOR QUANTIFICATION OF CS-137 VOLUMETRICALLY CONTAMINTED SOILS IN THE BC CONTROLLED AREA USING A 4 BY 4 BY 16 INCH SODIUM IODIDE DETECTOR

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    PAPPIN JL

    The purpose of this paper is to provide the Technical Basis and Documentation for Field Calibrations of radiation measurement equipment for use in the MARSSIM Seeping Surveys of the BC Controlled Area (BCCA). The Be Controlled Area is bounded on tt1e north by (but does not include) the BCCribs & Trenches and is bounded on the south by Army Loop Road. Parts of the BC Controlled Area are posted as a Contamination Area and the remainder is posted as a Soil Contamination Area. The area is approximately 13 square miles and divided into three zones (Zone A , Zone B.more » and Zone C). A map from reference 1 which shows the 3 zones is attached. The MARSSIM Scoping Surveys are intended 10 better identify the boundaries of the three zones based on the volumetric (pCi/g) contamination levels in the soil. The MARSSIM Field Calibration. reference 2. of radiation survey instrumentation will determine the Minimum Detectable Concentration (MDC) and an algorithm for converting counts to pCi/g. The instrumentation and corresponding results are not intended for occupational radiation protection decisions or for the release of property per DOE Order 5400.5.« less

  4. Final Status Survey Report for Corrective Action Unit 117 - Pluto Disassembly Facility, Building 2201, Nevada National Security Site, Nevada

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jeremy Gwin and Douglas Frenette

    This document contains the process knowledge, radiological data and subsequent statistical methodology and analysis to support approval for the radiological release of Corrective Action Unit (CAU) 117 – Pluto Disassembly Facility, Building 2201 located in Area 26 of the Nevada National Security Site (NNSS). Preparations for release of the building began in 2009 and followed the methodology described in the Multi-Agency Radiation Survey and Site Investigation Manual (MARSSIM). MARSSIM is the DOE approved process for release of Real Property (buildings and landmasses) to a set of established criteria or authorized limits. The pre-approved authorized limits for surface contamination values andmore » corresponding assumptions were established by DOE O 5400.5. The release criteria coincide with the acceptance criteria of the U10C landfill permit. The U10C landfill is the proposed location to dispose of the radiologically non-impacted, or “clean,” building rubble following demolition. However, other disposition options that include the building and/or waste remaining at the NNSS may be considered providing that the same release limits apply. The Final Status Survey was designed following MARSSIM guidance by reviewing historical documentation and radiological survey data. Following this review a formal radiological characterization survey was performed in two phases. The characterization revealed multiple areas of residual radioactivity above the release criteria. These locations were remediated (decontaminated) and then the surface activity was verified to be less than the release criteria. Once remediation efforts had been successfully completed, a Final Status Survey Plan (10-015, “Final Status Survey Plan for Corrective Action Unit 117 – Pluto Disassembly Facility, Building 2201”) was developed and implemented to complete the final step in the MARSSIM process, the Final Status Survey. The Final Status Survey Plan consisted of categorizing each individual room into one of three categories: Class 1, Class 2 or Class 3 (a fourth category is a “Non-Impacted Class” which in the case of Building 2201 only pertained to exterior surfaces of the building.) The majority of the rooms were determined to fall in the less restrictive Class 3 category, however, Rooms 102, 104, 106, and 107 were identified as containing Class 1 and 2 areas. Building 2201 was divided into “survey units” and surveyed following the requirements of the Final Status Survey Plan for each particular class. As each survey unit was completed and documented, the survey results were evaluated. Each sample (static measurement) with units of counts per minute (cpm) was corrected for the appropriate background and converted to a value with units of dpm/100 cm2. With a surface contamination value in the appropriate units, it was compared to the surface contamination limits, or in this case the derived concentration guideline level (DCGLw). The appropriate statistical test (sign test) was then performed. If the survey unit was statistically determined to be below the DCGLw, then the survey unit passed and the null hypothesis (that the survey unit is above limits) was rejected. If the survey unit was equal to or below the critical value in the sign test, the null hypothesis was not rejected. This process was performed for all survey units within Building 2201. A total of thirty-three “Class 1,” four “Class 2,” and one “Class 3” survey units were developed, surveyed, and evaluated. All survey units successfully passed the statistical test. Building 2201 meets the release criteria commensurate with the Waste Acceptance Criteria (for radiological purposes) of the U10C landfill permit residing within NNSS boundaries. Based on the thorough statistical sampling and scanning of the building’s interior, Building 2201 may be considered radiologically “clean,” or free of contamination.« less

  5. MARSAME Manual and Resources

    EPA Pesticide Factsheets

    MARSAME provides technical information on survey approaches to determine proper disposition of materials and equipment (M&E). MARSAME is a supplement to the Multi-Agency Radiation Survey and Site Investigation Manual (MARSSIM).

  6. REVIEW OF MULTI-AGENCY RADIATION SURVEY & SITE INVESTIGATION MANUAL (MARSSIM) SUPPLEMENT: MULTI-AGENCY RADIATION SURVEY AND ASSESSMENT OF MATERIALS AND EQUIPMENT (MARSAME)

    EPA Science Inventory

    Radioactive materials have been produced, processed, used, and transported amongst thousands of sites throughout the United States. Owners and operators of these sites would like to determine if materials or equipment on these sites are contaminated with radioactive materials, i...

  7. Method for Implementing Subsurface Solid Derived Concentration Guideline Levels (DCGL) - 12331

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lively, J.W.

    2012-07-01

    The U.S. Nuclear Regulatory Commission (NRC) and other federal agencies currently approve the Multi-Agency Radiation Site Survey and Investigation Manual (MARSSIM) as guidance for licensees who are conducting final radiological status surveys in support of decommissioning. MARSSIM provides a method to demonstrate compliance with the applicable regulation by comparing residual radioactivity in surface soils with derived concentration guideline levels (DCGLs), but specifically discounts its applicability to subsurface soils. Many sites and facilities undergoing decommissioning contain subsurface soils that are potentially impacted by radiological constituents. In the absence of specific guidance designed to address the derivation of subsurface soil DCGLs andmore » compliance demonstration, decommissioning facilities have attempted to apply DCGLs and final status survey techniques designed specifically for surface soils to subsurface soils. The decision to apply surface soil limits and surface soil compliance metrics to subsurface soils typically results in significant over-excavation with associated cost escalation. MACTEC, Inc. has developed the overarching concepts and principles found in recent NRC decommissioning guidance in NUREG 1757 to establish a functional method to derive dose-based subsurface soil DCGLs. The subsurface soil method developed by MACTEC also establishes a rigorous set of criterion-based data evaluation metrics (with analogs to the MARSSIM methodology) that can be used to demonstrate compliance with the developed subsurface soil DCGLs. The method establishes a continuum of volume factors that relate the size and depth of a volume of subsurface soil having elevated concentrations of residual radioactivity with its ability to produce dose. The method integrates the subsurface soil sampling regime with the derivation of the subsurface soil DCGL such that a self-regulating optimization is naturally sought by both the responsible party and regulator. This paper describes the concepts and basis used by MACTEC to develop the dose-based subsurface soil DCGL method. The paper will show how MACTEC's method can be used to demonstrate that higher concentrations of residual radioactivity in subsurface soils (as compared with surface soils) can meet the NRC's dose-based regulations. MACTEC's method has been used successfully to obtain the NRC's radiological release at a site with known radiological impacts to subsurface soils exceeding the surface soil DCGL, saving both time and cost. Having considered the current NRC guidance for consideration of residual radioactivity in subsurface soils during decommissioning, MACTEC has developed a technically based approach to the derivation of and demonstration of compliance with subsurface soil DCGLs for radionuclides. In fact, the process uses the already accepted concepts and metrics approved for surface soils as the foundation for deriving scaling factors used to calculate subsurface soil DCGLs that are at least equally protective of the decommissioning annual dose standard. Each of the elements identified for consideration in the current NRC guidance is addressed in this proposed method. Additionally, there is considerable conservatism built into the assumptions and techniques used to arrive at subsurface soil scaling factors and DCGLs. The degree of conservatism embodied in the approach used is such that risk managers and decision makers approving and using subsurface soil DCGLs derived in accordance with this method can be confident that the future exposures will be well below permissible and safe levels. The technical basis for the method can be applied to a broad variety of sites with residual radioactivity in subsurface soils. Given the costly nature of soil surveys, excavation, and disposal of soils as low-level radioactive waste, MACTEC's method for deriving and demonstrating compliance with subsurface soil DCGLs offers the possibility of significant cost savings over the traditional approach of applying surface soil DCGLs to subsurface soils. Furthermore, while yet untested, MACTEC believes that the concepts and methods embodied in this approach could readily be applied to other types of contamination found in subsurface soils. (author)« less

  8. USE OF THE AERIAL MEASUREMENT SYSTEM HELICOPTER EMERGENCY RESPONSE ACQUISITION SYSTEMS WITH GEOGRAPHIC INFORMATION SYSTEM FOR RADIOACTIVE SOIL REMEDIATION - [11504

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    BROCK CT

    2011-02-15

    The Aerial Measurement System (AMS) Helicopter Emergency Response Acquisition System provides a thorough and economical means to identify and characterize the contaminants for large area radiological surveys. The helicopter system can provide a 100-percent survey of an area that qualifies as a scoping survey under the Multi-Agency Radiation Survey and Site Investigation Manual (MARSSIM) methodology. If the sensitivity is adequate when compared to the clean up values, it may also be used for the characterization survey. The data from the helicopter survey can be displayed and manipulated to provide invaluable data during remediation activities.

  9. Multiresolution analysis of characteristic length scales with high-resolution topographic data

    NASA Astrophysics Data System (ADS)

    Sangireddy, Harish; Stark, Colin P.; Passalacqua, Paola

    2017-07-01

    Characteristic length scales (CLS) define landscape structure and delimit geomorphic processes. Here we use multiresolution analysis (MRA) to estimate such scales from high-resolution topographic data. MRA employs progressive terrain defocusing, via convolution of the terrain data with Gaussian kernels of increasing standard deviation, and calculation at each smoothing resolution of (i) the probability distributions of curvature and topographic index (defined as the ratio of slope to area in log scale) and (ii) characteristic spatial patterns of divergent and convergent topography identified by analyzing the curvature of the terrain. The MRA is first explored using synthetic 1-D and 2-D signals whose CLS are known. It is then validated against a set of MARSSIM (a landscape evolution model) steady state landscapes whose CLS were tuned by varying hillslope diffusivity and simulated noise amplitude. The known CLS match the scales at which the distributions of topographic index and curvature show scaling breaks, indicating that the MRA can identify CLS in landscapes based on the scaling behavior of topographic attributes. Finally, the MRA is deployed to measure the CLS of five natural landscapes using meter resolution digital terrain model data. CLS are inferred from the scaling breaks of the topographic index and curvature distributions and equated with (i) small-scale roughness features and (ii) the hillslope length scale.

  10. A multi-resolution analysis of lidar-DTMs to identify geomorphic processes from characteristic topographic length scales

    NASA Astrophysics Data System (ADS)

    Sangireddy, H.; Passalacqua, P.; Stark, C. P.

    2013-12-01

    Characteristic length scales are often present in topography, and they reflect the driving geomorphic processes. The wide availability of high resolution lidar Digital Terrain Models (DTMs) allows us to measure such characteristic scales, but new methods of topographic analysis are needed in order to do so. Here, we explore how transitions in probability distributions (pdfs) of topographic variables such as (log(area/slope)), defined as topoindex by Beven and Kirkby[1979], can be measured by Multi-Resolution Analysis (MRA) of lidar DTMs [Stark and Stark, 2001; Sangireddy et al.,2012] and used to infer dominant geomorphic processes such as non-linear diffusion and critical shear. We show this correlation between dominant geomorphic processes to characteristic length scales by comparing results from a landscape evolution model to natural landscapes. The landscape evolution model MARSSIM Howard[1994] includes components for modeling rock weathering, mass wasting by non-linear creep, detachment-limited channel erosion, and bedload sediment transport. We use MARSSIM to simulate steady state landscapes for a range of hillslope diffusivity and critical shear stresses. Using the MRA approach, we estimate modal values and inter-quartile ranges of slope, curvature, and topoindex as a function of resolution. We also construct pdfs at each resolution and identify and extract characteristic scale breaks. Following the approach of Tucker et al.,[2001], we measure the average length to channel from ridges, within the GeoNet framework developed by Passalacqua et al.,[2010] and compute pdfs for hillslope lengths at each scale defined in the MRA. We compare the hillslope diffusivity used in MARSSIM against inter-quartile ranges of topoindex and hillslope length scales, and observe power law relationships between the compared variables for simulated landscapes at steady state. We plot similar measures for natural landscapes and are able to qualitatively infer the dominant geomorphic processes. Also, we explore the variability in hillslope length scales as a function of hillslope diffusivity coefficients and critical shear stress in natural landscapes and show that we can infer signatures of dominant geomorphic processes by analyzing characteristic topographic length scales present in topography. References: Beven, K. and Kirkby, M. J.: A physically based variable contributing area model of basin hydrology, Hydrol. Sci. Bull., 24, 43-69, 1979 Howard, A. D. (1994). A detachment-limited model of drainage basin evolution.Water resources research, 30(7), 2261-2285. Passalacqua, P., Do Trung, T., Foufoula Georgiou, E., Sapiro, G., & Dietrich, W. E. (2010). A geometric framework for channel network extraction from lidar: Nonlinear diffusion and geodesic paths. Journal of Geophysical. Research: Earth Surface (2003-2012), 115(F1). Sangireddy, H., Passalacqua, P., Stark, C.P.(2012). Multi-resolution estimation of lidar-DTM surface flow metrics to identify characteristic topographic length scales, EP13C-0859: AGU Fall meeting 2012. Stark, C. P., & Stark, G. J. (2001). A channelization model of landscape evolution. American Journal of Science, 301(4-5), 486-512. Tucker, G. E., Catani, F., Rinaldo, A., & Bras, R. L. (2001). Statistical analysis of drainage density from digital terrain data. Geomorphology, 36(3), 187-202.

  11. Sediment Transport and Landscape Evolution on Comet 67P/Churyumov-Gerasimenko

    NASA Astrophysics Data System (ADS)

    Birch, S.; Umurhan, O. M.; Hayes, A.; Tang, Y.; Moore, J. M.; White, O. L.

    2017-12-01

    New observations from ESA's Rosetta orbiter of comet 67P/Churyumov-Gerasimenko (67P) have revolutionized our understanding of these primitive bodies and the processes that act to modify their surfaces. Centimeter to meter scale images of the surface of 67P have revealed a diverse sedimentary world, where the dominant landforms consist of vertical, consolidated cliffs and pits interspersed, and in the northern hemisphere buried, by smooth, decameter thick sedimentary deposits. Sublimation erosion, in the form of jets, from exposed cliff faces acts to break off parts of the weakened bedrock material, which then accumulate as mass wasting deposits at the cliff bases. The large boulders within these deposits may also contribute to the jets, as volatiles in exposed faces of the boulders, previously hidden from the Sun, can sublimate away. During a jet event, the less volatile material that does not escape the comet falls back and drapes the rocky surface as smooth deposits. This is particularly evident in the northern hemisphere of 67P and within gravitational lows, where the underlying consolidated material appears to outcrop from underneath a vast cover of sedimentary deposits. These sedimentary materials, having a low thermal inertia, counteracts the erosive process, and allows for the surface of 67P to retain a relatively primitive form to the current day. To understand this process quantitatively, and constrain over what timescale(s) the surface of 67P evolves, we utilized high-resolution photoclinometry digital terrain models ( 14 cm/pixel), and the MARSSIM landscape evolution model, adapted for a low, and variable gravity environment. Perfectly suited to model sublimation erosion and mass-wasting, MARSSIM also allows us to track the re-condensation of non-volatile materials to accurately account for the important feedback played by the sedimentary deposits. These simulations will allow for us to constrain the rates of landscape evolution on 67P, to compare directly to observations of dynamic changes on the nucleus. Through this work, we will also be able to assess the question of whether 67P is primitive or not, using reasonable assumptions as to the volatility and strength of the bedrock materials.

  12. Sampling and Analysis Plan for Assessment of LANL-Derived Residual Radionuclides in Soils within Tract A-16-e for Land Conveyance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gillis, Jessica; Ruedig, Elizabeth

    2016-08-25

    Public Law 105-119 directs the U.S. Department of Energy (DOE) to convey or transfer parcels of land to the Incorporated County of Los Alamos or their designees and to the Department of Interior, Bureau of Indian Affairs, in trust for the Pueblo de San Ildefonso. Los Alamos National Security is tasked to support DOE in conveyance and/or transfer of identified land parcels no later than September 2022. Under DOE Order 458.1, Radiation Protection of the Public and the Environment (O458.1, 2013), and Los Alamos National Laboratory (LANL) implementing Policy 412 (P412, 2014a), real property with the potential to contain residualmore » radioactive material must meet the criteria for clearance and release to the public. This Sampling and Analysis Plan (SAP) investigates Tract A-16-e and proposes 50 project-specific soil samples for use in radiological clearance decisions consistent with LANL Procedure ENV-ES-TP-238 (2015a) and guidance in the Multi-Agency Radiation Survey and Site Investigation Manual (MARSSIM, 2000).« less

  13. Integrating the Clearance in NPP Residual Material Management

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Garcia-Bermejo, R.; Lamela, B.

    Previous Experiences in decommissioning projects are being used to optimize the residual material management in NPP, metallic scrap usually. The approach is based in the availability of a materials Clearance MARSSIM-based methodology developed and licensed in Spain. A typical project includes the integration of segregation, decontamination, clearance, quality control and quality assurance activities. The design is based in the clearance methodology features translating them into standard operational procedures. In terms of ecological taxes and final disposal costs, significant amounts of money could be saved with this type of approaches. The last clearance project managed a total amount of 405 tonsmore » scrap metal and a similar amount of other residual materials occupying a volume of 1500 m{sup 3}. After less than a year of field works 251 tons were finally recycled in a non-licensed smelting facility. The balance was disposed as LILW. In the planning phase the estimated cost savings were 4.5 Meuro. However, today a VLLW option is available in European countries so, the estimated cost savings are reduced to 1.2 Meuro. In conclusion: the application of materials clearance in NPP decommissioning lessons learnt to the NPP residual material management is an interesting management option. This practice is currently going on in Spanish NPP and, in a preliminary view, is consistent with the new MARSAME Draft. An interesting parameter is the cost of 1 m3 of recyclable scrap. The above estimates are very project specific because in the segregation process other residual materials were involved. If the effect of this other materials is removed the estimated Unit Cost were in this project around 1700 euro/m{sup 3}, this figure is clearly below the above VLLW disposal cost of 2600 euro. In a future project it appears feasible to descend to 839 euro/m{sup 3} and if it became routine values and is used in big Decommissioning projects, around 600 euro/m{sup 3} or below possibly could be achieved. A rough economical analysis permits to estimate a saving around 2000 US$ to 13000 US$ per cubic meter of steel scrap according the variability of materials and disposal costs. Many learnt lessons of this practice were used as a feed back in the planning of characterization activities for decommissioning a Spanish NPP and today are considered as a significant reference in our Decommissioning engineering approaches.« less

  14. Soil Segregation Methods for Reducing Transportation and Disposal Costs - 13544

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Frothingham, David; Andrews, Shawn; Barker, Michelle

    2013-07-01

    At Formerly Utilized Sites Remedial Action Program (FUSRAP) sites where the selected alternative for contaminated soil is excavation and off-site disposal, the most significant budget items of the remedial action are the costs for transportation and disposal of soil at an off-site facility. At these sites, the objective is to excavate and dispose of only those soils that exceed derived concentration guideline levels. In situ soil segregation using gross gamma detectors to guide the excavation is often challenging at sites where the soil contamination is overlain by clean soil or where the contaminated soil is located in isolated, subsurface pockets.more » In addition, data gaps are often identified during the alternative evaluation and selection process, resulting in increased uncertainty in the extent of subsurface contamination. In response, the U.S. Army Corps of Engineers, Buffalo District is implementing ex situ soil segregation methods. At the remediated Painesville Site, soils were excavated and fed through a conveyor-belt system, which automatically segregated them into above- and below-cleanup criteria discharge piles utilizing gamma spectroscopy. At the Linde Site and the Shallow Land Disposal Area (SLDA) Site, which are both in the remediation phase, soils are initially segregated during the excavation process using gross gamma detectors and then transported to a pad for confirmatory manual surveying and sampling. At the Linde Site, the ex situ soils are analyzed on the basis of a site-specific method, to establish compliance with beneficial reuse criteria that were developed for the Linde remediation. At the SLDA Site, the ex situ soils are surveyed and sampled based on Multi-Agency Radiation Survey and Site Investigation Manual (MARSSIM) final status survey guidance to demonstrate compliance with the derived concentration guideline levels. At all three sites, the ex situ soils that meet the site- specific DCGLs are retained on-site and used as backfill material. This paper describes the ex situ soil segregation methods, the considerations of each method, and the estimated cost savings from minimizing the volume of soil requiring transportation and off-site disposal. (authors)« less

  15. A Simulation of DNA Sequencing Utilizing 3M Post-It[R] Notes

    ERIC Educational Resources Information Center

    Christensen, Doug

    2009-01-01

    An inexpensive and equipment free approach to teaching the technical aspects of DNA sequencing. The activity described requires an instructor with a familiarity of DNA sequencing technology but provides a straight forward method of teaching the technical aspects of sequencing in the absence of expensive sequencing equipment. The final sequence…

  16. Reduce costs with multimission sequencing and a multimission operations system

    NASA Technical Reports Server (NTRS)

    Bliss, D. A.; Morales, L. C.

    2003-01-01

    The paper will then propose extending this multi-mission philosophy to skeleton timeline development, science sequencing, and spacecraft sequencing. Finally, the paper will investigate a multi-mission approach to MOS development.

  17. Sampling and Analysis Plan for Verification Sampling of LANL-Derived Residual Radionuclides in Soils within Tract A-18-2 for Land Conveyance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ruedig, Elizabeth

    Public Law 105-119 directs the U.S. Department of Energy (DOE) to convey or transfer parcels of land to the Incorporated County of Los Alamos or their designees and to the Department of Interior, Bureau of Indian Affairs, in trust for the Pueblo de San Ildefonso. Los Alamos National Security is tasked to support DOE in conveyance and/or transfer of identified land parcels no later than September 2022. Under DOE Order 458.1, Radiation Protection of the Public and the Environment (O458.1, 2013) and Los Alamos National Laboratory (LANL or the Laboratory) implementing Policy 412 (P412, 2014), real property with the potentialmore » to contain residual radioactive material must meet the criteria for clearance and release to the public. This Sampling and Analysis Plan (SAP) is a second investigation of Tract A-18-2 for the purpose of verifying the previous sampling results (LANL 2017). This sample plan requires 18 projectspecific soil samples for use in radiological clearance decisions consistent with LANL Procedure ENV-ES-TP-238 (2015a) and guidance in the Multi-Agency Radiation Survey and Site Investigation Manual (MARSSIM, 2000). The sampling work will be conducted by LANL, and samples will be evaluated by a LANL-contracted independent lab. However, there will be federal review (verification) of all steps of the sampling process.« less

  18. Science sequence design

    NASA Technical Reports Server (NTRS)

    Koskela, P. E.; Bollman, W. E.; Freeman, J. E.; Helton, M. R.; Reichert, R. J.; Travers, E. S.; Zawacki, S. J.

    1973-01-01

    The activities of the following members of the Navigation Team are recorded: the Science Sequence Design Group, responsible for preparing the final science sequence designs; the Advanced Sequence Planning Group, responsible for sequence planning; and the Science Recommendation Team (SRT) representatives, responsible for conducting the necessary sequence design interfaces with the teams during the mission. The interface task included science support in both advance planning and daily operations. Science sequences designed during the mission are also discussed.

  19. The optimal design of stepped wedge trials with equal allocation to sequences and a comparison to other trial designs.

    PubMed

    Thompson, Jennifer A; Fielding, Katherine; Hargreaves, James; Copas, Andrew

    2017-12-01

    Background/Aims We sought to optimise the design of stepped wedge trials with an equal allocation of clusters to sequences and explored sample size comparisons with alternative trial designs. Methods We developed a new expression for the design effect for a stepped wedge trial, assuming that observations are equally correlated within clusters and an equal number of observations in each period between sequences switching to the intervention. We minimised the design effect with respect to (1) the fraction of observations before the first and after the final sequence switches (the periods with all clusters in the control or intervention condition, respectively) and (2) the number of sequences. We compared the design effect of this optimised stepped wedge trial to the design effects of a parallel cluster-randomised trial, a cluster-randomised trial with baseline observations, and a hybrid trial design (a mixture of cluster-randomised trial and stepped wedge trial) with the same total cluster size for all designs. Results We found that a stepped wedge trial with an equal allocation to sequences is optimised by obtaining all observations after the first sequence switches and before the final sequence switches to the intervention; this means that the first sequence remains in the control condition and the last sequence remains in the intervention condition for the duration of the trial. With this design, the optimal number of sequences is [Formula: see text], where [Formula: see text] is the cluster-mean correlation, [Formula: see text] is the intracluster correlation coefficient, and m is the total cluster size. The optimal number of sequences is small when the intracluster correlation coefficient and cluster size are small and large when the intracluster correlation coefficient or cluster size is large. A cluster-randomised trial remains more efficient than the optimised stepped wedge trial when the intracluster correlation coefficient or cluster size is small. A cluster-randomised trial with baseline observations always requires a larger sample size than the optimised stepped wedge trial. The hybrid design can always give an equally or more efficient design, but will be at most 5% more efficient. We provide a strategy for selecting a design if the optimal number of sequences is unfeasible. For a non-optimal number of sequences, the sample size may be reduced by allowing a proportion of observations before the first or after the final sequence has switched. Conclusion The standard stepped wedge trial is inefficient. To reduce sample sizes when a hybrid design is unfeasible, stepped wedge trial designs should have no observations before the first sequence switches or after the final sequence switches.

  20. Development of Audio and Visual Media to Accompany Sequenced Instructional Programs in Physical Education for the Handicapped. Final Report. July 31, 1972.

    ERIC Educational Resources Information Center

    Avance, Lyonel D.; Carr, Dorothy B.

    Presented is the final report of a project to develop and field test audio and visual media to accompany developmentally sequenced activities appropriate for a physical education program for handicapped children from preschool through high school. Brief sections cover the following: the purposes and accomplishments of the project; the population…

  1. The communicative functions of final rises in Finnish intonation.

    PubMed

    Ogden, Richard; Routarinne, Sara

    2005-01-01

    This paper considers the communicative function of final rises in Finnish conversational talk between pairs of teenage girls. Final rises are fairly common, occurring approximately twice a minute, predominantly on declaratives and in narrative sequences. We briefly consider the interplay between voice quality (known to be a marker of transition relevance) and rising intonation in Finnish. We argue that in narrative sequences, rising terminals manage two main interactional tasks: they provide a place for a coparticipant to mark recipiency, and they project more talk by the current speaker. Using a methodology which combines phonetic observation with conversation analysis, we demonstrate participants' orientation to these functions.

  2. Development of a Novel Technology for Label Free DNA Sequencing

    DTIC Science & Technology

    2012-05-21

    of the C-H bond stretch vibrations in the planes of the corresponding DNA bases , and in the higher-frequency side, sequence-identifier region is...composed of the N-H bond stretch vibrations in the planes of the corresponding DNA bases . In addition, the sequence-identifier dividing region almost...regions are localized at the corresponding DNA bases and exhibit a definable dependence on the sequence form of the codons under study. Final

  3. Simulator evaluation of the final approach spacing tool

    NASA Technical Reports Server (NTRS)

    Davis, Thomas J.; Erzberger, Heinz; Green, Steven M.

    1990-01-01

    The design and simulator evaluation of an automation tool for assisting terminal radar approach controllers in sequencing and spacing traffic onto the final approach course is described. The automation tool, referred to as the Final Approach Spacing Tool (FAST), displays speed and heading advisories for arrivals as well as sequencing information on the controller's radar display. The main functional elements of FAST are a scheduler that schedules and sequences the traffic, a 4-D trajectory synthesizer that generates the advisories, and a graphical interface that displays the information to the controller. FAST was implemented on a high performance workstation. It can be operated as a stand-alone in the Terminal Radar Approach Control (TRACON) Facility or as an element of a system integrated with automation tools in the Air Route Traffic Control Center (ARTCC). FAST was evaluated by experienced TRACON controllers in a real-time air traffic control simulation. Simulation results show that FAST significantly reduced controller workload and demonstrated a potential for an increase in landing rate.

  4. Advances in high throughput DNA sequence data compression.

    PubMed

    Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz

    2016-06-01

    Advances in high throughput sequencing technologies and reduction in cost of sequencing have led to exponential growth in high throughput DNA sequence data. This growth has posed challenges such as storage, retrieval, and transmission of sequencing data. Data compression is used to cope with these challenges. Various methods have been developed to compress genomic and sequencing data. In this article, we present a comprehensive review of compression methods for genome and reads compression. Algorithms are categorized as referential or reference free. Experimental results and comparative analysis of various methods for data compression are presented. Finally, key challenges and research directions in DNA sequence data compression are highlighted.

  5. The complete and fully assembled genome sequence of Aeromonas salmonicida subsp. pectinolytica and its comparative analysis with other Aeromonas species: investigation of the mobilome in environmental and pathogenic strains.

    PubMed

    Pfeiffer, Friedhelm; Zamora-Lagos, Maria-Antonia; Blettinger, Martin; Yeroslaviz, Assa; Dahl, Andreas; Gruber, Stephan; Habermann, Bianca H

    2018-01-05

    Due to the predominant usage of short-read sequencing to date, most bacterial genome sequences reported in the last years remain at the draft level. This precludes certain types of analyses, such as the in-depth analysis of genome plasticity. Here we report the finalized genome sequence of the environmental strain Aeromonas salmonicida subsp. pectinolytica 34mel, for which only a draft genome with 253 contigs is currently available. Successful completion of the transposon-rich genome critically depended on the PacBio long read sequencing technology. Using finalized genome sequences of A. salmonicida subsp. pectinolytica and other Aeromonads, we report the detailed analysis of the transposon composition of these bacterial species. Mobilome evolution is exemplified by a complex transposon, which has shifted from pathogenicity-related to environmental-related gene content in A. salmonicida subsp. pectinolytica 34mel. Obtaining the complete, circular genome of A. salmonicida subsp. pectinolytica allowed us to perform an in-depth analysis of its mobilome. We demonstrate the mobilome-dependent evolution of this strain's genetic profile from pathogenic to environmental.

  6. 75 FR 79937 - Regulatory Flexibility Agenda

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-12-20

    ... INVESTMENT MANAGEMENT--Final Rule Stage Regulation Sequence Title Identifier Number Number 617 Temporary Rule Regarding Principal Trades With Certain Advisory Clients 3235-AJ96 DIVISION OF INVESTMENT MANAGEMENT... Sequence Title Identifier Number Number 626 Confirmation of Transactions in Open-End Management Investment...

  7. Evaluation and Selection of Best Priority Sequencing Rule in Job Shop Scheduling using Hybrid MCDM Technique

    NASA Astrophysics Data System (ADS)

    Kiran Kumar, Kalla; Nagaraju, Dega; Gayathri, S.; Narayanan, S.

    2017-05-01

    Priority Sequencing Rules provide the guidance for the order in which the jobs are to be processed at a workstation. The application of different priority rules in job shop scheduling gives different order of scheduling. More experimentation needs to be conducted before a final choice is made to know the best priority sequencing rule. Hence, a comprehensive method of selecting the right choice is essential in managerial decision making perspective. This paper considers seven different priority sequencing rules in job shop scheduling. For evaluation and selection of the best priority sequencing rule, a set of eight criteria are considered. The aim of this work is to demonstrate the methodology of evaluating and selecting the best priority sequencing rule by using hybrid multi criteria decision making technique (MCDM), i.e., analytical hierarchy process (AHP) with technique for order preference by similarity to ideal solution (TOPSIS). The criteria weights are calculated by using AHP whereas the relative closeness values of all priority sequencing rules are computed based on TOPSIS with the help of data acquired from the shop floor of a manufacturing firm. Finally, from the findings of this work, the priority sequencing rules are ranked from most important to least important. The comprehensive methodology presented in this paper is very much essential for the management of a workstation to choose the best priority sequencing rule among the available alternatives for processing the jobs with maximum benefit.

  8. FASMA: a service to format and analyze sequences in multiple alignments.

    PubMed

    Costantini, Susan; Colonna, Giovanni; Facchiano, Angelo M

    2007-12-01

    Multiple sequence alignments are successfully applied in many studies for under- standing the structural and functional relations among single nucleic acids and protein sequences as well as whole families. Because of the rapid growth of sequence databases, multiple sequence alignments can often be very large and difficult to visualize and analyze. We offer a new service aimed to visualize and analyze the multiple alignments obtained with different external algorithms, with new features useful for the comparison of the aligned sequences as well as for the creation of a final image of the alignment. The service is named FASMA and is available at http://bioinformatica.isa.cnr.it/FASMA/.

  9. Sequence verification as quality-control step for production of cDNA microarrays.

    PubMed

    Taylor, E; Cogdell, D; Coombes, K; Hu, L; Ramdas, L; Tabor, A; Hamilton, S; Zhang, W

    2001-07-01

    To generate cDNA arrays in our core laboratory, we amplified about 2300 PCR products from a human, sequence-verified cDNA clone library. As a quality-control step, we sequenced the PCR products immediately before printing. The sequence information was used to search the GenBank database to confirm the identities. Although these clones were previously sequence verified by the company, we found that only 79% of the clones matched the original database after handling. Our experience strongly indicates the necessity to sequence verify the clones at the final stage before printing on microarray slides and to modify the gene list accordingly.

  10. Getting a cue before getting a clue: Event-related potentials to inference in visual narrative comprehension

    PubMed Central

    Cohn, Neil; Kutas, Marta

    2015-01-01

    Inference has long been emphasized in the comprehension of verbal and visual narratives. Here, we measured event-related brain potentials to visual sequences designed to elicit inferential processing. In Impoverished sequences, an expressionless “onlooker” watches an undepicted event (e.g., person throws a ball for a dog, then watches the dog chase it) just prior to a surprising finale (e.g., someone else returns the ball), which should lead to an inference (i.e., the different person retrieved the ball). Implied sequences alter this narrative structure by adding visual cues to the critical panel such as a surprised facial expression to the onlooker implying they saw an unexpected, albeit undepicted, event. In contrast, Expected sequences show a predictable, but then confounded, event (i.e., dog retrieves ball, then different person returns it), and Explicit sequences depict the unexpected event (i.e., different person retrieves then returns ball). At the critical penultimate panel, sequences representing depicted events (Explicit, Expected) elicited a larger posterior positivity (P600) than the relatively passive events of an onlooker (Impoverished, Implied), though Implied sequences were slightly more positive than Impoverished sequences. At the subsequent and final panel, a posterior positivity (P600) was greater to images in Impoverished sequences than those in Explicit and Implied sequences, which did not differ. In addition, both sequence types requiring inference (Implied, Impoverished) elicited a larger frontal negativity than those explicitly depicting events (Expected, Explicit). These results show that neural processing differs for visual narratives omitting events versus those depicting events, and that the presence of subtle visual cues can modulate such effects presumably by altering narrative structure. PMID:26320706

  11. Design and evaluation of an air traffic control Final Approach Spacing Tool

    NASA Technical Reports Server (NTRS)

    Davis, Thomas J.; Erzberger, Heinz; Green, Steven M.; Nedell, William

    1991-01-01

    This paper describes the design and simulator evaluation of an automation tool for assisting terminal radar approach controllers in sequencing and spacing traffic onto the final approach course. The automation tool, referred to as the Final Approach Spacing Tool (FAST), displays speed and heading advisories for arriving aircraft as well as sequencing information on the controller's radar display. The main functional elements of FAST are a scheduler that schedules and sequences the traffic, a four-dimensional trajectory synthesizer that generates the advisories, and a graphical interface that displays the information to the controller. FAST has been implemented on a high-performance workstation. It can be operated as a stand-alone in the terminal radar approach control facility or as an element of a system integrated with automation tools in the air route traffic control center. FAST was evaluated by experienced air traffic controllers in a real-time air traffic control simulation. simulation results summarized in the paper show that the automation tools significantly reduced controller work load and demonstrated a potential for an increase in landing rate.

  12. Complete Genome Sequence of Pelosinus fermentans JBW45, a Member of a Remarkably Competitive Group of Negativicutes in the Firmicutes Phylum

    DOE PAGES

    De León, Kara B.; Utturkar, Sagar M.; Camilleri, Laura B.; ...

    2015-09-24

    The genome of Pelosinus fermentans JBW45, isolated from a chromium-contaminated site in Hanford, Washington, USA, has been completed with PacBio sequencing. Finally, nine copies of the rRNA gene operon and multiple transposase genes with identical sequences resulted in breaks in the original draft genome and may suggest genomic instability of JBW45.

  13. Ohmic resistance in a multi-anode MxCs

    EPA Pesticide Factsheets

    A-3txf_sequence summary.xksx: Abundance of contigs or unique sequences for each biofilm samples from anodes in the MEC reactorHodon Waterloo final_fasta_working.docx: Raw sequences with their identification numbersRNA S1_MEC.docx: Representative sequences with their ID number and taxonomyThis dataset is associated with the following publication:Santodomingo, J., H. Ryu, B. Dhar, and H. Lee. Ohmic resistance affects microbial community and electrochemical kinetics in a multi-anode microbial electrochemical cell. JOURNAL OF POWER SOURCES. Elsevier Science Ltd, New York, NY, USA, 331: 315-321, (2016).

  14. Statistical physics of interacting neural networks

    NASA Astrophysics Data System (ADS)

    Kinzel, Wolfgang; Metzler, Richard; Kanter, Ido

    2001-12-01

    Recent results on the statistical physics of time series generation and prediction are presented. A neural network is trained on quasi-periodic and chaotic sequences and overlaps to the sequence generator as well as the prediction errors are calculated numerically. For each network there exists a sequence for which it completely fails to make predictions. Two interacting networks show a transition to perfect synchronization. A pool of interacting networks shows good coordination in the minority game-a model of competition in a closed market. Finally, as a demonstration, a perceptron predicts bit sequences produced by human beings.

  15. Knee cartilage extraction and bone-cartilage interface analysis from 3D MRI data sets

    NASA Astrophysics Data System (ADS)

    Tamez-Pena, Jose G.; Barbu-McInnis, Monica; Totterman, Saara

    2004-05-01

    This works presents a robust methodology for the analysis of the knee joint cartilage and the knee bone-cartilage interface from fused MRI sets. The proposed approach starts by fusing a set of two 3D MR images the knee. Although the proposed method is not pulse sequence dependent, the first sequence should be programmed to achieve good contrast between bone and cartilage. The recommended second pulse sequence is one that maximizes the contrast between cartilage and surrounding soft tissues. Once both pulse sequences are fused, the proposed bone-cartilage analysis is done in four major steps. First, an unsupervised segmentation algorithm is used to extract the femur, the tibia, and the patella. Second, a knowledge based feature extraction algorithm is used to extract the femoral, tibia and patellar cartilages. Third, a trained user corrects cartilage miss-classifications done by the automated extracted cartilage. Finally, the final segmentation is the revisited using an unsupervised MAP voxel relaxation algorithm. This final segmentation has the property that includes the extracted bone tissue as well as all the cartilage tissue. This is an improvement over previous approaches where only the cartilage was segmented. Furthermore, this approach yields very reproducible segmentation results in a set of scan-rescan experiments. When these segmentations were coupled with a partial volume compensated surface extraction algorithm the volume, area, thickness measurements shows precisions around 2.6%

  16. Derivational Suffixes as Cues to Stress Position in Reading Greek

    ERIC Educational Resources Information Center

    Grimani, Aikaterini; Protopapas, Athanassios

    2017-01-01

    Background: In languages with lexical stress, reading aloud must include stress assignment. Stress information sources across languages include word-final letter sequences. Here, we examine whether such sequences account for stress assignment in Greek and whether this is attributable to absolute rules involving accenting morphemes or to…

  17. 77 FR 25587 - Anchorage Regulations; Wells, ME

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-05-01

    ... anchorage areas were out of sequence and formed an hourglass shaped anchorage. This direct final rule corrects the sequence of the coordinates so that the anchorage area forms a box-like shaped anchorage...), U.S. Department of Transportation, West Building Ground Floor, Room W12-140, 1200 New Jersey Avenue...

  18. California mild CTV strains that break resistance in Trifoliate Orange

    USDA-ARS?s Scientific Manuscript database

    This is the final report of a project to characterize California isolates of Citrus tristeza virus (CTV) that replicate in Poncirus trifoliata (trifoliate orange). Next Generation Sequencing (NGS) of viral small interfering RNAs (siRNAs) and assembly of full-length sequences of mild California CTV i...

  19. It’s More Than Stamp Collecting: How Genome Sequencing Can Unify Biological Research

    PubMed Central

    Richards, Stephen

    2015-01-01

    The availability of reference genome sequences, especially the human reference, has revolutionized the study of biology. However, whilst the genomes of some species have been fully sequenced, a wide range of biological problems still cannot be effectively studied for lack of genome sequence information. Here, I identify neglected areas of biology and describe how both targeted species sequencing and more broad taxonomic surveys of the tree of life can address important biological questions. I enumerate the significant benefits that would accrue from sequencing a broader range of taxa, as well as discuss the technical advances in sequencing and assembly methods that would allow for wide-ranging application of whole-genome analysis. Finally, I suggest that in addition to “Big Science” survey initiatives to sequence the tree of life, a modified infrastructure-funding paradigm would better support reference genome sequence generation for research communities most in need. PMID:26003218

  20. It's more than stamp collecting: how genome sequencing can unify biological research.

    PubMed

    Richards, Stephen

    2015-07-01

    The availability of reference genome sequences, especially the human reference, has revolutionized the study of biology. However, while the genomes of some species have been fully sequenced, a wide range of biological problems still cannot be effectively studied for lack of genome sequence information. Here, I identify neglected areas of biology and describe how both targeted species sequencing and more broad taxonomic surveys of the tree of life can address important biological questions. I enumerate the significant benefits that would accrue from sequencing a broader range of taxa, as well as discuss the technical advances in sequencing and assembly methods that would allow for wide-ranging application of whole-genome analysis. Finally, I suggest that in addition to 'big science' survey initiatives to sequence the tree of life, a modified infrastructure-funding paradigm would better support reference genome sequence generation for research communities most in need. Copyright © 2015 Elsevier Ltd. All rights reserved.

  1. Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms.

    PubMed

    Ortegon, Patricia; Poot-Hernández, Augusto C; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya

    2015-01-01

    In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case.

  2. Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms

    PubMed Central

    Ortegon, Patricia; Poot-Hernández, Augusto C.; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya

    2015-01-01

    In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case. PMID:25973143

  3. Construction of a scFv Library with Synthetic, Non-combinatorial CDR Diversity.

    PubMed

    Bai, Xuelian; Shim, Hyunbo

    2017-01-01

    Many large synthetic antibody libraries have been designed, constructed, and successfully generated high-quality antibodies suitable for various demanding applications. While synthetic antibody libraries have many advantages such as optimized framework sequences and a broader sequence landscape than natural antibodies, their sequence diversities typically are generated by random combinatorial synthetic processes which cause the incorporation of many undesired CDR sequences. Here, we describe the construction of a synthetic scFv library using oligonucleotide mixtures that contain predefined, non-combinatorially synthesized CDR sequences. Each CDR is first inserted to a master scFv framework sequence and the resulting single-CDR libraries are subjected to a round of proofread panning. The proofread CDR sequences are assembled to produce the final scFv library with six diversified CDRs.

  4. Preference for locus of punishment in a response sequence.

    NASA Technical Reports Server (NTRS)

    Dardano, J. F.

    1972-01-01

    Study of differences in the aversiveness of response-dependent shock when scheduled on the first, middle or final response of a sequence of 70 responses of food-deprived pigeons, using a procedure to identify relative preferences. The preferred shock schedule and the strength of the preference were found to vary among the pigeons.

  5. Sequence-independent construction of ordered combinatorial libraries with predefined crossover points.

    PubMed

    Jézéquel, Laetitia; Loeper, Jacqueline; Pompon, Denis

    2008-11-01

    Combinatorial libraries coding for mosaic enzymes with predefined crossover points constitute useful tools to address and model structure-function relationships and for functional optimization of enzymes based on multivariate statistics. The presented method, called sequence-independent generation of a chimera-ordered library (SIGNAL), allows easy shuffling of any predefined amino acid segment between two or more proteins. This method is particularly well adapted to the exchange of protein structural modules. The procedure could also be well suited to generate ordered combinatorial libraries independent of sequence similarities in a robotized manner. Sequence segments to be recombined are first extracted by PCR from a single-stranded template coding for an enzyme of interest using a biotin-avidin-based method. This technique allows the reduction of parental template contamination in the final library. Specific PCR primers allow amplification of two complementary mosaic DNA fragments, overlapping in the region to be exchanged. Fragments are finally reassembled using a fusion PCR. The process is illustrated via the construction of a set of mosaic CYP2B enzymes using this highly modular approach.

  6. Academic performance in a pharmacotherapeutics course sequence taught synchronously on two campuses using distance education technology.

    PubMed

    Steinberg, Michael; Morin, Anna K

    2011-10-10

    To compare the academic performance of campus-based students in a pharmacotherapeutics course with that of students at a distant campus taught via synchronous teleconferencing. Examination scores and final course grades for campus-based and distant students completing the case-based pharmacotherapeutics course sequence over a 5-year period were collected and analyzed. The mean examination scores and final course grades were not significantly different between students on the 2 campuses. The use of synchronous distance education technology to teach students does not affect students' academic performance when used in an active-learning, case-based pharmacotherapeutics course.

  7. Comparative Analysis of Sequential Proximal Optimizing Technique Versus Kissing Balloon Inflation Technique in Provisional Bifurcation Stenting: Fractal Coronary Bifurcation Bench Test.

    PubMed

    Finet, Gérard; Derimay, François; Motreff, Pascal; Guerin, Patrice; Pilet, Paul; Ohayon, Jacques; Darremont, Olivier; Rioufol, Gilles

    2015-08-24

    This study used a fractal bifurcation bench model to compare 6 optimization sequences for coronary bifurcation provisional stenting, including 1 novel sequence without kissing balloon inflation (KBI), comprising initial proximal optimizing technique (POT) + side-branch inflation (SBI) + final POT, called "re-POT." In provisional bifurcation stenting, KBI fails to improve the rate of major adverse cardiac events. Proximal geometric deformation increases the rate of in-stent restenosis and target lesion revascularization. A bifurcation bench model was used to compare KBI alone, KBI after POT, KBI with asymmetric inflation pressure after POT, and 2 sequences without KBI: initial POT plus SBI, and initial POT plus SBI with final POT (called "re-POT"). For each protocol, 5 stents were tested using 2 different drug-eluting stent designs: that is, a total of 60 tests. Compared with the classic KBI-only sequence and those associating POT with modified KBI, the re-POT sequence gave significantly (p < 0.05) better geometric results: it reduced SB ostium stent-strut obstruction from 23.2 ± 6.0% to 5.6 ± 8.3%, provided perfect proximal stent apposition with almost perfect circularity (ellipticity index reduced from 1.23 ± 0.02 to 1.04 ± 0.01), reduced proximal area overstretch from 24.2 ± 7.6% to 8.0 ± 0.4%, and reduced global strut malapposition from 40 ± 6.2% to 2.6 ± 1.4%. In comparison with 5 other techniques, the re-POT sequence significantly optimized the final result of provisional coronary bifurcation stenting, maintaining circular geometry while significantly reducing SB ostium strut obstruction and global strut malapposition. These experimental findings confirm that provisional stenting may be optimized more effectively without KBI using re-POT. Copyright © 2015 American College of Cardiology Foundation. Published by Elsevier Inc. All rights reserved.

  8. Final-Approach-Spacing Subsystem For Air Traffic

    NASA Technical Reports Server (NTRS)

    Davis, Thomas J.; Erzberger, Heinz; Bergeron, Hugh

    1992-01-01

    Automation subsystem of computers, computer workstations, communication equipment, and radar helps air-traffic controllers in terminal radar approach-control (TRACON) facility manage sequence and spacing of arriving aircraft for both efficiency and safety. Called FAST (Final Approach Spacing Tool), subsystem enables controllers to choose among various levels of automation.

  9. Exploring Dance Movement Data Using Sequence Alignment Methods

    PubMed Central

    Chavoshi, Seyed Hossein; De Baets, Bernard; Neutens, Tijs; De Tré, Guy; Van de Weghe, Nico

    2015-01-01

    Despite the abundance of research on knowledge discovery from moving object databases, only a limited number of studies have examined the interaction between moving point objects in space over time. This paper describes a novel approach for measuring similarity in the interaction between moving objects. The proposed approach consists of three steps. First, we transform movement data into sequences of successive qualitative relations based on the Qualitative Trajectory Calculus (QTC). Second, sequence alignment methods are applied to measure the similarity between movement sequences. Finally, movement sequences are grouped based on similarity by means of an agglomerative hierarchical clustering method. The applicability of this approach is tested using movement data from samba and tango dancers. PMID:26181435

  10. GRIL: genome rearrangement and inversion locator.

    PubMed

    Darling, Aaron E; Mau, Bob; Blattner, Frederick R; Perna, Nicole T

    2004-01-01

    GRIL is a tool to automatically identify collinear regions in a set of bacterial-size genome sequences. GRIL uses three basic steps. First, regions of high sequence identity are located. Second, some of these regions are filtered based on user-specified criteria. Finally, the remaining regions of sequence identity are used to define significant collinear regions among the sequences. By locating collinear regions of sequence, GRIL provides a basis for multiple genome alignment using current alignment systems. GRIL also provides a basis for using current inversion distance tools to infer phylogeny. GRIL is implemented in C++ and runs on any x86-based Linux or Windows platform. It is available from http://asap.ahabs.wisc.edu/gril

  11. N-Terminal Amino Acid Sequence Determination of Proteins by N-Terminal Dimethyl Labeling: Pitfalls and Advantages When Compared with Edman Degradation Sequence Analysis.

    PubMed

    Chang, Elizabeth; Pourmal, Sergei; Zhou, Chun; Kumar, Rupesh; Teplova, Marianna; Pavletich, Nikola P; Marians, Kenneth J; Erdjument-Bromage, Hediye

    2016-07-01

    In recent history, alternative approaches to Edman sequencing have been investigated, and to this end, the Association of Biomolecular Resource Facilities (ABRF) Protein Sequencing Research Group (PSRG) initiated studies in 2014 and 2015, looking into bottom-up and top-down N-terminal (Nt) dimethyl derivatization of standard quantities of intact proteins with the aim to determine Nt sequence information. We have expanded this initiative and used low picomole amounts of myoglobin to determine the efficiency of Nt-dimethylation. Application of this approach on protein domains, generated by limited proteolysis of overexpressed proteins, confirms that it is a universal labeling technique and is very sensitive when compared with Edman sequencing. Finally, we compared Edman sequencing and Nt-dimethylation of the same polypeptide fragments; results confirm that there is agreement in the identity of the Nt amino acid sequence between these 2 methods.

  12. An approach for addressing hard-to-detect hot spots.

    PubMed

    Abelquist, Eric W; King, David A; Miller, Laurence F; Viars, James A

    2013-05-01

    The Multi-Agency Radiation Survey and Site Investigation Manual (MARSSIM) survey approach is comprised of systematic random sampling coupled with radiation scanning to assess acceptability of potential hot spots. Hot spot identification for some radionuclides may not be possible due to the very weak gamma or x-ray radiation they emit-these hard-to-detect nuclides are unlikely to be identified by field scans. Similarly, scanning technology is not yet available for chemical contamination. For both hard-to-detect nuclides and chemical contamination, hot spots are only identified via volumetric sampling. The remedial investigation and cleanup of sites under the Comprehensive Environmental Response, Compensation, and Liability Act typically includes the collection of samples over relatively large exposure units, and concentration limits are applied assuming the contamination is more or less uniformly distributed. However, data collected from contaminated sites demonstrate contamination is often highly localized. These highly localized areas, or hot spots, will only be identified if sample densities are high or if the environmental characterization program happens to sample directly from the hot spot footprint. This paper describes a Bayesian approach for addressing hard-to-detect nuclides and chemical hot spots. The approach begins using available data (e.g., as collected using the standard approach) to predict the probability that an unacceptable hot spot is present somewhere in the exposure unit. This Bayesian approach may even be coupled with the graded sampling approach to optimize hot spot characterization. Once the investigator concludes that the presence of hot spots is likely, then the surveyor should use the data quality objectives process to generate an appropriate sample campaign that optimizes the identification of risk-relevant hot spots.

  13. Modeling Surface Processes Occurring on Moons of the Outer Solar System

    NASA Astrophysics Data System (ADS)

    Umurhan, O. M.; White, O. L.; Moore, J. M.; Howard, A. D.; Schenk, P.

    2016-12-01

    A variety of processes, some with familiar terrestrial analogs, are known to take place on moon surfaces in the outer solar system. In this talk, we discuss the observed features of mass wasting and surface transport seen on both Jupiter's moon Calisto and one of Saturn's Trojan moons Helene. We provide a number of numerical models using upgraded version of MARSSIM in support of several hypotheses suggested on behalf of the observations made regarding these objects. Calisto exhibits rolling plains of low albedo materials surrounding relatively high jutting peaks harboring high albedo deposits. Our modeling supports the interpretation that Calisto's surface is a record of erosion driven by the sublimation of CO2 and H2O contained in the bedrock. Both solar insolation and surface re-radiation drives the sublimation leaving behind debris which we interpret to be the observed darkened regolith and, further, the high albedo peaks are water ice deposits on surface cold traps. On the other hand, the 45 km scale Helene, being a milligravity environment, exhibits mysterious looking streaks and grooves of very high albedo materials extending for several kilometers with a down-sloping grade of 7o-9o. Helene's cratered terrain also shows evidence of narrowed septa. The observed surface features suggest some type of advective processes are at play in this system. Our modeling lends support to the suggestion that Helene's surface materials behave as a Bingham plastic material - our flow modeling with such rheologies can reproduce the observed pattern of streakiness depending upon the smoothness of the underlying bedrock; the overall gradients observed; and the narrowed septa of inter-crater regions.

  14. Draft Genome Sequence of an Isolate of Colletotrichum fructicola, a Causal Agent of Mango Anthracnose.

    PubMed

    Li, Qili; Bu, Junyan; Yu, Zhihe; Tang, Lihua; Huang, Suiping; Guo, Tangxun; Mo, Jianyou; Hsiang, Tom

    2018-02-22

    Here, we present a draft genome sequence of isolate 15060 of Colletotrichum fructicola , a causal agent of mango anthracnose. The final assembly consists of 1,048 scaffolds totaling 56,493,063 bp (G+C content, 53.38%) and 15,180 predicted genes. Copyright © 2018 Li et al.

  15. [A Series of Motion Picture Documents in Communication Theory and the New Educational Media. Final Scripts.

    ERIC Educational Resources Information Center

    Wagner, Robert W.

    This publication contains four film scripts, each comprising from six to eleven short sequences. Each script has a complete shot list and transcript of the soundtrack, which contains narration, interviews, discussions, and synchronous sound from documentary situations. The six sequences in "The Information Explosion" cover the history of…

  16. Acetylcholinesterase 1 in populations of organophosphate resistant North American strains of the cattle tick, Rhipicephalus microplus (Acari: Ixodidae)

    USDA-ARS?s Scientific Manuscript database

    In a collaboration with Purdue University researchers, we sequenced a 143,606 base pair Rhipicephalus microplus BAC library clone that contained the coding region for acetylcholinesterase 1 (AChE1). Sequencing was by Sanger protocols and the final assembly resulted in 15 contigs of varying length, e...

  17. Nullomers and High Order Nullomers in Genomic Sequences

    PubMed Central

    Vergni, Davide; Santoni, Daniele

    2016-01-01

    A nullomer is an oligomer that does not occur as a subsequence in a given DNA sequence, i.e. it is an absent word of that sequence. The importance of nullomers in several applications, from drug discovery to forensic practice, is now debated in the literature. Here, we investigated the nature of nullomers, whether their absence in genomes has just a statistical explanation or it is a peculiar feature of genomic sequences. We introduced an extension of the notion of nullomer, namely high order nullomers, which are nullomers whose mutated sequences are still nullomers. We studied different aspects of them: comparison with nullomers of random sequences, CpG distribution and mean helical rise. In agreement with previous results we found that the number of nullomers in the human genome is much larger than expected by chance. Nevertheless antithetical results were found when considering a random DNA sequence preserving dinucleotide frequencies. The analysis of CpG frequencies in nullomers and high order nullomers revealed, as expected, a high CpG content but it also highlighted a strong dependence of CpG frequencies on the dinucleotide position, suggesting that nullomers have their own peculiar structure and are not simply sequences whose CpG frequency is biased. Furthermore, phylogenetic trees were built on eleven species based on both the similarities between the dinucleotide frequencies and the number of nullomers two species share, showing that nullomers are fairly conserved among close species. Finally the study of mean helical rise of nullomers sequences revealed significantly high mean rise values, reinforcing the hypothesis that those sequences have some peculiar structural features. The obtained results show that nullomers are the consequence of the peculiar structure of DNA (also including biased CpG frequency and CpGs islands), so that the hypermutability model, also taking into account CpG islands, seems to be not sufficient to explain nullomer phenomenon. Finally, high order nullomers could emphasize those features that already make simple nullomers useful in several applications. PMID:27906971

  18. Sequence verification of synthetic DNA by assembly of sequencing reads

    PubMed Central

    Wilson, Mandy L.; Cai, Yizhi; Hanlon, Regina; Taylor, Samantha; Chevreux, Bastien; Setubal, João C.; Tyler, Brett M.; Peccoud, Jean

    2013-01-01

    Gene synthesis attempts to assemble user-defined DNA sequences with base-level precision. Verifying the sequences of construction intermediates and the final product of a gene synthesis project is a critical part of the workflow, yet one that has received the least attention. Sequence validation is equally important for other kinds of curated clone collections. Ensuring that the physical sequence of a clone matches its published sequence is a common quality control step performed at least once over the course of a research project. GenoREAD is a web-based application that breaks the sequence verification process into two steps: the assembly of sequencing reads and the alignment of the resulting contig with a reference sequence. GenoREAD can determine if a clone matches its reference sequence. Its sophisticated reporting features help identify and troubleshoot problems that arise during the sequence verification process. GenoREAD has been experimentally validated on thousands of gene-sized constructs from an ORFeome project, and on longer sequences including whole plasmids and synthetic chromosomes. Comparing GenoREAD results with those from manual analysis of the sequencing data demonstrates that GenoREAD tends to be conservative in its diagnostic. GenoREAD is available at www.genoread.org. PMID:23042248

  19. Hierarchical Traces for Reduced NSM Memory Requirements

    NASA Astrophysics Data System (ADS)

    Dahl, Torbjørn S.

    This paper presents work on using hierarchical long term memory to reduce the memory requirements of nearest sequence memory (NSM) learning, a previously published, instance-based reinforcement learning algorithm. A hierarchical memory representation reduces the memory requirements by allowing traces to share common sub-sequences. We present moderated mechanisms for estimating discounted future rewards and for dealing with hidden state using hierarchical memory. We also present an experimental analysis of how the sub-sequence length affects the memory compression achieved and show that the reduced memory requirements do not effect the speed of learning. Finally, we analyse and discuss the persistence of the sub-sequences independent of specific trace instances.

  20. Determination of the sequences of protein-derived peptides and peptide mixtures by mass spectrometry

    PubMed Central

    Morris, Howard R.; Williams, Dudley H.; Ambler, Richard P.

    1971-01-01

    Micro-quantities of protein-derived peptides have been converted into N-acetylated permethyl derivatives, and their sequences determined by low-resolution mass spectrometry without prior knowledge of their amino acid compositions or lengths. A new strategy is suggested for the mass spectrometric sequencing of oligopeptides or proteins, involving gel filtration of protein hydrolysates and subsequent sequence analysis of peptide mixtures. Finally, results are given that demonstrate for the first time the use of mass spectrometry for the analysis of a protein-derived peptide mixture, again without prior knowledge of the protein or components within the mixture. PMID:5158904

  1. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

    PubMed

    Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong; Warnow, Tandy

    2015-05-01

    We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate--slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.

  2. OncoNEM: inferring tumor evolution from single-cell sequencing data.

    PubMed

    Ross, Edith M; Markowetz, Florian

    2016-04-15

    Single-cell sequencing promises a high-resolution view of genetic heterogeneity and clonal evolution in cancer. However, methods to infer tumor evolution from single-cell sequencing data lag behind methods developed for bulk-sequencing data. Here, we present OncoNEM, a probabilistic method for inferring intra-tumor evolutionary lineage trees from somatic single nucleotide variants of single cells. OncoNEM identifies homogeneous cellular subpopulations and infers their genotypes as well as a tree describing their evolutionary relationships. In simulation studies, we assess OncoNEM's robustness and benchmark its performance against competing methods. Finally, we show its applicability in case studies of muscle-invasive bladder cancer and essential thrombocythemia.

  3. Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications

    PubMed Central

    Yilmaz, Pelin; Kottmann, Renzo; Field, Dawn; Knight, Rob; Cole, James R; Amaral-Zettler, Linda; Gilbert, Jack A; Karsch-Mizrachi, Ilene; Johnston, Anjanette; Cochrane, Guy; Vaughan, Robert; Hunter, Christopher; Park, Joonhong; Morrison, Norman; Rocca-Serra, Philippe; Sterk, Peter; Arumugam, Manimozhiyan; Bailey, Mark; Baumgartner, Laura; Birren, Bruce W; Blaser, Martin J; Bonazzi, Vivien; Booth, Tim; Bork, Peer; Bushman, Frederic D; Buttigieg, Pier Luigi; Chain, Patrick S G; Charlson, Emily; Costello, Elizabeth K; Huot-Creasy, Heather; Dawyndt, Peter; DeSantis, Todd; Fierer, Noah; Fuhrman, Jed A; Gallery, Rachel E; Gevers, Dirk; Gibbs, Richard A; Gil, Inigo San; Gonzalez, Antonio; Gordon, Jeffrey I; Guralnick, Robert; Hankeln, Wolfgang; Highlander, Sarah; Hugenholtz, Philip; Jansson, Janet; Kau, Andrew L; Kelley, Scott T; Kennedy, Jerry; Knights, Dan; Koren, Omry; Kuczynski, Justin; Kyrpides, Nikos; Larsen, Robert; Lauber, Christian L; Legg, Teresa; Ley, Ruth E; Lozupone, Catherine A; Ludwig, Wolfgang; Lyons, Donna; Maguire, Eamonn; Methé, Barbara A; Meyer, Folker; Muegge, Brian; Nakielny, Sara; Nelson, Karen E; Nemergut, Diana; Neufeld, Josh D; Newbold, Lindsay K; Oliver, Anna E; Pace, Norman R; Palanisamy, Giriprakash; Peplies, Jörg; Petrosino, Joseph; Proctor, Lita; Pruesse, Elmar; Quast, Christian; Raes, Jeroen; Ratnasingham, Sujeevan; Ravel, Jacques; Relman, David A; Assunta-Sansone, Susanna; Schloss, Patrick D; Schriml, Lynn; Sinha, Rohini; Smith, Michelle I; Sodergren, Erica; Spor, Aymé; Stombaugh, Jesse; Tiedje, James M; Ward, Doyle V; Weinstock, George M; Wendel, Doug; White, Owen; Whiteley, Andrew; Wilke, Andreas; Wortman, Jennifer R; Yatsunenko, Tanya; Glöckner, Frank Oliver

    2012-01-01

    Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences—the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The ‘environmental packages’ apply to any genome sequence of known origin and can be used in combination with MIMARKS and other GSC checklists. Finally, to establish a unified standard for describing sequence data and to provide a single point of entry for the scientific community to access and learn about GSC checklists, we present the minimum information about any (x) sequence (MIxS). Adoption of MIxS will enhance our ability to analyze natural genetic diversity documented by massive DNA sequencing efforts from myriad ecosystems in our ever-changing biosphere. PMID:21552244

  4. Dynamic learning and context-dependence in sequential, attribute-based, stated-preference valuation questions

    Treesearch

    Thomas P. Holmes; Kevin J. Boyle

    2005-01-01

    A hybrid stated-preference model is presented that combines the referendum contingent valuation response format with an experimentally designed set of attributes. A sequence of valuation questions is asked to a random sample in a mailout mail-back format. Econometric analysis shows greater discrimination between alternatives in the final choice in the sequence, and the...

  5. A Sequenced Instructional Program in Physical Education for the Handicapped, Phase III. Producing and Disseminating Demonstration Packages. Final Report.

    ERIC Educational Resources Information Center

    Carr, Dorothy B.; Avance, Lyonel D.

    Presented is a sequenced instructional program in physical education which constitutes the third of a three-phase, 4-year project, funded by Title III, for handicapped children, preschool through high school levels, in the Los Angeles Unified School District. Described are the project setting and the following accomplishments: a curriculum guide…

  6. Inverse statistical physics of protein sequences: a key issues review.

    PubMed

    Cocco, Simona; Feinauer, Christoph; Figliuzzi, Matteo; Monasson, Rémi; Weigt, Martin

    2018-03-01

    In the course of evolution, proteins undergo important changes in their amino acid sequences, while their three-dimensional folded structure and their biological function remain remarkably conserved. Thanks to modern sequencing techniques, sequence data accumulate at unprecedented pace. This provides large sets of so-called homologous, i.e. evolutionarily related protein sequences, to which methods of inverse statistical physics can be applied. Using sequence data as the basis for the inference of Boltzmann distributions from samples of microscopic configurations or observables, it is possible to extract information about evolutionary constraints and thus protein function and structure. Here we give an overview over some biologically important questions, and how statistical-mechanics inspired modeling approaches can help to answer them. Finally, we discuss some open questions, which we expect to be addressed over the next years.

  7. Inverse statistical physics of protein sequences: a key issues review

    NASA Astrophysics Data System (ADS)

    Cocco, Simona; Feinauer, Christoph; Figliuzzi, Matteo; Monasson, Rémi; Weigt, Martin

    2018-03-01

    In the course of evolution, proteins undergo important changes in their amino acid sequences, while their three-dimensional folded structure and their biological function remain remarkably conserved. Thanks to modern sequencing techniques, sequence data accumulate at unprecedented pace. This provides large sets of so-called homologous, i.e. evolutionarily related protein sequences, to which methods of inverse statistical physics can be applied. Using sequence data as the basis for the inference of Boltzmann distributions from samples of microscopic configurations or observables, it is possible to extract information about evolutionary constraints and thus protein function and structure. Here we give an overview over some biologically important questions, and how statistical-mechanics inspired modeling approaches can help to answer them. Finally, we discuss some open questions, which we expect to be addressed over the next years.

  8. [Influence of PCR cycle number on microbial diversity analysis through next generation sequencing].

    PubMed

    An, Yunhe; Gao, Lijuan; Li, Junbo; Tian, Yanjie; Wang, Jinlong; Zheng, Xuejuan; Wu, Huijuan

    2016-08-25

    Using of high throughput sequencing technology to study the microbial diversity in complex samples has become one of the hottest issues in the field of microbial diversity research. In this study, the soil and sheep rumen chyme samples were used to extract DNA, respectively. Then the 25 ng total DNA was used to amplify the 16S rRNA V3 region with 20, 25, 30 PCR cycles, and the final sequencing library was constructed by mixing equal amounts of purified PCR products. Finally, the operational taxonomic unit (OUT) amount, rarefaction curve, microbial number and species were compared through data analysis. It was found that at the same amount of DNA template, the proportion of the community composition was not the best with more numbers of PCR cycle, although the species number was much more. In all, when the PCR cycle number is 25, the number of species and proportion of the community composition were the most optimal both in soil or chyme samples.

  9. Error propagation in eigenimage filtering.

    PubMed

    Soltanian-Zadeh, H; Windham, J P; Jenkins, J M

    1990-01-01

    Mathematical derivation of error (noise) propagation in eigenimage filtering is presented. Based on the mathematical expressions, a method for decreasing the propagated noise given a sequence of images is suggested. The signal-to-noise ratio (SNR) and contrast-to-noise ratio (CNR) of the final composite image are compared to the SNRs and CNRs of the images in the sequence. The consistency of the assumptions and accuracy of the mathematical expressions are investigated using sequences of simulated and real magnetic resonance (MR) images of an agarose phantom and a human brain.

  10. Identification of genes in anonymous DNA sequences. Final report: Report period, 15 April 1993--15 April 1994

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fields, C.A.

    1994-09-01

    This Report concludes the DOE Human Genome Program project, ``Identification of Genes in Anonymous DNA Sequence.`` The central goals of this project have been (1) understanding the problem of identifying genes in anonymous sequences, and (2) development of tools, primarily the automated identification system gm, for identifying genes. The activities supported under the previous award are summarized here to provide a single complete report on the activities supported as part of the project from its inception to its completion.

  11. Approximate matching of regular expressions.

    PubMed

    Myers, E W; Miller, W

    1989-01-01

    Given a sequence A and regular expression R, the approximate regular expression matching problem is to find a sequence matching R whose optimal alignment with A is the highest scoring of all such sequences. This paper develops an algorithm to solve the problem in time O(MN), where M and N are the lengths of A and R. Thus, the time requirement is asymptotically no worse than for the simpler problem of aligning two fixed sequences. Our method is superior to an earlier algorithm by Wagner and Seiferas in several ways. First, it treats real-valued costs, in addition to integer costs, with no loss of asymptotic efficiency. Second, it requires only O(N) space to deliver just the score of the best alignment. Finally, its structure permits implementation techniques that make it extremely fast in practice. We extend the method to accommodate gap penalties, as required for typical applications in molecular biology, and further refine it to search for sub-strings of A that strongly align with a sequence in R, as required for typical data base searches. We also show how to deliver an optimal alignment between A and R in only O(N + log M) space using O(MN log M) time. Finally, an O(MN(M + N) + N2log N) time algorithm is presented for alignment scoring schemes where the cost of a gap is an arbitrary increasing function of its length.

  12. PipeOnline 2.0: automated EST processing and functional data sorting.

    PubMed

    Ayoubi, Patricia; Jin, Xiaojing; Leite, Saul; Liu, Xianghui; Martajaja, Jeson; Abduraham, Abdurashid; Wan, Qiaolan; Yan, Wei; Misawa, Eduardo; Prade, Rolf A

    2002-11-01

    Expressed sequence tags (ESTs) are generated and deposited in the public domain, as redundant, unannotated, single-pass reactions, with virtually no biological content. PipeOnline automatically analyses and transforms large collections of raw DNA-sequence data from chromatograms or FASTA files by calling the quality of bases, screening and removing vector sequences, assembling and rewriting consensus sequences of redundant input files into a unigene EST data set and finally through translation, amino acid sequence similarity searches, annotation of public databases and functional data. PipeOnline generates an annotated database, retaining the processed unigene sequence, clone/file history, alignments with similar sequences, and proposed functional classification, if available. Functional annotation is automatic and based on a novel method that relies on homology of amino acid sequence multiplicity within GenBank records. Records are examined through a function ordered browser or keyword queries with automated export of results. PipeOnline offers customization for individual projects (MyPipeOnline), automated updating and alert service. PipeOnline is available at http://stress-genomics.org.

  13. Astronaut Story Musgrave during final stages of exercise in the WETF

    NASA Technical Reports Server (NTRS)

    1982-01-01

    Astronaut Story Musgrave, STS-6 mission specialist, checks a sequence list on his spacesuit during the final stages of suit-donning exercise in the weightless environment test facility (WETF). He is wearing the full extravehicular mobility unit (EMU), including helmet and gloves and is strapped in to the platform for movement into the water.

  14. Improving global CD uniformity by optimizing post-exposure bake and develop sequences

    NASA Astrophysics Data System (ADS)

    Osborne, Stephen P.; Mueller, Mark; Lem, Homer; Reyland, David; Baik, KiHo

    2003-12-01

    Improvements in the final uniformity of masks can be shrouded by error contributions from many sources. The final Global CD Uniformity (GCDU) of a mask is degraded by individual contributions of the writing tool, the Post Applied Bake (PAB), the Post Exposure Bake (PEB), the Develop sequence and the Etch step. Final global uniformity will improve by isolating and minimizing the variability of the PEB and Develop. We achieved this de-coupling of the PEB and Develop process from the whole process stream by using "dark loss" which is the loss of unexposed resist during the develop process. We confirmed a correspondence between Angstroms of dark loss and nanometer sized deviations in the chrome CD. A plate with a distinctive dark loss pattern was related to a nearly identical pattern in the chrome CD. This pattern was verified to have originated during the PEB process and displayed a [Δ(Final CD)/Δ(Dark Loss)] ratio of 6 for TOK REAP200 resist. Previous papers have reported a sensitive linkage between Angstroms of dark loss and nanometers in the final uniformity of the written plate. These initial studies reported using this method to improve the PAB of resists for greater uniformity of sensitivity and contrast. Similarly, this paper demonstrates an outstanding optimization of PEB and Develop processes.

  15. Scientific overview and historical context of the 1811-1812 new Madrid earthquake sequence

    USGS Publications Warehouse

    Hough, S.E.

    2004-01-01

    aftershock». These values are consistent with other lines of evidence, including scaling relationships. Finally, I show that accounts from the New Madrid sequence reveal evidence for remotely triggered earthquakes well outside the NMSZ. Remotely triggered earthquakes represent a potentially important new wrinkle in historic earthquake research, as their ground motions can sometimes be confused with mainshock ground motions.

  16. Automated sample-preparation technologies in genome sequencing projects.

    PubMed

    Hilbert, H; Lauber, J; Lubenow, H; Düsterhöft, A

    2000-01-01

    A robotic workstation system (BioRobot 96OO, QIAGEN) and a 96-well UV spectrophotometer (Spectramax 250, Molecular Devices) were integrated in to the process of high-throughput automated sequencing of double-stranded plasmid DNA templates. An automated 96-well miniprep kit protocol (QIAprep Turbo, QIAGEN) provided high-quality plasmid DNA from shotgun clones. The DNA prepared by this procedure was used to generate more than two mega bases of final sequence data for two genomic projects (Arabidopsis thaliana and Schizosaccharomyces pombe), three thousand expressed sequence tags (ESTs) plus half a mega base of human full-length cDNA clones, and approximately 53,000 single reads for a whole genome shotgun project (Pseudomonas putida).

  17. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences

    PubMed Central

    Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong

    2015-01-01

    Abstract We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate—slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory. PMID:25549288

  18. TU-H-CAMPUS-JeP3-05: Adaptive Determination of Needle Sequence HDR Prostate Brachytherapy with Divergent Needle-By-Needle Delivery

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Borot de Battisti, M; Maenhout, M; Lagendijk, J J W

    Purpose: To develop a new method which adaptively determines the optimal needle insertion sequence for HDR prostate brachytherapy involving divergent needle-by-needle dose delivery by e.g. a robotic device. A needle insertion sequence is calculated at the beginning of the intervention and updated after each needle insertion with feedback on needle positioning errors. Methods: Needle positioning errors and anatomy changes may occur during HDR brachytherapy which can lead to errors in the delivered dose. A novel strategy was developed to calculate and update the needle sequence and the dose plan after each needle insertion with feedback on needle positioning errors. Themore » dose plan optimization was performed by numerical simulations. The proposed needle sequence determination optimizes the final dose distribution based on the dose coverage impact of each needle. This impact is predicted stochastically by needle insertion simulations. HDR procedures were simulated with varying number of needle insertions (4 to 12) using 11 patient MR data-sets with PTV, prostate, urethra, bladder and rectum delineated. Needle positioning errors were modeled by random normally distributed angulation errors (standard deviation of 3 mm at the needle’s tip). The final dose parameters were compared in the situations where the needle with the largest vs. the smallest dose coverage impact was selected at each insertion. Results: Over all scenarios, the percentage of clinically acceptable final dose distribution improved when the needle selected had the largest dose coverage impact (91%) compared to the smallest (88%). The differences were larger for few (4 to 6) needle insertions (maximum difference scenario: 79% vs. 60%). The computation time of the needle sequence optimization was below 60s. Conclusion: A new adaptive needle sequence determination for HDR prostate brachytherapy was developed. Coupled to adaptive planning, the selection of the needle with the largest dose coverage impact increases chances of reaching the clinical constraints. M. Borot de Battisti is funded by Philips Medical Systems Nederland B.V.; M. Moerland is principal investigator on a contract funded by Philips Medical Systems Nederland B.V.; G. Hautvast and D. Binnekamp are fulltime employees of Philips Medical Systems Nederland B.V.« less

  19. Skilled memory in expert figure skaters.

    PubMed

    Deakin, J M; Allard, F

    1991-01-01

    The present studies extend skilled-memory theory to a domain involving the performance of motor sequences. Skilled figure skaters were better able than their less skilled counterparts to perform short skating sequences that were choreographed, rather than randomly constructed. Expert skaters encoded sequences for performance very differently from the way in which they encoded sequences that were verbally presented for verbal recall. Tasks interpolated between sequence and recall showed no significant influence on recall accuracy, implicating long-term memory in skating memory. There was little evidence for the use of retrieval structures when skaters learned the brief sequences used throughout these studies. Finally, expert skaters were able to judge the similarity of two skating elements faster than less skilled skaters, indicating a faster access to semantic memory for experts. The data indicate that skaters show many of the same skilled-memory characteristics as have been described in other skill domains involving memorization, such as digit span and memory for dinner orders.

  20. Sequence analysis of the canine mitochondrial DNA control region from shed hair samples in criminal investigations.

    PubMed

    Berger, C; Berger, B; Parson, W

    2012-01-01

    In recent years, evidence from domestic dogs has increasingly been analyzed by forensic DNA testing. Especially, canine hairs have proved most suitable and practical due to the high rate of hair transfer occurring between dogs and humans. Starting with the description of a contamination-free sample handling procedure, we give a detailed workflow for sequencing hypervariable segments (HVS) of the mtDNA control region from canine evidence. After the hair material is lysed and the DNA extracted by Phenol/Chloroform, the amplification and sequencing strategy comprises the HVS I and II of the canine control region and is optimized for DNA of medium-to-low quality and quantity. The sequencing procedure is based on the Sanger Big-dye deoxy-terminator method and the separation of the sequencing reaction products is performed on a conventional multicolor fluorescence detection capillary electrophoresis platform. Finally, software-aided base calling and sequence interpretation are addressed exemplarily.

  1. Predicting DNA hybridization kinetics from sequence

    NASA Astrophysics Data System (ADS)

    Zhang, Jinny X.; Fang, John Z.; Duan, Wei; Wu, Lucia R.; Zhang, Angela W.; Dalchau, Neil; Yordanov, Boyan; Petersen, Rasmus; Phillips, Andrew; Zhang, David Yu

    2018-01-01

    Hybridization is a key molecular process in biology and biotechnology, but so far there is no predictive model for accurately determining hybridization rate constants based on sequence information. Here, we report a weighted neighbour voting (WNV) prediction algorithm, in which the hybridization rate constant of an unknown sequence is predicted based on similarity reactions with known rate constants. To construct this algorithm we first performed 210 fluorescence kinetics experiments to observe the hybridization kinetics of 100 different DNA target and probe pairs (36 nt sub-sequences of the CYCS and VEGF genes) at temperatures ranging from 28 to 55 °C. Automated feature selection and weighting optimization resulted in a final six-feature WNV model, which can predict hybridization rate constants of new sequences to within a factor of 3 with ∼91% accuracy, based on leave-one-out cross-validation. Accurate prediction of hybridization kinetics allows the design of efficient probe sequences for genomics research.

  2. Alternation blindness in the representation of binary sequences.

    PubMed

    Yu, Ru Qi; Osherson, Daniel; Zhao, Jiaying

    2018-03-01

    Binary information is prevalent in the environment and contains 2 distinct outcomes. Binary sequences consist of a mixture of alternation and repetition. Understanding how people perceive such sequences would contribute to a general theory of information processing. In this study, we examined how people process alternation and repetition in binary sequences. Across 4 paradigms involving estimation, working memory, change detection, and visual search, we found that the number of alternations is underestimated compared with repetitions (Experiment 1). Moreover, recall for binary sequences deteriorates as the sequence alternates more (Experiment 2). Changes in bits are also harder to detect as the sequence alternates more (Experiment 3). Finally, visual targets superimposed on bits of a binary sequence take longer to process as alternation increases (Experiment 4). Overall, our results indicate that compared with repetition, alternation in a binary sequence is less salient in the sense of requiring more attention for successful encoding. The current study thus reveals the cognitive constraints in the representation of alternation and provides a new explanation for the overalternation bias in randomness perception. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  3. Enhanced sequencing coverage with digital droplet multiple displacement amplification

    PubMed Central

    Sidore, Angus M.; Lan, Freeman; Lim, Shaun W.; Abate, Adam R.

    2016-01-01

    Sequencing small quantities of DNA is important for applications ranging from the assembly of uncultivable microbial genomes to the identification of cancer-associated mutations. To obtain sufficient quantities of DNA for sequencing, the small amount of starting material must be amplified significantly. However, existing methods often yield errors or non-uniform coverage, reducing sequencing data quality. Here, we describe digital droplet multiple displacement amplification, a method that enables massive amplification of low-input material while maintaining sequence accuracy and uniformity. The low-input material is compartmentalized as single molecules in millions of picoliter droplets. Because the molecules are isolated in compartments, they amplify to saturation without competing for resources; this yields uniform representation of all sequences in the final product and, in turn, enhances the quality of the sequence data. We demonstrate the ability to uniformly amplify the genomes of single Escherichia coli cells, comprising just 4.7 fg of starting DNA, and obtain sequencing coverage distributions that rival that of unamplified material. Digital droplet multiple displacement amplification provides a simple and effective method for amplifying minute amounts of DNA for accurate and uniform sequencing. PMID:26704978

  4. Contributions from associative and explicit sequence knowledge to the execution of discrete keying sequences.

    PubMed

    Verwey, Willem B

    2015-05-01

    Research has provided many indications that highly practiced 6-key sequences are carried out in a chunking mode in which key-specific stimuli past the first are largely ignored. When in such sequences a deviating stimulus occasionally occurs at an unpredictable location, participants fall back to responding to individual stimuli (Verwey & Abrahamse, 2012). The observation that in such a situation execution still benefits from prior practice has been attributed to the possibility to operate in an associative mode. To better understand the contribution to the execution of keying sequences of motor chunks, associative sequence knowledge and also of explicit sequence knowledge, the present study tested three alternative accounts for the earlier finding of an execution rate increase at the end of 6-key sequences performed in the associative mode. The results provide evidence that the earlier observed execution rate increase can be attributed to the use of explicit sequence knowledge. In the present experiment this benefit was limited to sequences that are executed at the moderately fast rates of the associative mode, and occurred at both the earlier and final elements of the sequences. Copyright © 2015 Elsevier B.V. All rights reserved.

  5. Solucion de Problemas y Procesos Cognoscitivos (Problem Solving and Cognitive Processes). Publication No. 41.

    ERIC Educational Resources Information Center

    Rimoldi, Horacio J. A.

    The study of problem solving is made through the analysis of the process that leads to the final answer. The type of information obtained through the study of the process is compared with the information obtained by studying the final answer. The experimental technique used permits to identify the sequence of questions (tactics) that subjects ask…

  6. Dissemination and Implementation of a Financial Management Program for Adult/Young Farmers in Vocational Agriculture Programs in Missouri. Final Report.

    ERIC Educational Resources Information Center

    Denker, Robert; Stewart, Bob R.

    In addition to an eight-page narrative, this final report contains materials and products from phase 2 of a project to develop, disseminate, and implement a three-year sequenced individualized and group instructional program in financial management for adult/young farmers in vocational agriculture. The narrative section discusses the four project…

  7. Detection of a new bat gammaherpesvirus in the Philippines.

    PubMed

    Watanabe, Shumpei; Ueda, Naoya; Iha, Koichiro; Masangkay, Joseph S; Fujii, Hikaru; Alviola, Phillip; Mizutani, Tetsuya; Maeda, Ken; Yamane, Daisuke; Walid, Azab; Kato, Kentaro; Kyuwa, Shigeru; Tohya, Yukinobu; Yoshikawa, Yasuhiro; Akashi, Hiroomi

    2009-08-01

    A new bat herpesvirus was detected in the spleen of an insectivorous bat (Hipposideros diadema, family Hipposideridae) collected on Panay Island, the Philippines. PCR analyses were performed using COnsensus-DEgenerate Hybrid Oligonucleotide Primers (CODEHOPs) targeting the herpesvirus DNA polymerase (DPOL) gene. Although we obtained PCR products with CODEHOPs, direct sequencing using the primers was not possible because of high degree of degeneracy. Direct sequencing technology developed in our rapid determination system of viral RNA sequences (RDV) was applied in this study, and a partial DPOL nucleotide sequence was determined. In addition, a partial gB gene nucleotide sequence was also determined using the same strategy. We connected the partial gB and DPOL sequences with long-distance PCR, and a 3741-bp nucleotide fragment, including the 3' part of the gB gene and the 5' part of the DPOL gene, was finally determined. Phylogenetic analysis showed that the sequence was novel and most similar to those of the subfamily Gammaherpesvirinae.

  8. The application of the high throughput sequencing technology in the transposable elements.

    PubMed

    Liu, Zhen; Xu, Jian-hong

    2015-09-01

    High throughput sequencing technology has dramatically improved the efficiency of DNA sequencing, and decreased the costs to a great extent. Meanwhile, this technology usually has advantages of better specificity, higher sensitivity and accuracy. Therefore, it has been applied to the research on genetic variations, transcriptomics and epigenomics. Recently, this technology has been widely employed in the studies of transposable elements and has achieved fruitful results. In this review, we summarize the application of high throughput sequencing technology in the fields of transposable elements, including the estimation of transposon content, preference of target sites and distribution, insertion polymorphism and population frequency, identification of rare copies, transposon horizontal transfers as well as transposon tagging. We also briefly introduce the major common sequencing strategies and algorithms, their advantages and disadvantages, and the corresponding solutions. Finally, we envision the developing trends of high throughput sequencing technology, especially the third generation sequencing technology, and its application in transposon studies in the future, hopefully providing a comprehensive understanding and reference for related scientific researchers.

  9. Forgetting motor programmes: retrieval dynamics in procedural memory.

    PubMed

    Tempel, Tobias; Frings, Christian

    2014-01-01

    When motor sequences are stored in memory in a categorised manner, selective retrieval of some sequences can induce forgetting of the non-retrieved sequences. We show that such retrieval-induced forgetting (RIF) occurs not only in cued recall but also in a test assessing memory indirectly by providing novel test cues without involving recall of items. Participants learned several sequential finger movements (SFMs), each consisting of the movement of two fingers of either the left or the right hand. Subsequently, they performed retrieval practice on half of the sequences of one hand. A final task then required participants to enter letter dyads. A subset of these dyads corresponded to the previously learned sequences. RIF was present in the response times during the entering of the dyads. The finding of RIF in the slowed-down execution of motor programmes overlapping with initially trained motor sequences suggests that inhibition resolved interference between procedural representations of the acquired motor sequences of one hand during retrieval practice.

  10. A Phylogenomic Approach Based on PCR Target Enrichment and High Throughput Sequencing: Resolving the Diversity within the South American Species of Bartsia L. (Orobanchaceae)

    PubMed Central

    Tank, David C.

    2016-01-01

    Advances in high-throughput sequencing (HTS) have allowed researchers to obtain large amounts of biological sequence information at speeds and costs unimaginable only a decade ago. Phylogenetics, and the study of evolution in general, is quickly migrating towards using HTS to generate larger and more complex molecular datasets. In this paper, we present a method that utilizes microfluidic PCR and HTS to generate large amounts of sequence data suitable for phylogenetic analyses. The approach uses the Fluidigm Access Array System (Fluidigm, San Francisco, CA, USA) and two sets of PCR primers to simultaneously amplify 48 target regions across 48 samples, incorporating sample-specific barcodes and HTS adapters (2,304 unique amplicons per Access Array). The final product is a pooled set of amplicons ready to be sequenced, and thus, there is no need to construct separate, costly genomic libraries for each sample. Further, we present a bioinformatics pipeline to process the raw HTS reads to either generate consensus sequences (with or without ambiguities) for every locus in every sample or—more importantly—recover the separate alleles from heterozygous target regions in each sample. This is important because it adds allelic information that is well suited for coalescent-based phylogenetic analyses that are becoming very common in conservation and evolutionary biology. To test our approach and bioinformatics pipeline, we sequenced 576 samples across 96 target regions belonging to the South American clade of the genus Bartsia L. in the plant family Orobanchaceae. After sequencing cleanup and alignment, the experiment resulted in ~25,300bp across 486 samples for a set of 48 primer pairs targeting the plastome, and ~13,500bp for 363 samples for a set of primers targeting regions in the nuclear genome. Finally, we constructed a combined concatenated matrix from all 96 primer combinations, resulting in a combined aligned length of ~40,500bp for 349 samples. PMID:26828929

  11. Quantitation of next generation sequencing library preparation protocol efficiencies using droplet digital PCR assays - a systematic comparison of DNA library preparation kits for Illumina sequencing.

    PubMed

    Aigrain, Louise; Gu, Yong; Quail, Michael A

    2016-06-13

    The emergence of next-generation sequencing (NGS) technologies in the past decade has allowed the democratization of DNA sequencing both in terms of price per sequenced bases and ease to produce DNA libraries. When it comes to preparing DNA sequencing libraries for Illumina, the current market leader, a plethora of kits are available and it can be difficult for the users to determine which kit is the most appropriate and efficient for their applications; the main concerns being not only cost but also minimal bias, yield and time efficiency. We compared 9 commercially available library preparation kits in a systematic manner using the same DNA sample by probing the amount of DNA remaining after each protocol steps using a new droplet digital PCR (ddPCR) assay. This method allows the precise quantification of fragments bearing either adaptors or P5/P7 sequences on both ends just after ligation or PCR enrichment. We also investigated the potential influence of DNA input and DNA fragment size on the final library preparation efficiency. The overall library preparations efficiencies of the libraries show important variations between the different kits with the ones combining several steps into a single one exhibiting some final yields 4 to 7 times higher than the other kits. Detailed ddPCR data also reveal that the adaptor ligation yield itself varies by more than a factor of 10 between kits, certain ligation efficiencies being so low that it could impair the original library complexity and impoverish the sequencing results. When a PCR enrichment step is necessary, lower adaptor-ligated DNA inputs leads to greater amplification yields, hiding the latent disparity between kits. We describe a ddPCR assay that allows us to probe the efficiency of the most critical step in the library preparation, ligation, and to draw conclusion on which kits is more likely to preserve the sample heterogeneity and reduce the need of amplification.

  12. Data compression for sequencing data

    PubMed Central

    2013-01-01

    Post-Sanger sequencing methods produce tons of data, and there is a general agreement that the challenge to store and process them must be addressed with data compression. In this review we first answer the question “why compression” in a quantitative manner. Then we also answer the questions “what” and “how”, by sketching the fundamental compression ideas, describing the main sequencing data types and formats, and comparing the specialized compression algorithms and tools. Finally, we go back to the question “why compression” and give other, perhaps surprising answers, demonstrating the pervasiveness of data compression techniques in computational biology. PMID:24252160

  13. Memory and learning with rapid audiovisual sequences

    PubMed Central

    Keller, Arielle S.; Sekuler, Robert

    2015-01-01

    We examined short-term memory for sequences of visual stimuli embedded in varying multisensory contexts. In two experiments, subjects judged the structure of the visual sequences while disregarding concurrent, but task-irrelevant auditory sequences. Stimuli were eight-item sequences in which varying luminances and frequencies were presented concurrently and rapidly (at 8 Hz). Subjects judged whether the final four items in a visual sequence identically replicated the first four items. Luminances and frequencies in each sequence were either perceptually correlated (Congruent) or were unrelated to one another (Incongruent). Experiment 1 showed that, despite encouragement to ignore the auditory stream, subjects' categorization of visual sequences was strongly influenced by the accompanying auditory sequences. Moreover, this influence tracked the similarity between a stimulus's separate audio and visual sequences, demonstrating that task-irrelevant auditory sequences underwent a considerable degree of processing. Using a variant of Hebb's repetition design, Experiment 2 compared musically trained subjects and subjects who had little or no musical training on the same task as used in Experiment 1. Test sequences included some that intermittently and randomly recurred, which produced better performance than sequences that were generated anew for each trial. The auditory component of a recurring audiovisual sequence influenced musically trained subjects more than it did other subjects. This result demonstrates that stimulus-selective, task-irrelevant learning of sequences can occur even when such learning is an incidental by-product of the task being performed. PMID:26575193

  14. Memory and learning with rapid audiovisual sequences.

    PubMed

    Keller, Arielle S; Sekuler, Robert

    2015-01-01

    We examined short-term memory for sequences of visual stimuli embedded in varying multisensory contexts. In two experiments, subjects judged the structure of the visual sequences while disregarding concurrent, but task-irrelevant auditory sequences. Stimuli were eight-item sequences in which varying luminances and frequencies were presented concurrently and rapidly (at 8 Hz). Subjects judged whether the final four items in a visual sequence identically replicated the first four items. Luminances and frequencies in each sequence were either perceptually correlated (Congruent) or were unrelated to one another (Incongruent). Experiment 1 showed that, despite encouragement to ignore the auditory stream, subjects' categorization of visual sequences was strongly influenced by the accompanying auditory sequences. Moreover, this influence tracked the similarity between a stimulus's separate audio and visual sequences, demonstrating that task-irrelevant auditory sequences underwent a considerable degree of processing. Using a variant of Hebb's repetition design, Experiment 2 compared musically trained subjects and subjects who had little or no musical training on the same task as used in Experiment 1. Test sequences included some that intermittently and randomly recurred, which produced better performance than sequences that were generated anew for each trial. The auditory component of a recurring audiovisual sequence influenced musically trained subjects more than it did other subjects. This result demonstrates that stimulus-selective, task-irrelevant learning of sequences can occur even when such learning is an incidental by-product of the task being performed.

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jarocki, John Charles; Zage, David John; Fisher, Andrew N.

    LinkShop is a software tool for applying the method of Linkography to the analysis time-sequence data. LinkShop provides command line, web, and application programming interfaces (API) for input and processing of time-sequence data, abstraction models, and ontologies. The software creates graph representations of the abstraction model, ontology, and derived linkograph. Finally, the tool allows the user to perform statistical measurements of the linkograph and refine the ontology through direct manipulation of the linkograph.

  16. Final Report for LDRD Project 02-ERD-069: Discovering the Unknown Mechanism(s) of Virulence in a BW, Class A Select Agent

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chain, P; Garcia, E

    2003-02-06

    The goal of this proposed effort was to assess the difficulty in identifying and characterizing virulence candidate genes in an organism for which very limited data exists. This was accomplished by first addressing the finishing phase of draft-sequenced F. tularensis genomes and conducting comparative analyses to determine the coding potential of each genome; to discover the differences in genome structure and content, and to identify potential genes whose products may be involved in the F. tularensis virulence process. The project was divided into three parts: (1) Genome finishing: This part involves determining the order and orientation of the consensus sequencesmore » of contigs obtained from Phrap assemblies of random draft genomic sequences. This tedious process consists of linking contig ends using information embedded in each sequence file that relates the sequence to the original cloned insert. Since inserts are sequenced from both ends, we can establish a link between these paired-ends in different contigs and thus order and orient contigs. Since these genomes carry numerous copies of insertion sequences, these repeated elements ''confuse'' the Phrap assembly program. It is thus necessary to break these contigs apart at the repeated sequences and individually join the proper flanking regions using paired-end information, or using results of comparisons against a similar genome. Larger repeated elements such as the small subunit ribosomal RNA operon require verification with PCR. Tandem repeats require manual intervention and typically rely on single nucleotide polymorphisms to be resolved. Remaining gaps require PCR reactions and sequencing. Once the genomes have been ''closed'', low quality regions are addressed by resequencing reactions. (2) Genome analysis: The final consensus sequences are processed by combining the results of three gene modelers: Glimmer, Critica and Generation. The final gene models are submitted to a battery of homology searches and domain prediction programs in order to annotate them (e.g. BLAST, Pfam, TIGRfam, COG, KEGG, InterPro, TMhmm, SignalP). The genome structure is also assessed in terms of G+C content, GC bias (GC skew), and locations of repeated regions (e.g. IS elements) and phage-like genes. (3) Comparative genomics: The results of the various genome analyses are compared between the finished (or almost finished) genomes. Here, we have compared the F. tularensis genomes from the extremely lethal strain Schu4 (subsp. tularensis), the vaccine strain LVS (subsp. holartica), and strain UT01-4992 of the less virulent, opportunistic subsp. novicida. Regions present in the highly virulent strain that are absent from the other less virulent strains may provide insight into what factors are required for the high level of virulence.« less

  17. Functional Network Development During the First Year: Relative Sequence and Socioeconomic Correlations

    PubMed Central

    Gao, Wei; Alcauter, Sarael; Elton, Amanda; Hernandez-Castillo, Carlos R.; Smith, J. Keith; Ramirez, Juanita; Lin, Weili

    2015-01-01

    The first postnatal year is characterized by the most dramatic functional network development of the human lifespan. Yet, the relative sequence of the maturation of different networks and the impact of socioeconomic status (SES) on their development during this critical period remains poorly characterized. Leveraging a large, normally developing infant sample with multiple longitudinal resting-state functional magnetic resonance imaging scans during the first year (N = 65, scanned every 3 months), we aimed to delineate the relative maturation sequence of 9 key brain functional networks and examine their SES correlations. Our results revealed a maturation sequence from primary sensorimotor/auditory to visual to attention/default-mode, and finally to executive control networks. Network-specific critical growth periods were also identified. Finally, marginally significant positive SES–brain correlations were observed at 6 months of age for both the sensorimotor and default-mode networks, indicating interesting SES effects on functional brain maturation. To the best of our knowledge, this is the first study delineating detailed longitudinal growth trajectories of all major functional networks during the first year of life and their SES correlations. Insights from this study not only improve our understanding of early brain development, but may also inform the critical periods for SES expression during infancy. PMID:24812084

  18. Self-organized neural maps of human protein sequences.

    PubMed Central

    Ferrán, E. A.; Pflugfelder, B.; Ferrara, P.

    1994-01-01

    We have recently described a method based on artificial neural networks to cluster protein sequences into families. The network was trained with Kohonen's unsupervised learning algorithm using, as inputs, the matrix patterns derived from the dipeptide composition of the proteins. We present here a large-scale application of that method to classify the 1,758 human protein sequences stored in the SwissProt database (release 19.0), whose lengths are greater than 50 amino acids. In the final 2-dimensional topologically ordered map of 15 x 15 neurons, proteins belonging to known families were associated with the same neuron or with neighboring ones. Also, as an attempt to reduce the time-consuming learning procedure, we compared 2 learning protocols: one of 500 epochs (100 SUN CPU-hours [CPU-h]), and another one of 30 epochs (6.7 CPU-h). A further reduction of learning-computing time, by a factor of about 3.3, with similar protein clustering results, was achieved using a matrix of 11 x 11 components to represent the sequences. Although network training is time consuming, the classification of a new protein in the final ordered map is very fast (14.6 CPU-seconds). We also show a comparison between the artificial neural network approach and conventional methods of biosequence analysis. PMID:8019421

  19. Coupling detrended fluctuation analysis for multiple warehouse-out behavioral sequences

    NASA Astrophysics Data System (ADS)

    Yao, Can-Zhong; Lin, Ji-Nan; Zheng, Xu-Zhou

    2017-01-01

    Interaction patterns among different warehouses could make the warehouse-out behavioral sequences less predictable. We firstly take a coupling detrended fluctuation analysis on the warehouse-out quantity, and find that the multivariate sequences exhibit significant coupling multifractal characteristics regardless of the types of steel products. Secondly, we track the sources of multifractal warehouse-out sequences by shuffling and surrogating original ones, and we find that fat-tail distribution contributes more to multifractal features than the long-term memory, regardless of types of steel products. From perspective of warehouse contribution, some warehouses steadily contribute more to multifractal than other warehouses. Finally, based on multiscale multifractal analysis, we propose Hurst surface structure to investigate coupling multifractal, and show that multiple behavioral sequences exhibit significant coupling multifractal features that emerge and usually be restricted within relatively greater time scale interval.

  20. JANE: efficient mapping of prokaryotic ESTs and variable length sequence reads on related template genomes

    PubMed Central

    2009-01-01

    Background ESTs or variable sequence reads can be available in prokaryotic studies well before a complete genome is known. Use cases include (i) transcriptome studies or (ii) single cell sequencing of bacteria. Without suitable software their further analysis and mapping would have to await finalization of the corresponding genome. Results The tool JANE rapidly maps ESTs or variable sequence reads in prokaryotic sequencing and transcriptome efforts to related template genomes. It provides an easy-to-use graphics interface for information retrieval and a toolkit for EST or nucleotide sequence function prediction. Furthermore, we developed for rapid mapping an enhanced sequence alignment algorithm which reassembles and evaluates high scoring pairs provided from the BLAST algorithm. Rapid assembly on and replacement of the template genome by sequence reads or mapped ESTs is achieved. This is illustrated (i) by data from Staphylococci as well as from a Blattabacteria sequencing effort, (ii) mapping single cell sequencing reads is shown for poribacteria to sister phylum representative Rhodopirellula Baltica SH1. The algorithm has been implemented in a web-server accessible at http://jane.bioapps.biozentrum.uni-wuerzburg.de. Conclusion Rapid prokaryotic EST mapping or mapping of sequence reads is achieved applying JANE even without knowing the cognate genome sequence. PMID:19943962

  1. Now and Next-Generation Sequencing Techniques: Future of Sequence Analysis Using Cloud Computing

    PubMed Central

    Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav

    2012-01-01

    Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed “cloud computing”) has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows. PMID:23248640

  2. Now and next-generation sequencing techniques: future of sequence analysis using cloud computing.

    PubMed

    Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav

    2012-01-01

    Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed "cloud computing") has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows.

  3. Expansins expression is associated with grain size dynamics in wheat (Triticum aestivum L.)

    PubMed Central

    Lizana, X. Carolina; Riegel, Ricardo; Gomez, Leonardo D.; Herrera, Jaime; Isla, Adolfo; McQueen-Mason, Simon J.; Calderini, Daniel F.

    2010-01-01

    Grain weight is one of the most important components of cereal yield and quality. A clearer understanding of the physiological and molecular determinants of this complex trait would provide an insight into the potential benefits for plant breeding. In the present study, the dynamics of dry matter accumulation, water uptake, and grain size in parallel with the expression of expansins during grain growth in wheat were analysed. The stabilized water content of grains showed a strong association with final grain weight (r2=0.88, P <0.01). Grain length was found to be the trait that best correlated with final grain weight (r2=0.98, P <0.01) and volume (r2=0.94, P <0.01). The main events that defined final grain weight occurred during the first third of grain-filling when maternal tissues (the pericarp of grains) undergo considerable expansion. Eight expansin coding sequences were isolated from pericarp RNA and the temporal profiles of accumulation of these transcripts were monitored. Sequences showing high homology with TaExpA6 were notably abundant during early grain expansion and declined as maturity was reached. RNA in situ hybridization studies revealed that the transcript for TaExpA6 was principally found in the pericarp during early growth in grain development and, subsequently, in both the endosperm and pericarp. The signal in these images is likely to be the sum of the transcript levels of all three sequences with high similarity to the TaExpA6 gene. The early part of the expression profile of this putative expansin gene correlates well with the critical periods of early grain expansion, suggesting it as a possible factor in the final determination of grain size. PMID:20080826

  4. Combining high-throughput sequencing and targeted loci data to infer the phylogeny of the "Adenocalymma-Neojobertia" clade (Bignonieae, Bignoniaceae).

    PubMed

    Fonseca, Luiz Henrique M; Lohmann, Lúcia G

    2018-06-01

    Combining high-throughput sequencing data with amplicon sequences allows the reconstruction of robust phylogenies based on comprehensive sampling of characters and taxa. Here, we combine Next Generation Sequencing (NGS) and Sanger sequencing data to infer the phylogeny of the "Adenocalymma-Neojobertia" clade (Bignonieae, Bignoniaceae), a diverse lineage of Neotropical plants, using Maximum Likelihood and Bayesian approaches. We used NGS to obtain complete or nearly-complete plastomes of members of this clade, leading to a final dataset with 54 individuals, representing 44 members of ingroup and 10 outgroups. In addition, we obtained Sanger sequences of two plastid markers (ndhF and rpl32-trnL) for 44 individuals (43 ingroup and 1 outgroup) and the nuclear PepC for 64 individuals (63 ingroup and 1 outgroup). Our final dataset includes 87 individuals of members of the "Adenocalymma-Neojobertia" clade, representing 66 species (ca. 90% of the diversity), plus 11 outgroups. Plastid and nuclear datasets recovered congruent topologies and were combined. The combined analysis recovered a monophyletic "Adenocalymma-Neojobertia" clade and a paraphyletic Adenocalymma that also contained a monophyletic Neojobertia plus Pleonotoma albiflora. Relationships are strongly supported in all analyses, with most lineages within the "Adenocalymma-Neojobertia" clade receiving maximum posterior probabilities. Ancestral character state reconstructions using Bayesian approaches identified six morphological synapomorphies of clades namely, prophyll type, petiole and petiolule articulation, tendril ramification, inflorescence ramification, calyx shape, and fruit wings. Other characters such as habit, calyx cupular trichomes, corolla color, and corolla shape evolved multiple times. These characters are putatively related with the clade diversification and can be further explored in diversification studies. Copyright © 2018 Elsevier Inc. All rights reserved.

  5. Benchmarking of Methods for Genomic Taxonomy

    DOE PAGES

    Larsen, Mette V.; Cosentino, Salvatore; Lukjancenko, Oksana; ...

    2014-02-26

    One of the first issues that emerges when a prokaryotic organism of interest is encountered is the question of what it is—that is, which species it is. The 16S rRNA gene formed the basis of the first method for sequence-based taxonomy and has had a tremendous impact on the field of microbiology. Nevertheless, the method has been found to have a number of shortcomings. In this paper, we trained and benchmarked five methods for whole-genome sequence-based prokaryotic species identification on a common data set of complete genomes: (i) SpeciesFinder, which is based on the complete 16S rRNA gene; (ii) Reads2Typemore » that searches for species-specific 50-mers in either the 16S rRNA gene or the gyrB gene (for the Enterobacteraceae family); (iii) the ribosomal multilocus sequence typing (rMLST) method that samples up to 53 ribosomal genes; (iv) TaxonomyFinder, which is based on species-specific functional protein domain profiles; and finally (v) KmerFinder, which examines the number of cooccurring k-mers (substrings of k nucleotides in DNA sequence data). The performances of the methods were subsequently evaluated on three data sets of short sequence reads or draft genomes from public databases. In total, the evaluation sets constituted sequence data from more than 11,000 isolates covering 159 genera and 243 species. Our results indicate that methods that sample only chromosomal, core genes have difficulties in distinguishing closely related species which only recently diverged. Finally, the KmerFinder method had the overall highest accuracy and correctly identified from 93% to 97% of the isolates in the evaluations sets.« less

  6. Global Emerging Infection Surveillance and Response (GEIS)- Avian Influenza Pandemic Influenza (AI/PI) Program

    DTIC Science & Technology

    2014-10-01

    amplicon of Corona Virus RdP gene. Finally, one PCR amplicon of a Chikungunya virus gene from the VHF group was sequenced. These sequence data are...suggestive of STIs ( discharge or genital ulcer) often go undiagnosed, and are treated empirically with broad spectrum antibiotics. The drug resistance... discharge are offered anonymous screening for gonorrhea and chlamydia (GC) and specimen taken for detection and isolation of Neisseria gonorrhoeae

  7. Massively Parallel Sequencing Detected a Mutation in the MFN2 Gene Missed by Sanger Sequencing Due to a Primer Mismatch on an SNP Site.

    PubMed

    Neupauerová, Jana; Grečmalová, Dagmar; Seeman, Pavel; Laššuthová, Petra

    2016-05-01

    We describe a patient with early onset severe axonal Charcot-Marie-Tooth disease (CMT2) with dominant inheritance, in whom Sanger sequencing failed to detect a mutation in the mitofusin 2 (MFN2) gene because of a single nucleotide polymorphism (rs2236057) under the PCR primer sequence. The severe early onset phenotype and the family history with severely affected mother (died after delivery) was very suggestive of CMT2A and this suspicion was finally confirmed by a MFN2 mutation. The mutation p.His361Tyr was later detected in the patient by massively parallel sequencing with a gene panel for hereditary neuropathies. According to this information, new primers for amplification and sequencing were designed which bind away from the polymorphic sites of the patient's DNA. Sanger sequencing with these new primers then confirmed the heterozygous mutation in the MFN2 gene in this patient. This case report shows that massively parallel sequencing may in some rare cases be more sensitive than Sanger sequencing and highlights the importance of accurate primer design which requires special attention. © 2016 John Wiley & Sons Ltd/University College London.

  8. Formerly Used Sites Remedial Action Program (FUSRAP) W. R. Grace Building 23 Remedial Action-Challenges and Successes - 12247

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Barber, Brenda; Honerlah, Hans; O'Neill, Mike

    2012-07-01

    Monazite sand processing was conducted at the W. R. Grace Curtis Bay Facility (Baltimore, Maryland) from mid-May 1956 through the spring of 1957 under license to the Atomic Energy Commission (AEC), for the extraction of source material in the form of thorium, as well as rare earth elements. The processing was conducted in the southwest quadrant of a ca. 100 year old, five-story, building (Building 23) in the active manufacturing portion of the facility. Building components and equipment in the southwest quadrant of Building 23 exhibited residual radiological activity remaining from the monazite sand processing. U.S. Army Corps of Engineersmore » (USACE) conducted a remedial investigation (RI) and feasibility study (FS) and prepared a Record of Decision (ROD) to address residual radioactivity on building components and equipment in the southwest quadrant of Building 23. The remedy selected for the southwest quadrant of Building 23, which was documented in the ROD (dated May 2005), was identified as 'Alternative 2: Decontamination With Removal to Industrial Use Levels'. The selected remedy provided for either decontaminating or removing areas of radioactivity to meet the RGs. Demonstration of compliance with the selected ARAR was performed using the Multi-Agency Radiation Survey and Site Investigation Manual (MARSSIM) and other appropriate guidance, as well as appropriate dose modeling codes where necessary. USACE-Baltimore District along with its private industry partner worked together under the terms of a 2008 Settlement Agreement to implement the remedial action (RA) for the southwest quadrant of Building 23. The RA was conducted in two phases: Phase 1 was completed to improve the building condition for support of subsequent remedial action and decrease scope uncertainty of the remedial action, and Phase 2 included decontamination and removal activities to meet the RGs and demonstration of compliance with the selected ARAR. Challenges encountered during the RA include: coordination with stakeholders, coordination between multiple RA contractors, addressing unique structural challenges for Building 23, nonradiological hazards associated with the RA, weather issues, and complex final status survey (FSS) coordination. The challenges during the Phase 1 RA were handled successfully. The challenges for the Phase 2 RA, which is anticipated to be complete by late-summer of 2012, have been handled successfully so far. By fall of 2012, USACE is expecting to finalize a robust RA Closure Report, including the Final Status Survey Report, which summarizes the RA activities and documents compliance with the ROD. During the ongoing RA at Building 23, there have been and still are many challenges both technically and from a project management perspective, due in part to the nature and extent of impact at the site (residual radioactivity within an active processing building), dual oversight by the property owner and USACE, and site-specific challenges associated with a complex RA and multiple contractors. Currently, USACE and its industry partner are overseeing the completion of RA field activities. RA closure documentation for the remediation of Building 23 to address residual contamination in building materials will be reviewed/approved by USACE and its industry partner upon completion of the field activities. USACE and its industry partner are working well together, through the Settlement Agreement, to conduct a cost-efficient and effective remedial action to address the legacy issues at Building 23. This cooperative effort has set a firm foundation for achieving a successful RA at the RWDA using a 'forward think' approach, and it is a case study for other sites where an industry partner is involved. The collaborative effort led to implementation of an RA which is acceptable to the site owner, the regulators, and the public, thus allowing USACE to move this project forward successfully in the FUSRAP program. (authors)« less

  9. Student Mental Models of the Greenhouse Effect: Retention Months After Interventions

    NASA Astrophysics Data System (ADS)

    Harris, S. E.; Gold, A. U.

    2013-12-01

    Individual understanding of climate science, and the greenhouse effect in particular, is one factor important for societal decision-making. Ideally, learning opportunities about the greenhouse effect will not only move people toward expert-like ideas but will also have long-lasting effects for those individuals. We assessed university students' mental models of the greenhouse effect before and after specific learning experiences, on a final exam, then again a few months later. Our aim was to measure retention after students had not necessarily been thinking about, nor studying, the greenhouse effect recently. How sticky were the ideas learned? 164 students in an introductory science course participated in a sequence of two learning activities and assessments regarding the greenhouse effect. The first lesson involved the full class, then, for the second lesson, half the students completed a simulation-based activity and the other half completed a data-driven activity. We assessed student thinking through concept sketches, multiple choice and short answer questions. All students generated concept sketches four times, and completed a set of multiple choice (MCQs) and short answer questions twice. Later, 3-4 months after the course ended, 27 students ('retention students') completed an additional concept sketch and answered the questions again, as a retention assessment. These 27 students were nearly evenly split between the two contrasting second lessons in the sequence and included both high and low-achieving students. We then compared student sketches and scores to 'expert' answers. The general pattern over time showed a significant increase in student scores from before the lesson sequence to after, both on concept sketches and MCQs, then an additional increase in concept sketch score on the final exam (MCQs were not asked on the final exam). The scores for the retention students were not significantly different from the full class. Within the retention group, there was also no difference in scores based on which contrasting lesson a student did. Students in both of the contrasting lessons scored significantly higher on the retention test than on the initial pre-test. Their concept sketch scores on the retention test were slightly lower than their scores on the final exam (not significantly), but matched their post-lesson-sequence scores. Their MCQ scores were slightly higher on the retention test than on the post-lesson-sequence test (also not significantly). These results imply that students both learned and retained new ideas about the greenhouse effect for at least a few months after the end of the course and did not regress to their pre-lesson ideas. Further analysis should show which particular aspects of student mental models changed over the full temporal sequence.

  10. Sequencing of Dust Filter Production Process Using Design Structure Matrix (DSM)

    NASA Astrophysics Data System (ADS)

    Sari, R. M.; Matondang, A. R.; Syahputri, K.; Anizar; Siregar, I.; Rizkya, I.; Ursula, C.

    2018-01-01

    Metal casting company produces machinery spare part for manufactures. One of the product produced is dust filter. Most of palm oil mill used this product. Since it is used in most of palm oil mill, company often have problems to address this product. One of problem is the disordered of production process. It carried out by the job sequencing. The important job that should be solved first, least implement, while less important job and could be completed later, implemented first. Design Structure Matrix (DSM) used to analyse and determine priorities in the production process. DSM analysis is sort of production process through dependency sequencing. The result of dependency sequences shows the sequence process according to the inter-process linkage considering before and after activities. Finally, it demonstrates their activities to the coupled activities for metal smelting, refining, grinding, cutting container castings, metal expenditure of molds, metal casting, coating processes, and manufacture of molds of sand.

  11. The complete nucleotide sequence and genome organization of a novel betaflexivirus infecting Citrullus lanatus.

    PubMed

    Xin, Min; Zhang, Peipei; Liu, Wenwen; Ren, Yingdang; Cao, Mengji; Wang, Xifeng

    2017-10-01

    The complete nucleotide sequence of a novel positive single-stranded (+ss) RNA virus, tentatively named watermelon virus A (WVA), was determined using a combination of three methods: RNA sequencing, small RNA sequencing, and Sanger sequencing. The full genome of WVA is comprised of 8,372 nucleotides (nt), excluding the poly (A) tail, and contains four open reading frames (ORFs). The largest ORF, ORF1 encodes a putative replication-associated polyprotein (RP) with three conserved domains. ORF2 and ORF4 encode a movement protein (MP) and coat protein (CP), respectively. The putative product encoded by ORF3, of an estimated molecular mass of 25 kDa, has no significant similarity with other proteins. Identity and phylogenetic analysis indicate that WVA is a new virus, closely related to members of the family Betaflexiviridae. However, the final taxonomic allocation of WVA within the family is yet to be determined.

  12. Protein Science by DNA Sequencing: How Advances in Molecular Biology Are Accelerating Biochemistry.

    PubMed

    Higgins, Sean A; Savage, David F

    2018-01-09

    A fundamental goal of protein biochemistry is to determine the sequence-function relationship, but the vastness of sequence space makes comprehensive evaluation of this landscape difficult. However, advances in DNA synthesis and sequencing now allow researchers to assess the functional impact of every single mutation in many proteins, but challenges remain in library construction and the development of general assays applicable to a diverse range of protein functions. This Perspective briefly outlines the technical innovations in DNA manipulation that allow massively parallel protein biochemistry and then summarizes the methods currently available for library construction and the functional assays of protein variants. Areas in need of future innovation are highlighted with a particular focus on assay development and the use of computational analysis with machine learning to effectively traverse the sequence-function landscape. Finally, applications in the fundamentals of protein biochemistry, disease prediction, and protein engineering are presented.

  13. Genotyping by Sequencing Using Specific Allelic Capture to Build a High-Density Genetic Map of Durum Wheat

    PubMed Central

    Holtz, Yan; Ardisson, Morgane; Ranwez, Vincent; Besnard, Alban; Leroy, Philippe; Poux, Gérard; Roumet, Pierre; Viader, Véronique; Santoni, Sylvain; David, Jacques

    2016-01-01

    Targeted sequence capture is a promising technology which helps reduce costs for sequencing and genotyping numerous genomic regions in large sets of individuals. Bait sequences are designed to capture specific alleles previously discovered in parents or reference populations. We studied a set of 135 RILs originating from a cross between an emmer cultivar (Dic2) and a recent durum elite cultivar (Silur). Six thousand sequence baits were designed to target Dic2 vs. Silur polymorphisms discovered in a previous RNAseq study. These baits were exposed to genomic DNA of the RIL population. Eighty percent of the targeted SNPs were recovered, 65% of which were of high quality and coverage. The final high density genetic map consisted of more than 3,000 markers, whose genetic and physical mapping were consistent with those obtained with large arrays. PMID:27171472

  14. IBS: an illustrator for the presentation and visualization of biological sequences.

    PubMed

    Liu, Wenzhong; Xie, Yubin; Ma, Jiyong; Luo, Xiaotong; Nie, Peng; Zuo, Zhixiang; Lahrmann, Urs; Zhao, Qi; Zheng, Yueyuan; Zhao, Yong; Xue, Yu; Ren, Jian

    2015-10-15

    Biological sequence diagrams are fundamental for visualizing various functional elements in protein or nucleotide sequences that enable a summarization and presentation of existing information as well as means of intuitive new discoveries. Here, we present a software package called illustrator of biological sequences (IBS) that can be used for representing the organization of either protein or nucleotide sequences in a convenient, efficient and precise manner. Multiple options are provided in IBS, and biological sequences can be manipulated, recolored or rescaled in a user-defined mode. Also, the final representational artwork can be directly exported into a publication-quality figure. The standalone package of IBS was implemented in JAVA, while the online service was implemented in HTML5 and JavaScript. Both the standalone package and online service are freely available at http://ibs.biocuckoo.org. renjian.sysu@gmail.com or xueyu@hust.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  15. IBS: an illustrator for the presentation and visualization of biological sequences

    PubMed Central

    Liu, Wenzhong; Xie, Yubin; Ma, Jiyong; Luo, Xiaotong; Nie, Peng; Zuo, Zhixiang; Lahrmann, Urs; Zhao, Qi; Zheng, Yueyuan; Zhao, Yong; Xue, Yu; Ren, Jian

    2015-01-01

    Summary: Biological sequence diagrams are fundamental for visualizing various functional elements in protein or nucleotide sequences that enable a summarization and presentation of existing information as well as means of intuitive new discoveries. Here, we present a software package called illustrator of biological sequences (IBS) that can be used for representing the organization of either protein or nucleotide sequences in a convenient, efficient and precise manner. Multiple options are provided in IBS, and biological sequences can be manipulated, recolored or rescaled in a user-defined mode. Also, the final representational artwork can be directly exported into a publication-quality figure. Availability and implementation: The standalone package of IBS was implemented in JAVA, while the online service was implemented in HTML5 and JavaScript. Both the standalone package and online service are freely available at http://ibs.biocuckoo.org. Contact: renjian.sysu@gmail.com or xueyu@hust.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26069263

  16. The effect of different control point sampling sequences on convergence of VMAT inverse planning

    NASA Astrophysics Data System (ADS)

    Pardo Montero, Juan; Fenwick, John D.

    2011-04-01

    A key component of some volumetric-modulated arc therapy (VMAT) optimization algorithms is the progressive addition of control points to the optimization. This idea was introduced in Otto's seminal VMAT paper, in which a coarse sampling of control points was used at the beginning of the optimization and new control points were progressively added one at a time. A different form of the methodology is also present in the RapidArc optimizer, which adds new control points in groups called 'multiresolution levels', each doubling the number of control points in the optimization. This progressive sampling accelerates convergence, improving the results obtained, and has similarities with the ordered subset algorithm used to accelerate iterative image reconstruction. In this work we have used a VMAT optimizer developed in-house to study the performance of optimization algorithms which use different control point sampling sequences, most of which fall into three different classes: doubling sequences, which add new control points in groups such that the number of control points in the optimization is (roughly) doubled; Otto-like progressive sampling which adds one control point at a time, and equi-length sequences which contain several multiresolution levels each with the same number of control points. Results are presented in this study for two clinical geometries, prostate and head-and-neck treatments. A dependence of the quality of the final solution on the number of starting control points has been observed, in agreement with previous works. We have found that some sequences, especially E20 and E30 (equi-length sequences with 20 and 30 multiresolution levels, respectively), generate better results than a 5 multiresolution level RapidArc-like sequence. The final value of the cost function is reduced up to 20%, such reductions leading to small improvements in dosimetric parameters characterizing the treatments—slightly more homogeneous target doses and better sparing of the organs at risk.

  17. SVM-Based Prediction of Propeptide Cleavage Sites in Spider Toxins Identifies Toxin Innovation in an Australian Tarantula

    PubMed Central

    Wong, Emily S. W.; Hardy, Margaret C.; Wood, David; Bailey, Timothy; King, Glenn F.

    2013-01-01

    Spider neurotoxins are commonly used as pharmacological tools and are a popular source of novel compounds with therapeutic and agrochemical potential. Since venom peptides are inherently toxic, the host spider must employ strategies to avoid adverse effects prior to venom use. It is partly for this reason that most spider toxins encode a protective proregion that upon enzymatic cleavage is excised from the mature peptide. In order to identify the mature toxin sequence directly from toxin transcripts, without resorting to protein sequencing, the propeptide cleavage site in the toxin precursor must be predicted bioinformatically. We evaluated different machine learning strategies (support vector machines, hidden Markov model and decision tree) and developed an algorithm (SpiderP) for prediction of propeptide cleavage sites in spider toxins. Our strategy uses a support vector machine (SVM) framework that combines both local and global sequence information. Our method is superior or comparable to current tools for prediction of propeptide sequences in spider toxins. Evaluation of the SVM method on an independent test set of known toxin sequences yielded 96% sensitivity and 100% specificity. Furthermore, we sequenced five novel peptides (not used to train the final predictor) from the venom of the Australian tarantula Selenotypus plumipes to test the accuracy of the predictor and found 80% sensitivity and 99.6% 8-mer specificity. Finally, we used the predictor together with homology information to predict and characterize seven groups of novel toxins from the deeply sequenced venom gland transcriptome of S. plumipes, which revealed structural complexity and innovations in the evolution of the toxins. The precursor prediction tool (SpiderP) is freely available on ArachnoServer (http://www.arachnoserver.org/spiderP.html), a web portal to a comprehensive relational database of spider toxins. All training data, test data, and scripts used are available from the SpiderP website. PMID:23894279

  18. Final Technical Report for subcontract number B612144

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mayali, X.; Marcu, O.

    The original statement of work stipulated that the Subcontractor shall perform bacterial and algal cultivation and manipulation, microbe isolation, preparation of samples for sequencing and isotopic analysis, data analysis, and manuscript preparation. The Subcontractor shall work closely with Dr. Mayali and other LLNL scientists, and shall participate in monthly SFA meetings (either in person or by telephone). The Subcontractor shall deliver a final report at the conclusion of the work.

  19. Single-Cell Sequencing for Drug Discovery and Drug Development.

    PubMed

    Wu, Hongjin; Wang, Charles; Wu, Shixiu

    2017-01-01

    Next-generation sequencing (NGS), particularly single-cell sequencing, has revolutionized the scale and scope of genomic and biomedical research. Recent technological advances in NGS and singlecell studies have made the deep whole-genome (DNA-seq), whole epigenome and whole-transcriptome sequencing (RNA-seq) at single-cell level feasible. NGS at the single-cell level expands our view of genome, epigenome and transcriptome and allows the genome, epigenome and transcriptome of any organism to be explored without a priori assumptions and with unprecedented throughput. And it does so with single-nucleotide resolution. NGS is also a very powerful tool for drug discovery and drug development. In this review, we describe the current state of single-cell sequencing techniques, which can provide a new, more powerful and precise approach for analyzing effects of drugs on treated cells and tissues. Our review discusses single-cell whole genome/exome sequencing (scWGS/scWES), single-cell transcriptome sequencing (scRNA-seq), single-cell bisulfite sequencing (scBS), and multiple omics of single-cell sequencing. We also highlight the advantages and challenges of each of these approaches. Finally, we describe, elaborate and speculate the potential applications of single-cell sequencing for drug discovery and drug development. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  20. Draft Sequences of the Radish (Raphanus sativus L.) Genome

    PubMed Central

    Kitashiba, Hiroyasu; Li, Feng; Hirakawa, Hideki; Kawanabe, Takahiro; Zou, Zhongwei; Hasegawa, Yoichi; Tonosaki, Kaoru; Shirasawa, Sachiko; Fukushima, Aki; Yokoi, Shuji; Takahata, Yoshihito; Kakizaki, Tomohiro; Ishida, Masahiko; Okamoto, Shunsuke; Sakamoto, Koji; Shirasawa, Kenta; Tabata, Satoshi; Nishio, Takeshi

    2014-01-01

    Radish (Raphanus sativus L., n = 9) is one of the major vegetables in Asia. Since the genomes of Brassica and related species including radish underwent genome rearrangement, it is quite difficult to perform functional analysis based on the reported genomic sequence of Brassica rapa. Therefore, we performed genome sequencing of radish. Short reads of genomic sequences of 191.1 Gb were obtained by next-generation sequencing (NGS) for a radish inbred line, and 76,592 scaffolds of ≥300 bp were constructed along with the bacterial artificial chromosome-end sequences. Finally, the whole draft genomic sequence of 402 Mb spanning 75.9% of the estimated genomic size and containing 61,572 predicted genes was obtained. Subsequently, 221 single nucleotide polymorphism markers and 768 PCR-RFLP markers were used together with the 746 markers produced in our previous study for the construction of a linkage map. The map was combined further with another radish linkage map constructed mainly with expressed sequence tag-simple sequence repeat markers into a high-density integrated map of 1,166 cM with 2,553 DNA markers. A total of 1,345 scaffolds were assigned to the linkage map, spanning 116.0 Mb. Bulked PCR products amplified by 2,880 primer pairs were sequenced by NGS, and SNPs in eight inbred lines were identified. PMID:24848699

  1. Prediction of multi-drug resistance transporters using a novel sequence analysis method [version 2; referees: 2 approved

    DOE PAGES

    McDermott, Jason E.; Bruillard, Paul; Overall, Christopher C.; ...

    2015-03-09

    There are many examples of groups of proteins that have similar function, but the determinants of functional specificity may be hidden by lack of sequencesimilarity, or by large groups of similar sequences with different functions. Transporters are one such protein group in that the general function, transport, can be easily inferred from the sequence, but the substrate specificity can be impossible to predict from sequence with current methods. In this paper we describe a linguistic-based approach to identify functional patterns from groups of unaligned protein sequences and its application to predict multi-drug resistance transporters (MDRs) from bacteria. We first showmore » that our method can recreate known patterns from PROSITE for several motifs from unaligned sequences. We then show that the method, MDRpred, can predict MDRs with greater accuracy and positive predictive value than a collection of currently available family-based models from the Pfam database. Finally, we apply MDRpred to a large collection of protein sequences from an environmental microbiome study to make novel predictions about drug resistance in a potential environmental reservoir.« less

  2. Skeleton-based human action recognition using multiple sequence alignment

    NASA Astrophysics Data System (ADS)

    Ding, Wenwen; Liu, Kai; Cheng, Fei; Zhang, Jin; Li, YunSong

    2015-05-01

    Human action recognition and analysis is an active research topic in computer vision for many years. This paper presents a method to represent human actions based on trajectories consisting of 3D joint positions. This method first decompose action into a sequence of meaningful atomic actions (actionlets), and then label actionlets with English alphabets according to the Davies-Bouldin index value. Therefore, an action can be represented using a sequence of actionlet symbols, which will preserve the temporal order of occurrence of each of the actionlets. Finally, we employ sequence comparison to classify multiple actions through using string matching algorithms (Needleman-Wunsch). The effectiveness of the proposed method is evaluated on datasets captured by commodity depth cameras. Experiments of the proposed method on three challenging 3D action datasets show promising results.

  3. Compact quantum random number generator based on superluminescent light-emitting diodes

    NASA Astrophysics Data System (ADS)

    Wei, Shihai; Yang, Jie; Fan, Fan; Huang, Wei; Li, Dashuang; Xu, Bingjie

    2017-12-01

    By measuring the amplified spontaneous emission (ASE) noise of the superluminescent light emitting diodes, we propose and realize a quantum random number generator (QRNG) featured with practicability. In the QRNG, after the detection and amplification of the ASE noise, the data acquisition and randomness extraction which is integrated in a field programmable gate array (FPGA) are both implemented in real-time, and the final random bit sequences are delivered to a host computer with a real-time generation rate of 1.2 Gbps. Further, to achieve compactness, all the components of the QRNG are integrated on three independent printed circuit boards with a compact design, and the QRNG is packed in a small enclosure sized 140 mm × 120 mm × 25 mm. The final random bit sequences can pass all the NIST-STS and DIEHARD tests.

  4. Draft genome sequence of ramie, Boehmeria nivea (L.) Gaudich.

    PubMed

    Luan, Ming-Bao; Jian, Jian-Bo; Chen, Ping; Chen, Jun-Hui; Chen, Jian-Hua; Gao, Qiang; Gao, Gang; Zhou, Ju-Hong; Chen, Kun-Mei; Guang, Xuan-Min; Chen, Ji-Kang; Zhang, Qian-Qian; Wang, Xiao-Fei; Fang, Long; Sun, Zhi-Min; Bai, Ming-Zhou; Fang, Xiao-Dong; Zhao, Shan-Cen; Xiong, He-Ping; Yu, Chun-Ming; Zhu, Ai-Guo

    2018-05-01

    Ramie, Boehmeria nivea (L.) Gaudich, family Urticaceae, is a plant native to eastern Asia, and one of the world's oldest fibre crops. It is also used as animal feed and for the phytoremediation of heavy metal-contaminated farmlands. Thus, the genome sequence of ramie was determined to explore the molecular basis of its fibre quality, protein content and phytoremediation. For further understanding ramie genome, different paired-end and mate-pair libraries were combined to generate 134.31 Gb of raw DNA sequences using the Illumina whole-genome shotgun sequencing approach. The highly heterozygous B. nivea genome was assembled using the Platanus Genome Assembler, which is an effective tool for the assembly of highly heterozygous genome sequences. The final length of the draft genome of this species was approximately 341.9 Mb (contig N50 = 22.62 kb, scaffold N50 = 1,126.36 kb). Based on ramie genome annotations, 30,237 protein-coding genes were predicted, and the repetitive element content was 46.3%. The completeness of the final assembly was evaluated by benchmarking universal single-copy orthologous genes (BUSCO); 90.5% of the 1,440 expected embryophytic genes were identified as complete, and 4.9% were identified as fragmented. Phylogenetic analysis based on single-copy gene families and one-to-one orthologous genes placed ramie with mulberry and cannabis, within the clade of urticalean rosids. Genome information of ramie will be a valuable resource for the conservation of endangered Boehmeria species and for future studies on the biogeography and characteristic evolution of members of Urticaceae. © 2018 John Wiley & Sons Ltd.

  5. Expression of Bacillus anthracis Protective Antigen in Bacillus megaterium

    DTIC Science & Technology

    2004-03-01

    was easily purified to homogeneity in a single step by ion exchange chromatography. N-terminal amino acid sequencing of the final product confirmed...and this material was purified in a single step by ion-exchange chromatography. N-terminal amino acid sequencing definitively proved that the rPA was...production of a poly-D-glutamic acid capsule, encoded by pXO2, is essential for immune evasion and cellular survival in the host [3,4]. The lethal effects

  6. New Erwinia-Like Organism Causing Cervical Lymphadenitis▿

    PubMed Central

    Shin, Sang Yop; Lee, Mi Young; Song, Jae-Hoon; Ko, Kwan Soo

    2008-01-01

    The first case of cervical lymphadenitis due to infection by a new Erwinia-like organism is reported. The organism was identified initially as Pantoea sp. by a Vitek 2-based assessment but was finally identified as a member of the genus Erwinia by 16S rRNA gene sequence analysis. The isolate displayed 98.9% 16S rRNA gene sequence similarity to that of E. tasmaniensis and showed phenotypic characteristics that were different from other Erwinia species. PMID:18614665

  7. Mycofier: a new machine learning-based classifier for fungal ITS sequences.

    PubMed

    Delgado-Serrano, Luisa; Restrepo, Silvia; Bustos, Jose Ricardo; Zambrano, Maria Mercedes; Anzola, Juan Manuel

    2016-08-11

    The taxonomic and phylogenetic classification based on sequence analysis of the ITS1 genomic region has become a crucial component of fungal ecology and diversity studies. Nowadays, there is no accurate alignment-free classification tool for fungal ITS1 sequences for large environmental surveys. This study describes the development of a machine learning-based classifier for the taxonomical assignment of fungal ITS1 sequences at the genus level. A fungal ITS1 sequence database was built using curated data. Training and test sets were generated from it. A Naïve Bayesian classifier was built using features from the primary sequence with an accuracy of 87 % in the classification at the genus level. The final model was based on a Naïve Bayes algorithm using ITS1 sequences from 510 fungal genera. This classifier, denoted as Mycofier, provides similar classification accuracy compared to BLASTN, but the database used for the classification contains curated data and the tool, independent of alignment, is more efficient and contributes to the field, given the lack of an accurate classification tool for large data from fungal ITS1 sequences. The software and source code for Mycofier are freely available at https://github.com/ldelgado-serrano/mycofier.git .

  8. Asymmetry of perceived key movement in chorale sequences: converging evidence from a probe-tone analysis.

    PubMed

    Cuddy, L L; Thompson, W F

    1992-01-01

    In a probe-tone experiment, two groups of listeners--one trained, the other untrained, in traditional music theory--rated the goodness of fit of each of the 12 notes of the chromatic scale to four-voice harmonic sequences. Sequences were 12 simplified excerpts from Bach chorales, 4 nonmodulating, and 8 modulating. Modulations occurred either one or two steps in either the clockwise or the counterclockwise direction on the cycle of fifths. A consistent pattern of probe-tone ratings was obtained for each sequence, with no significant differences between listener groups. Two methods of analysis (Fourier analysis and regression analysis) revealed a directional asymmetry in the perceived key movement conveyed by modulating sequences. For a given modulation distance, modulations in the counterclockwise direction effected a clearer shift in tonal organization toward the final key than did clockwise modulations. The nature of the directional asymmetry was consistent with results reported for identification and rating of key change in the sequences (Thompson & Cuddy, 1989a). Further, according to the multiple-regression analysis, probe-tone ratings did not merely reflect the distribution of tones in the sequence. Rather, ratings were sensitive to the temporal structure of the tonal organization in the sequence.

  9. Protein Sequence Classification with Improved Extreme Learning Machine Algorithms

    PubMed Central

    2014-01-01

    Precisely classifying a protein sequence from a large biological protein sequences database plays an important role for developing competitive pharmacological products. Comparing the unseen sequence with all the identified protein sequences and returning the category index with the highest similarity scored protein, conventional methods are usually time-consuming. Therefore, it is urgent and necessary to build an efficient protein sequence classification system. In this paper, we study the performance of protein sequence classification using SLFNs. The recent efficient extreme learning machine (ELM) and its invariants are utilized as the training algorithms. The optimal pruned ELM is first employed for protein sequence classification in this paper. To further enhance the performance, the ensemble based SLFNs structure is constructed where multiple SLFNs with the same number of hidden nodes and the same activation function are used as ensembles. For each ensemble, the same training algorithm is adopted. The final category index is derived using the majority voting method. Two approaches, namely, the basic ELM and the OP-ELM, are adopted for the ensemble based SLFNs. The performance is analyzed and compared with several existing methods using datasets obtained from the Protein Information Resource center. The experimental results show the priority of the proposed algorithms. PMID:24795876

  10. Gelada vocal sequences follow Menzerath's linguistic law.

    PubMed

    Gustison, Morgan L; Semple, Stuart; Ferrer-I-Cancho, Ramon; Bergman, Thore J

    2016-05-10

    Identifying universal principles underpinning diverse natural systems is a key goal of the life sciences. A powerful approach in addressing this goal has been to test whether patterns consistent with linguistic laws are found in nonhuman animals. Menzerath's law is a linguistic law that states that, the larger the construct, the smaller the size of its constituents. Here, to our knowledge, we present the first evidence that Menzerath's law holds in the vocal communication of a nonhuman species. We show that, in vocal sequences of wild male geladas (Theropithecus gelada), construct size (sequence size in number of calls) is negatively correlated with constituent size (duration of calls). Call duration does not vary significantly with position in the sequence, but call sequence composition does change with sequence size and most call types are abbreviated in larger sequences. We also find that intercall intervals follow the same relationship with sequence size as do calls. Finally, we provide formal mathematical support for the idea that Menzerath's law reflects compression-the principle of minimizing the expected length of a code. Our findings suggest that a common principle underpins human and gelada vocal communication, highlighting the value of exploring the applicability of linguistic laws in vocal systems outside the realm of language.

  11. Studying long 16S rDNA sequences with ultrafast-metagenomic sequence classification using exact alignments (Kraken).

    PubMed

    Valenzuela-González, Fabiola; Martínez-Porchas, Marcel; Villalpando-Canchola, Enrique; Vargas-Albores, Francisco

    2016-03-01

    Ultrafast-metagenomic sequence classification using exact alignments (Kraken) is a novel approach to classify 16S rDNA sequences. The classifier is based on mapping short sequences to the lowest ancestor and performing alignments to form subtrees with specific weights in each taxon node. This study aimed to evaluate the classification performance of Kraken with long 16S rDNA random environmental sequences produced by cloning and then Sanger sequenced. A total of 480 clones were isolated and expanded, and 264 of these clones formed contigs (1352 ± 153 bp). The same sequences were analyzed using the Ribosomal Database Project (RDP) classifier. Deeper classification performance was achieved by Kraken than by the RDP: 73% of the contigs were classified up to the species or variety levels, whereas 67% of these contigs were classified no further than the genus level by the RDP. The results also demonstrated that unassembled sequences analyzed by Kraken provide similar or inclusively deeper information. Moreover, sequences that did not form contigs, which are usually discarded by other programs, provided meaningful information when analyzed by Kraken. Finally, it appears that the assembly step for Sanger sequences can be eliminated when using Kraken. Kraken cumulates the information of both sequence senses, providing additional elements for the classification. In conclusion, the results demonstrate that Kraken is an excellent choice for use in the taxonomic assignment of sequences obtained by Sanger sequencing or based on third generation sequencing, of which the main goal is to generate larger sequences. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

    PubMed

    Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario

    2011-01-01

    Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.

  13. Final progress report, Construction of a genome-wide highly characterized clone resource for genome sequencing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nierman, William C.

    At TIGR, the human Bacterial Artificial Chromosome (BAC) end sequencing and trimming were with an overall sequencing success rate of 65%. CalTech human BAC libraries A, B, C and D as well as Roswell Park Cancer Institute's library RPCI-11 were used. To date, we have generated >300,000 end sequences from >186,000 human BAC clones with an average read length {approx}460 bp for a total of 141 Mb covering {approx}4.7% of the genome. Over sixty percent of the clones have BAC end sequences (BESs) from both ends representing over five-fold coverage of the genome by the paired-end clones. The average phredmore » Q20 length is {approx}400 bp. This high accuracy makes our BESs match the human finished sequences with an average identity of 99% and a match length of 450 bp, and a frequency of one match per 12.8 kb contig sequence. Our sample tracking has ensured a clone tracking accuracy of >90%, which gives researchers a high confidence in (1) retrieving the right clone from the BA C libraries based on the sequence matches; and (2) building a minimum tiling path of sequence-ready clones across the genome and genome assembly scaffolds.« less

  14. Base-Calling Algorithm with Vocabulary (BCV) Method for Analyzing Population Sequencing Chromatograms

    PubMed Central

    Fantin, Yuri S.; Neverov, Alexey D.; Favorov, Alexander V.; Alvarez-Figueroa, Maria V.; Braslavskaya, Svetlana I.; Gordukova, Maria A.; Karandashova, Inga V.; Kuleshov, Konstantin V.; Myznikova, Anna I.; Polishchuk, Maya S.; Reshetov, Denis A.; Voiciehovskaya, Yana A.; Mironov, Andrei A.; Chulanov, Vladimir P.

    2013-01-01

    Sanger sequencing is a common method of reading DNA sequences. It is less expensive than high-throughput methods, and it is appropriate for numerous applications including molecular diagnostics. However, sequencing mixtures of similar DNA of pathogens with this method is challenging. This is important because most clinical samples contain such mixtures, rather than pure single strains. The traditional solution is to sequence selected clones of PCR products, a complicated, time-consuming, and expensive procedure. Here, we propose the base-calling with vocabulary (BCV) method that computationally deciphers Sanger chromatograms obtained from mixed DNA samples. The inputs to the BCV algorithm are a chromatogram and a dictionary of sequences that are similar to those we expect to obtain. We apply the base-calling function on a test dataset of chromatograms without ambiguous positions, as well as one with 3–14% sequence degeneracy. Furthermore, we use BCV to assemble a consensus sequence for an HIV genome fragment in a sample containing a mixture of viral DNA variants and to determine the positions of the indels. Finally, we detect drug-resistant Mycobacterium tuberculosis strains carrying frameshift mutations mixed with wild-type bacteria in the pncA gene, and roughly characterize bacterial communities in clinical samples by direct 16S rRNA sequencing. PMID:23382983

  15. A programmable method for massively parallel targeted sequencing

    PubMed Central

    Hopmans, Erik S.; Natsoulis, Georges; Bell, John M.; Grimes, Susan M.; Sieh, Weiva; Ji, Hanlee P.

    2014-01-01

    We have developed a targeted resequencing approach referred to as Oligonucleotide-Selective Sequencing. In this study, we report a series of significant improvements and novel applications of this method whereby the surface of a sequencing flow cell is modified in situ to capture specific genomic regions of interest from a sample and then sequenced. These improvements include a fully automated targeted sequencing platform through the use of a standard Illumina cBot fluidics station. Targeting optimization increased the yield of total on-target sequencing data 2-fold compared to the previous iteration, while simultaneously increasing the percentage of reads that could be mapped to the human genome. The described assays cover up to 1421 genes with a total coverage of 5.5 Megabases (Mb). We demonstrate a 10-fold abundance uniformity of greater than 90% in 1 log distance from the median and a targeting rate of up to 95%. We also sequenced continuous genomic loci up to 1.5 Mb while simultaneously genotyping SNPs and genes. Variants with low minor allele fraction were sensitively detected at levels of 5%. Finally, we determined the exact breakpoint sequence of cancer rearrangements. Overall, this approach has high performance for selective sequencing of genome targets, configuration flexibility and variant calling accuracy. PMID:24782526

  16. SN 1987A - The evolution from red to blue

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tuchman, Y.; Wheeler, J.C.

    1989-11-01

    Envelope models in thermal and dynamic equilibrium are used to explore the nature of the transition of SK -69 deg 202, the progenitor of SN 1987A, from the Hayashi track to its final blue position in the H-R diagram. Loci of possible thermal equilibrium solutions are presented as a function of Teff and M(C/O), the mass of the carbon/oxygen core interior to the helium burning shell. It is found that uniform helium enrichment of the envelope results in red-blue evolution but that the resulting blue solution is much hotter than SK -69 deg 202. Solutions in which the only changemore » is to redistribute the portion of the envelope enriched in helium during main-sequence convective core contraction into a step function with Y of about 0.5 at a mass cut of about 10 solar masses give a natural transition from red to blue and a final value of Teff in agreement with observations. It is argued that SK -69 deg 202 probably fell on a post-Hayashi track sequence at moderate Teff. The possible connection of this sequence to the step distribution in the H-R diagram of the LMC. 19 refs.« less

  17. Automatic registration of panoramic image sequence and mobile laser scanning data using semantic features

    NASA Astrophysics Data System (ADS)

    Li, Jianping; Yang, Bisheng; Chen, Chi; Huang, Ronggang; Dong, Zhen; Xiao, Wen

    2018-02-01

    Inaccurate exterior orientation parameters (EoPs) between sensors obtained by pre-calibration leads to failure of registration between panoramic image sequence and mobile laser scanning data. To address this challenge, this paper proposes an automatic registration method based on semantic features extracted from panoramic images and point clouds. Firstly, accurate rotation parameters between the panoramic camera and the laser scanner are estimated using GPS and IMU aided structure from motion (SfM). The initial EoPs of panoramic images are obtained at the same time. Secondly, vehicles in panoramic images are extracted by the Faster-RCNN as candidate primitives to be matched with potential corresponding primitives in point clouds according to the initial EoPs. Finally, translation between the panoramic camera and the laser scanner is refined by maximizing the overlapping area of corresponding primitive pairs based on the Particle Swarm Optimization (PSO), resulting in a finer registration between panoramic image sequences and point clouds. Two challenging urban scenes were experimented to assess the proposed method, and the final registration errors of these two scenes were both less than three pixels, which demonstrates a high level of automation, robustness and accuracy.

  18. Survivability of the Hardened Mobile Launcher When Attacked by a Hypothetical Rapidly Retargetable ICBM System.

    DTIC Science & Technology

    1986-03-01

    Aimpoints 22 Overviev 22 Random Movement of the RML 23 Computing Burst Locations and the HMIL’s Final Location 23 Selecting the HIMLs Speed. 29...described threat. The actual model used in this study is an MEASIC computer program . written and run on an Apple Macintosh computer . It is described in...mechanics of the computer program that models the warheads’ flight time sequence, it will be helpful to explain some of the elements of the sequence

  19. Stereoselective Total Synthesis of Radiolabeled Artemisinin (Qinghaosu).

    DTIC Science & Technology

    Our previous total synthesis of (+)- artemisinin has been optimized from 18 to 11 steps. The final two steps in the sequence are: 1) alkylation of a...product (+)- artemisinin . The first step was repeated utilizing carbon-14 methyl iodide and the sequence completed as before to afford the desired...carbon-14labeled (+)- artemisinin . The label resides in the methyl group pendant from the lactone ring (ring D), the position of attachment being C-9, the carbon atom being C-16. Keywords: Antimalarials. (aw)

  20. Cohabitational and marital histories of adults in Great Britain.

    PubMed

    Haskey, J

    1999-01-01

    This article presents findings on cohabitation-derived from cohabitation and marriage histories collected in a specially designed module of the ONS Omnibus Survey. It examines the sequence of types of partnerships, and how this sequence varies by birth cohort of respondents. Also compared is the relative stability of cohabiting unions and married partnerships. Finally, the reasons for converting a cohabiting union into a marriage are analysed--separately for men and women, and separately according to whether the marriage continued or ended.

  1. Deciphering mRNA Sequence Determinants of Protein Production Rate

    NASA Astrophysics Data System (ADS)

    Szavits-Nossan, Juraj; Ciandrini, Luca; Romano, M. Carmen

    2018-03-01

    One of the greatest challenges in biophysical models of translation is to identify coding sequence features that affect the rate of translation and therefore the overall protein production in the cell. We propose an analytic method to solve a translation model based on the inhomogeneous totally asymmetric simple exclusion process, which allows us to unveil simple design principles of nucleotide sequences determining protein production rates. Our solution shows an excellent agreement when compared to numerical genome-wide simulations of S. cerevisiae transcript sequences and predicts that the first 10 codons, which is the ribosome footprint length on the mRNA, together with the value of the initiation rate, are the main determinants of protein production rate under physiological conditions. Finally, we interpret the obtained analytic results based on the evolutionary role of the codons' choice for regulating translation rates and ribosome densities.

  2. Research on gait-based human identification

    NASA Astrophysics Data System (ADS)

    Li, Youguo

    Gait recognition refers to automatic identification of individual based on his/her style of walking. This paper proposes a gait recognition method based on Continuous Hidden Markov Model with Mixture of Gaussians(G-CHMM). First, we initialize a Gaussian mix model for training image sequence with K-means algorithm, then train the HMM parameters using a Baum-Welch algorithm. These gait feature sequences can be trained and obtain a Continuous HMM for every person, therefore, the 7 key frames and the obtained HMM can represent each person's gait sequence. Finally, the recognition is achieved by Front algorithm. The experiments made on CASIA gait databases obtain comparatively high correction identification ratio and comparatively strong robustness for variety of bodily angle.

  3. DHS Internship Final Report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    House, Samantha

    2014-09-01

    This summer I worked on projects that involved RNA sequencing of pathogens after an infection of host cells. The goal of these projects was to continue developing pathogen enrichment strategies for transcriptomic analysis, and also to perform hostpathogen interaction studies.

  4. Design of a final approach spacing tool for TRACON air traffic control

    NASA Technical Reports Server (NTRS)

    Davis, Thomas J.; Erzberger, Heinz; Bergeron, Hugh

    1989-01-01

    This paper describes an automation tool that assists air traffic controllers in the Terminal Radar Approach Control (TRACON) Facilities in providing safe and efficient sequencing and spacing of arrival traffic. The automation tool, referred to as the Final Approach Spacing Tool (FAST), allows the controller to interactively choose various levels of automation and advisory information ranging from predicted time errors to speed and heading advisories for controlling time error. FAST also uses a timeline to display current scheduling and sequencing information for all aircraft in the TRACON airspace. FAST combines accurate predictive algorithms and state-of-the-art mouse and graphical interface technology to present advisory information to the controller. Furthermore, FAST exchanges various types of traffic information and communicates with automation tools being developed for the Air Route Traffic Control Center. Thus it is part of an integrated traffic management system for arrival traffic at major terminal areas.

  5. The ASLOTS concept: An interactive, adaptive decision support concept for Final Approach Spacing of Aircraft (FASA). FAA-NASA Joint University Program

    NASA Technical Reports Server (NTRS)

    Simpson, Robert W.

    1993-01-01

    This presentation outlines a concept for an adaptive, interactive decision support system to assist controllers at a busy airport in achieving efficient use of multiple runways. The concept is being implemented as a computer code called FASA (Final Approach Spacing for Aircraft), and will be tested and demonstrated in ATCSIM, a high fidelity simulation of terminal area airspace and airport surface operations. Objectives are: (1) to provide automated cues to assist controllers in the sequencing and spacing of landing and takeoff aircraft; (2) to provide the controller with a limited ability to modify the sequence and spacings between aircraft, and to insert takeoffs and missed approach aircraft in the landing flows; (3) to increase spacing accuracy using more complex and precise separation criteria while reducing controller workload; and (4) achieve higher operational takeoff and landing rates on multiple runways in poor visibility.

  6. Illumina Synthetic Long Read Sequencing Allows Recovery of Missing Sequences even in the “Finished” C. elegans Genome

    PubMed Central

    Li, Runsheng; Hsieh, Chia-Ling; Young, Amanda; Zhang, Zhihong; Ren, Xiaoliang; Zhao, Zhongying

    2015-01-01

    Most next-generation sequencing platforms permit acquisition of high-throughput DNA sequences, but the relatively short read length limits their use in genome assembly or finishing. Illumina has recently released a technology called Synthetic Long-Read Sequencing that can produce reads of unusual length, i.e., predominately around 10 Kb. However, a systematic assessment of their use in genome finishing and assembly is still lacking. We evaluate the promise and deficiency of the long reads in these aspects using isogenic C. elegans genome with no gap. First, the reads are highly accurate and capable of recovering most types of repetitive sequences. However, the presence of tandem repetitive sequences prevents pre-assembly of long reads in the relevant genomic region. Second, the reads are able to reliably detect missing but not extra sequences in the C. elegans genome. Third, the reads of smaller size are more capable of recovering repetitive sequences than those of bigger size. Fourth, at least 40 Kbp missing genomic sequences are recovered in the C. elegans genome using the long reads. Finally, an N50 contig size of at least 86 Kbp can be achieved with 24×reads but with substantial mis-assembly errors, highlighting a need for novel assembly algorithm for the long reads. PMID:26039588

  7. Filovirus RefSeq Entries: Evaluation and Selection of Filovirus Type Variants, Type Sequences, and Names

    PubMed Central

    Kuhn, Jens H.; Andersen, Kristian G.; Bào, Yīmíng; Bavari, Sina; Becker, Stephan; Bennett, Richard S.; Bergman, Nicholas H.; Blinkova, Olga; Bradfute, Steven; Brister, J. Rodney; Bukreyev, Alexander; Chandran, Kartik; Chepurnov, Alexander A.; Davey, Robert A.; Dietzgen, Ralf G.; Doggett, Norman A.; Dolnik, Olga; Dye, John M.; Enterlein, Sven; Fenimore, Paul W.; Formenty, Pierre; Freiberg, Alexander N.; Garry, Robert F.; Garza, Nicole L.; Gire, Stephen K.; Gonzalez, Jean-Paul; Griffiths, Anthony; Happi, Christian T.; Hensley, Lisa E.; Herbert, Andrew S.; Hevey, Michael C.; Hoenen, Thomas; Honko, Anna N.; Ignatyev, Georgy M.; Jahrling, Peter B.; Johnson, Joshua C.; Johnson, Karl M.; Kindrachuk, Jason; Klenk, Hans-Dieter; Kobinger, Gary; Kochel, Tadeusz J.; Lackemeyer, Matthew G.; Lackner, Daniel F.; Leroy, Eric M.; Lever, Mark S.; Mühlberger, Elke; Netesov, Sergey V.; Olinger, Gene G.; Omilabu, Sunday A.; Palacios, Gustavo; Panchal, Rekha G.; Park, Daniel J.; Patterson, Jean L.; Paweska, Janusz T.; Peters, Clarence J.; Pettitt, James; Pitt, Louise; Radoshitzky, Sheli R.; Ryabchikova, Elena I.; Saphire, Erica Ollmann; Sabeti, Pardis C.; Sealfon, Rachel; Shestopalov, Aleksandr M.; Smither, Sophie J.; Sullivan, Nancy J.; Swanepoel, Robert; Takada, Ayato; Towner, Jonathan S.; van der Groen, Guido; Volchkov, Viktor E.; Volchkova, Valentina A.; Wahl-Jensen, Victoria; Warren, Travis K.; Warfield, Kelly L.; Weidmann, Manfred; Nichol, Stuart T.

    2014-01-01

    Sequence determination of complete or coding-complete genomes of viruses is becoming common practice for supporting the work of epidemiologists, ecologists, virologists, and taxonomists. Sequencing duration and costs are rapidly decreasing, sequencing hardware is under modification for use by non-experts, and software is constantly being improved to simplify sequence data management and analysis. Thus, analysis of virus disease outbreaks on the molecular level is now feasible, including characterization of the evolution of individual virus populations in single patients over time. The increasing accumulation of sequencing data creates a management problem for the curators of commonly used sequence databases and an entry retrieval problem for end users. Therefore, utilizing the data to their fullest potential will require setting nomenclature and annotation standards for virus isolates and associated genomic sequences. The National Center for Biotechnology Information’s (NCBI’s) RefSeq is a non-redundant, curated database for reference (or type) nucleotide sequence records that supplies source data to numerous other databases. Building on recently proposed templates for filovirus variant naming [ ()////-], we report consensus decisions from a majority of past and currently active filovirus experts on the eight filovirus type variants and isolates to be represented in RefSeq, their final designations, and their associated sequences. PMID:25256396

  8. Whistle sequences in wild killer whales (Orcinus orca).

    PubMed

    Riesch, Rüdiger; Ford, John K B; Thomsen, Frank

    2008-09-01

    Combining different stereotyped vocal signals into specific sequences increases the range of information that can be transferred between individuals. The temporal emission pattern and the behavioral context of vocal sequences have been described in detail for a variety of birds and mammals. Yet, in cetaceans, the study of vocal sequences is just in its infancy. Here, we provide a detailed analysis of sequences of stereotyped whistles in killer whales off Vancouver Island, British Columbia. A total of 1140 whistle transitions in 192 whistle sequences recorded from resident killer whales were analyzed using common spectrographic analysis techniques. In addition to the stereotyped whistles described by Riesch et al., [(2006). "Stability and group specificity of stereotyped whistles in resident killer whales, Orcinus orca, off British Columbia," Anim. Behav. 71, 79-91.] We found a new and rare stereotyped whistle (W7) as well as two whistle elements, which are closely linked to whistle sequences: (1) stammers and (2) bridge elements. Furthermore, the frequency of occurrence of 12 different stereotyped whistle types within the sequences was not randomly distributed and the transition patterns between whistles were also nonrandom. Finally, whistle sequences were closely tied to close-range behavioral interactions (in particular among males). Hence, we conclude that whistle sequences in wild killer whales are complex signal series and propose that they are most likely emitted by single individuals.

  9. Identification of a Heterozygous SPG11 Mutation by Clinical Exome Sequencing in a Patient With Hereditary Spastic Paraplegia: A Case Report.

    PubMed

    Oh, Ja-Young; Do, Hyun Jung; Lee, Seungok; Jang, Ja-Hyun; Cho, Eun-Hae; Jang, Dae-Hyun

    2016-12-01

    Next-generation sequencing, such as whole-genome sequencing, whole-exome sequencing, and targeted panel sequencing have been applied for diagnosis of many genetic diseases, and are in the process of replacing the traditional methods of genetic analysis. Clinical exome sequencing (CES), which provides not only sequence variation data but also clinical interpretation, aids in reaching a final conclusion with regards to genetic diagnosis. Sequencing of genes with clinical relevance rather than whole exome sequencing might be more suitable for the diagnosis of known hereditary disease with genetic heterogeneity. Here, we present the clinical usefulness of CES for the diagnosis of hereditary spastic paraplegia (HSP). We report a case of patient who was strongly suspected of having HSP based on her clinical manifestations. HSP is one of the diseases with high genetic heterogeneity, the 72 different loci and 59 discovered genes identified so far. Therefore, traditional approach for diagnosis of HSP with genetic analysis is very challenging and time-consuming. CES with TruSight One Sequencing Panel, which enriches about 4,800 genes with clinical relevance, revealed compound heterozygous mutations in SPG11 . One workflow and one procedure can provide the results of genetic analysis, and CES with enrichment of clinically relevant genes is a cost-effective and time-saving diagnostic tool for diseases with genetic heterogeneity, including HSP.

  10. Extrinsic Cognitive Load Impairs Spoken Word Recognition in High- and Low-Predictability Sentences.

    PubMed

    Hunter, Cynthia R; Pisoni, David B

    Listening effort (LE) induced by speech degradation reduces performance on concurrent cognitive tasks. However, a converse effect of extrinsic cognitive load on recognition of spoken words in sentences has not been shown. The aims of the present study were to (a) examine the impact of extrinsic cognitive load on spoken word recognition in a sentence recognition task and (b) determine whether cognitive load and/or LE needed to understand spectrally degraded speech would differentially affect word recognition in high- and low-predictability sentences. Downstream effects of speech degradation and sentence predictability on the cognitive load task were also examined. One hundred twenty young adults identified sentence-final spoken words in high- and low-predictability Speech Perception in Noise sentences. Cognitive load consisted of a preload of short (low-load) or long (high-load) sequences of digits, presented visually before each spoken sentence and reported either before or after identification of the sentence-final word. LE was varied by spectrally degrading sentences with four-, six-, or eight-channel noise vocoding. Level of spectral degradation and order of report (digits first or words first) were between-participants variables. Effects of cognitive load, sentence predictability, and speech degradation on accuracy of sentence-final word identification as well as recall of preload digit sequences were examined. In addition to anticipated main effects of sentence predictability and spectral degradation on word recognition, we found an effect of cognitive load, such that words were identified more accurately under low load than high load. However, load differentially affected word identification in high- and low-predictability sentences depending on the level of sentence degradation. Under severe spectral degradation (four-channel vocoding), the effect of cognitive load on word identification was present for high-predictability sentences but not for low-predictability sentences. Under mild spectral degradation (eight-channel vocoding), the effect of load was present for low-predictability sentences but not for high-predictability sentences. There were also reliable downstream effects of speech degradation and sentence predictability on recall of the preload digit sequences. Long digit sequences were more easily recalled following spoken sentences that were less spectrally degraded. When digits were reported after identification of sentence-final words, short digit sequences were recalled more accurately when the spoken sentences were predictable. Extrinsic cognitive load can impair recognition of spectrally degraded spoken words in a sentence recognition task. Cognitive load affected word identification in both high- and low-predictability sentences, suggesting that load may impact both context use and lower-level perceptual processes. Consistent with prior work, LE also had downstream effects on memory for visual digit sequences. Results support the proposal that extrinsic cognitive load and LE induced by signal degradation both draw on a central, limited pool of cognitive resources that is used to recognize spoken words in sentences under adverse listening conditions.

  11. NGS Catalog: A Database of Next Generation Sequencing Studies in Humans

    PubMed Central

    Xia, Junfeng; Wang, Qingguo; Jia, Peilin; Wang, Bing; Pao, William; Zhao, Zhongming

    2015-01-01

    Next generation sequencing (NGS) technologies have been rapidly applied in biomedical and biological research since its advent only a few years ago, and they are expected to advance at an unprecedented pace in the following years. To provide the research community with a comprehensive NGS resource, we have developed the database Next Generation Sequencing Catalog (NGS Catalog, http://bioinfo.mc.vanderbilt.edu/NGS/index.html), a continually updated database that collects, curates and manages available human NGS data obtained from published literature. NGS Catalog deposits publication information of NGS studies and their mutation characteristics (SNVs, small insertions/deletions, copy number variations, and structural variants), as well as mutated genes and gene fusions detected by NGS. Other functions include user data upload, NGS general analysis pipelines, and NGS software. NGS Catalog is particularly useful for investigators who are new to NGS but would like to take advantage of these powerful technologies for their own research. Finally, based on the data deposited in NGS Catalog, we summarized features and findings from whole exome sequencing, whole genome sequencing, and transcriptome sequencing studies for human diseases or traits. PMID:22517761

  12. Cloud-based adaptive exon prediction for DNA analysis.

    PubMed

    Putluri, Srinivasareddy; Zia Ur Rahman, Md; Fathima, Shaik Yasmeen

    2018-02-01

    Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database.

  13. Statistical properties of DNA sequences

    NASA Technical Reports Server (NTRS)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

    1995-01-01

    We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.

  14. An improved TCF sequence for biobleaching kenaf pulp: influence of the hexenuronic acid content and the use of xylanase.

    PubMed

    Andreu, Glòria; Vidal, Teresa

    2014-01-01

    Enzymatic delignification with laccase from Trametes villosa used in combination with chemical mediators (acetosyringone, acetovanillone and 1-hydroxybenzotriazole) to improve the totally chlorine-free (TCF) bleaching of kenaf pulp was studied. The best final pulp properties were obtained by using an LHBTQPo sequence developed by incorporating a laccase-mediator stage into an industrial bleaching sequence involving chelation and peroxide stages. The new sequence resulted in increased kenaf pulp delignification (90.4%) and brightness (77.2%ISO) relative to a conventional TCF chemical sequence (74.5% delignification and 74.5% brightness). Also, the sequence provided bleached kenaf fibers with high cellulose content (pulp viscosity of 890 g·mL(-1) vs 660 g·mL(-1)). Scanning electron micrographs revealed that xylanase altered fiber surfaces and facilitated reagent access as a result. However, the LHBTX (xylanase) stage removed 21% of hexenuronic acids in kenaf pulp. These recalcitrant compounds spent additional bleaching reagents and affected pulp properties after peroxide stage. Copyright © 2013 Elsevier Ltd. All rights reserved.

  15. Stock price forecasting based on time series analysis

    NASA Astrophysics Data System (ADS)

    Chi, Wan Le

    2018-05-01

    Using the historical stock price data to set up a sequence model to explain the intrinsic relationship of data, the future stock price can forecasted. The used models are auto-regressive model, moving-average model and autoregressive-movingaverage model. The original data sequence of unit root test was used to judge whether the original data sequence was stationary. The non-stationary original sequence as a first order difference needed further processing. Then the stability of the sequence difference was re-inspected. If it is still non-stationary, the second order differential processing of the sequence is carried out. Autocorrelation diagram and partial correlation diagram were used to evaluate the parameters of the identified ARMA model, including coefficients of the model and model order. Finally, the model was used to forecast the fitting of the shanghai composite index daily closing price with precision. Results showed that the non-stationary original data series was stationary after the second order difference. The forecast value of shanghai composite index daily closing price was closer to actual value, indicating that the ARMA model in the paper was a certain accuracy.

  16. Mobile Genome Express (MGE): A comprehensive automatic genetic analyses pipeline with a mobile device.

    PubMed

    Yoon, Jun-Hee; Kim, Thomas W; Mendez, Pedro; Jablons, David M; Kim, Il-Jin

    2017-01-01

    The development of next-generation sequencing (NGS) technology allows to sequence whole exomes or genome. However, data analysis is still the biggest bottleneck for its wide implementation. Most laboratories still depend on manual procedures for data handling and analyses, which translates into a delay and decreased efficiency in the delivery of NGS results to doctors and patients. Thus, there is high demand for developing an automatic and an easy-to-use NGS data analyses system. We developed comprehensive, automatic genetic analyses controller named Mobile Genome Express (MGE) that works in smartphones or other mobile devices. MGE can handle all the steps for genetic analyses, such as: sample information submission, sequencing run quality check from the sequencer, secured data transfer and results review. We sequenced an Actrometrix control DNA containing multiple proven human mutations using a targeted sequencing panel, and the whole analysis was managed by MGE, and its data reviewing program called ELECTRO. All steps were processed automatically except for the final sequencing review procedure with ELECTRO to confirm mutations. The data analysis process was completed within several hours. We confirmed the mutations that we have identified were consistent with our previous results obtained by using multi-step, manual pipelines.

  17. The twilight zone of cis element alignments.

    PubMed

    Sebastian, Alvaro; Contreras-Moreira, Bruno

    2013-02-01

    Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein-DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein-DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments.

  18. The twilight zone of cis element alignments

    PubMed Central

    Sebastian, Alvaro; Contreras-Moreira, Bruno

    2013-01-01

    Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein–DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein–DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments. PMID:23268451

  19. Hysteretic energy prediction method for mainshock-aftershock sequences

    NASA Astrophysics Data System (ADS)

    Zhai, Changhai; Ji, Duofa; Wen, Weiping; Li, Cuihua; Lei, Weidong; Xie, Lili

    2018-04-01

    Structures located in seismically active regions may be subjected to mainshock-aftershock (MSAS) sequences. Strong aftershocks significantly affect the hysteretic energy demand of structures. The hysteretic energy, E H,seq, is normalized by mass m and expressed in terms of the equivalent velocity, V D,seq, to quantitatively investigate aftershock effects on the hysteretic energy of structures. The equivalent velocity, V D,seq, is computed by analyzing the response time-history of an inelastic single-degree-of-freedom (SDOF) system with a varying vibration period subjected to 309 MSAS sequences. The present study selected two kinds of MSAS sequences, with one aftershock and two aftershocks, respectively. The aftershocks are scaled to maintain different relative intensities. The variation of the equivalent velocity, V D,seq, is studied for consideration of the ductility values, site conditions, relative intensities, number of aftershocks, hysteretic models, and damping ratios. The MSAS sequence with one aftershock exhibited a 10% to 30% hysteretic energy increase, whereas the MSAS sequence with two aftershocks presented a 20% to 40% hysteretic energy increase. Finally, a hysteretic energy prediction equation is proposed as a function of the vibration period, ductility value, and damping ratio to estimate hysteretic energy for mainshock-aftershock sequences.

  20. Generation of 2A-linked multicistronic cassettes by recombinant PCR.

    PubMed

    Szymczak-Workman, Andrea L; Vignali, Kate M; Vignali, Dario A A

    2012-02-01

    The need for reliable, multicistronic vectors for multigene delivery is at the forefront of biomedical technology. It is now possible to express multiple proteins from a single open reading frame (ORF) using 2A peptide-linked multicistronic vectors. These small sequences, when cloned between genes, allow for efficient, stoichiometric production of discrete protein products within a single vector through a novel "cleavage" event within the 2A peptide sequence. Expression of more than two genes using conventional approaches has several limitations, most notably imbalanced protein expression and large size. The use of 2A peptide sequences alleviates these concerns. They are small (18-22 amino acids) and have divergent amino-terminal sequences, which minimizes the chance for homologous recombination and allows for multiple, different 2A peptide sequences to be used within a single vector. Importantly, separation of genes placed between 2A peptide sequences is nearly 100%, which allows for stoichiometric and concordant expression of the genes, regardless of the order of placement within the vector. This protocol describes the use of recombinant polymerase chain reaction (PCR) to connect multiple 2A-linked protein sequences. The final construct is subcloned into an expression vector.

  1. The effect of letter string length and report condition on letter recognition accuracy.

    PubMed

    Raghunandan, Avesh; Karmazinaite, Berta; Rossow, Andrea S

    Letter sequence recognition accuracy has been postulated to be limited primarily by low-level visual factors. The influence of high level factors such as visual memory (load and decay) has been largely overlooked. This study provides insight into the role of these factors by investigating the interaction between letter sequence recognition accuracy, letter string length and report condition. Letter sequence recognition accuracy for trigrams and pentagrams were measured in 10 adult subjects for two report conditions. In the complete report condition subjects reported all 3 or all 5 letters comprising trigrams and pentagrams, respectively. In the partial report condition, subjects reported only a single letter in the trigram or pentagram. Letters were presented for 100ms and rendered in high contrast, using black lowercase Courier font that subtended 0.4° at the fixation distance of 0.57m. Letter sequence recognition accuracy was consistently higher for trigrams compared to pentagrams especially for letter positions away from fixation. While partial report increased recognition accuracy in both string length conditions, the effect was larger for pentagrams, and most evident for the final letter positions within trigrams and pentagrams. The effect of partial report on recognition accuracy for the final letter positions increased as eccentricity increased away from fixation, and was independent of the inner/outer position of a letter. Higher-level visual memory functions (memory load and decay) play a role in letter sequence recognition accuracy. There is also suggestion of additional delays imposed on memory encoding by crowded letter elements. Copyright © 2016 Spanish General Council of Optometry. Published by Elsevier España, S.L.U. All rights reserved.

  2. Ab initio structure determination and refinement of a scorpion protein toxin.

    PubMed

    Smith, G D; Blessing, R H; Ealick, S E; Fontecilla-Camps, J C; Hauptman, H A; Housset, D; Langs, D A; Miller, R

    1997-09-01

    The structure of toxin II from the scorpion Androctonus australis Hector has been determined ab initio by direct methods using SnB at 0.96 A resolution. For the purpose of this structure redetermination, undertaken as a test of the minimal function and the SnB program, the identity and sequence of the protein was withheld from part of the research team. A single solution obtained from 1 619 random atom trials was clearly revealed by the bimodal distribution of the final value of the minimal function associated with each individual trial. Five peptide fragments were identified from a conservative analysis of the initial E-map, and following several refinement cycles with X-PLOR, a model was built of the complete structure. At the end of the X-PLOR refinement, the sequence was compared with the published sequence and 57 of the 64 residues had been correctly identified. Two errors in sequence resulted from side chains with similar size while the rest of the errors were a result of severe disorder or high thermal motion in the side chains. Given the amino-acid sequence, it is estimated that the initial E-map could have produced a model containing 99% of all main-chain and 81% of side-chain atoms. The structure refinement was completed with PROFFT, including the contributions of protein H atoms, and converged at a residual of 0.158 for 30 609 data with F >or= 2sigma(F) in the resolution range 8.0-0.964 A. The final model consisted of 518 non-H protein atoms (36 disordered), 407 H atoms, and 129 water molecules (43 with occupancies less than unity). This total of 647 non-H atoms represents the largest light-atom structure solved to date.

  3. Molecular genetic studies of DMT1 on 12q in French-Canadian restless legs syndrome patients and families.

    PubMed

    Xiong, Lan; Dion, Patrick; Montplaisir, Jacques; Levchenko, Anastasia; Thibodeau, Pascale; Karemera, Liliane; Rivière, Jean-Baptiste; St-Onge, Judith; Gaspar, Claudia; Dubé, Marie-Pierre; Desautels, Alex; Turecki, Gustavo; Rouleau, Guy A

    2007-10-05

    Converging evidence from clinical observations, brain imaging and pathological findings strongly indicate impaired brain iron regulation in restless legs syndrome (RLS). Animal models with mutation in (DMT1) divalent metal transporter 1 gene, an important brain iron transporter, demonstrate a similar iron deficiency profile as found in RLS brain. The human DMT1 gene, mapped to chromosome 12q near the RLS1 locus, qualifies as an excellent functional and possible positional candidate for RLS. DMT1 protein levels were assessed in lymphoblastoid cell lines from RLS patients and controls. Linkage analyses were carried out with markers flanking and within the DMT1 gene. Selected patient samples from RLS families with compatible linkage to the RLS1 locus on 12q were fully sequenced in both the coding regions and the long stretches of UTR sequences. Finally, selected sequence variants were further studied in case/control and family-based association tests. A clinical association of anemia and RLS was further confirmed in this study. There was no detectable difference in DMT1 protein levels between RLS patient lymphoblastoid cell lines and normal controls. Non-parametric linkage analyses failed to identify any significant linkage signals within the DMT1 gene region. Sequencing of selected patients did not detect any sequence variant(s) compatible with DMT1 harboring RLS causative mutation(s). Further studies did not find any association between ten SNPs, spanning the whole DMT1 gene region, and RLS affection status. Finally, two DMT1 intronic SNPs showed positive association with RLS in patients with a history of anemia, when compared to RLS patients without anemia. (c) 2007 Wiley-Liss, Inc.

  4. A Next Generation Sequencing custom gene panel as first line diagnostic tool for atypical cases of syndromic obesity: Application in a case of Alström syndrome.

    PubMed

    Maltese, Paolo E; Iarossi, Giancarlo; Ziccardi, Lucia; Colombo, Leonardo; Buzzonetti, Luca; Crinò, Antonino; Tezzele, Silvia; Bertelli, Matteo

    2018-02-01

    Obesity phenotype can be manifested as an isolated trait or accompanied by multisystem disorders as part of a syndromic picture. In both situations, same molecular pathways may be involved to different degrees. This evidence is stronger in syndromic obesity, in which phenotypes of different syndromes may overlap. In these cases, genetic testing can unequivocally provide a final diagnosis. Here we describe a patient who met the diagnostic criteria for Alström syndrome only during adolescence. Genetic testing was requested at 25 years of age for a final confirmation of the diagnosis. The genetic diagnosis of Alström syndrome was obtained through a Next Generation Sequencing genetic test approach using a custom-designed gene panel of 47 genes associated with syndromic and non-syndromic obesity. Genetic analysis revealed a novel homozygous frameshift variant p.(Arg1550Lysfs*10) on exon 8 of the ALMS1 gene. This case shows the need for a revision of the diagnostic criteria guidelines, as a consequence of the recent advent of massive parallel sequencing technology. Indications for genetic testing reported in these currently accepted diagnostic criteria for Alström syndrome, were drafted when sequencing was expensive and time consuming. Nowadays, Next Generation Sequencing testing could be considered as first line diagnostic tool not only for Alström syndrome but, more generally, for all those atypical or not clearly distinguishable cases of syndromic obesity, thus avoiding delayed diagnosis and treatments. Early diagnosis permits a better follow-up and pre-symptomatic interventions. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  5. The CanOE strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes.

    PubMed

    Smith, Adam Alexander Thil; Belda, Eugeni; Viari, Alain; Medigue, Claudine; Vallenet, David

    2012-05-01

    Of all biochemically characterized metabolic reactions formalized by the IUBMB, over one out of four have yet to be associated with a nucleic or protein sequence, i.e. are sequence-orphan enzymatic activities. Few bioinformatics annotation tools are able to propose candidate genes for such activities by exploiting context-dependent rather than sequence-dependent data, and none are readily accessible and propose result integration across multiple genomes. Here, we present CanOE (Candidate genes for Orphan Enzymes), a four-step bioinformatics strategy that proposes ranked candidate genes for sequence-orphan enzymatic activities (or orphan enzymes for short). The first step locates "genomic metabolons", i.e. groups of co-localized genes coding proteins catalyzing reactions linked by shared metabolites, in one genome at a time. These metabolons can be particularly helpful for aiding bioanalysts to visualize relevant metabolic data. In the second step, they are used to generate candidate associations between un-annotated genes and gene-less reactions. The third step integrates these gene-reaction associations over several genomes using gene families, and summarizes the strength of family-reaction associations by several scores. In the final step, these scores are used to rank members of gene families which are proposed for metabolic reactions. These associations are of particular interest when the metabolic reaction is a sequence-orphan enzymatic activity. Our strategy found over 60,000 genomic metabolons in more than 1,000 prokaryote organisms from the MicroScope platform, generating candidate genes for many metabolic reactions, of which more than 70 distinct orphan reactions. A computational validation of the approach is discussed. Finally, we present a case study on the anaerobic allantoin degradation pathway in Escherichia coli K-12.

  6. Large-Scale Biomonitoring of Remote and Threatened Ecosystems via High-Throughput Sequencing

    PubMed Central

    Gibson, Joel F.; Shokralla, Shadi; Curry, Colin; Baird, Donald J.; Monk, Wendy A.; King, Ian; Hajibabaei, Mehrdad

    2015-01-01

    Biodiversity metrics are critical for assessment and monitoring of ecosystems threatened by anthropogenic stressors. Existing sorting and identification methods are too expensive and labour-intensive to be scaled up to meet management needs. Alternately, a high-throughput DNA sequencing approach could be used to determine biodiversity metrics from bulk environmental samples collected as part of a large-scale biomonitoring program. Here we show that both morphological and DNA sequence-based analyses are suitable for recovery of individual taxonomic richness, estimation of proportional abundance, and calculation of biodiversity metrics using a set of 24 benthic samples collected in the Peace-Athabasca Delta region of Canada. The high-throughput sequencing approach was able to recover all metrics with a higher degree of taxonomic resolution than morphological analysis. The reduced cost and increased capacity of DNA sequence-based approaches will finally allow environmental monitoring programs to operate at the geographical and temporal scale required by industrial and regulatory end-users. PMID:26488407

  7. Cassini Imaging Science: First Results at Saturn

    NASA Astrophysics Data System (ADS)

    Porco, C. C.

    The Cassini Imaging Science experiment at Saturn will commence in early February, 2004 -- five months before Cassini's arrival at Saturn. Approach observations consist of repeated multi-spectral `movie' sequences of Saturn and its rings, image sequences designed to search for previously unseen satellites between the outer edge of the ring system and the orbit of Hyperion, images of known satellites for orbit refinement, observations of Phoebe during Cassini's closest approach to the satellite, and repeated multi-spectral `movie' sequences of Titan to detect and track clouds (for wind determination) and to sense the surface. During Saturn Orbit Insertion, the highest resolution images (~ 100 m) obtained during the whole orbital tour will be collected of the dark side of the rings. Finally, imaging sequences are planned for Cassini's first Titan flyby, on July 2, from a distance of ~ 350,000 km, yielding an image scale of ~ 2.1 km on the South polar region. The highlights of these observation sequences will be presented.

  8. Ancestry estimation and control of population stratification for sequence-based association studies.

    PubMed

    Wang, Chaolong; Zhan, Xiaowei; Bragg-Gresham, Jennifer; Kang, Hyun Min; Stambolian, Dwight; Chew, Emily Y; Branham, Kari E; Heckenlively, John; Fulton, Robert; Wilson, Richard K; Mardis, Elaine R; Lin, Xihong; Swaroop, Anand; Zöllner, Sebastian; Abecasis, Gonçalo R

    2014-04-01

    Estimating individual ancestry is important in genetic association studies where population structure leads to false positive signals, although assigning ancestry remains challenging with targeted sequence data. We propose a new method for the accurate estimation of individual genetic ancestry, based on direct analysis of off-target sequence reads, and implement our method in the publicly available LASER software. We validate the method using simulated and empirical data and show that the method can accurately infer worldwide continental ancestry when used with sequencing data sets with whole-genome shotgun coverage as low as 0.001×. For estimates of fine-scale ancestry within Europe, the method performs well with coverage of 0.1×. On an even finer scale, the method improves discrimination between exome-sequenced study participants originating from different provinces within Finland. Finally, we show that our method can be used to improve case-control matching in genetic association studies and to reduce the risk of spurious findings due to population structure.

  9. Accuracy of maxillary positioning after standard and inverted orthognathic sequencing.

    PubMed

    Ritto, Fabio G; Ritto, Thiago G; Ribeiro, Danilo Passeado; Medeiros, Paulo José; de Moraes, Márcio

    2014-05-01

    This study aimed to compare the accuracy of maxillary positioning after bimaxillary orthognathic surgery, using 2 sequences. A total of 80 cephalograms (40 preoperative and 40 postoperative) from 40 patients were analyzed. Group 1 included radiographs of patients submitted to conventional sequence, whereas group 2 patients were submitted to inverted sequence. The final position of the maxillary central incisor was obtained after vertical and horizontal measurements of the tracings, and it was compared with what had been planned. The null hypothesis, which stated that there would be no difference between the groups, was tested. After applying the Welch t test for comparison of mean differences between maxillary desired and achieved position, considering a statistical significance of 5% and a 2-tailed test, the null hypothesis was not rejected (P > .05). Thus, there was no difference in the accuracy of maxillary positioning between groups. Conventional and inverted sequencing proved to be reliable in positioning the maxilla after LeFort I osteotomy in bimaxillary orthognathic surgeries. Copyright © 2014 Elsevier Inc. All rights reserved.

  10. MetaCAA: A clustering-aided methodology for efficient assembly of metagenomic datasets.

    PubMed

    Reddy, Rachamalla Maheedhar; Mohammed, Monzoorul Haque; Mande, Sharmila S

    2014-01-01

    A key challenge in analyzing metagenomics data pertains to assembly of sequenced DNA fragments (i.e. reads) originating from various microbes in a given environmental sample. Several existing methodologies can assemble reads originating from a single genome. However, these methodologies cannot be applied for efficient assembly of metagenomic sequence datasets. In this study, we present MetaCAA - a clustering-aided methodology which helps in improving the quality of metagenomic sequence assembly. MetaCAA initially groups sequences constituting a given metagenome into smaller clusters. Subsequently, sequences in each cluster are independently assembled using CAP3, an existing single genome assembly program. Contigs formed in each of the clusters along with the unassembled reads are then subjected to another round of assembly for generating the final set of contigs. Validation using simulated and real-world metagenomic datasets indicates that MetaCAA aids in improving the overall quality of assembly. A software implementation of MetaCAA is available at https://metagenomics.atc.tcs.com/MetaCAA. Copyright © 2014 Elsevier Inc. All rights reserved.

  11. Single-Cell Semiconductor Sequencing

    PubMed Central

    Kohn, Andrea B.; Moroz, Tatiana P.; Barnes, Jeffrey P.; Netherton, Mandy; Moroz, Leonid L.

    2014-01-01

    RNA-seq or transcriptome analysis of individual cells and small-cell populations is essential for virtually any biomedical field. It is especially critical for developmental, aging, and cancer biology as well as neuroscience where the enormous heterogeneity of cells present a significant methodological and conceptual challenge. Here we present two methods that allow for fast and cost-efficient transcriptome sequencing from ultra-small amounts of tissue or even from individual cells using semiconductor sequencing technology (Ion Torrent, Life Technologies). The first method is a reduced representation sequencing which maximizes capture of RNAs and preserves transcripts’ directionality. The second, a template-switch protocol, is designed for small mammalian neurons. Both protocols, from cell/tissue isolation to final sequence data, take up to 4 days. The efficiency of these protocols has been validated with single hippocampal neurons and various invertebrate tissues including individually identified neurons within a simpler memory-forming circuit of Aplysia californica and early (1-, 2-, 4-, 8-cells) embryonic and developmental stages from basal metazoans. PMID:23929110

  12. BioPig: a Hadoop-based analytic toolkit for large-scale sequence data.

    PubMed

    Nordberg, Henrik; Bhatia, Karan; Wang, Kai; Wang, Zhong

    2013-12-01

    The recent revolution in sequencing technologies has led to an exponential growth of sequence data. As a result, most of the current bioinformatics tools become obsolete as they fail to scale with data. To tackle this 'data deluge', here we introduce the BioPig sequence analysis toolkit as one of the solutions that scale to data and computation. We built BioPig on the Apache's Hadoop MapReduce system and the Pig data flow language. Compared with traditional serial and MPI-based algorithms, BioPig has three major advantages: first, BioPig's programmability greatly reduces development time for parallel bioinformatics applications; second, testing BioPig with up to 500 Gb sequences demonstrates that it scales automatically with size of data; and finally, BioPig can be ported without modification on many Hadoop infrastructures, as tested with Magellan system at National Energy Research Scientific Computing Center and the Amazon Elastic Compute Cloud. In summary, BioPig represents a novel program framework with the potential to greatly accelerate data-intensive bioinformatics analysis.

  13. Predicting discovery rates of genomic features.

    PubMed

    Gravel, Simon

    2014-06-01

    Successful sequencing experiments require judicious sample selection. However, this selection must often be performed on the basis of limited preliminary data. Predicting the statistical properties of the final sample based on preliminary data can be challenging, because numerous uncertain model assumptions may be involved. Here, we ask whether we can predict "omics" variation across many samples by sequencing only a fraction of them. In the infinite-genome limit, we find that a pilot study sequencing 5% of a population is sufficient to predict the number of genetic variants in the entire population within 6% of the correct value, using an estimator agnostic to demography, selection, or population structure. To reach similar accuracy in a finite genome with millions of polymorphisms, the pilot study would require ∼15% of the population. We present computationally efficient jackknife and linear programming methods that exhibit substantially less bias than the state of the art when applied to simulated data and subsampled 1000 Genomes Project data. Extrapolating based on the National Heart, Lung, and Blood Institute Exome Sequencing Project data, we predict that 7.2% of sites in the capture region would be variable in a sample of 50,000 African Americans and 8.8% in a European sample of equal size. Finally, we show how the linear programming method can also predict discovery rates of various genomic features, such as the number of transcription factor binding sites across different cell types. Copyright © 2014 by the Genetics Society of America.

  14. Identification of Changes along a Continuum of Speech Intonation is Impaired in Congenital Amusia.

    PubMed

    Hutchins, Sean; Gosselin, Nathalie; Peretz, Isabelle

    2010-01-01

    A small number of individuals have severe musical problems that have neuro-genetic underpinnings. This musical disorder is termed "congenital amusia," an umbrella term for lifelong musical disabilities that cannot be attributed to deafness, lack of exposure, or brain damage after birth. Amusics seem to lack the ability to detect fine pitch differences in tone sequences. However, differences between statements and questions, which vary in final pitch, are well perceived by most congenital amusic individuals. We hypothesized that the origin of this apparent domain-specificity of the disorder lies in the range of pitch variations, which are very coarse in speech as compared to music. Here, we tested this hypothesis by using a continuum of gradually increasing final pitch in both speech and tone sequences. To this aim, nine amusic cases and nine matched controls were presented with statements and questions that varied on a pitch continuum from falling to rising in 11 steps. The sentences were either naturally spoken or were tone sequence versions of these. The task was to categorize the sentences as statements or questions and the tone sequences as falling or rising. In each case, the observation of an S-shaped identification function indicates that amusics can accurately identify unambiguous examples of statements and questions but have problems with fine variations between these endpoints. Thus, the results indicate that a deficient pitch perception might compromise music, not because it is specialized for that domain but because music requirements are more fine-grained.

  15. Microbial oceanography: Killers of the winners

    NASA Astrophysics Data System (ADS)

    Kirchman, David L.

    2013-02-01

    Viruses that infect the SAR11 group of oceanic bacteria have finally been found and sequenced. Because SAR11 is ubiquitous, these viruses may be the most abundant in the oceans -- and perhaps in the entire biosphere. See Letter p.357

  16. Site selection for MSFC operational tests of solar heating and cooling systems

    NASA Technical Reports Server (NTRS)

    1978-01-01

    The criteria, methodology, and sequence aspects of the site selection process are presented. This report organized the logical thought process that should be applied to the site selection process, but final decisions are highly selective.

  17. ChromatoGate: A Tool for Detecting Base Mis-Calls in Multiple Sequence Alignments by Semi-Automatic Chromatogram Inspection

    PubMed Central

    Alachiotis, Nikolaos; Vogiatzi, Emmanouella; Pavlidis, Pavlos; Stamatakis, Alexandros

    2013-01-01

    Automated DNA sequencers generate chromatograms that contain raw sequencing data. They also generate data that translates the chromatograms into molecular sequences of A, C, G, T, or N (undetermined) characters. Since chromatogram translation programs frequently introduce errors, a manual inspection of the generated sequence data is required. As sequence numbers and lengths increase, visual inspection and manual correction of chromatograms and corresponding sequences on a per-peak and per-nucleotide basis becomes an error-prone, time-consuming, and tedious process. Here, we introduce ChromatoGate (CG), an open-source software that accelerates and partially automates the inspection of chromatograms and the detection of sequencing errors for bidirectional sequencing runs. To provide users full control over the error correction process, a fully automated error correction algorithm has not been implemented. Initially, the program scans a given multiple sequence alignment (MSA) for potential sequencing errors, assuming that each polymorphic site in the alignment may be attributed to a sequencing error with a certain probability. The guided MSA assembly procedure in ChromatoGate detects chromatogram peaks of all characters in an alignment that lead to polymorphic sites, given a user-defined threshold. The threshold value represents the sensitivity of the sequencing error detection mechanism. After this pre-filtering, the user only needs to inspect a small number of peaks in every chromatogram to correct sequencing errors. Finally, we show that correcting sequencing errors is important, because population genetic and phylogenetic inferences can be misled by MSAs with uncorrected mis-calls. Our experiments indicate that estimates of population mutation rates can be affected two- to three-fold by uncorrected errors. PMID:24688709

  18. ChromatoGate: A Tool for Detecting Base Mis-Calls in Multiple Sequence Alignments by Semi-Automatic Chromatogram Inspection.

    PubMed

    Alachiotis, Nikolaos; Vogiatzi, Emmanouella; Pavlidis, Pavlos; Stamatakis, Alexandros

    2013-01-01

    Automated DNA sequencers generate chromatograms that contain raw sequencing data. They also generate data that translates the chromatograms into molecular sequences of A, C, G, T, or N (undetermined) characters. Since chromatogram translation programs frequently introduce errors, a manual inspection of the generated sequence data is required. As sequence numbers and lengths increase, visual inspection and manual correction of chromatograms and corresponding sequences on a per-peak and per-nucleotide basis becomes an error-prone, time-consuming, and tedious process. Here, we introduce ChromatoGate (CG), an open-source software that accelerates and partially automates the inspection of chromatograms and the detection of sequencing errors for bidirectional sequencing runs. To provide users full control over the error correction process, a fully automated error correction algorithm has not been implemented. Initially, the program scans a given multiple sequence alignment (MSA) for potential sequencing errors, assuming that each polymorphic site in the alignment may be attributed to a sequencing error with a certain probability. The guided MSA assembly procedure in ChromatoGate detects chromatogram peaks of all characters in an alignment that lead to polymorphic sites, given a user-defined threshold. The threshold value represents the sensitivity of the sequencing error detection mechanism. After this pre-filtering, the user only needs to inspect a small number of peaks in every chromatogram to correct sequencing errors. Finally, we show that correcting sequencing errors is important, because population genetic and phylogenetic inferences can be misled by MSAs with uncorrected mis-calls. Our experiments indicate that estimates of population mutation rates can be affected two- to three-fold by uncorrected errors.

  19. Sequence modelling and an extensible data model for genomic database

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Peter Wei-Der

    1992-01-01

    The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS's do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data modelmore » that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the Extensible Object Model'', to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.« less

  20. Sequence modelling and an extensible data model for genomic database

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Peter Wei-Der

    1992-01-01

    The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS`s do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data modelmore » that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the ``Extensible Object Model``, to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.« less

  1. The fast changing landscape of sequencing technologies and their impact on microbial genome assemblies and annotation.

    PubMed

    Mavromatis, Konstantinos; Land, Miriam L; Brettin, Thomas S; Quest, Daniel J; Copeland, Alex; Clum, Alicia; Goodwin, Lynne; Woyke, Tanja; Lapidus, Alla; Klenk, Hans Peter; Cottingham, Robert W; Kyrpides, Nikos C

    2012-01-01

    The emergence of next generation sequencing (NGS) has provided the means for rapid and high throughput sequencing and data generation at low cost, while concomitantly creating a new set of challenges. The number of available assembled microbial genomes continues to grow rapidly and their quality reflects the quality of the sequencing technology used, but also of the analysis software employed for assembly and annotation. In this work, we have explored the quality of the microbial draft genomes across various sequencing technologies. We have compared the draft and finished assemblies of 133 microbial genomes sequenced at the Department of Energy-Joint Genome Institute and finished at the Los Alamos National Laboratory using a variety of combinations of sequencing technologies, reflecting the transition of the institute from Sanger-based sequencing platforms to NGS platforms. The quality of the public assemblies and of the associated gene annotations was evaluated using various metrics. Results obtained with the different sequencing technologies, as well as their effects on downstream processes, were analyzed. Our results demonstrate that the Illumina HiSeq 2000 sequencing system, the primary sequencing technology currently used for de novo genome sequencing and assembly at JGI, has various advantages in terms of total sequence throughput and cost, but it also introduces challenges for the downstream analyses. In all cases assembly results although on average are of high quality, need to be viewed critically and consider sources of errors in them prior to analysis. These data follow the evolution of microbial sequencing and downstream processing at the JGI from draft genome sequences with large gaps corresponding to missing genes of significant biological role to assemblies with multiple small gaps (Illumina) and finally to assemblies that generate almost complete genomes (Illumina+PacBio).

  2. Acquisition of initial /s/-stop and stop-/s/sequences in Greek.

    PubMed

    Syrika, Asimina; Nicolaidis, Katerina; Edwards, Jan; Beckman, Mary E

    2011-09-01

    Previous work on children's acquisition of complex sequences points to a tendency for affricates to be acquired before clusters, but there is no clear evidence of a difference in order of acquisition between clusters with /s/ that violate the Sonority Sequencing Principle (SSP), such as /s/ followed by stop in onset position, and other clusters that obey the SSP. One problem with studies that have compared the acquisition of SSP-obeying and SSP-violating clusters is that the component sounds in the two types of sequences were different.This paper examines the acquisition of initial /s/-stop and stop-/s/ sequences by sixty Greek children aged 2 through 5 years. Results showed greater accuracy for the /s/-stop relative to the stop-/s/ sequences, but no difference in accuracy between /ts/, which is usually analyzed as an affricate in Greek, and the other stop-/s/ sequences. Moreover, errors for the /s/-stop sequences and /ts/ primarily involved stop substitutions, whereas errors for /ps/ and /ks/ were more variable and often involved fricative substitutions, a pattern which may have a perceptual explanation. Finally, /ts/ showed a distinct temporal pattern relative to the stop-/s/ clusters /ps/ and /ks/, similar to what has been reported for productions of Greek adults.

  3. Gelada vocal sequences follow Menzerath’s linguistic law

    PubMed Central

    Gustison, Morgan L.; Semple, Stuart; Ferrer-i-Cancho, Ramon; Bergman, Thore J.

    2016-01-01

    Identifying universal principles underpinning diverse natural systems is a key goal of the life sciences. A powerful approach in addressing this goal has been to test whether patterns consistent with linguistic laws are found in nonhuman animals. Menzerath’s law is a linguistic law that states that, the larger the construct, the smaller the size of its constituents. Here, to our knowledge, we present the first evidence that Menzerath’s law holds in the vocal communication of a nonhuman species. We show that, in vocal sequences of wild male geladas (Theropithecus gelada), construct size (sequence size in number of calls) is negatively correlated with constituent size (duration of calls). Call duration does not vary significantly with position in the sequence, but call sequence composition does change with sequence size and most call types are abbreviated in larger sequences. We also find that intercall intervals follow the same relationship with sequence size as do calls. Finally, we provide formal mathematical support for the idea that Menzerath’s law reflects compression—the principle of minimizing the expected length of a code. Our findings suggest that a common principle underpins human and gelada vocal communication, highlighting the value of exploring the applicability of linguistic laws in vocal systems outside the realm of language. PMID:27091968

  4. Acquisition of initial /s/-stop and stop-/s/ sequences in Greek

    PubMed Central

    Syrika, Asimina; Nicolaidis, Katerina; Edwards, Jan; Beckman, Mary E.

    2010-01-01

    Previous work on children’s acquisition of complex sequences points to a tendency for affricates to be acquired before clusters, but there is no clear evidence of a difference in order of acquisition between clusters with /s/ that violate the Sonority Sequencing Principle (SSP), such as /s/ followed by stop in onset position, and other clusters that obey the SSP. One problem with studies that have compared the acquisition of SSP-obeying and SSP-violating clusters is that the component sounds in the two types of sequences were different. This paper examines the acquisition of initial /s/-stop and stop-/s/ sequences by sixty Greek children aged 2 through 5 years. Results showed greater accuracy for the /s/-stop relative to the stop-/s/ sequences, but no difference in accuracy between /ts/, which is usually analyzed as an affricate in Greek, and the other stop-/s/ sequences. Moreover, errors for the /s/-stop sequences and /ts/ primarily involved stop substitutions, whereas errors for /ps/ and /ks/ were more variable and often involved fricative substitutions, a pattern which may have a perceptual explanation. Finally, /ts/ showed a distinct temporal pattern relative to the stop-/s/ clusters /ps/ and /ks/, similarly to what has been reported for productions of Greek adults. PMID:22070044

  5. A weighted U-statistic for genetic association analyses of sequencing data.

    PubMed

    Wei, Changshuai; Li, Ming; He, Zihuai; Vsevolozhskaya, Olga; Schaid, Daniel J; Lu, Qing

    2014-12-01

    With advancements in next-generation sequencing technology, a massive amount of sequencing data is generated, which offers a great opportunity to comprehensively investigate the role of rare variants in the genetic etiology of complex diseases. Nevertheless, the high-dimensional sequencing data poses a great challenge for statistical analysis. The association analyses based on traditional statistical methods suffer substantial power loss because of the low frequency of genetic variants and the extremely high dimensionality of the data. We developed a Weighted U Sequencing test, referred to as WU-SEQ, for the high-dimensional association analysis of sequencing data. Based on a nonparametric U-statistic, WU-SEQ makes no assumption of the underlying disease model and phenotype distribution, and can be applied to a variety of phenotypes. Through simulation studies and an empirical study, we showed that WU-SEQ outperformed a commonly used sequence kernel association test (SKAT) method when the underlying assumptions were violated (e.g., the phenotype followed a heavy-tailed distribution). Even when the assumptions were satisfied, WU-SEQ still attained comparable performance to SKAT. Finally, we applied WU-SEQ to sequencing data from the Dallas Heart Study (DHS), and detected an association between ANGPTL 4 and very low density lipoprotein cholesterol. © 2014 WILEY PERIODICALS, INC.

  6. Metamorphic Proteins: Emergence of Dual Protein Folds from One Primary Sequence.

    PubMed

    Lella, Muralikrishna; Mahalakshmi, Radhakrishnan

    2017-06-20

    Every amino acid exhibits a different propensity for distinct structural conformations. Hence, decoding how the primary amino acid sequence undergoes the transition to a defined secondary structure and its final three-dimensional fold is presently considered predictable with reasonable certainty. However, protein sequences that defy the first principles of secondary structure prediction (they attain two different folds) have recently been discovered. Such proteins, aptly named metamorphic proteins, decrease the conformational constraint by increasing flexibility in the secondary structure and thereby result in efficient functionality. In this review, we discuss the major factors driving the conformational switch related both to protein sequence and to structure using illustrative examples. We discuss the concept of an evolutionary transition in sequence and structure, the functional impact of the tertiary fold, and the pressure of intrinsic and external factors that give rise to metamorphic proteins. We mainly focus on the major components of protein architecture, namely, the α-helix and β-sheet segments, which are involved in conformational switching within the same or highly similar sequences. These chameleonic sequences are widespread in both cytosolic and membrane proteins, and these folds are equally important for protein structure and function. We discuss the implications of metamorphic proteins and chameleonic peptide sequences in de novo peptide design.

  7. SMARTIV: combined sequence and structure de-novo motif discovery for in-vivo RNA binding data.

    PubMed

    Polishchuk, Maya; Paz, Inbal; Yakhini, Zohar; Mandel-Gutfreund, Yael

    2018-05-25

    Gene expression regulation is highly dependent on binding of RNA-binding proteins (RBPs) to their RNA targets. Growing evidence supports the notion that both RNA primary sequence and its local secondary structure play a role in specific Protein-RNA recognition and binding. Despite the great advance in high-throughput experimental methods for identifying sequence targets of RBPs, predicting the specific sequence and structure binding preferences of RBPs remains a major challenge. We present a novel webserver, SMARTIV, designed for discovering and visualizing combined RNA sequence and structure motifs from high-throughput RNA-binding data, generated from in-vivo experiments. The uniqueness of SMARTIV is that it predicts motifs from enriched k-mers that combine information from ranked RNA sequences and their predicted secondary structure, obtained using various folding methods. Consequently, SMARTIV generates Position Weight Matrices (PWMs) in a combined sequence and structure alphabet with assigned P-values. SMARTIV concisely represents the sequence and structure motif content as a single graphical logo, which is informative and easy for visual perception. SMARTIV was examined extensively on a variety of high-throughput binding experiments for RBPs from different families, generated from different technologies, showing consistent and accurate results. Finally, SMARTIV is a user-friendly webserver, highly efficient in run-time and freely accessible via http://smartiv.technion.ac.il/.

  8. Using single cell sequencing data to model the evolutionary history of a tumor.

    PubMed

    Kim, Kyung In; Simon, Richard

    2014-01-24

    The introduction of next-generation sequencing (NGS) technology has made it possible to detect genomic alterations within tumor cells on a large scale. However, most applications of NGS show the genetic content of mixtures of cells. Recently developed single cell sequencing technology can identify variation within a single cell. Characterization of multiple samples from a tumor using single cell sequencing can potentially provide information on the evolutionary history of that tumor. This may facilitate understanding how key mutations accumulate and evolve in lineages to form a heterogeneous tumor. We provide a computational method to infer an evolutionary mutation tree based on single cell sequencing data. Our approach differs from traditional phylogenetic tree approaches in that our mutation tree directly describes temporal order relationships among mutation sites. Our method also accommodates sequencing errors. Furthermore, we provide a method for estimating the proportion of time from the earliest mutation event of the sample to the most recent common ancestor of the sample of cells. Finally, we discuss current limitations on modeling with single cell sequencing data and possible improvements under those limitations. Inferring the temporal ordering of mutational sites using current single cell sequencing data is a challenge. Our proposed method may help elucidate relationships among key mutations and their role in tumor progression.

  9. An investigation of developmental changes in interpretation and construction of graphic AAC symbol sequences through systematic combination of input and output modalities.

    PubMed

    Trudeau, Natacha; Sutton, Ann; Morford, Jill P

    2014-09-01

    While research on spoken language has a long tradition of studying and contrasting language production and comprehension, the study of graphic symbol communication has focused more on production than comprehension. As a result, the relationships between the ability to construct and to interpret graphic symbol sequences are not well understood. This study explored the use of graphic symbol sequences in children without disabilities aged 3;0 to 6;11 (years; months) (n=111). Children took part in nine tasks that systematically varied input and output modalities (speech, action, and graphic symbols). Results show that in 3- and 4-year-olds, attributing meaning to a sequence of symbols was particularly difficult even when the children knew the meaning of each symbol in the sequence. Similarly, while even 3- and 4-year-olds could produce a graphic symbol sequence following a model, transposing a spoken sentence into a graphic sequence was more difficult for them. Representing an action with graphic symbols was difficult even for 5-year-olds. Finally, the ability to comprehend graphic-symbol sequences preceded the ability to produce them. These developmental patterns, as well as memory-related variables, should be taken into account in choosing intervention strategies with young children who use AAC.

  10. Helix Unwinding and Base Flipping Enable Human MTERF1 to Terminate Mitochondrial Transcription

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yakubovskaya, E.; Mejia, E; Byrnes, J

    2010-01-01

    Defects in mitochondrial gene expression are associated with aging and disease. Mterf proteins have been implicated in modulating transcription, replication and protein synthesis. We have solved the structure of a member of this family, the human mitochondrial transcriptional terminator MTERF1, bound to dsDNA containing the termination sequence. The structure indicates that upon sequence recognition MTERF1 unwinds the DNA molecule, promoting eversion of three nucleotides. Base flipping is critical for stable binding and transcriptional termination. Additional structural and biochemical results provide insight into the DNA binding mechanism and explain how MTERF1 recognizes its target sequence. Finally, we have demonstrated that themore » mitochondrial pathogenic G3249A and G3244A mutations interfere with key interactions for sequence recognition, eliminating termination. Our results provide insight into the role of mterf proteins and suggest a link between mitochondrial disease and the regulation of mitochondrial transcription.« less

  11. Hypothesis testing in students: Sequences, stages, and instructional strategies

    NASA Astrophysics Data System (ADS)

    Moshman, David; Thompson, Pat A.

    Six sequences in the development of hypothesis-testing conceptions are proposed, involving (a) interpretation of the hypothesis; (b) the distinction between using theories and testing theories; (c) the consideration of multiple possibilities; (d) the relation of theory and data; (e) the nature of verification and falsification; and (f) the relation of truth and falsity. An alternative account is then provided involving three global stages: concrete operations, formal operations, and a postformal metaconstructivestage. Relative advantages and difficulties of the stage and sequence conceptualizations are discussed. Finally, three families of teaching strategy are distinguished, which emphasize, respectively: (a) social transmission of knowledge; (b) carefully sequenced empirical experience by the student; and (c) self-regulated cognitive activity of the student. It is argued on the basis of Piaget's theory that the last of these plays a crucial role in the construction of such logical reasoning strategies as those involved in testing hypotheses.

  12. Marine Fungi: Their Ecology and Molecular Diversity

    NASA Astrophysics Data System (ADS)

    Richards, Thomas A.; Jones, Meredith D. M.; Leonard, Guy; Bass, David

    2012-01-01

    Fungi appear to be rare in marine environments. There are relatively few marine isolates in culture, and fungal small subunit ribosomal DNA (SSU rDNA) sequences are rarely recovered in marine clone library experiments (i.e., culture-independent sequence surveys of eukaryotic microbial diversity from environmental DNA samples). To explore the diversity of marine fungi, we took a broad selection of SSU rDNA data sets and calculated a summary phylogeny. Bringing these data together identified a diverse collection of marine fungi, including sequences branching close to chytrids (flagellated fungi), filamentous hypha-forming fungi, and multicellular fungi. However, the majority of the sequences branched with ascomycete and basidiomycete yeasts. We discuss evidence for 36 novel marine lineages, the majority and most divergent of which branch with the chytrids. We then investigate what these data mean for the evolutionary history of the Fungi and specifically marine-terrestrial transitions. Finally, we discuss the roles of fungi in marine ecosystems.

  13. Wavelet Fusion for Concealed Object Detection Using Passive Millimeter Wave Sequence Images

    NASA Astrophysics Data System (ADS)

    Chen, Y.; Pang, L.; Liu, H.; Xu, X.

    2018-04-01

    PMMW imaging system can create interpretable imagery on the objects concealed under clothing, which gives the great advantage to the security check system. Paper addresses wavelet fusion to detect concealed objects using passive millimeter wave (PMMW) sequence images. According to PMMW real-time imager acquired image characteristics and storage methods firstly, using the sum of squared difference (SSD) as the image-related parameters to screen the sequence images. Secondly, the selected images are optimized using wavelet fusion algorithm. Finally, the concealed objects are detected by mean filter, threshold segmentation and edge detection. The experimental results show that this method improves the detection effect of concealed objects by selecting the most relevant images from PMMW sequence images and using wavelet fusion to enhance the information of the concealed objects. The method can be effectively applied to human body concealed object detection in millimeter wave video.

  14. Recovering complete and draft population genomes from metagenome datasets

    DOE PAGES

    Sangwan, Naseer; Xia, Fangfang; Gilbert, Jack A.

    2016-03-08

    Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem ofmore » chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution.« less

  15. Conservative secondary structure motifs already present in early-stage folding (in silico) as found in serpines family.

    PubMed

    Brylinski, Michal; Konieczny, Leszek; Kononowicz, Andrzej; Roterman, Irena

    2008-03-21

    The well-known procedure implemented in ClustalW oriented on the sequence comparison was applied to structure comparison. The consensus sequence as well as consensus structure has been defined for proteins belonging to serpine family. The structure of early stage intermediate was the object for similarity search. The high values of W(sequence) appeared to be accordant with high values of W(structure) making possible structure comparison using common criteria for sequence and structure comparison. Since the early stage structural form has been created according to limited conformational sub-space which does not include the beta-structure (this structure is mediated by C7eq structural form), is particularly important to see, that the C7eq structural form may be treated as the seed for beta-structure present in the final native structure of protein. The applicability of ClustalW procedure to structure comparison makes these two comparisons unified.

  16. Preference for locus of punishment in a response sequence1

    PubMed Central

    Dardano, J. F.

    1972-01-01

    Food-deprived pigeons pecked a key under a schedule in which grain was made available after the seventieth peck. In each sequence of 70 responses, either the first, middle, or final response was followed by electric shock. Before the first response of each sequence, each response on a second key changed the color of the food key and the schedule of shock that was correlated with the food key color. Each pigeon preferred a schedule of shock, in that each of the three shock schedules did not occur equally often. The preferred shock schedule and the strength of the preference varied among the pigeons. The overall rate of responding by a pigeon under a given shock schedule was directly related to the pigeon's relative preference for that schedule, except when shock after the first response in the sequence was the most preferred schedule. PMID:16811588

  17. Recovering complete and draft population genomes from metagenome datasets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sangwan, Naseer; Xia, Fangfang; Gilbert, Jack A.

    Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem ofmore » chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution.« less

  18. Evaluating the accuracy of SHAPE-directed RNA secondary structure predictions

    PubMed Central

    Sükösd, Zsuzsanna; Swenson, M. Shel; Kjems, Jørgen; Heitsch, Christine E.

    2013-01-01

    Recent advances in RNA structure determination include using data from high-throughput probing experiments to improve thermodynamic prediction accuracy. We evaluate the extent and nature of improvements in data-directed predictions for a diverse set of 16S/18S ribosomal sequences using a stochastic model of experimental SHAPE data. The average accuracy for 1000 data-directed predictions always improves over the original minimum free energy (MFE) structure. However, the amount of improvement varies with the sequence, exhibiting a correlation with MFE accuracy. Further analysis of this correlation shows that accurate MFE base pairs are typically preserved in a data-directed prediction, whereas inaccurate ones are not. Thus, the positive predictive value of common base pairs is consistently higher than the directed prediction accuracy. Finally, we confirm sequence dependencies in the directability of thermodynamic predictions and investigate the potential for greater accuracy improvements in the worst performing test sequence. PMID:23325843

  19. Determination of the promoter region of mouse ribosomal RNA gene by an in vitro transcription system.

    PubMed Central

    Yamamoto, O; Takakusa, N; Mishima, Y; Kominami, R; Muramatsu, M

    1984-01-01

    Sequences required for a faithful and efficient transcription of a cloned mouse ribosomal RNA gene (rDNA) are determined by testing a series of deletion mutants in an in vitro transcription system utilizing two kinds of mouse cellular extract. Deletion of sequences upstream of -40 or downstream of +52 causes only slight reduction in promoter activity as compared with the "wild-type" template. For upstream deletion mutants, the removal of a sequence between -40 and -35 causes a significant decrease in the capacity to direct efficient initiation. This decrease becomes more pronounced when the deletion reaches -32 and the sequence A-T-C-T-T-T, conserved among mouse, rat, and human rDNAs, is lost. Residual template activity is further reduced as more upstream sequence is deleted and finally becomes undetectable when the deletion is extended from -22 down to -17, corresponding to the loss of the conserved sequence T-A-T-T-G. As for downstream deletion mutants, the removal of the sequence downstream of +23 causes some (and further deletions up to +11 cause a more) serious decrease in template activity in vitro. These deletions involve other conserved sequences downstream of the transcription start site. However, the removal of the original transcription start site does not abolish the transcription initiation completely, provided that the whole upstream sequence is intact. Images PMID:6320178

  20. The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing.

    PubMed

    Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P; Panitz, Frank; Bendixen, Christian; Nielsen, Rasmus; Willerslev, Eske

    2007-02-14

    The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources. We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences). Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis. We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%). Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial analyses, population genetics, and phylogenetics.

  1. Phylogenetic characterization of a biogas plant microbial community integrating clone library 16S-rDNA sequences and metagenome sequence data obtained by 454-pyrosequencing.

    PubMed

    Kröber, Magdalena; Bekel, Thomas; Diaz, Naryttza N; Goesmann, Alexander; Jaenicke, Sebastian; Krause, Lutz; Miller, Dimitri; Runte, Kai J; Viehöver, Prisca; Pühler, Alfred; Schlüter, Andreas

    2009-06-01

    The phylogenetic structure of the microbial community residing in a fermentation sample from a production-scale biogas plant fed with maize silage, green rye and liquid manure was analysed by an integrated approach using clone library sequences and metagenome sequence data obtained by 454-pyrosequencing. Sequencing of 109 clones from a bacterial and an archaeal 16S-rDNA amplicon library revealed that the obtained nucleotide sequences are similar but not identical to 16S-rDNA database sequences derived from different anaerobic environments including digestors and bioreactors. Most of the bacterial 16S-rDNA sequences could be assigned to the phylum Firmicutes with the most abundant class Clostridia and to the class Bacteroidetes, whereas most archaeal 16S-rDNA sequences cluster close to the methanogen Methanoculleus bourgensis. Further sequences of the archaeal library most probably represent so far non-characterised species within the genus Methanoculleus. A similar result derived from phylogenetic analysis of mcrA clone sequences. The mcrA gene product encodes the alpha-subunit of methyl-coenzyme-M reductase involved in the final step of methanogenesis. BLASTn analysis applying stringent settings resulted in assignment of 16S-rDNA metagenome sequence reads to 62 16S-rDNA amplicon sequences thus enabling frequency of abundance estimations for 16S-rDNA clone library sequences. Ribosomal Database Project (RDP) Classifier processing of metagenome 16S-rDNA reads revealed abundance of the phyla Firmicutes, Bacteroidetes and Euryarchaeota and the orders Clostridiales, Bacteroidales and Methanomicrobiales. Moreover, a large fraction of 16S-rDNA metagenome reads could not be assigned to lower taxonomic ranks, demonstrating that numerous microorganisms in the analysed fermentation sample of the biogas plant are still unclassified or unknown.

  2. Classification of G-protein coupled receptors based on a rich generation of convolutional neural network, N-gram transformation and multiple sequence alignments.

    PubMed

    Li, Man; Ling, Cheng; Xu, Qi; Gao, Jingyang

    2018-02-01

    Sequence classification is crucial in predicting the function of newly discovered sequences. In recent years, the prediction of the incremental large-scale and diversity of sequences has heavily relied on the involvement of machine-learning algorithms. To improve prediction accuracy, these algorithms must confront the key challenge of extracting valuable features. In this work, we propose a feature-enhanced protein classification approach, considering the rich generation of multiple sequence alignment algorithms, N-gram probabilistic language model and the deep learning technique. The essence behind the proposed method is that if each group of sequences can be represented by one feature sequence, composed of homologous sites, there should be less loss when the sequence is rebuilt, when a more relevant sequence is added to the group. On the basis of this consideration, the prediction becomes whether a query sequence belonging to a group of sequences can be transferred to calculate the probability that the new feature sequence evolves from the original one. The proposed work focuses on the hierarchical classification of G-protein Coupled Receptors (GPCRs), which begins by extracting the feature sequences from the multiple sequence alignment results of the GPCRs sub-subfamilies. The N-gram model is then applied to construct the input vectors. Finally, these vectors are imported into a convolutional neural network to make a prediction. The experimental results elucidate that the proposed method provides significant performance improvements. The classification error rate of the proposed method is reduced by at least 4.67% (family level I) and 5.75% (family Level II), in comparison with the current state-of-the-art methods. The implementation program of the proposed work is freely available at: https://github.com/alanFchina/CNN .

  3. Genome sequence determination and metagenomic characterization of a Dehalococcoides mixed culture grown on cis-1,2-dichloroethene.

    PubMed

    Yohda, Masafumi; Yagi, Osami; Takechi, Ayane; Kitajima, Mizuki; Matsuda, Hisashi; Miyamura, Naoaki; Aizawa, Tomoko; Nakajima, Mutsuyasu; Sunairi, Michio; Daiba, Akito; Miyajima, Takashi; Teruya, Morimi; Teruya, Kuniko; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Juan, Ayaka; Nakano, Kazuma; Aoyama, Misako; Terabayashi, Yasunobu; Satou, Kazuhito; Hirano, Takashi

    2015-07-01

    A Dehalococcoides-containing bacterial consortium that performed dechlorination of 0.20 mM cis-1,2-dichloroethene to ethene in 14 days was obtained from the sediment mud of the lotus field. To obtain detailed information of the consortium, the metagenome was analyzed using the short-read next-generation sequencer SOLiD 3. Matching the obtained sequence tags with the reference genome sequences indicated that the Dehalococcoides sp. in the consortium was highly homologous to Dehalococcoides mccartyi CBDB1 and BAV1. Sequence comparison with the reference sequence constructed from 16S rRNA gene sequences in a public database showed the presence of Sedimentibacter, Sulfurospirillum, Clostridium, Desulfovibrio, Parabacteroides, Alistipes, Eubacterium, Peptostreptococcus and Proteocatella in addition to Dehalococcoides sp. After further enrichment, the members of the consortium were narrowed down to almost three species. Finally, the full-length circular genome sequence of the Dehalococcoides sp. in the consortium, D. mccartyi IBARAKI, was determined by analyzing the metagenome with the single-molecule DNA sequencer PacBio RS. The accuracy of the sequence was confirmed by matching it to the tag sequences obtained by SOLiD 3. The genome is 1,451,062 nt and the number of CDS is 1566, which includes 3 rRNA genes and 47 tRNA genes. There exist twenty-eight RDase genes that are accompanied by the genes for anchor proteins. The genome exhibits significant sequence identity with other Dehalococcoides spp. throughout the genome, but there exists significant difference in the distribution RDase genes. The combination of a short-read next-generation DNA sequencer and a long-read single-molecule DNA sequencer gives detailed information of a bacterial consortium. Copyright © 2014 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  4. CDSbank: taxonomy-aware extraction, selection, renaming and formatting of protein-coding DNA or amino acid sequences.

    PubMed

    Hazes, Bart

    2014-02-28

    Protein-coding DNA sequences and their corresponding amino acid sequences are routinely used to study relationships between sequence, structure, function, and evolution. The rapidly growing size of sequence databases increases the power of such comparative analyses but it makes it more challenging to prepare high quality sequence data sets with control over redundancy, quality, completeness, formatting, and labeling. Software tools for some individual steps in this process exist but manual intervention remains a common and time consuming necessity. CDSbank is a database that stores both the protein-coding DNA sequence (CDS) and amino acid sequence for each protein annotated in Genbank. CDSbank also stores Genbank feature annotation, a flag to indicate incomplete 5' and 3' ends, full taxonomic data, and a heuristic to rank the scientific interest of each species. This rich information allows fully automated data set preparation with a level of sophistication that aims to meet or exceed manual processing. Defaults ensure ease of use for typical scenarios while allowing great flexibility when needed. Access is via a free web server at http://hazeslab.med.ualberta.ca/CDSbank/. CDSbank presents a user-friendly web server to download, filter, format, and name large sequence data sets. Common usage scenarios can be accessed via pre-programmed default choices, while optional sections give full control over the processing pipeline. Particular strengths are: extract protein-coding DNA sequences just as easily as amino acid sequences, full access to taxonomy for labeling and filtering, awareness of incomplete sequences, and the ability to take one protein sequence and extract all synonymous CDS or identical protein sequences in other species. Finally, CDSbank can also create labeled property files to, for instance, annotate or re-label phylogenetic trees.

  5. Overcoming Sequence Misalignments with Weighted Structural Superposition

    PubMed Central

    Khazanov, Nickolay A.; Damm-Ganamet, Kelly L.; Quang, Daniel X.; Carlson, Heather A.

    2012-01-01

    An appropriate structural superposition identifies similarities and differences between homologous proteins that are not evident from sequence alignments alone. We have coupled our Gaussian-weighted RMSD (wRMSD) tool with a sequence aligner and seed extension (SE) algorithm to create a robust technique for overlaying structures and aligning sequences of homologous proteins (HwRMSD). HwRMSD overcomes errors in the initial sequence alignment that would normally propagate into a standard RMSD overlay. SE can generate a corrected sequence alignment from the improved structural superposition obtained by wRMSD. HwRMSD’s robust performance and its superiority over standard RMSD are demonstrated over a range of homologous proteins. Its better overlay results in corrected sequence alignments with good agreement to HOMSTRAD. Finally, HwRMSD is compared to established structural alignment methods: FATCAT, SSM, CE, and Dalilite. Most methods are comparable at placing residue pairs within 2 Å, but HwRMSD places many more residue pairs within 1 Å, providing a clear advantage. Such high accuracy is essential in drug design, where small distances can have a large impact on computational predictions. This level of accuracy is also needed to correct sequence alignments in an automated fashion, especially for omics-scale analysis. HwRMSD can align homologs with low sequence identity and large conformational differences, cases where both sequence-based and structural-based methods may fail. The HwRMSD pipeline overcomes the dependency of structural overlays on initial sequence pairing and removes the need to determine the best sequence-alignment method, substitution matrix, and gap parameters for each unique pair of homologs. PMID:22733542

  6. Compressing DNA sequence databases with coil.

    PubMed

    White, W Timothy J; Hendy, Michael D

    2008-05-20

    Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression - an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression - the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.

  7. Compressing DNA sequence databases with coil

    PubMed Central

    White, W Timothy J; Hendy, Michael D

    2008-01-01

    Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work. PMID:18489794

  8. Exploring the sequence-structure protein landscape in the glycosyltransferase family

    PubMed Central

    Zhang, Ziding; Kochhar, Sunil; Grigorov, Martin

    2003-01-01

    To understand the molecular basis of glycosyltransferases’ (GTFs) catalytic mechanism, extensive structural information is required. Here, fold recognition methods were employed to assign 3D protein shapes (folds) to the currently known GTF sequences, available in public databases such as GenBank and Swissprot. First, GTF sequences were retrieved and classified into clusters, based on sequence similarity only. Intracluster sequence similarity was chosen sufficiently high to ensure that the same fold is found within a given cluster. Then, a representative sequence from each cluster was selected to compose a subset of GTF sequences. The members of this reduced set were processed by three different fold recognition methods: 3D-PSSM, FUGUE, and GeneFold. Finally, the results from different fold recognition methods were analyzed and compared to sequence-similarity search methods (i.e., BLAST and PSI-BLAST). It was established that the folds of about 70% of all currently known GTF sequences can be confidently assigned by fold recognition methods, a value which is higher than the fold identification rate based on sequence comparison alone (48% for BLAST and 64% for PSI-BLAST). The identified folds were submitted to 3D clustering, and we found that most of the GTF sequences adopt the typical GTF A or GTF B folds. Our results indicate a lack of evidence that new GTF folds (i.e., folds other than GTF A and B) exist. Based on cases where fold identification was not possible, we suggest several sequences as the most promising targets for a structural genomics initiative focused on the GTF protein family. PMID:14500887

  9. Encoding and choice in the task span paradigm.

    PubMed

    Reiman, Kaitlin M; Weaver, Starla M; Arrington, Catherine M

    2015-03-01

    Cognitive control during sequences of planned behaviors requires both plan-level processes such as generating, maintaining, and monitoring the plan, as well as task-level processes such as selecting, establishing and implementing specific task sets. The task span paradigm (Logan in J Exp Psychol Gen 133:218-236, 2004) combines two common cognitive control paradigms, task switching and working memory span, to investigate the integration of plan-level and task-level processes during control of sequential behavior. The current study expands past task span research to include measures of encoding processes and choice behavior with volitional sequence generation, using the standard task span as well as a novel voluntary task span paradigm. In two experiments, we consider how sequence complexity, defined separately for plan-level and task-level complexity, influences sequence encoding (Experiment 1), sequence choice (Experiment 2), sequence memory, and task performance of planned sequences of action. Results indicate that participants were sensitive to sequence complexity, but that different aspects of behavior are most strongly influenced by different types of complexity. Hierarchical complexity at the plan level best predicts voluntary sequence generation and memory; while switch frequency at the task level best predicts encoding of externally defined sequences and task performance. Furthermore, performance RTs were similar for externally and internally defined plans, whereas memory was improved for internally defined sequences. Finally, participants demonstrated a significant sequence choice bias in the voluntary task span. Consistent with past research on choice behavior, volitional selection of plans was markedly influenced by both the ease of memory and performance.

  10. 75 FR 51392 - Federal Management Regulation; Transportation Management

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-08-20

    ...; Docket Number 2010-0011, sequence 1] RIN 3090-AJ03 Federal Management Regulation; Transportation Management AGENCY: Office of Governmentwide Policy, General Services Administration (GSA). ACTION: Final rule. SUMMARY: The General Services Administration (GSA) is amending the Federal Management Regulation (FMR) by...

  11. Advances in the understanding and use of the genomic base of microbial secondary metabolite biosynthesis for the discovery of new natural products.

    PubMed

    McAlpine, James B

    2009-03-27

    Over the past decade major changes have occurred in the access to genome sequences that encode the enzymes responsible for the biosynthesis of secondary metabolites, knowledge of how those sequences translate into the final structure of the metabolite, and the ability to alter the sequence to obtain predicted products via both homologous and heterologous expression. Novel genera have been discovered leading to new chemotypes, but more surprisingly several instances have been uncovered where the apparently general rules of modular translation have not applied. Several new biosynthetic pathways have been unearthed, and our general knowledge grows rapidly. This review aims to highlight some of the more striking discoveries and advances of the decade.

  12. The age of the Keystone thrust: laser-fusion 40Ar/39Ar dating of foreland basin deposits, southern Spring Mountains, Nevada

    USGS Publications Warehouse

    Fleck, R.J.; Carr, M.D.

    1990-01-01

    Nonmarine sedimentary and volcaniclastic foreland-basin deposits in the Spring Mountains are cut by the Contact and Keystone thrusts. These synorogenic deposits, informally designated the Lavinia Wash sequence by Carr (1980), previously were assigned a Late Jurassic to Early Cretaceous(?) age. New 40Ar.39Ar laser-fusion and incremental-heating studies of a tuff bed in the Lavinia Wash sequence support a best estimate age of 99.0 ?? 0.4 Ma, indicating that the Lavinia Wash sequence is actually late Early Cretaceous in age and establishing a maximum age for final emplacement of the Contact and Keystone thrust plates consistent with the remainder of the Mesozoic foreland thrust belt. -from Authors

  13. Integrated Advanced Microwave Sounding Unit-A (AMSU-A). Performance Verification Report: Final Comprehensive Performance Test Report, P/N 1331720-2TST, S/N 105/A1

    NASA Technical Reports Server (NTRS)

    Platt, R.

    1999-01-01

    This is the Performance Verification Report, Final Comprehensive Performance Test (CPT) Report, for the Integrated Advanced Microwave Sounding Unit-A (AMSU-A). This specification establishes the requirements for the CPT and Limited Performance Test (LPT) of the AMSU-1A, referred to here in as the unit. The sequence in which the several phases of this test procedure shall take place is shown.

  14. On some new normed sequence spaces

    NASA Astrophysics Data System (ADS)

    Pranajaya, G.; Herawati, E.

    2018-01-01

    The sequence spaces (c 0)Λ, c Λ, and (ℓ ∞)Λ was introduced and studied by Mursaleen and Noman [11]. In the present paper, for M is a generalization of Orlicz function, we extend the spaces Mursaleen and Noman’s to [c 0(M)]Λ, [c(M)]Λ, and [ℓ ∞(M)]Λ, respectively, and investigate some topological properties of these spaces. Finally, we determine the necessary and sufficient conditions of an infinite matrix A belonging to classes (c 0(M), c 0(M)), (c(M), c(M)), and (ℓ ∞(M), ℓ ∞(M)).

  15. A Robust Framework for Microbial Archaeology

    PubMed Central

    Warinner, Christina; Herbig, Alexander; Mann, Allison; Yates, James A. Fellows; Weiβ, Clemens L.; Burbano, Hernán A.; Orlando, Ludovic; Krause, Johannes

    2017-01-01

    Microbial archaeology is flourishing in the era of high-throughput sequencing, revealing the agents behind devastating historical plagues, identifying the cryptic movements of pathogens in prehistory, and reconstructing the ancestral microbiota of humans. Here, we introduce the fundamental concepts and theoretical framework of the discipline, then discuss applied methodologies for pathogen identification and microbiome characterization from archaeological samples. We give special attention to the process of identifying, validating, and authenticating ancient microbes using high-throughput DNA sequencing data. Finally, we outline standards and precautions to guide future research in the field. PMID:28460196

  16. Plant centromeres: structure and control.

    PubMed

    Richards, E J; Dawe, R K

    1998-04-01

    Recent work has led to a better understanding of the molecular components of plant centromeres. Conservation of at least some centromere protein constituents between plant and non-plant systems has been demonstrated. The identity and organization of plant centromeric DNA sequences are also beginning to yield to analysis. While there is little primary DNA sequence conservation among the characterized plant centromeres and their non-plant counterparts, some parallels in centromere genomic organisation can be seen across species. Finally, the emerging idea that centromere activity is controlled epigenetically finds support in an examination of the plant centromere literature.

  17. Langevin synchronization in a time-dependent, harmonic basin: An exact solution in 1D

    NASA Astrophysics Data System (ADS)

    Cadilhe, A.; Voter, Arthur F.

    2018-02-01

    The trajectories of two particles undergoing Langevin dynamics while sharing a common noise sequence can merge into a single (master) trajectory. Here, we present an exact solution for a particle undergoing Langevin dynamics in a harmonic, time-dependent potential, thus extending the idea of synchronization to nonequilibrium systems. We calculate the synchronization level, i.e., the mismatch between two trajectories sharing a common noise sequence, in the underdamped, critically damped, and overdamped regimes. Finally, we provide asymptotic expansions in various limiting cases and compare to the time independent case.

  18. Semi-synthesis of murine prion protein by native chemical ligation and chemical activation for preparation of polypeptide-α-thioester.

    PubMed

    Shi, Lei; Chen, Huai; Zhang, Si-Yu; Chu, Ting-Ting; Zhao, Yu-Fen; Chen, Yong-Xiang; Li, Yan-Mei

    2017-06-01

    Prions are suspected as pathogen of the fatal transmissible spongiform encephalopathies. Strategies to access homogenous prion protein (PrP) are required to fully comprehend the molecular mechanism of prion diseases. However, the polypeptide fragments from PrP show a high tendency to form aggregates, which is a gigantic obstacle of protein synthesis and purification. In this study, murine prion sequence 90 to 230 that is the core three-dimensional structure domain was constructed from three segments murine PrP (mPrP)(90-177), mPrP(178-212), and mPrP(213-230) by combining protein expression, chemical synthesis and chemical ligation. The protein sequence 90 to 177 was obtained from expression and finally converted into the polypeptide hydrazide by chemical activation of a cysteine in the tail. The other two polypeptide fragments of the C-terminal were obtained by chemical synthesis, which utilized the strategies of isopeptide and pseudoproline building blocks to complete the synthesis of such difficult sequences. The three segments were finally assembled by sequentially using native chemical ligation. This strategy will allow more straightforward access to homogeneously modified PrP variants. Copyright © 2017 European Peptide Society and John Wiley & Sons, Ltd. Copyright © 2017 European Peptide Society and John Wiley & Sons, Ltd.

  19. Human action recognition based on spatial-temporal descriptors using key poses

    NASA Astrophysics Data System (ADS)

    Hu, Shuo; Chen, Yuxin; Wang, Huaibao; Zuo, Yaqing

    2014-11-01

    Human action recognition is an important area of pattern recognition today due to its direct application and need in various occasions like surveillance and virtual reality. In this paper, a simple and effective human action recognition method is presented based on the key poses of human silhouette and the spatio-temporal feature. Firstly, the contour points of human silhouette have been gotten, and the key poses are learned by means of K-means clustering based on the Euclidean distance between each contour point and the centre point of the human silhouette, and then the type of each action is labeled for further match. Secondly, we obtain the trajectories of centre point of each frame, and create a spatio-temporal feature value represented by W to describe the motion direction and speed of each action. The value W contains the information of location and temporal order of each point on the trajectories. Finally, the matching stage is performed by comparing the key poses and W between training sequences and test sequences, the nearest neighbor sequences is found and its label supplied the final result. Experiments on the public available Weizmann datasets show the proposed method can improve accuracy by distinguishing amphibious poses and increase suitability for real-time applications by reducing the computational cost.

  20. Structure-function analysis of diacylglycerol acyltransferase sequences from 70 organisms

    USDA-ARS?s Scientific Manuscript database

    Diacylglycerol acyltransferases (DGATs) catalyze the final and rate-limiting step of triacylglycerol (TAG) biosynthesis in eukaryotic organisms. Understanding the roles of DGATs will help to create transgenic plants with value-added properties and provide clues for therapeutic intervention for obes...

  1. Sequence analysis of diacylglycerol acyltransferases

    USDA-ARS?s Scientific Manuscript database

    Diacylglycerol acyltransferases (DGATs) catalyze the final step of triacylglycerol (TAG) biosynthesis in eukaryotes. DGATs esterify sn-1,2-diacylglycerol with a long-chain fatty acyl-CoA. Plants and animals deficient in DGATs accumulate less TAG and over-expression of DGATs increases TAG. DGAT knock...

  2. 77 FR 35625 - Statement of General Policy on the Sequencing of the Compliance Dates for Final Rules Applicable...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-06-14

    ... a ``big bang'' approach where all of the rules to be adopted under Title VII go into effect... `big bang' approach to implementation would be too disruptive to the marketplace--particularly given...

  3. DNA SEQUENCE SIMILARITY REQUIREMENTS FOR INTERSPECIFIC RECOMBINATION IN BACILLUS. (R825348)

    EPA Science Inventory

    The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...

  4. Microbial and viral-like rhodopsins present in coastal marine sediments from four polar and subpolar regions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    López, José L.; Golemba, Marcelo; Hernández, Edgardo

    Rhodopsins are broadly distributed. In this work, we analyzed 23 metagenomes corresponding to marine sediment samples from four regions that share cold climate conditions (Norway; Sweden; Argentina and Antarctica). In order to investigate the genes evolution of viral rhodopsins, an initial set of 6224 bacterial rhodopsin sequences according to COG5524 were retrieved from the 23 metagenomes. After selection by the presence of transmembrane domains and alignment, 123 viral (51) and non-viral (72) sequences (>50 amino acids) were finally included in further analysis. Viral rhodopsin genes were homologs of Phaeocystis globosa virus and Organic lake Phycodnavirus. Non-viral microbial rhodopsin genes weremore » ascribed to Bacteroidetes, Planctomycetes, Firmicutes, Actinobacteria, Cyanobacteria, Proteobacteria, Deinococcus-Thermus and Cryptophyta and Fungi. A rescreening using Blastp, using as queries the viral sequences previously described, retrieved 30 sequences (>100 amino acids). Phylogeographic analysis revealed a geographical clustering of the sequences affiliated to the viral group. This clustering was not observed for the microbial non-viral sequences. The phylogenetic reconstruction allowed us to propose the existence of a putative ancestor of viral rhodopsin genes related to Actinobacteria and Chloroflexi. This is the first report about the existence of a phylogeographic association of the viral rhodopsin sequences from marine sediments.« less

  5. A Segmentation Method for Lung Parenchyma Image Sequences Based on Superpixels and a Self-Generating Neural Forest

    PubMed Central

    Liao, Xiaolei; Zhao, Juanjuan; Jiao, Cheng; Lei, Lei; Qiang, Yan; Cui, Qiang

    2016-01-01

    Background Lung parenchyma segmentation is often performed as an important pre-processing step in the computer-aided diagnosis of lung nodules based on CT image sequences. However, existing lung parenchyma image segmentation methods cannot fully segment all lung parenchyma images and have a slow processing speed, particularly for images in the top and bottom of the lung and the images that contain lung nodules. Method Our proposed method first uses the position of the lung parenchyma image features to obtain lung parenchyma ROI image sequences. A gradient and sequential linear iterative clustering algorithm (GSLIC) for sequence image segmentation is then proposed to segment the ROI image sequences and obtain superpixel samples. The SGNF, which is optimized by a genetic algorithm (GA), is then utilized for superpixel clustering. Finally, the grey and geometric features of the superpixel samples are used to identify and segment all of the lung parenchyma image sequences. Results Our proposed method achieves higher segmentation precision and greater accuracy in less time. It has an average processing time of 42.21 seconds for each dataset and an average volume pixel overlap ratio of 92.22 ± 4.02% for four types of lung parenchyma image sequences. PMID:27532214

  6. Standardization and quality management in next-generation sequencing.

    PubMed

    Endrullat, Christoph; Glökler, Jörn; Franke, Philipp; Frohme, Marcus

    2016-09-01

    DNA sequencing continues to evolve quickly even after > 30 years. Many new platforms suddenly appeared and former established systems have vanished in almost the same manner. Since establishment of next-generation sequencing devices, this progress gains momentum due to the continually growing demand for higher throughput, lower costs and better quality of data. In consequence of this rapid development, standardized procedures and data formats as well as comprehensive quality management considerations are still scarce. Here, we listed and summarized current standardization efforts and quality management initiatives from companies, organizations and societies in form of published studies and ongoing projects. These comprise on the one hand quality documentation issues like technical notes, accreditation checklists and guidelines for validation of sequencing workflows. On the other hand, general standard proposals and quality metrics are developed and applied to the sequencing workflow steps with the main focus on upstream processes. Finally, certain standard developments for downstream pipeline data handling, processing and storage are discussed in brief. These standardization approaches represent a first basis for continuing work in order to prospectively implement next-generation sequencing in important areas such as clinical diagnostics, where reliable results and fast processing is crucial. Additionally, these efforts will exert a decisive influence on traceability and reproducibility of sequence data.

  7. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hraber, Peter; Korber, Bette; Wagh, Kshitij

    Within-host genetic sequencing from samples collected over time provides a dynamic view of how viruses evade host immunity. Immune-driven mutations might stimulate neutralization breadth by selecting antibodies adapted to cycles of immune escape that generate within-subject epitope diversity. Comprehensive identification of immune-escape mutations is experimentally and computationally challenging. With current technology, many more viral sequences can readily be obtained than can be tested for binding and neutralization, making down-selection necessary. Typically, this is done manually, by picking variants that represent different time-points and branches on a phylogenetic tree. Such strategies are likely to miss many relevant mutations and combinations ofmore » mutations, and to be redundant for other mutations. Longitudinal Antigenic Sequences and Sites from Intrahost Evolution (LASSIE) uses transmitted founder loss to identify virus “hot-spots” under putative immune selection and chooses sequences that represent recurrent mutations in selected sites. LASSIE favors earliest sequences in which mutations arise. Here, with well-characterized longitudinal Env sequences, we confirmed selected sites were concentrated in antibody contacts and selected sequences represented diverse antigenic phenotypes. Finally, practical applications include rapidly identifying immune targets under selective pressure within a subject, selecting minimal sets of reagents for immunological assays that characterize evolving antibody responses, and for immunogens in polyvalent “cocktail” vaccines.« less

  8. Foreign Plastid Sequences in Plant Mitochondria are Frequently Acquired Via Mitochondrion-to-Mitochondrion Horizontal Transfer

    PubMed Central

    Gandini, C. L.; Sanchez-Puerta, M. V.

    2017-01-01

    Angiosperm mitochondrial genomes (mtDNA) exhibit variable quantities of alien sequences. Many of these sequences are acquired by intracellular gene transfer (IGT) from the plastid. In addition, frequent events of horizontal gene transfer (HGT) between mitochondria of different species also contribute to their expanded genomes. In contrast, alien sequences are rarely found in plastid genomes. Most of the plant-to-plant HGT events involve mitochondrion-to-mitochondrion transfers. Occasionally, foreign sequences in mtDNAs are plastid-derived (MTPT), raising questions about their origin, frequency, and mechanism of transfer. The rising number of complete mtDNAs allowed us to address these questions. We identified 15 new foreign MTPTs, increasing significantly the number of those previously reported. One out of five of the angiosperm species analyzed contained at least one foreign MTPT, suggesting a remarkable frequency of HGT among plants. By analyzing the flanking regions of the foreign MTPTs, we found strong evidence for mt-to-mt transfers in 65% of the cases. We hypothesize that plastid sequences were initially acquired by the native mtDNA via IGT and then transferred to a distantly-related plant via mitochondrial HGT, rather than directly from a foreign plastid to the mitochondrial genome. Finally, we describe three novel putative cases of mitochondrial-derived sequences among angiosperm plastomes. PMID:28262720

  9. Musical Scales in Tone Sequences Improve Temporal Accuracy.

    PubMed

    Li, Min S; Di Luca, Massimiliano

    2018-01-01

    Predicting the time of stimulus onset is a key component in perception. Previous investigations of perceived timing have focused on the effect of stimulus properties such as rhythm and temporal irregularity, but the influence of non-temporal properties and their role in predicting stimulus timing has not been exhaustively considered. The present study aims to understand how a non-temporal pattern in a sequence of regularly timed stimuli could improve or bias the detection of temporal deviations. We presented interspersed sequences of 3, 4, 5, and 6 auditory tones where only the timing of the last stimulus could slightly deviate from isochrony. Participants reported whether the last tone was 'earlier' or 'later' relative to the expected regular timing. In two conditions, the tones composing the sequence were either organized into musical scales or they were random tones. In one experiment, all sequences ended with the same tone; in the other experiment, each sequence ended with a different tone. Results indicate higher discriminability of anisochrony with musical scales and with longer sequences, irrespective of the knowledge of the final tone. Such an outcome suggests that the predictability of non-temporal properties, as enabled by the musical scale pattern, can be a factor in determining the sensitivity of time judgments.

  10. On the Power and the Systematic Biases of the Detection of Chromosomal Inversions by Paired-End Genome Sequencing

    PubMed Central

    Lucas Lledó, José Ignacio; Cáceres, Mario

    2013-01-01

    One of the most used techniques to study structural variation at a genome level is paired-end mapping (PEM). PEM has the advantage of being able to detect balanced events, such as inversions and translocations. However, inversions are still quite difficult to predict reliably, especially from high-throughput sequencing data. We simulated realistic PEM experiments with different combinations of read and library fragment lengths, including sequencing errors and meaningful base-qualities, to quantify and track down the origin of false positives and negatives along sequencing, mapping, and downstream analysis. We show that PEM is very appropriate to detect a wide range of inversions, even with low coverage data. However, % of inversions located between segmental duplications are expected to go undetected by the most common sequencing strategies. In general, longer DNA libraries improve the detectability of inversions far better than increments of the coverage depth or the read length. Finally, we review the performance of three algorithms to detect inversions —SVDetect, GRIAL, and VariationHunter—, identify common pitfalls, and reveal important differences in their breakpoint precisions. These results stress the importance of the sequencing strategy for the detection of structural variants, especially inversions, and offer guidelines for the design of future genome sequencing projects. PMID:23637806

  11. A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data.

    PubMed

    Gog, Julia R; Lever, Andrew M L; Skittrall, Jordan P

    2018-01-01

    We present a fast, robust and parsimonious approach to detecting signals in an ordered sequence of numbers. Our motivation is in seeking a suitable method to take a sequence of scores corresponding to properties of positions in virus genomes, and find outlying regions of low scores. Suitable statistical methods without using complex models or making many assumptions are surprisingly lacking. We resolve this by developing a method that detects regions of low score within sequences of real numbers. The method makes no assumptions a priori about the length of such a region; it gives the explicit location of the region and scores it statistically. It does not use detailed mechanistic models so the method is fast and will be useful in a wide range of applications. We present our approach in detail, and test it on simulated sequences. We show that it is robust to a wide range of signal morphologies, and that it is able to capture multiple signals in the same sequence. Finally we apply it to viral genomic data to identify regions of evolutionary conservation within influenza and rotavirus.

  12. Evolution of Enzyme Superfamilies: Comprehensive Exploration of Sequence-Function Relationships.

    PubMed

    Baier, F; Copp, J N; Tokuriki, N

    2016-11-22

    The sequence and functional diversity of enzyme superfamilies have expanded through billions of years of evolution from a common ancestor. Understanding how protein sequence and functional "space" have expanded, at both the evolutionary and molecular level, is central to biochemistry, molecular biology, and evolutionary biology. Integrative approaches that examine protein sequence, structure, and function have begun to provide comprehensive views of the functional diversity and evolutionary relationships within enzyme superfamilies. In this review, we outline the recent advances in our understanding of enzyme evolution and superfamily functional diversity. We describe the tools that have been used to comprehensively analyze sequence relationships and to characterize sequence and function relationships. We also highlight recent large-scale experimental approaches that systematically determine the activity profiles across enzyme superfamilies. We identify several intriguing insights from this recent body of work. First, promiscuous activities are prevalent among extant enzymes. Second, many divergent proteins retain "function connectivity" via enzyme promiscuity, which can be used to probe the evolutionary potential and history of enzyme superfamilies. Finally, we discuss open questions regarding the intricacies of enzyme divergence, as well as potential research directions that will deepen our understanding of enzyme superfamily evolution.

  13. Cloud-based adaptive exon prediction for DNA analysis

    PubMed Central

    Putluri, Srinivasareddy; Fathima, Shaik Yasmeen

    2018-01-01

    Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database. PMID:29515813

  14. Hepatitis C virus genotypes in Singapore and Indonesia.

    PubMed

    Ng, W C; Guan, R; Tan, M F; Seet, B L; Lim, C A; Ngiam, C M; Sjaifoellah Noer, H M; Lesmana, L

    1995-01-01

    5' untranslated and partial core (C) region sequence of hepatitis C virus (HCV) in 21 Singaporean and 15 Indonesian isolates were amplified by reverse-transcription polymerase chain reaction and sequenced with the use of conserved primer sequences deduced from HCV genomes identified in other geographical regions. The HCV genotypes are predominantly that of Simmonds type 1 and less of type 2 and 3 with the latter genotype currently not detected in Indonesia. The 5' untranslated sequences are related to HCV-1. DK-7 (Denmark), US-11 (United States of America), HCV-J4, SA-10 (South Africa), T-3 (Taiwan), HCV-J6, HCV-J8, Eb-1 and Eb-8. When compared with the prototype HCV-1, insertions are found within the 5' untranslated region of Singaporean isolates and not in the Indonesians. There are Singaporean and Indonesian isolates that have sequences within the 5' untranslated region that differ slightly from each other. Microheterogeneity is observed in the core region of two Singaporeans and one Indonesian isolate. Finally, not all HCV isolates can be amplified with the conserved core sequence primers when compared with the ease with which these isolates can be amplified with 5' untranslated region conserved primers.

  15. FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation

    DOE PAGES

    Bolleman, Jerven T.; Mungall, Christopher J.; Strozzi, Francesco; ...

    2016-06-13

    Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. In this paper, we have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned “omics” areas. Using the same data formatmore » to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. Our ontology allows users to uniformly describe – and potentially merge – sequence annotations from multiple sources. Finally, data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.« less

  16. VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research

    PubMed Central

    Lai, Zhongwu; Markovets, Aleksandra; Ahdesmaki, Miika; Chapman, Brad; Hofmann, Oliver; McEwen, Robert; Johnson, Justin; Dougherty, Brian; Barrett, J. Carl; Dry, Jonathan R.

    2016-01-01

    Abstract Accurate variant calling in next generation sequencing (NGS) is critical to understand cancer genomes better. Here we present VarDict, a novel and versatile variant caller for both DNA- and RNA-sequencing data. VarDict simultaneously calls SNV, MNV, InDels, complex and structural variants, expanding the detected genetic driver landscape of tumors. It performs local realignments on the fly for more accurate allele frequency estimation. VarDict performance scales linearly to sequencing depth, enabling ultra-deep sequencing used to explore tumor evolution or detect tumor DNA circulating in blood. In addition, VarDict performs amplicon aware variant calling for polymerase chain reaction (PCR)-based targeted sequencing often used in diagnostic settings, and is able to detect PCR artifacts. Finally, VarDict also detects differences in somatic and loss of heterozygosity variants between paired samples. VarDict reprocessing of The Cancer Genome Atlas (TCGA) Lung Adenocarcinoma dataset called known driver mutations in KRAS, EGFR, BRAF, PIK3CA and MET in 16% more patients than previously published variant calls. We believe VarDict will greatly facilitate application of NGS in clinical cancer research. PMID:27060149

  17. Targeted or whole genome sequencing of formalin fixed tissue samples: potential applications in cancer genomics.

    PubMed

    Munchel, Sarah; Hoang, Yen; Zhao, Yue; Cottrell, Joseph; Klotzle, Brandy; Godwin, Andrew K; Koestler, Devin; Beyerlein, Peter; Fan, Jian-Bing; Bibikova, Marina; Chien, Jeremy

    2015-09-22

    Current genomic studies are limited by the poor availability of fresh-frozen tissue samples. Although formalin-fixed diagnostic samples are in abundance, they are seldom used in current genomic studies because of the concern of formalin-fixation artifacts. Better characterization of these artifacts will allow the use of archived clinical specimens in translational and clinical research studies. To provide a systematic analysis of formalin-fixation artifacts on Illumina sequencing, we generated 26 DNA sequencing data sets from 13 pairs of matched formalin-fixed paraffin-embedded (FFPE) and fresh-frozen (FF) tissue samples. The results indicate high rate of concordant calls between matched FF/FFPE pairs at reference and variant positions in three commonly used sequencing approaches (whole genome, whole exome, and targeted exon sequencing). Global mismatch rates and C · G > T · A substitutions were comparable between matched FF/FFPE samples, and discordant rates were low (<0.26%) in all samples. Finally, low-pass whole genome sequencing produces similar pattern of copy number alterations between FF/FFPE pairs. The results from our studies suggest the potential use of diagnostic FFPE samples for cancer genomic studies to characterize and catalog variations in cancer genomes.

  18. A short review of variants calling for single-cell-sequencing data with applications.

    PubMed

    Wei, Zhuohui; Shu, Chang; Zhang, Changsheng; Huang, Jingying; Cai, Hongmin

    2017-11-01

    The field of single-cell sequencing is fleetly expanding, and many techniques have been developed in the past decade. With this technology, biologists can study not only the heterogeneity between two adjacent cells in the same tissue or organ, but also the evolutionary relationships and degenerative processes in a single cell. Calling variants is the main purpose in analyzing single cell sequencing (SCS) data. Currently, some popular methods used for bulk-cell-sequencing data analysis are tailored directly to be applied in dealing with SCS data. However, SCS requires an extra step of genome amplification to accumulate enough quantity for satisfying sequencing needs. The amplification yields large biases and thus raises challenge for using the bulk-cell-sequencing methods. In order to provide guidance for the development of specialized analyzed methods as well as using currently developed tools for SNS, this paper aims to bridge the gap. In this paper, we firstly introduced two popular genome amplification methods and compared their capabilities. Then we introduced a few popular models for calling single-nucleotide polymorphisms and copy-number variations. Finally, break-through applications of SNS were summarized to demonstrate its potential in researching cell evolution. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. TagDust2: a generic method to extract reads from sequencing data.

    PubMed

    Lassmann, Timo

    2015-01-28

    Arguably the most basic step in the analysis of next generation sequencing data (NGS) involves the extraction of mappable reads from the raw reads produced by sequencing instruments. The presence of barcodes, adaptors and artifacts subject to sequencing errors makes this step non-trivial. Here I present TagDust2, a generic approach utilizing a library of hidden Markov models (HMM) to accurately extract reads from a wide array of possible read architectures. TagDust2 extracts more reads of higher quality compared to other approaches. Processing of multiplexed single, paired end and libraries containing unique molecular identifiers is fully supported. Two additional post processing steps are included to exclude known contaminants and filter out low complexity sequences. Finally, TagDust2 can automatically detect the library type of sequenced data from a predefined selection. Taken together TagDust2 is a feature rich, flexible and adaptive solution to go from raw to mappable NGS reads in a single step. The ability to recognize and record the contents of raw reads will help to automate and demystify the initial, and often poorly documented, steps in NGS data analysis pipelines. TagDust2 is freely available at: http://tagdust.sourceforge.net .

  20. FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bolleman, Jerven T.; Mungall, Christopher J.; Strozzi, Francesco

    Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. In this paper, we have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned “omics” areas. Using the same data formatmore » to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. Our ontology allows users to uniformly describe – and potentially merge – sequence annotations from multiple sources. Finally, data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.« less

  1. The practical evaluation of DNA barcode efficacy.

    PubMed

    Spouge, John L; Mariño-Ramírez, Leonardo

    2012-01-01

    This chapter describes a workflow for measuring the efficacy of a barcode in identifying species. First, assemble individual sequence databases corresponding to each barcode marker. A controlled collection of taxonomic data is preferable to GenBank data, because GenBank data can be problematic, particularly when comparing barcodes based on more than one marker. To ensure proper controls when evaluating species identification, specimens not having a sequence in every marker database should be discarded. Second, select a computer algorithm for assigning species to barcode sequences. No algorithm has yet improved notably on assigning a specimen to the species of its nearest neighbor within a barcode database. Because global sequence alignments (e.g., with the Needleman-Wunsch algorithm, or some related algorithm) examine entire barcode sequences, they generally produce better species assignments than local sequence alignments (e.g., with BLAST). No neighboring method (e.g., global sequence similarity, global sequence distance, or evolutionary distance based on a global alignment) has yet shown a notable superiority in identifying species. Finally, "the probability of correct identification" (PCI) provides an appropriate measurement of barcode efficacy. The overall PCI for a data set is the average of the species PCIs, taken over all species in the data set. This chapter states explicitly how to calculate PCI, how to estimate its statistical sampling error, and how to use data on PCR failure to set limits on how much improvements in PCR technology can improve species identification.

  2. Longitudinal Antigenic Sequences and Sites from Intra-Host Evolution (LASSIE) identifies immune-selected HIV variants

    DOE PAGES

    Hraber, Peter; Korber, Bette; Wagh, Kshitij; ...

    2015-10-21

    Within-host genetic sequencing from samples collected over time provides a dynamic view of how viruses evade host immunity. Immune-driven mutations might stimulate neutralization breadth by selecting antibodies adapted to cycles of immune escape that generate within-subject epitope diversity. Comprehensive identification of immune-escape mutations is experimentally and computationally challenging. With current technology, many more viral sequences can readily be obtained than can be tested for binding and neutralization, making down-selection necessary. Typically, this is done manually, by picking variants that represent different time-points and branches on a phylogenetic tree. Such strategies are likely to miss many relevant mutations and combinations ofmore » mutations, and to be redundant for other mutations. Longitudinal Antigenic Sequences and Sites from Intrahost Evolution (LASSIE) uses transmitted founder loss to identify virus “hot-spots” under putative immune selection and chooses sequences that represent recurrent mutations in selected sites. LASSIE favors earliest sequences in which mutations arise. Here, with well-characterized longitudinal Env sequences, we confirmed selected sites were concentrated in antibody contacts and selected sequences represented diverse antigenic phenotypes. Finally, practical applications include rapidly identifying immune targets under selective pressure within a subject, selecting minimal sets of reagents for immunological assays that characterize evolving antibody responses, and for immunogens in polyvalent “cocktail” vaccines.« less

  3. A segmentation method for lung nodule image sequences based on superpixels and density-based spatial clustering of applications with noise

    PubMed Central

    Zhang, Wei; Zhang, Xiaolong; Qiang, Yan; Tian, Qi; Tang, Xiaoxian

    2017-01-01

    The fast and accurate segmentation of lung nodule image sequences is the basis of subsequent processing and diagnostic analyses. However, previous research investigating nodule segmentation algorithms cannot entirely segment cavitary nodules, and the segmentation of juxta-vascular nodules is inaccurate and inefficient. To solve these problems, we propose a new method for the segmentation of lung nodule image sequences based on superpixels and density-based spatial clustering of applications with noise (DBSCAN). First, our method uses three-dimensional computed tomography image features of the average intensity projection combined with multi-scale dot enhancement for preprocessing. Hexagonal clustering and morphological optimized sequential linear iterative clustering (HMSLIC) for sequence image oversegmentation is then proposed to obtain superpixel blocks. The adaptive weight coefficient is then constructed to calculate the distance required between superpixels to achieve precise lung nodules positioning and to obtain the subsequent clustering starting block. Moreover, by fitting the distance and detecting the change in slope, an accurate clustering threshold is obtained. Thereafter, a fast DBSCAN superpixel sequence clustering algorithm, which is optimized by the strategy of only clustering the lung nodules and adaptive threshold, is then used to obtain lung nodule mask sequences. Finally, the lung nodule image sequences are obtained. The experimental results show that our method rapidly, completely and accurately segments various types of lung nodule image sequences. PMID:28880916

  4. A pipeline of programs for collecting and analyzing group II intron retroelement sequences from GenBank

    PubMed Central

    2013-01-01

    Background Accurate and complete identification of mobile elements is a challenging task in the current era of sequencing, given their large numbers and frequent truncations. Group II intron retroelements, which consist of a ribozyme and an intron-encoded protein (IEP), are usually identified in bacterial genomes through their IEP; however, the RNA component that defines the intron boundaries is often difficult to identify because of a lack of strong sequence conservation corresponding to the RNA structure. Compounding the problem of boundary definition is the fact that a majority of group II intron copies in bacteria are truncated. Results Here we present a pipeline of 11 programs that collect and analyze group II intron sequences from GenBank. The pipeline begins with a BLAST search of GenBank using a set of representative group II IEPs as queries. Subsequent steps download the corresponding genomic sequences and flanks, filter out non-group II introns, assign introns to phylogenetic subclasses, filter out incomplete and/or non-functional introns, and assign IEP sequences and RNA boundaries to the full-length introns. In the final step, the redundancy in the data set is reduced by grouping introns into sets of ≥95% identity, with one example sequence chosen to be the representative. Conclusions These programs should be useful for comprehensive identification of group II introns in sequence databases as data continue to rapidly accumulate. PMID:24359548

  5. Definition of Cis-Acting Elements Regulating Expression of the Drosophila Melanogaster Ninae Opsin Gene by Oligonucleotide-Directed Mutagenesis

    PubMed Central

    Mismer, D.; Rubin, G. M.

    1989-01-01

    We have analyzed the cis-acting regulatory sequences of the Rh1 (ninaE) gene in Drosophila melanogaster by P-element-mediated germline transformation of indicator genes transcribed from mutant ninaE promoter sequences. We have previously shown that a 200-bp region extending from -120 to +67 relative to the transcription start site is sufficient to obtain eye-specific expression from the ninaE promoter. In the present study, 22 different 4-13-bp sequences in the -120/+67 promoter region were altered by oligonucleotide-directed mutagenesis. Several of these sequences were found to be required for proper promoter function; two of these are conserved in the promoter of the homologous gene isolated from the related species Drosophila virilis. Alteration of a conserved 9-bp sequence results in aberrant, low level expression in the body. Alteration of a separate 11-bp sequence, found in the promoter regions of several photoreceptor-specific genes of Drosophila, results in an approximately 15-fold reduction in promoter efficiency but without apparent alteration of tissue-specificity. A protein factor capable of interacting with this 11-bp sequence has been detected by DNaseI footprinting in embryonic nuclear extracts. Finally, we have further characterized two separable enhancer sequences previously shown to be required for normal levels of expression from this promoter. PMID:2521839

  6. Temporal and spatial localization of prediction-error signals in the visual brain.

    PubMed

    Johnston, Patrick; Robinson, Jonathan; Kokkinakis, Athanasios; Ridgeway, Samuel; Simpson, Michael; Johnson, Sam; Kaufman, Jordy; Young, Andrew W

    2017-04-01

    It has been suggested that the brain pre-empts changes in the environment through generating predictions, although real-time electrophysiological evidence of prediction violations in the domain of visual perception remain elusive. In a series of experiments we showed participants sequences of images that followed a predictable implied sequence or whose final image violated the implied sequence. Through careful design we were able to use the same final image transitions across predictable and unpredictable conditions, ensuring that any differences in neural responses were due only to preceding context and not to the images themselves. EEG and MEG recordings showed that early (N170) and mid-latency (N300) visual evoked potentials were robustly modulated by images that violated the implied sequence across a range of types of image change (expression deformations, rigid-rotations and visual field location). This modulation occurred irrespective of stimulus object category. Although the stimuli were static images, MEG source reconstruction of the early latency signal (N/M170) localized expectancy violation signals to brain areas associated with motion perception. Our findings suggest that the N/M170 can index mismatches between predicted and actual visual inputs in a system that predicts trajectories based on ongoing context. More generally we suggest that the N/M170 may reflect a "family" of brain signals generated across widespread regions of the visual brain indexing the resolution of top-down influences and incoming sensory data. This has important implications for understanding the N/M170 and investigating how the brain represents context to generate perceptual predictions. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Alignment of 1000 Genomes Project reads to reference assembly GRCh38.

    PubMed

    Zheng-Bradley, Xiangqun; Streeter, Ian; Fairley, Susan; Richardson, David; Clarke, Laura; Flicek, Paul

    2017-07-01

    The 1000 Genomes Project produced more than 100 trillion basepairs of short read sequence from more than 2600 samples in 26 populations over a period of five years. In its final phase, the project released over 85 million genotyped and phased variants on human reference genome assembly GRCh37. An updated reference assembly, GRCh38, was released in late 2013, but there was insufficient time for the final phase of the project analysis to change to the new assembly. Although it is possible to lift the coordinates of the 1000 Genomes Project variants to the new assembly, this is a potentially error-prone process as coordinate remapping is most appropriate only for non-repetitive regions of the genome and those that did not see significant change between the two assemblies. It will also miss variants in any region that was newly added to GRCh38. Thus, to produce the highest quality variants and genotypes on GRCh38, the best strategy is to realign the reads and recall the variants based on the new alignment. As the first step of variant calling for the 1000 Genomes Project data, we have finished remapping all of the 1000 Genomes sequence reads to GRCh38 with alternative scaffold-aware BWA-MEM. The resulting alignments are available as CRAM, a reference-based sequence compression format. The data have been released on our FTP site and are also available from European Nucleotide Archive to facilitate researchers discovering variants on the primary sequences and alternative contigs of GRCh38. © The Authors 2017. Published by Oxford University Press.

  8. Discriminative prediction of mammalian enhancers from DNA sequence

    PubMed Central

    Lee, Dongwon; Karchin, Rachel; Beer, Michael A.

    2011-01-01

    Accurately predicting regulatory sequences and enhancers in entire genomes is an important but difficult problem, especially in large vertebrate genomes. With the advent of ChIP-seq technology, experimental detection of genome-wide EP300/CREBBP bound regions provides a powerful platform to develop predictive tools for regulatory sequences and to study their sequence properties. Here, we develop a support vector machine (SVM) framework which can accurately identify EP300-bound enhancers using only genomic sequence and an unbiased set of general sequence features. Moreover, we find that the predictive sequence features identified by the SVM classifier reveal biologically relevant sequence elements enriched in the enhancers, but we also identify other features that are significantly depleted in enhancers. The predictive sequence features are evolutionarily conserved and spatially clustered, providing further support of their functional significance. Although our SVM is trained on experimental data, we also predict novel enhancers and show that these putative enhancers are significantly enriched in both ChIP-seq signal and DNase I hypersensitivity signal in the mouse brain and are located near relevant genes. Finally, we present results of comparisons between other EP300/CREBBP data sets using our SVM and uncover sequence elements enriched and/or depleted in the different classes of enhancers. Many of these sequence features play a role in specifying tissue-specific or developmental-stage-specific enhancer activity, but our results indicate that some features operate in a general or tissue-independent manner. In addition to providing a high confidence list of enhancer targets for subsequent experimental investigation, these results contribute to our understanding of the general sequence structure of vertebrate enhancers. PMID:21875935

  9. The quest for rare variants: pooled multiplexed next generation sequencing in plants.

    PubMed

    Marroni, Fabio; Pinosio, Sara; Morgante, Michele

    2012-01-01

    Next generation sequencing (NGS) instruments produce an unprecedented amount of sequence data at contained costs. This gives researchers the possibility of designing studies with adequate power to identify rare variants at a fraction of the economic and labor resources required by individual Sanger sequencing. As of today, few research groups working in plant sciences have exploited this potentiality, showing that pooled NGS provides results in excellent agreement with those obtained by individual Sanger sequencing. The aim of this review is to convey to the reader the general ideas underlying the use of pooled NGS for the identification of rare variants. To facilitate a thorough understanding of the possibilities of the method, we will explain in detail the possible experimental and analytical approaches and discuss their advantages and disadvantages. We will show that information on allele frequency obtained by pooled NGS can be used to accurately compute basic population genetics indexes such as allele frequency, nucleotide diversity, and Tajima's D. Finally, we will discuss applications and future perspectives of the multiplexed NGS approach.

  10. Current state-of-art of STR sequencing in forensic genetics.

    PubMed

    Alonso, Antonio; Barrio, Pedro A; Müller, Petra; Köcher, Steffi; Berger, Burkhard; Martin, Pablo; Bodner, Martin; Willuweit, Sascha; Parson, Walther; Roewer, Lutz; Budowle, Bruce

    2018-05-11

    The current state of validation and implementation strategies of MPS technology for the analysis of STR markers for forensic genetics use is described, covering the topics of the current catalogue of commercial MPS-STR panels, leading MPS-platforms, and MPS-STR data analysis tools. In addition, the developmental and internal validation studies carried out to date to evaluate reliability, sensitivity, mixture analysis, concordance, and the ability to analyze challenged samples are summarized. The results of various MPS-STR population studies that showed a large number of new STR sequence variants that increase the power of discrimination in several forensically-relevant loci are also presented. Finally, various initiatives developed by several international projects and standardization (or guidelines) groups to facilitate application of MPS technology for STR marker analyses are discussed in regard to promoting a standard STR sequence nomenclature, performing population studies to detect sequence variants, and developing a universal system to translate sequence variants into a simple STR nomenclature (numbers and letters) compatible with national STR databases. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  11. Whole Genome Sequence of Two Wild-Derived Mus musculus domesticus Inbred Strains, LEWES/EiJ and ZALENDE/EiJ, with Different Diploid Numbers

    PubMed Central

    Morgan, Andrew P.; Didion, John P.; Doran, Anthony G.; Holt, James M.; McMillan, Leonard; Keane, Thomas M.; de Villena, Fernando Pardo-Manuel

    2016-01-01

    Wild-derived mouse inbred strains are becoming increasingly popular for complex traits analysis, evolutionary studies, and systems genetics. Here, we report the whole-genome sequencing of two wild-derived mouse inbred strains, LEWES/EiJ and ZALENDE/EiJ, of Mus musculus domesticus origin. These two inbred strains were selected based on their geographic origin, karyotype, and use in ongoing research. We generated 14× and 18× coverage sequence, respectively, and discovered over 1.1 million novel variants, most of which are private to one of these strains. This report expands the number of wild-derived inbred genomes in the Mus genus from six to eight. The sequence variation can be accessed via an online query tool; variant calls (VCF format) and alignments (BAM format) are available for download from a dedicated ftp site. Finally, the sequencing data have also been stored in a lossless, compressed, and indexed format using the multi-string Burrows-Wheeler transform. All data can be used without restriction. PMID:27765810

  12. RNA Relics and Origin of Life

    PubMed Central

    Demongeot, Jacques; Glade, Nicolas; Moreira, Andrés; Vial, Laurent

    2009-01-01

    A number of small RNA sequences, located in different non-coding sequences and highly preserved across the tree of life, have been suggested to be molecular fossils, of ancient (and possibly primordial) origin. On the other hand, recent years have revealed the existence of ubiquitous roles for small RNA sequences in modern organisms, in functions ranging from cell regulation to antiviral activity. We propose that a single thread can be followed from the beginning of life in RNA structures selected only for stability reasons through the RNA relics and up to the current coevolution of RNA sequences; such an understanding would shed light both on the history and on the present development of the RNA machinery and interactions. After presenting the evidence (by comparing their sequences) that points toward a common thread, we discuss a scenario of genome coevolution (with emphasis on viral infectious processes) and finally propose a plan for the reevaluation of the stereochemical theory of the genetic code; we claim that it may still be relevant, and not only for understanding the origin of life, but also for a comprehensive picture of regulation in present-day cells. PMID:20111682

  13. Nanowire-nanopore transistor sensor for DNA detection during translocation

    NASA Astrophysics Data System (ADS)

    Xie, Ping; Xiong, Qihua; Fang, Ying; Qing, Quan; Lieber, Charles

    2011-03-01

    Nanopore sequencing, as a promising low cost, high throughput sequencing technique, has been proposed more than a decade ago. Due to the incompatibility between small ionic current signal and fast translocation speed and the technical difficulties on large scale integration of nanopore for direct ionic current sequencing, alternative methods rely on integrated DNA sensors have been proposed, such as using capacitive coupling or tunnelling current etc. But none of them have been experimentally demonstrated yet. Here we show that for the first time an amplified sensor signal has been experimentally recorded from a nanowire-nanopore field effect transistor sensor during DNA translocation. Independent multi-channel recording was also demonstrated for the first time. Our results suggest that the signal is from highly localized potential change caused by DNA translocation in none-balanced buffer condition. Given this method may produce larger signal for smaller nanopores, we hope our experiment can be a starting point for a new generation of nanopore sequencing devices with larger signal, higher bandwidth and large-scale multiplexing capability and finally realize the ultimate goal of low cost high throughput sequencing.

  14. High-fidelity target sequencing of individual molecules identified using barcode sequences: de novo detection and absolute quantitation of mutations in plasma cell-free DNA from cancer patients.

    PubMed

    Kukita, Yoji; Matoba, Ryo; Uchida, Junji; Hamakawa, Takuya; Doki, Yuichiro; Imamura, Fumio; Kato, Kikuya

    2015-08-01

    Circulating tumour DNA (ctDNA) is an emerging field of cancer research. However, current ctDNA analysis is usually restricted to one or a few mutation sites due to technical limitations. In the case of massively parallel DNA sequencers, the number of false positives caused by a high read error rate is a major problem. In addition, the final sequence reads do not represent the original DNA population due to the global amplification step during the template preparation. We established a high-fidelity target sequencing system of individual molecules identified in plasma cell-free DNA using barcode sequences; this system consists of the following two steps. (i) A novel target sequencing method that adds barcode sequences by adaptor ligation. This method uses linear amplification to eliminate the errors introduced during the early cycles of polymerase chain reaction. (ii) The monitoring and removal of erroneous barcode tags. This process involves the identification of individual molecules that have been sequenced and for which the number of mutations have been absolute quantitated. Using plasma cell-free DNA from patients with gastric or lung cancer, we demonstrated that the system achieved near complete elimination of false positives and enabled de novo detection and absolute quantitation of mutations in plasma cell-free DNA. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  15. Effect of the sequence data deluge on the performance of methods for detecting protein functional residues.

    PubMed

    Garrido-Martín, Diego; Pazos, Florencio

    2018-02-27

    The exponential accumulation of new sequences in public databases is expected to improve the performance of all the approaches for predicting protein structural and functional features. Nevertheless, this was never assessed or quantified for some widely used methodologies, such as those aimed at detecting functional sites and functional subfamilies in protein multiple sequence alignments. Using raw protein sequences as only input, these approaches can detect fully conserved positions, as well as those with a family-dependent conservation pattern. Both types of residues are routinely used as predictors of functional sites and, consequently, understanding how the sequence content of the databases affects them is relevant and timely. In this work we evaluate how the growth and change with time in the content of sequence databases affect five sequence-based approaches for detecting functional sites and subfamilies. We do that by recreating historical versions of the multiple sequence alignments that would have been obtained in the past based on the database contents at different time points, covering a period of 20 years. Applying the methods to these historical alignments allows quantifying the temporal variation in their performance. Our results show that the number of families to which these methods can be applied sharply increases with time, while their ability to detect potentially functional residues remains almost constant. These results are informative for the methods' developers and final users, and may have implications in the design of new sequencing initiatives.

  16. GuiTope: an application for mapping random-sequence peptides to protein sequences.

    PubMed

    Halperin, Rebecca F; Stafford, Phillip; Emery, Jack S; Navalkar, Krupa Arun; Johnston, Stephen Albert

    2012-01-03

    Random-sequence peptide libraries are a commonly used tool to identify novel ligands for binding antibodies, other proteins, and small molecules. It is often of interest to compare the selected peptide sequences to the natural protein binding partners to infer the exact binding site or the importance of particular residues. The ability to search a set of sequences for similarity to a set of peptides may sometimes enable the prediction of an antibody epitope or a novel binding partner. We have developed a software application designed specifically for this task. GuiTope provides a graphical user interface for aligning peptide sequences to protein sequences. All alignment parameters are accessible to the user including the ability to specify the amino acid frequency in the peptide library; these frequencies often differ significantly from those assumed by popular alignment programs. It also includes a novel feature to align di-peptide inversions, which we have found improves the accuracy of antibody epitope prediction from peptide microarray data and shows utility in analyzing phage display datasets. Finally, GuiTope can randomly select peptides from a given library to estimate a null distribution of scores and calculate statistical significance. GuiTope provides a convenient method for comparing selected peptide sequences to protein sequences, including flexible alignment parameters, novel alignment features, ability to search a database, and statistical significance of results. The software is available as an executable (for PC) at http://www.immunosignature.com/software and ongoing updates and source code will be available at sourceforge.net.

  17. ST-intuitionistic fuzzy metric space with properties

    NASA Astrophysics Data System (ADS)

    Arora, Sahil; Kumar, Tanuj

    2017-07-01

    In this paper, we define ST-intuitionistic fuzzy metric space and the notion of convergence and completeness properties of cauchy sequences is studied. Further, we prove some properties of ST-intuitionistic fuzzy metric space. Finally, we introduce the concept of symmetric ST Intuitionistic Fuzzy metric space.

  18. ASSESSMENT OF CLONE IDENTITY AND SEQUENCE FIDELITY FOR 1189 IMAGE CDNA CLONES. (R827402)

    EPA Science Inventory

    The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...

  19. Updates on HRF Payloads Operations in Columbus ATCS

    NASA Technical Reports Server (NTRS)

    DePalo, Savino; Wright, Bruce D.; La,e Robert E.; Challis, Simon; Davenport, Robert; Pietrafesa, Donata

    2011-01-01

    The NASA developed Human Research Facility 1 (HRF1) and Human Research Facility (HRF2) experiment racks have been operating in the European Space Agency (ESA) Columbus module of the International Space Station (ISS) since Summer 2008. The two racks are of the same design. Since the start of operations, unexpected pressure spikes were observed in the Columbus module's thermal-hydraulic system during the racks activation sequence. The root cause of these spikes was identified in the activation command sequence in the Rack Interface Controller (RIC), which controls the flow of thermal-hydraulic system fluid through the rack. A new Common RIC Software (CRS) release fixed the bug and was uploaded on both racks in late 2009. This paper gives a short introduction to the topic, describes the Columbus module countermeasures to mitigate the spikes, describes the ground validation test of the new software, and describes the flight checks performed before and after the final upload. Finally, the new on-orbit test designed to further simplify the racks hydraulic management is presented.

  20. Dawn Auroral Breakup at Saturn Initiated by Auroral Arcs: UVIS/Cassini Beginning of Grand Finale Phase

    NASA Astrophysics Data System (ADS)

    Radioti, A.; Grodent, D.; Yao, Z. H.; Gérard, J.-C.; Badman, S. V.; Pryor, W.; Bonfond, B.

    2017-12-01

    We present Cassini auroral observations obtained on 11 November 2016 with the Ultraviolet Imaging Spectrograph at the beginning of the F-ring orbits and the Grand Finale phase of the mission. The spacecraft made a close approach to Saturn's southern pole and offered a remarkable view of the dayside and nightside aurora. With this sequence we identify, for the first time, the presence of dusk/midnight arcs, which are azimuthally spread from high to low latitudes, suggesting that their source region extends from the outer to middle/inner magnetosphere. The observed arcs could be auroral manifestations of plasma flows propagating toward the planet from the magnetotail, similar to terrestrial "auroral streamers." During the sequence the dawn auroral region brightens and expands poleward. We suggest that the dawn auroral breakup results from a combination of plasma instability and global-scale magnetic field reconfiguration, which is initiated by plasma flows propagating toward the planet. Alternatively, the dawn auroral enhancement could be triggered by tail magnetic reconnection.

  1. Identification of usual interstitial pneumonia pattern using RNA-Seq and machine learning: challenges and solutions.

    PubMed

    Choi, Yoonha; Liu, Tiffany Ting; Pankratz, Daniel G; Colby, Thomas V; Barth, Neil M; Lynch, David A; Walsh, P Sean; Raghu, Ganesh; Kennedy, Giulia C; Huang, Jing

    2018-05-09

    We developed a classifier using RNA sequencing data that identifies the usual interstitial pneumonia (UIP) pattern for the diagnosis of idiopathic pulmonary fibrosis. We addressed significant challenges, including limited sample size, biological and technical sample heterogeneity, and reagent and assay batch effects. We identified inter- and intra-patient heterogeneity, particularly within the non-UIP group. The models classified UIP on transbronchial biopsy samples with a receiver-operating characteristic area under the curve of ~ 0.9 in cross-validation. Using in silico mixed samples in training, we prospectively defined a decision boundary to optimize specificity at ≥85%. The penalized logistic regression model showed greater reproducibility across technical replicates and was chosen as the final model. The final model showed sensitivity of 70% and specificity of 88% in the test set. We demonstrated that the suggested methodologies appropriately addressed challenges of the sample size, disease heterogeneity and technical batch effects and developed a highly accurate and robust classifier leveraging RNA sequencing for the classification of UIP.

  2. Interpreting Microbial Biosynthesis in the Genomic Age: Biological and Practical Considerations

    PubMed Central

    Miller, Ian J.; Chevrette, Marc G.; Kwan, Jason C.

    2017-01-01

    Genome mining has become an increasingly powerful, scalable, and economically accessible tool for the study of natural product biosynthesis and drug discovery. However, there remain important biological and practical problems that can complicate or obscure biosynthetic analysis in genomic and metagenomic sequencing projects. Here, we focus on limitations of available technology as well as computational and experimental strategies to overcome them. We review the unique challenges and approaches in the study of symbiotic and uncultured systems, as well as those associated with biosynthetic gene cluster (BGC) assembly and product prediction. Finally, to explore sequencing parameters that affect the recovery and contiguity of large and repetitive BGCs assembled de novo, we simulate Illumina and PacBio sequencing of the Salinispora tropica genome focusing on assembly of the salinilactam (slm) BGC. PMID:28587290

  3. Anatomy and evolution of database search engines-a central component of mass spectrometry based proteomic workflows.

    PubMed

    Verheggen, Kenneth; Raeder, Helge; Berven, Frode S; Martens, Lennart; Barsnes, Harald; Vaudel, Marc

    2017-09-13

    Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines. © 2017 Wiley Periodicals, Inc.

  4. Whole-Exome Sequencing and Whole-Genome Sequencing in Critically Ill Neonates Suspected to Have Single-Gene Disorders

    PubMed Central

    Smith, Laurie D.; Willig, Laurel K.; Kingsmore, Stephen F.

    2016-01-01

    As the ability to identify the contribution of genetic background to human disease continues to advance, there is no discipline of medicine in which this may have a larger impact than in the care of the ill neonate. Newborns with congenital malformations, syndromic conditions, and inherited disorders often undergo an extensive, expensive, and long diagnostic process, often without a final diagnosis resulting in significant health care, societal, and personal costs. Although ethical concerns have been raised about the use of whole-genome sequencing in medical practice, its role in the diagnosis of rare disorders in ill neonates in tertiary care neonatal intensive care units has the potential to augment or modify the care of this vulnerable population of patients. PMID:26684335

  5. Extreme Precision Antenna Reflector Study Results

    NASA Technical Reports Server (NTRS)

    Sharp, G. R.; Gilger, L. D.; Ard, K. E.

    1985-01-01

    Thermal and mechanical distortion degrade the RF performance of antennas. The complexity of future communications antennas requires accurate, dimensionally stable antenna reflectors and structures built from materials other than those currently used. The advantages and disadvantages of using carbon fibers in an epoxy matrix are reviewed as well as current reflector fabrications technology and adjustment. The manufacturing sequence and coefficient of thermal expansion of carbon fiber/borosilicate glass composites is described. The construction of a parabolic reflector from this material and the assembling of both reflector and antenna are described. A 3M-aperture-diameter carbon/glass reflector that can be used as a subassembly for large reflectors is depicted. The deployment sequence for a 10.5M-aperture-diameter antenna, final reflector adjustment, and the deployment sequence for large reflectors are also illustrated.

  6. Anticipation measures of sequence learning: manual versus oculomotor versions of the serial reaction time task.

    PubMed

    Vakil, Eli; Bloch, Ayala; Cohen, Haggar

    2017-03-01

    The serial reaction time (SRT) task has generated a very large amount of research. Nevertheless the debate continues as to the exact cognitive processes underlying implicit sequence learning. Thus, the first goal of this study is to elucidate the underlying cognitive processes enabling sequence acquisition. We therefore compared reaction time (RT) in sequence learning in a standard manual activated (MA) to that in an ocular activated (OA) version of the task, within a single experimental setting. The second goal is to use eye movement measures to compare anticipation, as an additional indication of sequence learning, between the two versions of the SRT. Performance of the group given the MA version of the task (n = 29) was compared with that of the group given the OA version (n = 30). The results showed that although overall, RT was faster for the OA group, the rate of sequence learning was similar to that of the MA group performing the standard version of the SRT. Because the stimulus-response association is automatic and exists prior to training in the OA task, the decreased reaction time in this version of the task reflects a purer measure of the sequence learning that occurs in the SRT task. The results of this study show that eye tracking anticipation can be measured directly and can serve as a direct measure of sequence learning. Finally, using the OA version of the SRT to study sequence learning presents a significant methodological contribution by making sequence learning studies possible among populations that struggle to perform manual responses.

  7. Statistical Features of the 2010 Beni-Ilmane, Algeria, Aftershock Sequence

    NASA Astrophysics Data System (ADS)

    Hamdache, M.; Peláez, J. A.; Gospodinov, D.; Henares, J.

    2018-03-01

    The aftershock sequence of the 2010 Beni-Ilmane ( M W 5.5) earthquake is studied in depth to analyze the spatial and temporal variability of seismicity parameters of the relationships modeling the sequence. The b value of the frequency-magnitude distribution is examined rigorously. A threshold magnitude of completeness equal to 2.1, using the maximum curvature procedure or the changing point algorithm, and a b value equal to 0.96 ± 0.03 have been obtained for the entire sequence. Two clusters have been identified and characterized by their faulting type, exhibiting b values equal to 0.99 ± 0.05 and 1.04 ± 0.05. Additionally, the temporal decay of the aftershock sequence was examined using a stochastic point process. The analysis was done through the restricted epidemic-type aftershock sequence (RETAS) stochastic model, which allows the possibility to recognize the prevailing clustering pattern of the relaxation process in the examined area. The analysis selected the epidemic-type aftershock sequence (ETAS) model to offer the most appropriate description of the temporal distribution, which presumes that all events in the sequence can cause secondary aftershocks. Finally, the fractal dimensions are estimated using the integral correlation. The obtained D 2 values are 2.15 ± 0.01, 2.23 ± 0.01 and 2.17 ± 0.02 for the entire sequence, and for the first and second cluster, respectively. An analysis of the temporal evolution of the fractal dimensions D -2, D 0, D 2 and the spectral slope has been also performed to derive and characterize the different clusters included in the sequence.

  8. VaDiR: an integrated approach to Variant Detection in RNA.

    PubMed

    Neums, Lisa; Suenaga, Seiji; Beyerlein, Peter; Anders, Sara; Koestler, Devin; Mariani, Andrea; Chien, Jeremy

    2018-02-01

    Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue. We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels. Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.

  9. Polar Lights at Saturn Bid Cassini Farewell

    NASA Image and Video Library

    2017-10-16

    On Sept. 14, 2017, one day before making its final plunge into Saturn's atmosphere, NASA's Cassini spacecraft used its Ultraviolet Imaging Spectrograph, or UVIS, instrument to capture this final view of ultraviolet auroral emissions in the planet's north polar region. The view is centered on the north pole of Saturn, with lines of latitude visible for 80, 70 and 60 degrees. Lines of longitude are spaced 40 degrees apart. The planet's day side is at bottom, while the night side is at top. A sequence of images from this observation has also been assembled into a movie sequence. The last image in the movie was taken about an hour before the still image, which was the actual final UVIS auroral image. Auroral emissions are generated by charged particles traveling along the invisible lines of Saturn's magnetic field. These particles precipitate into the atmosphere, releasing light when they strike gas molecules there. Several individual auroral structures are visible here, despite that this UVIS view was acquired at a fairly large distance from the planet (about 424,000 miles or 683,000 kilometers). Each of these features is connected to a particular phenomenon in Saturn's magnetosphere. For instance, it is possible to identify auroral signatures here that are related to the injection of hot plasma from the dayside magnetosphere, as well as auroral features associated with a change in the magnetic field's shape on the magnetosphere's night side. Several possible scenarios have been postulated over the years to explain Saturn's changing auroral emissions, but researchers are still far from a complete understanding of this complicated puzzle. Researchers will continue to analyze the hundreds of image sequences UVIS obtained of Saturn's auroras during Cassini's 13-year mission, with many new discoveries likely to be made. This image and movie sequence were produced by the Laboratory for Planetary and Atmospheric Physics (LPAP) of the STAR Institute of the University of Liege in Belgium, in collaboration with the UVIS Team. The animation is available at https://photojournal.jpl.nasa.gov/catalog/PIA21899

  10. Robust Vocabulary Instruction in a Readers' Workshop

    ERIC Educational Resources Information Center

    Feezell, Greg

    2012-01-01

    This article presents strategies for integrating explicit vocabulary instruction within a reading workshop. The author begins by describing a process for involving students in word selection. The author then provides a weeklong instructional sequence using student-selected words. Finally, the author briefly examines the role of vocabulary…

  11. Structure-function analysis of diacylglycerol acyltransferase sequences for metabolic engineering and drug discovery

    USDA-ARS?s Scientific Manuscript database

    Diacylglycerol acyltransferase families (DGATs) catalyze the final and rate-limiting step of triacylglycerol (TAG) biosynthesis in eukaryotic organisms. DGAT knockout mice are resistant to diet-induced obesity and lack milk secretion. Over-expression of DGATs increases TAG in plants. Therefore, unde...

  12. THE PHYLOGENETIC RELATIONSHIPS OF WHALE-FALL VESICOMYID CLAMS BASED ON MITOCHONDRIAL COI DNA SEQUENCES. (U915626)

    EPA Science Inventory

    The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...

  13. 76 FR 39231 - Federal Acquisition Regulation; Federal Acquisition Circular 2005-53; Introduction

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-07-05

    ... National Aeronautics and Space Administration 48 CFR Chapter 1 Federal Acquisition Regulation; Final Rules... ADMINISTRATION 48 CFR Chapter 1 [Docket FAR 2011-0076, Sequence 5] Federal Acquisition Regulation; Federal Acquisition Circular 2005-53; Introduction AGENCIES: Department of Defense (DoD), General Services...

  14. Apical Coarticulation at Juncture Boundaries

    ERIC Educational Resources Information Center

    Lewis, J.; And Others

    1975-01-01

    Sentences were read by six informants to determine the presence or absence of /n/ in /nth/ sequences. The sentences contained seven different levels of juncture with /nth/ occurring in word final position, intervocalically, and across word boundaries, among other places. Dental coarticulation was not hindered by most junctures. (SC)

  15. IDENTICAL RIBOSOMAL DNA SEQUENCE DATA FROM PFIESTERIA PISCICIDA (DINOPHYCEAE) ISOLATES WITH DIFFERENT TOXICITY PHENOTYPES. (R827084)

    EPA Science Inventory

    The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...

  16. 76 FR 63844 - Federal Travel Regulation (FTR); Lodging Reimbursement

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-10-14

    ...; Docket Number 2011-0024, Sequence 1] RIN 3090-AJ22 Federal Travel Regulation (FTR); Lodging Reimbursement... (GSA) is amending the Federal Travel Regulation (FTR) regarding reimbursement of lodging per diem expenses while on temporary duty travel (TDY). This final rule specifically states GSA's policy in regards...

  17. Spatial methods for deriving crop rotation history

    USDA-ARS?s Scientific Manuscript database

    Converting multi-year remote sensing classification data into crop rotations is beneficial by defining length of crop rotation cycles and the specific sequences of intervening crops grown between the final year of a grass seed stand and establishment of a new perennial ryegrass seed crop. Markov mod...

  18. Evolution of proteins.

    NASA Technical Reports Server (NTRS)

    Dayhoff, M. O.

    1971-01-01

    The amino acid sequences of proteins from living organisms are dealt with. The structure of proteins is first discussed; the variation in this structure from one biological group to another is illustrated by the first halves of the sequences of cytochrome c, and a phylogenetic tree is derived from the cytochrome c data. The relative geological times associated with the events of this tree are discussed. Errors which occur in the duplication of cells during the evolutionary process are examined. Particular attention is given to evolution of mutant proteins, globins, ferredoxin, and transfer ribonucleic acids (tRNA's). Finally, a general outline of biological evolution is presented.

  19. Chromatin Immunoprecipitation Sequencing (ChIP-Seq) for Transcription Factors and Chromatin Factors in Arabidopsis thaliana Roots: From Material Collection to Data Analysis.

    PubMed

    Cortijo, Sandra; Charoensawan, Varodom; Roudier, François; Wigge, Philip A

    2018-01-01

    Chromatin immunoprecipitation combined with next-generation sequencing (ChIP-seq) is a powerful technique to investigate in vivo transcription factor (TF) binding to DNA, as well as chromatin marks. Here we provide a detailed protocol for all the key steps to perform ChIP-seq in Arabidopsis thaliana roots, also working on other A. thaliana tissues and in most non-ligneous plants. We detail all steps from material collection, fixation, chromatin preparation, immunoprecipitation, library preparation, and finally computational analysis based on a combination of publicly available tools.

  20. Automated selection of synthetic biology parts for genetic regulatory networks.

    PubMed

    Yaman, Fusun; Bhatia, Swapnil; Adler, Aaron; Densmore, Douglas; Beal, Jacob

    2012-08-17

    Raising the level of abstraction for synthetic biology design requires solving several challenging problems, including mapping abstract designs to DNA sequences. In this paper we present the first formalism and algorithms to address this problem. The key steps of this transformation are feature matching, signal matching, and part matching. Feature matching ensures that the mapping satisfies the regulatory relationships in the abstract design. Signal matching ensures that the expression levels of functional units are compatible. Finally, part matching finds a DNA part sequence that can implement the design. Our software tool MatchMaker implements these three steps.

  1. Mapping and sequencing the human genome: Science, ethics, and public policy. Final report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McInerney, J.D.

    1993-03-31

    Development of Mapping and Sequencing the Human Genome: Science, Ethics, and Public Policy followed the standard process of curriculum development at the Biological Sciences Curriculum Study (BSCS), the process is described. The production of this module was a collaborative effort between BSCS and the American Medical Association (AMA). Appendix A contains a copy of the module. Copies of reports sent to the Department of Energy (DOE) during the development process are contained in Appendix B; all reports should be on file at DOE. Appendix B also contains copies of status reports submitted to the BSCS Board of Directors.

  2. A survey and evaluations of histogram-based statistics in alignment-free sequence comparison.

    PubMed

    Luczak, Brian B; James, Benjamin T; Girgis, Hani Z

    2017-12-06

    Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences. We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover's distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover's distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours. The source code of the benchmarking tool is available as Supplementary Materials. © The Author 2017. Published by Oxford University Press.

  3. MollDE: a homology modeling framework you can click with.

    PubMed

    Canutescu, Adrian A; Dunbrack, Roland L

    2005-06-15

    Molecular Integrated Development Environment (MolIDE) is an integrated application designed to provide homology modeling tools and protocols under a uniform, user-friendly graphical interface. Its main purpose is to combine the most frequent modeling steps in a semi-automatic, interactive way, guiding the user from the target protein sequence to the final three-dimensional protein structure. The typical basic homology modeling process is composed of building sequence profiles of the target sequence family, secondary structure prediction, sequence alignment with PDB structures, assisted alignment editing, side-chain prediction and loop building. All of these steps are available through a graphical user interface. MolIDE's user-friendly and streamlined interactive modeling protocol allows the user to focus on the important modeling questions, hiding from the user the raw data generation and conversion steps. MolIDE was designed from the ground up as an open-source, cross-platform, extensible framework. This allows developers to integrate additional third-party programs to MolIDE. http://dunbrack.fccc.edu/molide/molide.php rl_dunbrack@fccc.edu.

  4. Evidence for two attentional components in visual working memory.

    PubMed

    Allen, Richard J; Baddeley, Alan D; Hitch, Graham J

    2014-11-01

    How does executive attentional control contribute to memory for sequences of visual objects, and what does this reveal about storage and processing in working memory? Three experiments examined the impact of a concurrent executive load (backward counting) on memory for sequences of individually presented visual objects. Experiments 1 and 2 found disruptive concurrent load effects of equivalent magnitude on memory for shapes, colors, and colored shape conjunctions (as measured by single-probe recognition). These effects were present only for Items 1 and 2 in a 3-item sequence; the final item was always impervious to this disruption. This pattern of findings was precisely replicated in Experiment 3 when using a cued verbal recall measure of shape-color binding, with error analysis providing additional insights concerning attention-related loss of early-sequence items. These findings indicate an important role for executive processes in maintaining representations of earlier encountered stimuli in an active form alongside privileged storage of the most recent stimulus. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  5. Mission strategy for cometary exploration in the 1980's

    NASA Technical Reports Server (NTRS)

    Farquhar, R. W.

    1974-01-01

    A sequence of ballistic intercept missions to comets is proposed. The mission set is composed of a well-known group of periodic comets whose physical properties are dissimilar. In addition to full descriptions of the nominal mission profiles, earth-based sighting conditions and estimates of cometary ephemeris errors are presented for each target comet. The first mission of the sequence is a slow flyby (approximately 8 km/sec) of Encke's comet near its perihelion in 1980. Because of a near resonance in the orbital periods of Encke and the spacecraft, it is possible to retarget the spacecraft for a second Encke encounter in 1984. The second mission of the sequence also consists of two cometary encounters but in this case different comets are involved; Giacobini-Zinner in 1985 and Borrelly in 1987. The final mission of the sequence calls for a simultaneous launch of two spacecraft towards Halley's comet in 1985. One spacecraft is targeted fo a pre-perihelion intercept at a heliocentric distance of 1.37 AU.

  6. Investigation of the design of a metal-lined fully wrapped composite vessel under high internal pressure

    NASA Astrophysics Data System (ADS)

    Kalaycıoğlu, Barış; Husnu Dirikolu, M.

    2010-09-01

    In this study, a Type III composite pressure vessel (ISO 11439:2000) loaded with high internal pressure is investigated in terms of the effect of the orientation of the element coordinate system while simulating the continuous variation of the fibre angle, the effect of symmetric and non-symmetric composite wall stacking sequences, and lastly, a stacking sequence evaluation for reducing the cylindrical section-end cap transition region stress concentration. The research was performed using an Ansys® model with 2.9 l volume, 6061 T6 aluminium liner/Kevlar® 49-Epoxy vessel material, and a service internal pressure loading of 22 MPa. The results show that symmetric stacking sequences give higher burst pressures by up to 15%. Stacking sequence evaluations provided a further 7% pressure-carrying capacity as well as reduced stress concentration in the transition region. Finally, the Type III vessel under consideration provides a 45% lighter construction as compared with an all metal (Type I) vessel.

  7. Threading DNA through nanopores for biosensing applications

    NASA Astrophysics Data System (ADS)

    Fyta, Maria

    2015-07-01

    This review outlines the recent achievements in the field of nanopore research. Nanopores are typically used in single-molecule experiments and are believed to have a high potential to realize an ultra-fast and very cheap genome sequencer. Here, the various types of nanopore materials, ranging from biological to 2D nanopores are discussed together with their advantages and disadvantages. These nanopores can utilize different protocols to read out the DNA nucleobases. Although, the first nanopore devices have reached the market, many still have issues which do not allow a full realization of a nanopore sequencer able to sequence the human genome in about a day. Ways to control the DNA, its dynamics and speed as the biomolecule translocates the nanopore in order to increase the signal-to-noise ratio in the reading-out process are examined in this review. Finally, the advantages, as well as the drawbacks in distinguishing the DNA nucleotides, i.e., the genetic information, are presented in view of their importance in the field of nanopore sequencing.

  8. Buckling Test Results from the 8-Foot-Diameter Orthogrid-Stiffened Cylinder Test Article TA01. [Test Dates: 19-21 November 2008

    NASA Technical Reports Server (NTRS)

    Hilburger, Mark W.; Waters, W. Allen, Jr.; Haynie, Waddy T.

    2015-01-01

    Results from the testing of cylinder test article SBKF-P2-CYLTA01 (referred to herein as TA01) are presented. The testing was conducted at the Marshall Space Flight Center (MSFC), November 19?21, 2008, in support of the Shell Buckling Knockdown Factor (SBKF) Project.i The test was used to verify the performance of a newly constructed buckling test facility at MSFC and to verify the test article design and analysis approach used by the SBKF project researchers. TA01 is an 8-foot-diameter (96-inches), 78.0-inch long, aluminum-lithium (Al-Li), orthogrid-stiffened cylindrical shell similar to those used in current state-of-the-art launch vehicle structures and was designed to exhibit global buckling when subjected to compression loads. Five different load sequences were applied to TA01 during testing and included four sub-critical load sequences, i.e., loading conditions that did not cause buckling or material failure, and one final load sequence to buckling and collapse. The sub-critical load sequences consisted of either uniform axial compression loading or combined axial compression and bending and the final load sequence subjected TA01 to uniform axial compression. Traditional displacement transducers and strain gages were used to monitor the test article response at nearly 300 locations and an advanced digital image correlation system was used to obtain low-speed and high-speed full-field displacement measurements of the outer surface of the test article. Overall, the test facility and test article performed as designed. In particular, the test facility successfully applied all desired load combinations to the test article and was able to test safely into the postbuckling range of loading, and the test article failed by global buckling. In addition, the test results correlated well with initial pretest predictions.

  9. Characterization of on-target generated tryptic peptides from Giberella zeae conidia spore proteins by means of matrix-assisted laser desorption/ionization mass spectrometry.

    PubMed

    Dong, Hongjuan; Marchetti-Deschmann, Martina; Allmaier, Günter

    2014-01-01

    Traditionally characterization of microbial proteins is performed by a complex sequence of steps with the final step to be either Edman sequencing or mass spectrometry, which generally takes several weeks or months to be complete. In this work, we proposed a strategy for the characterization of tryptic peptides derived from Giberella zeae (anamorph: Fusarium graminearum) proteins in parallel to intact cell mass spectrometry (ICMS) in which no complicated and time-consuming steps were needed. Experimentally, after a simple washing treatment of the spores, the aliquots of the intact G. zeae macro conidia spores solution, were deposited two times onto one MALDI (matrix-assisted laser desorption ionization) mass spectrometry (MS) target (two spots). One spot was used for ICMS and the second spot was subject to a brief on-target digestion with bead-immobilized or non-immobilized trypsin. Subsequently, one spot was analyzed immediately by MALDI MS in the linear mode (ICMS) whereas the second spot containing the digested material was investigated by MALDI MS in the reflectron mode ("peptide mass fingerprint") followed by protonated peptide selection for MS/MS (post source decay (PSD) fragment ion) analysis. Based on the formed fragment ions of selected tryptic peptides a complete or partial amino acid sequence was generated by manual de novo sequencing. These sequence data were used for homology search for protein identification. Finally four different peptides of varying abundances have been identified successfully allowing the verification that our desorbed/ionized surface compounds were indeed derived from proteins. The presence of three different proteins could be found unambiguously. Interestingly, one of these proteins is belonging to the ribosomal superfamily which indicates that not only surface-associated proteins were digested. This strategy minimized the amount of time and labor required for obtaining deeper information on spore preparations within the nowadays widely used ICMS approach. Copyright © 2013 Elsevier Ltd. All rights reserved.

  10. ChAy/Bx, a novel chimeric high-molecular-weight glutenin subunit gene apparently created by homoeologous recombination in Triticum turgidum ssp. dicoccoides.

    PubMed

    Guo, Xiao-Hui; Bi, Zhe-Guang; Wu, Bi-Hua; Wang, Zhen-Zhen; Hu, Ji-Liang; Zheng, You-Liang; Liu, Deng-Cai

    2013-12-01

    High-molecular-weight glutenin subunits (HMW-GSs) are of considerable interest, because they play a crucial role in determining dough viscoelastic properties and end-use quality of wheat flour. In this paper, ChAy/Bx, a novel chimeric HMW-GS gene from Triticum turgidum ssp. dicoccoides (AABB, 2n=4x=28) accession D129, was isolated and characterized. Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) analysis revealed that the electrophoretic mobility of the glutenin subunit encoded by ChAy/Bx was slightly faster than that of 1Dy12. The complete ORF of ChAy/Bx contained 1,671 bp encoding a deduced polypeptide of 555 amino acid residues (or 534 amino acid residues for the mature protein), making it the smallest HMW-GS gene known from Triticum species. Sequence analysis showed that ChAy/Bx was neither a conventional x-type nor a conventional y-type subunit gene, but a novel chimeric gene. Its first 1305 nt sequence was highly homologous with the corresponding sequence of 1Ay type genes, while its final 366 nt sequence was highly homologous with the corresponding sequence of 1Bx type genes. The mature ChAy/Bx protein consisted of the N-terminus of 1Ay type subunit (the first 414 amino acid residues) and the C-terminus of 1Bx type subunit (the final 120 amino acid residues). Secondary structure prediction showed that ChAy/Bx contained some domains of 1Ay subunit and some domains of 1Bx subunit. The special structure of this HMW glutenin chimera ChAy/Bx subunit might have unique effects on the end-use quality of wheat flour. Here we propose that homoeologous recombination might be a novel pathway for allelic variation or molecular evolution of HMW-GSs. © 2013.

  11. Optimized mtDNA Control Region Primer Extension Capture Analysis for Forensically Relevant Samples and Highly Compromised mtDNA of Different Age and Origin

    PubMed Central

    Eduardoff, Mayra; Xavier, Catarina; Strobl, Christina; Casas-Vargas, Andrea; Parson, Walther

    2017-01-01

    The analysis of mitochondrial DNA (mtDNA) has proven useful in forensic genetics and ancient DNA (aDNA) studies, where specimens are often highly compromised and DNA quality and quantity are low. In forensic genetics, the mtDNA control region (CR) is commonly sequenced using established Sanger-type Sequencing (STS) protocols involving fragment sizes down to approximately 150 base pairs (bp). Recent developments include Massively Parallel Sequencing (MPS) of (multiplex) PCR-generated libraries using the same amplicon sizes. Molecular genetic studies on archaeological remains that harbor more degraded aDNA have pioneered alternative approaches to target mtDNA, such as capture hybridization and primer extension capture (PEC) methods followed by MPS. These assays target smaller mtDNA fragment sizes (down to 50 bp or less), and have proven to be substantially more successful in obtaining useful mtDNA sequences from these samples compared to electrophoretic methods. Here, we present the modification and optimization of a PEC method, earlier developed for sequencing the Neanderthal mitochondrial genome, with forensic applications in mind. Our approach was designed for a more sensitive enrichment of the mtDNA CR in a single tube assay and short laboratory turnaround times, thus complying with forensic practices. We characterized the method using sheared, high quantity mtDNA (six samples), and tested challenging forensic samples (n = 2) as well as compromised solid tissue samples (n = 15) up to 8 kyrs of age. The PEC MPS method produced reliable and plausible mtDNA haplotypes that were useful in the forensic context. It yielded plausible data in samples that did not provide results with STS and other MPS techniques. We addressed the issue of contamination by including four generations of negative controls, and discuss the results in the forensic context. We finally offer perspectives for future research to enable the validation and accreditation of the PEC MPS method for final implementation in forensic genetic laboratories. PMID:28934125

  12. Initial Characterization of the Pf-Int Recombinase from the Malaria Parasite Plasmodium falciparum

    PubMed Central

    Ghorbal, Mehdi; Scheidig-Benatar, Christine; Bouizem, Salma; Thomas, Christophe; Paisley, Genevieve; Faltermeier, Claire; Liu, Melanie; Scherf, Artur; Lopez-Rubio, Jose-Juan; Gopaul, Deshmukh N.

    2012-01-01

    Background Genetic variation is an essential means of evolution and adaptation in many organisms in response to environmental change. Certain DNA alterations can be carried out by site-specific recombinases (SSRs) that fall into two families: the serine and the tyrosine recombinases. SSRs are seldom found in eukaryotes. A gene homologous to a tyrosine site-specific recombinase has been identified in the genome of Plasmodium falciparum. The sequence is highly conserved among five other members of Plasmodia. Methodology/Principal Findings The predicted open reading frame encodes for a ∼57 kDa protein containing a C-terminal domain including the putative tyrosine recombinase conserved active site residues R-H-R-(H/W)-Y. The N-terminus has the typical alpha-helical bundle and potentially a mixed alpha-beta domain resembling that of λ-Int. Pf-Int mRNA is expressed differentially during the P. falciparum erythrocytic life stages, peaking in the schizont stage. Recombinant Pf-Int and affinity chromatography of DNA from genomic or synthetic origin were used to identify potential DNA targets after sequencing or micro-array hybridization. Interestingly, the sequences captured also included highly variable subtelomeric genes such as var, rif, and stevor sequences. Electrophoretic mobility shift assays with DNA were carried out to verify Pf-Int/DNA binding. Finally, Pf-Int knock-out parasites were created in order to investigate the biological role of Pf-Int. Conclusions/Significance Our data identify for the first time a malaria parasite gene with structural and functional features of recombinases. Pf-Int may bind to and alter DNA, either in a sequence specific or in a non-specific fashion, and may contribute to programmed or random DNA rearrangements. Pf-Int is the first molecular player identified with a potential role in genome plasticity in this pathogen. Finally, Pf-Int knock-out parasite is viable showing no detectable impact on blood stage development, which is compatible with such function. PMID:23056326

  13. New Sequences with Low Correlation and Large Family Size

    NASA Astrophysics Data System (ADS)

    Zeng, Fanxin

    In direct-sequence code-division multiple-access (DS-CDMA) communication systems and direct-sequence ultra wideband (DS-UWB) radios, sequences with low correlation and large family size are important for reducing multiple access interference (MAI) and accepting more active users, respectively. In this paper, a new collection of families of sequences of length pn-1, which includes three constructions, is proposed. The maximum number of cyclically distinct families without GMW sequences in each construction is φ(pn-1)/n·φ(pm-1)/m, where p is a prime number, n is an even number, and n=2m, and these sequences can be binary or polyphase depending upon choice of the parameter p. In Construction I, there are pn distinct sequences within each family and the new sequences have at most d+2 nontrivial periodic correlation {-pm-1, -1, pm-1, 2pm-1,…,dpm-1}. In Construction II, the new sequences have large family size p2n and possibly take the nontrivial correlation values in {-pm-1, -1, pm-1, 2pm-1,…,(3d-4)pm-1}. In Construction III, the new sequences possess the largest family size p(d-1)n and have at most 2d correlation levels {-pm-1, -1,pm-1, 2pm-1,…,(2d-2)pm-1}. Three constructions are near-optimal with respect to the Welch bound because the values of their Welch-Ratios are moderate, WR_??_d, WR_??_3d-4 and WR_??_2d-2, respectively. Each family in Constructions I, II and III contains a GMW sequence. In addition, Helleseth sequences and Niho sequences are special cases in Constructions I and III, and their restriction conditions to the integers m and n, pm≠2 (mod 3) and n≅0 (mod 4), respectively, are removed in our sequences. Our sequences in Construction III include the sequences with Niho type decimation 3·2m-2, too. Finally, some open questions are pointed out and an example that illustrates the performance of these sequences is given.

  14. PET Imaging Stability Measurements During Simultaneous Pulsing of Aggressive MR Sequences on the SIGNA PET/MR System.

    PubMed

    Deller, Timothy W; Khalighi, Mohammad Mehdi; Jansen, Floris P; Glover, Gary H

    2018-01-01

    The recent introduction of simultaneous whole-body PET/MR scanners has enabled new research taking advantage of the complementary information obtainable with PET and MRI. One such application is kinetic modeling, which requires high levels of PET quantitative stability. To accomplish the required PET stability levels, the PET subsystem must be sufficiently isolated from the effects of MR activity. Performance measurements have previously been published, demonstrating sufficient PET stability in the presence of MR pulsing for typical clinical use; however, PET stability during radiofrequency (RF)-intensive and gradient-intensive sequences has not previously been evaluated for a clinical whole-body scanner. In this work, PET stability of the GE SIGNA PET/MR was examined during simultaneous scanning of aggressive MR pulse sequences. Methods: PET performance tests were acquired with MR idle and during simultaneous MR pulsing. Recent system improvements mitigating RF interference and gain variation were used. A fast recovery fast spin echo MR sequence was selected for high RF power, and an echo planar imaging sequence was selected for its high heat-inducing gradients. Measurements were performed to determine PET stability under varying MR conditions using the following metrics: sensitivity, scatter fraction, contrast recovery, uniformity, count rate performance, and image quantitation. A final PET quantitative stability assessment for simultaneous PET scanning during functional MRI studies was performed with a spiral in-and-out gradient echo sequence. Results: Quantitation stability of a 68 Ge flood phantom was demonstrated within 0.34%. Normalized sensitivity was stable during simultaneous scanning within 0.3%. Scatter fraction measured with a 68 Ge line source in the scatter phantom was stable within the range of 40.4%-40.6%. Contrast recovery and uniformity were comparable for PET images acquired simultaneously with multiple MR conditions. Peak noise equivalent count rate was 224 kcps at an effective activity concentration of 18.6 kBq/mL, and the count rate curves and scatter fraction curve were consistent for the alternating MR pulsing states. A final test demonstrated quantitative stability during a spiral functional MRI sequence. Conclusion: PET stability metrics demonstrated that PET quantitation was not affected during simultaneous aggressive MRI. This stability enables demanding applications such as kinetic modeling. © 2018 by the Society of Nuclear Medicine and Molecular Imaging.

  15. Presence of Stenotrophomonas maltophilia exhibiting high genetic similarity to clinical isolates in final effluents of pig farm wastewater treatment plants.

    PubMed

    Kim, Young-Ji; Park, Jin-Hyeong; Seo, Kun-Ho

    2018-03-01

    Although the prevalence of community-acquired Stenotrophomonas maltophilia infections is sharply increasing, the sources and likely transmission routes of this bacterium are poorly understood. We studied the significance of the presence of S. maltophilia in final effluents and receiving rivers of pig farm wastewater treatment plants (WWTPs). The loads and antibiotic resistance profiles of S. maltophilia in final effluents were assessed. Antibiotic resistance determinants and biofilm formation genes were detected by PCR, and genetic similarity to clinical isolates was investigated using multilocus sequence typing (MLST). S. maltophilia was recovered from final effluents at two of three farms and one corresponding receiving river. Tests of resistance to antibiotics recommended for S. maltophilia infection revealed that for each agent, at least one isolate was classified as resistant or intermediate, with the exception of minocycline. Furthermore, multidrug resistant S. maltophilia susceptible to antibiotics of only two categories was isolated and found to carry the sul2 gene, conferring trimethoprim/sulfamethoxazole resistance. All isolates carried spgM, encoding a major factor in biofilm formation. MLST revealed that isolates of the same sequence type (ST; ST189) were present in both effluent and receiving river samples, and phylogenetic analysis showed that all of the STs identified in this study clustered with clinical isolates. Moreover, one isolate (ST192) recovered in this investigation demonstrated 99.61% sequence identity with a clinical isolate (ST98) associated with a fatal infection in South Korea. Thus, the pathogenicity of the isolates reported here is likely similar to that of those from clinical environments, and WWTPs may play a role as a source of S. maltophilia from which this bacterium spreads to human communities. To the best of our knowledge, this represents the first report of S. maltophilia in pig farm WWTPs. Our results indicate that nationwide epidemiological investigations are needed to examine the possible link between WWTP-derived S. maltophilia and hospital- and community-acquired infections. Copyright © 2017 Elsevier GmbH. All rights reserved.

  16. A comprehensive aligned nifH gene database: a multipurpose tool for studies of nitrogen-fixing bacteria.

    PubMed

    Gaby, John Christian; Buckley, Daniel H

    2014-01-01

    We describe a nitrogenase gene sequence database that facilitates analysis of the evolution and ecology of nitrogen-fixing organisms. The database contains 32 954 aligned nitrogenase nifH sequences linked to phylogenetic trees and associated sequence metadata. The database includes 185 linked multigene entries including full-length nifH, nifD, nifK and 16S ribosomal RNA (rRNA) gene sequences. Evolutionary analyses enabled by the multigene entries support an ancient horizontal transfer of nitrogenase genes between Archaea and Bacteria and provide evidence that nifH has a different history of horizontal gene transfer from the nifDK enzyme core. Further analyses show that lineages in nitrogenase cluster I and cluster III have different rates of substitution within nifD, suggesting that nifD is under different selection pressure in these two lineages. Finally, we find that that the genetic divergence of nifH and 16S rRNA genes does not correlate well at sequence dissimilarity values used commonly to define microbial species, as stains having <3% sequence dissimilarity in their 16S rRNA genes can have up to 23% dissimilarity in nifH. The nifH database has a number of uses including phylogenetic and evolutionary analyses, the design and assessment of primers/probes and the evaluation of nitrogenase sequence diversity. Database URL: http://www.css.cornell.edu/faculty/buckley/nifh.htm.

  17. Structural and sequence features of two residue turns in beta-hairpins.

    PubMed

    Madan, Bharat; Seo, Sung Yong; Lee, Sun-Gu

    2014-09-01

    Beta-turns in beta-hairpins have been implicated as important sites in protein folding. In particular, two residue β-turns, the most abundant connecting elements in beta-hairpins, have been a major target for engineering protein stability and folding. In this study, we attempted to investigate and update the structural and sequence properties of two residue turns in beta-hairpins with a large data set. For this, 3977 beta-turns were extracted from 2394 nonhomologous protein chains and analyzed. First, the distribution, dihedral angles and twists of two residue turn types were determined, and compared with previous data. The trend of turn type occurrence and most structural features of the turn types were similar to previous results, but for the first time Type II turns in beta-hairpins were identified. Second, sequence motifs for the turn types were devised based on amino acid positional potentials of two-residue turns, and their distributions were examined. From this study, we could identify code-like sequence motifs for the two residue beta-turn types. Finally, structural and sequence properties of beta-strands in the beta-hairpins were analyzed, which revealed that the beta-strands showed no specific sequence and structural patterns for turn types. The analytical results in this study are expected to be a reference in the engineering or design of beta-hairpin turn structures and sequences. © 2014 Wiley Periodicals, Inc.

  18. A bacterial genetic screen identifies functional coding sequences of the insect mariner transposable element Famar1 amplified from the genome of the earwig, Forficula auricularia.

    PubMed

    Barry, Elizabeth G; Witherspoon, David J; Lampe, David J

    2004-02-01

    Transposons of the mariner family are widespread in animal genomes and have apparently infected them by horizontal transfer. Most species carry only old defective copies of particular mariner transposons that have diverged greatly from their active horizontally transferred ancestor, while a few contain young, very similar, and active copies. We report here the use of a whole-genome screen in bacteria to isolate somewhat diverged Famar1 copies from the European earwig, Forficula auricularia, that encode functional transposases. Functional and nonfunctional coding sequences of Famar1 and nonfunctional copies of Ammar1 from the European honey bee, Apis mellifera, were sequenced to examine their molecular evolution. No selection for sequence conservation was detected in any clade of a tree derived from these sequences, not even on branches leading to functional copies. This agrees with the current model for mariner transposon evolution that expects neutral evolution within particular hosts, with selection for function occurring only upon horizontal transfer to a new host. Our results further suggest that mariners are not finely tuned genetic entities and that a greater amount of sequence diversification than had previously been appreciated can occur in functional copies in a single host lineage. Finally, this method of isolating active copies can be used to isolate other novel active transposons without resorting to reconstruction of ancestral sequences.

  19. A comprehensive aligned nifH gene database: a multipurpose tool for studies of nitrogen-fixing bacteria

    PubMed Central

    Gaby, John Christian; Buckley, Daniel H.

    2014-01-01

    We describe a nitrogenase gene sequence database that facilitates analysis of the evolution and ecology of nitrogen-fixing organisms. The database contains 32 954 aligned nitrogenase nifH sequences linked to phylogenetic trees and associated sequence metadata. The database includes 185 linked multigene entries including full-length nifH, nifD, nifK and 16S ribosomal RNA (rRNA) gene sequences. Evolutionary analyses enabled by the multigene entries support an ancient horizontal transfer of nitrogenase genes between Archaea and Bacteria and provide evidence that nifH has a different history of horizontal gene transfer from the nifDK enzyme core. Further analyses show that lineages in nitrogenase cluster I and cluster III have different rates of substitution within nifD, suggesting that nifD is under different selection pressure in these two lineages. Finally, we find that that the genetic divergence of nifH and 16S rRNA genes does not correlate well at sequence dissimilarity values used commonly to define microbial species, as stains having <3% sequence dissimilarity in their 16S rRNA genes can have up to 23% dissimilarity in nifH. The nifH database has a number of uses including phylogenetic and evolutionary analyses, the design and assessment of primers/probes and the evaluation of nitrogenase sequence diversity. Database URL: http://www.css.cornell.edu/faculty/buckley/nifh.htm PMID:24501396

  20. Microbial bioinformatics 2020.

    PubMed

    Pallen, Mark J

    2016-09-01

    Microbial bioinformatics in 2020 will remain a vibrant, creative discipline, adding value to the ever-growing flood of new sequence data, while embracing novel technologies and fresh approaches. Databases and search strategies will struggle to cope and manual curation will not be sustainable during the scale-up to the million-microbial-genome era. Microbial taxonomy will have to adapt to a situation in which most microorganisms are discovered and characterised through the analysis of sequences. Genome sequencing will become a routine approach in clinical and research laboratories, with fresh demands for interpretable user-friendly outputs. The "internet of things" will penetrate healthcare systems, so that even a piece of hospital plumbing might have its own IP address that can be integrated with pathogen genome sequences. Microbiome mania will continue, but the tide will turn from molecular barcoding towards metagenomics. Crowd-sourced analyses will collide with cloud computing, but eternal vigilance will be the price of preventing the misinterpretation and overselling of microbial sequence data. Output from hand-held sequencers will be analysed on mobile devices. Open-source training materials will address the need for the development of a skilled labour force. As we boldly go into the third decade of the twenty-first century, microbial sequence space will remain the final frontier! © 2016 The Author. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.

  1. A novel endo-beta-1,3-glucanase, BGN13.1, involved in the mycoparasitism of Trichoderma harzianum.

    PubMed Central

    de la Cruz, J; Pintor-Toro, J A; Benítez, T; Llobell, A; Romero, L C

    1995-01-01

    The mycoparasitic fungus Trichoderma harzianum CECT 2413 produces at least three extracellular beta-1,3-glucanases. The most basic of these extracellular enzymes, named BGN13.1, was expressed when either fungal cell wall polymers or autoclaved mycelia from different fungi were used as the carbon source. BGN13.1 was purified to electrophoretic homogeneity and was biochemically characterized. The enzyme was specific for beta-1,3 linkages and has an endolytic mode of action. A synthetic oligonucleotide primer based on the sequence of an internal peptide was designed to clone the cDNA corresponding to BGN13.1. The deduced amino acid sequence predicted a molecular mass of 78 kDa for the mature protein. Analysis of the amino acid sequence indicates that the enzyme contains three regions, one N-terminal leader sequence; another, nondefined sequence; and one cysteine-rich C-terminal sequence. Sequence comparison shows that this beta-1,3-glucanase, first described for filamentous fungi, belongs to a family different from that of its previously described bacterial, yeast, and plant counterparts. Enzymatic-activity, protein, and mRNA data indicated that bgn13.1 is repressed by glucose and induced by either fungal cell wall polymers or autoclaved yeast cells and mycelia. Finally, experimental evidence showed that the enzyme hydrolyzes yeast and fungal cell walls. PMID:7592488

  2. Structure-function analysis of diacylglycerol acyltransferase sequences from tung tree and 82 other Organisms

    USDA-ARS?s Scientific Manuscript database

    Diacylglycerol acyltransferase family (DGATs) catalyzes the final and rate-limiting step of triacylglycerol (TAG) biosynthesis in eukaryotic organisms. DGATs esterify sn-1,2-diacylglycerol with a long-chain fatty acyl-CoA. Understanding the roles of DGATs will help to create transgenic plants with v...

  3. Variant amino acid residues alter the enzyme activity of peanut type 2 Diacylglycerol Acyltransferases

    USDA-ARS?s Scientific Manuscript database

    Diacylglycerol acyltransferase (DGAT) catalyzes the final, rate-limiting step in triacylglycerol (TAG) biosynthesis via the acyl-CoA-dependent acylation of diacylglycerol. In this study, type-2 DGAT2 genes were cloned from eleven peanut cultivars. Sequence analysis revealed at least eight peanut D...

  4. 75 FR 79459

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-12-20

    ... DEPARTMENT OF AGRICULTURE Regulation Sequence Title Identifier Rulemaking Stage Number Number 1 Wholesale Pork Reporting Program 0581-AD07 Proposed Rule Stage 2 National Dairy Promotion and Research Program; Dairy Import Assessments, DA-08-0050 0581-AC87 Final Rule Stage 3 Animal Welfare; Regulations and Standards for Birds 0579-AC02 Proposed...

  5. Developing Musical Creativity through Improvisation in the Large Performance Classroom

    ERIC Educational Resources Information Center

    Norgaard, Martin

    2017-01-01

    Improvisation is an ideal way to develop musical creativity in ensemble settings. This article describes two prominent theoretical frameworks related to improvisation. Next, based on research with developing and expert improvisers, it discusses how to sequence improvisatory activities so that students feel accomplished at every step. Finally, the…

  6. Securities and Exchange Commission Semiannual Regulatory Agenda

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-26

    ... Insurance Contracts 3235-AK49 DIVISION OF INVESTMENT MANAGEMENT--Final Rule Stage Regulation Sequence Title Identifier Number Number 425 Amendments to Form ADV 3235-AI17 DIVISION OF INVESTMENT MANAGEMENT--Completed... 15c2-2: Confirmation of Transactions in Open-End Management Investment Company 3235-AJ11 Shares, Unit...

  7. Microscale Synthesis of 1-Bromo-3-Chloro-5-Iodobenzene: An Improved Deamination of 4-Bromo-2-Chloro-6-Iodoaniline

    ERIC Educational Resources Information Center

    Pelter, Michael W.; Pelter, Libbie S. W.; Colovic, Dusanka; Strug, Regina

    2004-01-01

    The sequence of microscale mixing of 1-bromo-3-chloro-5-iodobenzene along with reductive deamination of 4-bromo-2-chloro-6-iodoaniline is described. This novel deamination approach is beneficial in final product separation and higher product output.

  8. Faculty Development for Gerontology Program Development. A Final Report.

    ERIC Educational Resources Information Center

    Peterson, David A.; Wendt, Pamela F.

    The University of Southern California's gerontology faculty development program sought to enhance gerontology programs by preparing two to three faculty members from each of several college campuses in Southern California to become core committees that would facilitate an organized sequence of gerontology instruction within their institutions. All…

  9. The Development of Validated Museum Exhibits. Final Report.

    ERIC Educational Resources Information Center

    Nicol, Elizabeth H.

    Exhibit development, as conceived in this report, is an evolutionary process, drawing the museum visitor into the collaborative venture of testing and improving the exhibits. The findings of contemporary learning research were put to work in the arrangement of activities and specimens that engaged children through self-instructional sequences. The…

  10. USE OF ICC/PCR AND NUCLEIC ACID SEQUENCING TO OVERCOME FALSE NEGATIVE RESULTS IN ENVIRONMENTAL VIRUS SURVEYS. (R824756)

    EPA Science Inventory

    The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...

  11. Project UPSTART. Final Report, October 1, 1983-September 30, 1984.

    ERIC Educational Resources Information Center

    Frain, Joan

    Project UPSTART, during this fourth year of outreach, offered assistance in replicating its developed Sequenced Neuro-Sensorimotor Program (SNSP) for severely multihandicapped infants, pre-schoolers, young adults and their families. Future replication sites were identified. Programs received outreach assistance in the areas of staff training,…

  12. COMBINING DNA SEQUENCES AND MORPHOLOGY IN SYSTEMATICS: TESTING THE VALIDITY OF THE DRAGONFLY SPECIES CORDULEGASTER BILINEATA. (R826599)

    EPA Science Inventory

    The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...

  13. 78 FR 1618 - Semiannual Agenda and Fiscal Year 2013 Regulatory Plan

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-01-08

    ... Regulation Sequence No. Title Identifier No. 369 Indoor Tanning Services; 1545-BJ40 Cosmetic Services Excise... DEPARTMENT OF THE TREASURY (TREAS) Internal Revenue Service (IRS) Final Rule Stage 369. Indoor Tanning.... 7805 Abstract: Proposed regulations provide guidance on the indoor tanning services tax made by the...

  14. Developing a Methodology for Designing Systems of Instruction.

    ERIC Educational Resources Information Center

    Carpenter, Polly

    This report presents a description of a process for instructional system design, identification of the steps in the design process, and determination of their sequence and interrelationships. As currently envisioned, several interrelated steps must be taken, five of which provide the inputs to the final design process. There are analysis of…

  15. Mapping Simple Repeated DNA Sequences in Heterochromatin of Drosophila Melanogaster

    PubMed Central

    Lohe, A. R.; Hilliker, A. J.; Roberts, P. A.

    1993-01-01

    Heterochromatin in Drosophila has unusual genetic, cytological and molecular properties. Highly repeated DNA sequences (satellites) are the principal component of heterochromatin. Using probes from cloned satellites, we have constructed a chromosome map of 10 highly repeated, simple DNA sequences in heterochromatin of mitotic chromosomes of Drosophila melanogaster. Despite extensive sequence homology among some satellites, chromosomal locations could be distinguished by stringent in situ hybridizations for each satellite. Only two of the localizations previously determined using gradient-purified bulk satellite probes are correct. Eight new satellite localizations are presented, providing a megabase-level chromosome map of one-quarter of the genome. Five major satellites each exhibit a multichromosome distribution, and five minor satellites hybridize to single sites on the Y chromosome. Satellites closely related in sequence are often located near one another on the same chromosome. About 80% of Y chromosome DNA is composed of nine simple repeated sequences, in particular (AAGAC)(n) (8 Mb), (AAGAG)(n) (7 Mb) and (AATAT)(n) (6 Mb). Similarly, more than 70% of the DNA in chromosome 2 heterochromatin is composed of five simple repeated sequences. We have also generated a high resolution map of satellites in chromosome 2 heterochromatin, using a series of translocation chromosomes whose breakpoints in heterochromatin were ordered by N-banding. Finally, staining and banding patterns of heterochromatic regions are correlated with the locations of specific repeated DNA sequences. The basis for the cytochemical heterogeneity in banding appears to depend exclusively on the different satellite DNAs present in heterochromatin. PMID:8375654

  16. Nanopore sequencing in microgravity

    PubMed Central

    McIntyre, Alexa B R; Rizzardi, Lindsay; Yu, Angela M; Alexander, Noah; Rosen, Gail L; Botkin, Douglas J; Stahl, Sarah E; John, Kristen K; Castro-Wallace, Sarah L; McGrath, Ken; Burton, Aaron S; Feinberg, Andrew P; Mason, Christopher E

    2016-01-01

    Rapid DNA sequencing and analysis has been a long-sought goal in remote research and point-of-care medicine. In microgravity, DNA sequencing can facilitate novel astrobiological research and close monitoring of crew health, but spaceflight places stringent restrictions on the mass and volume of instruments, crew operation time, and instrument functionality. The recent emergence of portable, nanopore-based tools with streamlined sample preparation protocols finally enables DNA sequencing on missions in microgravity. As a first step toward sequencing in space and aboard the International Space Station (ISS), we tested the Oxford Nanopore Technologies MinION during a parabolic flight to understand the effects of variable gravity on the instrument and data. In a successful proof-of-principle experiment, we found that the instrument generated DNA reads over the course of the flight, including the first ever sequenced in microgravity, and additional reads measured after the flight concluded its parabolas. Here we detail modifications to the sample-loading procedures to facilitate nanopore sequencing aboard the ISS and in other microgravity environments. We also evaluate existing analysis methods and outline two new approaches, the first based on a wave-fingerprint method and the second on entropy signal mapping. Computationally light analysis methods offer the potential for in situ species identification, but are limited by the error profiles (stays, skips, and mismatches) of older nanopore data. Higher accuracies attainable with modified sample processing methods and the latest version of flow cells will further enable the use of nanopore sequencers for diagnostics and research in space. PMID:28725742

  17. Decision Tree Algorithm-Generated Single-Nucleotide Polymorphism Barcodes of rbcL Genes for 38 Brassicaceae Species Tagging.

    PubMed

    Yang, Cheng-Hong; Wu, Kuo-Chuan; Chuang, Li-Yeh; Chang, Hsueh-Wei

    2018-01-01

    DNA barcode sequences are accumulating in large data sets. A barcode is generally a sequence larger than 1000 base pairs and generates a computational burden. Although the DNA barcode was originally envisioned as straightforward species tags, the identification usage of barcode sequences is rarely emphasized currently. Single-nucleotide polymorphism (SNP) association studies provide us an idea that the SNPs may be the ideal target of feature selection to discriminate between different species. We hypothesize that SNP-based barcodes may be more effective than the full length of DNA barcode sequences for species discrimination. To address this issue, we tested a r ibulose diphosphate carboxylase ( rbcL ) S NP b arcoding (RSB) strategy using a decision tree algorithm. After alignment and trimming, 31 SNPs were discovered in the rbcL sequences from 38 Brassicaceae plant species. In the decision tree construction, these SNPs were computed to set up the decision rule to assign the sequences into 2 groups level by level. After algorithm processing, 37 nodes and 31 loci were required for discriminating 38 species. Finally, the sequence tags consisting of 31 rbcL SNP barcodes were identified for discriminating 38 Brassicaceae species based on the decision tree-selected SNP pattern using RSB method. Taken together, this study provides the rational that the SNP aspect of DNA barcode for rbcL gene is a useful and effective sequence for tagging 38 Brassicaceae species.

  18. A sequence in the rat Pit-1 gene promoter confers synergistic activation by glucocorticoids and protein kinase-C.

    PubMed

    Jong, M T; Raaka, B M; Samuels, H H

    1994-10-01

    The 5'-flanking region of the gene for Pit-1, a pituitary-specific transcription factor, was isolated from a rat liver genomic library and sequenced. Expression of a reporter construct containing Pit-1 promoter sequences linked to the bacterial chloramphenicol acetyltransferase (CAT) gene was assessed by transient transfection in rat pituitary GH4C1 cells. Treatment of transfected cells with either dexamethasone (DEX) for 48 h or the phorbol ester 12-O-tetradecanoylphorbol 13-acetate (TPA) for the final 20 h of the 48-h posttransfection period had minimal effects on CAT expression. However, CAT activity was elevated about 20-fold when transfected cells were treated with both DEX and TPA. This apparent synergistic activation was lost when DEX treatment was also limited to the final 20 h of the 48-h posttransfection period, suggesting that a time-dependent accumulation of a DEX-induced gene product might be involved. This putative DEX-induced product appeared to be relatively stable, because synergistic activation was observed in cells treated with DEX alone for 36 h, followed by a 10-h incubation without DEX before the addition of TPA. The Pit-1 gene promoter region between -210 and -142 from the transcription start site conferred synergistic regulation by DEX and TPA when placed upstream of position -105 in the herpes viral thymidine kinase promoter.(ABSTRACT TRUNCATED AT 250 WORDS)

  19. Directing folding pathways for multi-component DNA origami nanostructures with complex topology

    NASA Astrophysics Data System (ADS)

    Marras, A. E.; Zhou, L.; Kolliopoulos, V.; Su, H.-J.; Castro, C. E.

    2016-05-01

    Molecular self-assembly has become a well-established technique to design complex nanostructures and hierarchical mesoscale assemblies. The typical approach is to design binding complementarity into nucleotide or amino acid sequences to achieve the desired final geometry. However, with an increasing interest in dynamic nanodevices, the need to design structures with motion has necessitated the development of multi-component structures. While this has been achieved through hierarchical assembly of similar structural units, here we focus on the assembly of topologically complex structures, specifically with concentric components, where post-folding assembly is not feasible. We exploit the ability to direct folding pathways to program the sequence of assembly and present a novel approach of designing the strand topology of intermediate folding states to program the topology of the final structure, in this case a DNA origami slider structure that functions much like a piston-cylinder assembly in an engine. The ability to program the sequence and control orientation and topology of multi-component DNA origami nanostructures provides a foundation for a new class of structures with internal and external moving parts and complex scaffold topology. Furthermore, this work provides critical insight to guide the design of intermediate states along a DNA origami folding pathway and to further understand the details of DNA origami self-assembly to more broadly control folding states and landscapes.

  20. Using the auxiliary camera for system calibration of 3D measurement by digital speckle

    NASA Astrophysics Data System (ADS)

    Xue, Junpeng; Su, Xianyu; Zhang, Qican

    2014-06-01

    The study of 3D shape measurement by digital speckle temporal sequence correlation have drawn a lot of attention by its own advantages, however, the measurement mainly for depth z-coordinate, horizontal physical coordinate (x, y) are usually marked as image pixel coordinate. In this paper, a new approach for the system calibration is proposed. With an auxiliary camera, we made up the temporary binocular vision system, which are used for the calibration of horizontal coordinates (mm) while the temporal sequence reference-speckle-sets are calibrated. First, the binocular vision system has been calibrated using the traditional method. Then, the digital speckles are projected on the reference plane, which is moved by equal distance in the direction of depth, temporal sequence speckle images are acquired with camera as reference sets. When the reference plane is in the first position and final position, crossed fringe pattern are projected to the plane respectively. The control points of pixel coordinates are extracted by Fourier analysis from the images, and the physical coordinates are calculated by the binocular vision. The physical coordinates corresponding to each pixel of the images are calculated by interpolation algorithm. Finally, the x and y corresponding to arbitrary depth value z are obtained by the geometric formula. Experiments prove that our method can fast and flexibly measure the 3D shape of an object as point cloud.

  1. A two-stage stochastic rule-based model to determine pre-assembly buffer content

    NASA Astrophysics Data System (ADS)

    Gunay, Elif Elcin; Kula, Ufuk

    2018-01-01

    This study considers instant decision-making needs of the automobile manufactures for resequencing vehicles before final assembly (FA). We propose a rule-based two-stage stochastic model to determine the number of spare vehicles that should be kept in the pre-assembly buffer to restore the altered sequence due to paint defects and upstream department constraints. First stage of the model decides the spare vehicle quantities, where the second stage model recovers the scrambled sequence respect to pre-defined rules. The problem is solved by sample average approximation (SAA) algorithm. We conduct a numerical study to compare the solutions of heuristic model with optimal ones and provide following insights: (i) as the mismatch between paint entrance and scheduled sequence decreases, the rule-based heuristic model recovers the scrambled sequence as good as the optimal resequencing model, (ii) the rule-based model is more sensitive to the mismatch between the paint entrance and scheduled sequences for recovering the scrambled sequence, (iii) as the defect rate increases, the difference in recovery effectiveness between rule-based heuristic and optimal solutions increases, (iv) as buffer capacity increases, the recovery effectiveness of the optimization model outperforms heuristic model, (v) as expected the rule-based model holds more inventory than the optimization model.

  2. The Relevance of HLA Sequencing in Population Genetics Studies

    PubMed Central

    Sanchez-Mazas, Alicia

    2014-01-01

    Next generation sequencing (NGS) is currently being adapted by different biotechnological platforms to the standard typing method for HLA polymorphism, the huge diversity of which makes this initiative particularly challenging. Boosting the molecular characterization of the HLA genes through efficient, rapid, and low-cost technologies is expected to amplify the success of tissue transplantation by enabling us to find donor-recipient matching for rare phenotypes. But the application of NGS technologies to the molecular mapping of the MHC region also anticipates essential changes in population genetic studies. Huge amounts of HLA sequence data will be available in the next years for different populations, with the potential to change our understanding of HLA variation in humans. In this review, we first explain how HLA sequencing allows a better assessment of the HLA diversity in human populations, taking also into account the methodological difficulties it introduces at the statistical level; secondly, we show how analyzing HLA sequence variation may improve our comprehension of population genetic relationships by facilitating the identification of demographic events that marked human evolution; finally, we discuss the interest of both HLA and genome-wide sequencing and genotyping in detecting functionally significant SNPs in the MHC region, the latter having also contributed to the makeup of the HLA molecular diversity observed today. PMID:25126587

  3. Transcriptomic analysis of Siberian ginseng (Eleutherococcus senticosus) to discover genes involved in saponin biosynthesis.

    PubMed

    Hwang, Hwan-Su; Lee, Hyoshin; Choi, Yong Eui

    2015-03-14

    Eleutherococcus senticosus, Siberian ginseng, is a highly valued woody medicinal plant belonging to the family Araliaceae. E. senticosus produces a rich variety of saponins such as oleanane-type, noroleanane-type, 29-hydroxyoleanan-type, and lupane-type saponins. Genomic or transcriptomic approaches have not been used to investigate the saponin biosynthetic pathway in this plant. In this study, de novo sequencing was performed to select candidate genes involved in the saponin biosynthetic pathway. A half-plate 454 pyrosequencing run produced 627,923 high-quality reads with an average sequence length of 422 bases. De novo assembly generated 72,811 unique sequences, including 15,217 contigs and 57,594 singletons. Approximately 48,300 (66.3%) unique sequences were annotated using BLAST similarity searches. All of the mevalonate pathway genes for saponin biosynthesis starting from acetyl-CoA were isolated. Moreover, 206 reads of cytochrome P450 (CYP) and 145 reads of uridine diphosphate glycosyltransferase (UGT) sequences were isolated. Based on methyl jasmonate (MeJA) treatment and real-time PCR (qPCR) analysis, 3 CYPs and 3 UGTs were finally selected as candidate genes involved in the saponin biosynthetic pathway. The identified sequences associated with saponin biosynthesis will facilitate the study of the functional genomics of saponin biosynthesis and genetic engineering of E. senticosus.

  4. Accuracy of the high-throughput amplicon sequencing to identify species within the genus Aspergillus.

    PubMed

    Lee, Seungeun; Yamamoto, Naomichi

    2015-12-01

    This study characterized the accuracy of high-throughput amplicon sequencing to identify species within the genus Aspergillus. To this end, we sequenced the internal transcribed spacer 1 (ITS1), β-tubulin (BenA), and calmodulin (CaM) gene encoding sequences as DNA markers from eight reference Aspergillus strains with known identities using 300-bp sequencing on the Illumina MiSeq platform, and compared them with the BLASTn outputs. The identifications with the sequences longer than 250 bp were accurate at the section rank, with some ambiguities observed at the species rank due to mostly cross detection of sibling species. Additionally, in silico analysis was performed to predict the identification accuracy for all species in the genus Aspergillus, where 107, 210, and 187 species were predicted to be identifiable down to the species rank based on ITS1, BenA, and CaM, respectively. Finally, air filter samples were analysed to quantify the relative abundances of Aspergillus species in outdoor air. The results were reproducible across biological duplicates both at the species and section ranks, but not strongly correlated between ITS1 and BenA, suggesting the Aspergillus detection can be taxonomically biased depending on the selection of the DNA markers and/or primers. Copyright © 2015 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.

  5. Project 1: Microbial Genomes: A Genomic Approach to Understanding the Evolution of Virulence. Project 2: From Genomes to Life: Drosophilia Development in Space and Time

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Robert DeSalle

    2004-09-10

    This project seeks to use the genomes of two close relatives, A. actinomycetemcomitans and H. aphrophilus, to understand the evolutionary changes that take place in a genome to make it more or less virulent. Our primary specific aim of this project was to sequence, annotate, and analyze the genomes of Actinobacillus actinomycetemcomitans (CU1000, serotype f) and Haemophilus aphrophilus. With these genome sequences we have then compared the whole genome sequences to each other and to the current Aa (HK1651 www.genome.ou.edu) genome project sequence along with other fully sequenced Pasteurellaceae to determine inter and intra species differences that may account formore » the differences and similarities in disease. We also propose to create and curate a comprehensive database where sequence information and analysis for the Pasteurellaceae (family that includes the genera Actinobacillus and Haemophilus) are readily accessible. And finally we have proposed to develop phylogenetic techniques that can be used to efficiently and accurately examine the evolution of genomes. Below we report on progress we have made on these major specific aims. Progress on the specific aims is reported below under two major headings--experimental approaches and bioinformatics and systematic biology approaches.« less

  6. Protecting unknown two-qubit entangled states by nesting Uhrig's dynamical decoupling sequences

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mukhtar, Musawwadah; Soh, Wee Tee; Saw, Thuan Beng

    2010-11-15

    Future quantum technologies rely heavily on good protection of quantum entanglement against environment-induced decoherence. A recent study showed that an extension of Uhrig's dynamical decoupling (UDD) sequence can (in theory) lock an arbitrary but known two-qubit entangled state to the Nth order using a sequence of N control pulses [Mukhtar et al., Phys. Rev. A 81, 012331 (2010)]. By nesting three layers of explicitly constructed UDD sequences, here we first consider the protection of unknown two-qubit states as superposition of two known basis states, without making assumptions of the system-environment coupling. It is found that the obtained decoherence suppression canmore » be highly sensitive to the ordering of the three UDD layers and can be remarkably effective with the correct ordering. The detailed theoretical results are useful for general understanding of the nature of controlled quantum dynamics under nested UDD. As an extension of our three-layer UDD, it is finally pointed out that a completely unknown two-qubit state can be protected by nesting four layers of UDD sequences. This work indicates that when UDD is applicable (e.g., when the environment has a sharp frequency cutoff and when control pulses can be taken as instantaneous pulses), dynamical decoupling using nested UDD sequences is a powerful approach for entanglement protection.« less

  7. Genome-Wide Prediction and Analysis of 3D-Domain Swapped Proteins in the Human Genome from Sequence Information.

    PubMed

    Upadhyay, Atul Kumar; Sowdhamini, Ramanathan

    2016-01-01

    3D-domain swapping is one of the mechanisms of protein oligomerization and the proteins exhibiting this phenomenon have many biological functions. These proteins, which undergo domain swapping, have acquired much attention owing to their involvement in human diseases, such as conformational diseases, amyloidosis, serpinopathies, proteionopathies etc. Early realisation of proteins in the whole human genome that retain tendency to domain swap will enable many aspects of disease control management. Predictive models were developed by using machine learning approaches with an average accuracy of 78% (85.6% of sensitivity, 87.5% of specificity and an MCC value of 0.72) to predict putative domain swapping in protein sequences. These models were applied to many complete genomes with special emphasis on the human genome. Nearly 44% of the protein sequences in the human genome were predicted positive for domain swapping. Enrichment analysis was performed on the positively predicted sequences from human genome for their domain distribution, disease association and functional importance based on Gene Ontology (GO). Enrichment analysis was also performed to infer a better understanding of the functional importance of these sequences. Finally, we developed hinge region prediction, in the given putative domain swapped sequence, by using important physicochemical properties of amino acids.

  8. Solid-phase proximity ligation assays for individual or parallel protein analyses with readout via real-time PCR or sequencing.

    PubMed

    Nong, Rachel Yuan; Wu, Di; Yan, Junhong; Hammond, Maria; Gu, Gucci Jijuan; Kamali-Moghaddam, Masood; Landegren, Ulf; Darmanis, Spyros

    2013-06-01

    Solid-phase proximity ligation assays share properties with the classical sandwich immunoassays for protein detection. The proteins captured via antibodies on solid supports are, however, detected not by single antibodies with detectable functions, but by pairs of antibodies with attached DNA strands. Upon recognition by these sets of three antibodies, pairs of DNA strands brought in proximity are joined by ligation. The ligated reporter DNA strands are then detected via methods such as real-time PCR or next-generation sequencing (NGS). We describe how to construct assays that can offer improved detection specificity by virtue of recognition by three antibodies, as well as enhanced sensitivity owing to reduced background and amplified detection. Finally, we also illustrate how the assays can be applied for parallel detection of proteins, taking advantage of the oligonucleotide ligation step to avoid background problems that might arise with multiplexing. The protocol for the singleplex solid-phase proximity ligation assay takes ~5 h. The multiplex version of the assay takes 7-8 h depending on whether quantitative PCR (qPCR) or sequencing is used as the readout. The time for the sequencing-based protocol includes the library preparation but not the actual sequencing, as times may vary based on the choice of sequencing platform.

  9. The relevance of HLA sequencing in population genetics studies.

    PubMed

    Sanchez-Mazas, Alicia; Meyer, Diogo

    2014-01-01

    Next generation sequencing (NGS) is currently being adapted by different biotechnological platforms to the standard typing method for HLA polymorphism, the huge diversity of which makes this initiative particularly challenging. Boosting the molecular characterization of the HLA genes through efficient, rapid, and low-cost technologies is expected to amplify the success of tissue transplantation by enabling us to find donor-recipient matching for rare phenotypes. But the application of NGS technologies to the molecular mapping of the MHC region also anticipates essential changes in population genetic studies. Huge amounts of HLA sequence data will be available in the next years for different populations, with the potential to change our understanding of HLA variation in humans. In this review, we first explain how HLA sequencing allows a better assessment of the HLA diversity in human populations, taking also into account the methodological difficulties it introduces at the statistical level; secondly, we show how analyzing HLA sequence variation may improve our comprehension of population genetic relationships by facilitating the identification of demographic events that marked human evolution; finally, we discuss the interest of both HLA and genome-wide sequencing and genotyping in detecting functionally significant SNPs in the MHC region, the latter having also contributed to the makeup of the HLA molecular diversity observed today.

  10. HIV Type 1 Transmission Networks Among Men Having Sex with Men and Heterosexuals in Kenya

    PubMed Central

    Faria, Nuno Rodrigues; Hassan, Amin; Hamers, Raph L.; Mutua, Gaudensia; Anzala, Omu; Mandaliya, Kishor; Cane, Patricia; Berkley, James A.; Rinke de Wit, Tobias F.; Wallis, Carole; Graham, Susan M.; Price, Matthew A.; Coutinho, Roel A.; Sanders, Eduard J.

    2014-01-01

    Abstract We performed a molecular phylogenetic study on HIV-1 polymerase sequences of men who have sex with men (MSM) and heterosexual patient samples in Kenya to characterize any observed HIV-1 transmission networks. HIV-1 polymerase sequences were obtained from samples in Nairobi and coastal Kenya from 84 MSM, 226 other men, and 364 women from 2005 to 2010. Using Bayesian phylogenetics, we tested whether sequences clustered by sexual orientation and geographic location. In addition, we used trait diffusion analyses to identify significant epidemiological links and to quantify the number of transmissions between risk groups. Finally, we compared 84 MSM sequences with all HIV-1 sequences available online at GenBank. Significant clustering of sequences from MSM at both coastal Kenya and Nairobi was found, with evidence of HIV-1 transmission between both locations. Although a transmission pair between a coastal MSM and woman was confirmed, no significant HIV-1 transmission was evident between MSM and the comparison population for the predominant subtype A (60%). However, a weak but significant link was evident when studying all subtypes together. GenBank comparison did not reveal other important transmission links. Our data suggest infrequent intermingling of MSM and heterosexual HIV-1 epidemics in Kenya. PMID:23947948

  11. RNase H-assisted RNA-primed rolling circle amplification for targeted RNA sequence detection.

    PubMed

    Takahashi, Hirokazu; Ohkawachi, Masahiko; Horio, Kyohei; Kobori, Toshiro; Aki, Tsunehiro; Matsumura, Yukihiko; Nakashimada, Yutaka; Okamura, Yoshiko

    2018-05-17

    RNA-primed rolling circle amplification (RPRCA) is a useful laboratory method for RNA detection; however, the detection of RNA is limited by the lack of information on 3'-terminal sequences. We uncovered that conventional RPRCA using pre-circularized probes could potentially detect the internal sequence of target RNA molecules in combination with RNase H. However, the specificity for mRNA detection was low, presumably due to non-specific hybridization of non-target RNA with the circular probe. To overcome this technical problem, we developed a method for detecting a sequence of interest in target RNA molecules via RNase H-assisted RPRCA using padlocked probes. When padlock probes are hybridized to the target RNA molecule, they are converted to the circular form by SplintR ligase. Subsequently, RNase H creates nick sites only in the hybridized RNA sequence, and single-stranded DNA is finally synthesized from the nick site by phi29 DNA polymerase. This method could specifically detect at least 10 fmol of the target RNA molecule without reverse transcription. Moreover, this method detected GFP mRNA present in 10 ng of total RNA isolated from Escherichia coli without background DNA amplification. Therefore, this method can potentially detect almost all types of RNA molecules without reverse transcription and reveal full-length sequence information.

  12. Sequence features of viral and human Internal Ribosome Entry Sites predictive of their activity

    PubMed Central

    Elias-Kirma, Shani; Nir, Ronit; Segal, Eran

    2017-01-01

    Translation of mRNAs through Internal Ribosome Entry Sites (IRESs) has emerged as a prominent mechanism of cellular and viral initiation. It supports cap-independent translation of select cellular genes under normal conditions, and in conditions when cap-dependent translation is inhibited. IRES structure and sequence are believed to be involved in this process. However due to the small number of IRESs known, there have been no systematic investigations of the determinants of IRES activity. With the recent discovery of thousands of novel IRESs in human and viruses, the next challenge is to decipher the sequence determinants of IRES activity. We present the first in-depth computational analysis of a large body of IRESs, exploring RNA sequence features predictive of IRES activity. We identified predictive k-mer features resembling IRES trans-acting factor (ITAF) binding motifs across human and viral IRESs, and found that their effect on expression depends on their sequence, number and position. Our results also suggest that the architecture of retroviral IRESs differs from that of other viruses, presumably due to their exposure to the nuclear environment. Finally, we measured IRES activity of synthetically designed sequences to confirm our prediction of increasing activity as a function of the number of short IRES elements. PMID:28922394

  13. AntiClustal: Multiple Sequence Alignment by antipole clustering and linear approximate 1-median computation.

    PubMed

    Di Pietro, C; Di Pietro, V; Emmanuele, G; Ferro, A; Maugeri, T; Modica, E; Pigola, G; Pulvirenti, A; Purrello, M; Ragusa, M; Scalia, M; Shasha, D; Travali, S; Zimmitti, V

    2003-01-01

    In this paper we present a new Multiple Sequence Alignment (MSA) algorithm called AntiClusAl. The method makes use of the commonly use idea of aligning homologous sequences belonging to classes generated by some clustering algorithm, and then continue the alignment process ina bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of the progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S which minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomized tournaments which has been successfully applied to large size search problems in general metric spaces. In particular a clustering algorithm called Antipole tree and an approximate linear 1-median computation are used. Our algorithm compared with Clustal W, a widely used tool to MSA, shows a better running time results with fully comparable alignment quality. A successful biological application showing high aminoacid conservation during evolution of Xenopus laevis SOD2 is also cited.

  14. Habits as action sequences: hierarchical action control and changes in outcome value

    PubMed Central

    Dezfouli, Amir; Lingawi, Nura W.; Balleine, Bernard W.

    2014-01-01

    Goal-directed action involves making high-level choices that are implemented using previously acquired action sequences to attain desired goals. Such a hierarchical schema is necessary for goal-directed actions to be scalable to real-life situations, but results in decision-making that is less flexible than when action sequences are unfolded and the decision-maker deliberates step-by-step over the outcome of each individual action. In particular, from this perspective, the offline revaluation of any outcomes that fall within action sequence boundaries will be invisible to the high-level planner resulting in decisions that are insensitive to such changes. Here, within the context of a two-stage decision-making task, we demonstrate that this property can explain the emergence of habits. Next, we show how this hierarchical account explains the insensitivity of over-trained actions to changes in outcome value. Finally, we provide new data that show that, under extended extinction conditions, habitual behaviour can revert to goal-directed control, presumably as a consequence of decomposing action sequences into single actions. This hierarchical view suggests that the development of action sequences and the insensitivity of actions to changes in outcome value are essentially two sides of the same coin, explaining why these two aspects of automatic behaviour involve a shared neural structure. PMID:25267824

  15. A reference human genome dataset of the BGISEQ-500 sequencer.

    PubMed

    Huang, Jie; Liang, Xinming; Xuan, Yuankai; Geng, Chunyu; Li, Yuxiang; Lu, Haorong; Qu, Shoufang; Mei, Xianglin; Chen, Hongbo; Yu, Ting; Sun, Nan; Rao, Junhua; Wang, Jiahao; Zhang, Wenwei; Chen, Ying; Liao, Sha; Jiang, Hui; Liu, Xin; Yang, Zhaopeng; Mu, Feng; Gao, Shangxian

    2017-05-01

    BGISEQ-500 is a new desktop sequencer developed by BGI. Using DNA nanoball and combinational probe anchor synthesis developed from Complete Genomics™ sequencing technologies, it generates short reads at a large scale. Here, we present the first human whole-genome sequencing dataset of BGISEQ-500. The dataset was generated by sequencing the widely used cell line HG001 (NA12878) in two sequencing runs of paired-end 50 bp (PE50) and two sequencing runs of paired-end 100 bp (PE100). We also include examples of the raw images from the sequencer for reference. Finally, we identified variations using this dataset, estimated the accuracy of the variations, and compared to that of the variations identified from similar amounts of publicly available HiSeq2500 data. We found similar single nucleotide polymorphism (SNP) detection accuracy for the BGISEQ-500 PE100 data (false positive rate [FPR] = 0.00020%, sensitivity = 96.20%) compared to the PE150 HiSeq2500 data (FPR = 0.00017%, sensitivity = 96.60%) better SNP detection accuracy than the PE50 data (FPR = 0.0006%, sensitivity = 94.15%). But for insertions and deletions (indels), we found lower accuracy for BGISEQ-500 data (FPR = 0.00069% and 0.00067% for PE100 and PE50 respectively, sensitivity = 88.52% and 70.93%) than the HiSeq2500 data (FPR = 0.00032%, sensitivity = 96.28%). Our dataset can serve as the reference dataset, providing basic information not just for future development, but also for all research and applications based on the new sequencing platform. © The Authors 2017. Published by Oxford University Press.

  16. Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment

    PubMed Central

    2011-01-01

    Background Many Bioinformatics studies begin with a multiple sequence alignment as the foundation for their research. This is because multiple sequence alignment can be a useful technique for studying molecular evolution and analyzing sequence structure relationships. Results In this paper, we have proposed a Vertical Decomposition with Genetic Algorithm (VDGA) for Multiple Sequence Alignment (MSA). In VDGA, we divide the sequences vertically into two or more subsequences, and then solve them individually using a guide tree approach. Finally, we combine all the subsequences to generate a new multiple sequence alignment. This technique is applied on the solutions of the initial generation and of each child generation within VDGA. We have used two mechanisms to generate an initial population in this research: the first mechanism is to generate guide trees with randomly selected sequences and the second is shuffling the sequences inside such trees. Two different genetic operators have been implemented with VDGA. To test the performance of our algorithm, we have compared it with existing well-known methods, namely PRRP, CLUSTALX, DIALIGN, HMMT, SB_PIMA, ML_PIMA, MULTALIGN, and PILEUP8, and also other methods, based on Genetic Algorithms (GA), such as SAGA, MSA-GA and RBT-GA, by solving a number of benchmark datasets from BAliBase 2.0. Conclusions The experimental results showed that the VDGA with three vertical divisions was the most successful variant for most of the test cases in comparison to other divisions considered with VDGA. The experimental results also confirmed that VDGA outperformed the other methods considered in this research. PMID:21867510

  17. Specialized microbial databases for inductive exploration of microbial genome sequences

    PubMed Central

    Fang, Gang; Ho, Christine; Qiu, Yaowu; Cubas, Virginie; Yu, Zhou; Cabau, Cédric; Cheung, Frankie; Moszer, Ivan; Danchin, Antoine

    2005-01-01

    Background The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects. Methods The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented. Results Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore , a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya) has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns. Conclusion This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis) associated to related organisms for comparison. PMID:15698474

  18. Neural Encoding and Integration of Learned Probabilistic Sequences in Avian Sensory-Motor Circuitry

    PubMed Central

    Brainard, Michael S.

    2013-01-01

    Many complex behaviors, such as human speech and birdsong, reflect a set of categorical actions that can be flexibly organized into variable sequences. However, little is known about how the brain encodes the probabilities of such sequences. Behavioral sequences are typically characterized by the probability of transitioning from a given action to any subsequent action (which we term “divergence probability”). In contrast, we hypothesized that neural circuits might encode the probability of transitioning to a given action from any preceding action (which we term “convergence probability”). The convergence probability of repeatedly experienced sequences could naturally become encoded by Hebbian plasticity operating on the patterns of neural activity associated with those sequences. To determine whether convergence probability is encoded in the nervous system, we investigated how auditory-motor neurons in vocal premotor nucleus HVC of songbirds encode different probabilistic characterizations of produced syllable sequences. We recorded responses to auditory playback of pseudorandomly sequenced syllables from the bird's repertoire, and found that variations in responses to a given syllable could be explained by a positive linear dependence on the convergence probability of preceding sequences. Furthermore, convergence probability accounted for more response variation than other probabilistic characterizations, including divergence probability. Finally, we found that responses integrated over >7–10 syllables (∼700–1000 ms) with the sign, gain, and temporal extent of integration depending on convergence probability. Our results demonstrate that convergence probability is encoded in sensory-motor circuitry of the song-system, and suggest that encoding of convergence probability is a general feature of sensory-motor circuits. PMID:24198363

  19. The zero age main sequence of WIMP burners

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fairbairn, Malcolm; Scott, Pat; Edsjoe, Joakim

    2008-02-15

    We modify a stellar structure code to estimate the effect upon the main sequence of the accretion of weakly-interacting dark matter onto stars and its subsequent annihilation. The effect upon the stars depends upon whether the energy generation rate from dark matter annihilation is large enough to shut off the nuclear burning in the star. Main sequence weakly-interacting massive particles (WIMP) burners look much like proto-stars moving on the Hayashi track, although they are in principle completely stable. We make some brief comments about where such stars could be found, how they might be observed and more detailed simulations whichmore » are currently in progress. Finally we comment on whether or not it is possible to link the paradoxically hot, young stars found at the galactic center with WIMP burners.« less

  20. Standard Mutation Nomenclature in Molecular Diagnostics

    PubMed Central

    Ogino, Shuji; Gulley, Margaret L.; den Dunnen, Johan T.; Wilson, Robert B.

    2007-01-01

    To translate basic research findings into clinical practice, it is essential that information about mutations and variations in the human genome are communicated easily and unequivocally. Unfortunately, there has been much confusion regarding the description of genetic sequence variants. This is largely because research articles that first report novel sequence variants do not often use standard nomenclature, and the final genomic sequence is compiled over many separate entries. In this article, we discuss issues crucial to clear communication, using examples of genes that are commonly assayed in clinical laboratories. Although molecular diagnostics is a dynamic field, this should not inhibit the need for and movement toward consensus nomenclature for accurate reporting among laboratories. Our aim is to alert laboratory scientists and other health care professionals to the important issues and provide a foundation for further discussions that will ultimately lead to solutions. PMID:17251329

  1. Molecular epidemiology of Oropouche virus, Brazil.

    PubMed

    Vasconcelos, Helena Baldez; Nunes, Márcio R T; Casseb, Lívia M N; Carvalho, Valéria L; Pinto da Silva, Eliana V; Silva, Mayra; Casseb, Samir M M; Vasconcelos, Pedro F C

    2011-05-01

    Oropouche virus (OROV) is the causative agent of Oropouche fever, an urban febrile arboviral disease widespread in South America, with >30 epidemics reported in Brazil and other Latin American countries during 1960-2009. To describe the molecular epidemiology of OROV, we analyzed the entire N gene sequences (small RNA) of 66 strains and 35 partial Gn (medium RNA) and large RNA gene sequences. Distinct patterns of OROV strain clustered according to N, Gn, and large gene sequences, which suggests that each RNA segment had a different evolutionary history and that the classification in genotypes must consider the genetic information for all genetic segments. Finally, time-scale analysis based on the N gene showed that OROV emerged in Brazil ≈223 years ago and that genotype I (based on N gene data) was responsible for the emergence of all other genotypes and for virus dispersal.

  2. A Folding Zone in the Ribosomal Exit Tunnel for Kv1.3 Helix Formation

    PubMed Central

    Tu, LiWei; Deutsch, Carol

    2010-01-01

    SUMMARY Although it is now clear that protein secondary structure can be acquired early, while the nascent peptide resides within the ribosomal exit tunnel, the principles governing folding of native polytopic proteins have not yet been elucidated. We now report an extensive investigation of native Kv1.3, a voltage-gated K+ channel, including transmembrane and linker segments synthesized in sequence. These native segments form helices vectorially (N- to C-terminus) only in a permissive vestibule located in the last 20Å of the tunnel. Native linker sequences similarly fold in this vestibule. Finally, secondary structure acquired in the ribosome is retained in the translocon. These findings emerge from accessibility studies of a diversity of native transmembrane and linker sequences and may therefore be applicable to protein biogenesis in general. PMID:20060838

  3. An insight into the sialotranscriptome of the seed-feeding bug, Oncopeltus fasciatus.

    PubMed

    Francischetti, Ivo M B; Lopes, Angela H; Dias, Felipe A; Pham, Van M; Ribeiro, José M C

    2007-09-01

    The salivary transcriptome of the seed-feeding hemipteran, Oncopeltus fasciatus (milkweed bug), is described following assembly of 1025 expressed sequence tags (ESTs) into 305 clusters of related sequences. Inspection of these sequences reveals abundance of low complexity, putative secreted products rich in the amino acids (aa) glycine, serine or threonine, which might function as silk or mucins and assist food canal lubrication and sealing of the feeding site around the mouthparts. Several protease inhibitors were found, including abundant expression of cystatin transcripts that may inhibit cysteine proteases common in seeds that might injure the insect or induce plant apoptosis. Serine proteases and lipases are described that might assist digestion and liquefaction of seed proteins and oils. Finally, several novel putative proteins are described with no known function that might affect plant physiology or act as antimicrobials.

  4. Object-Oriented Query Language For Events Detection From Images Sequences

    NASA Astrophysics Data System (ADS)

    Ganea, Ion Eugen

    2015-09-01

    In this paper is presented a method to represent the events extracted from images sequences and the query language used for events detection. Using an object oriented model the spatial and temporal relationships between salient objects and also between events are stored and queried. This works aims to unify the storing and querying phases for video events processing. The object oriented language syntax used for events processing allow the instantiation of the indexes classes in order to improve the accuracy of the query results. The experiments were performed on images sequences provided from sport domain and it shows the reliability and the robustness of the proposed language. To extend the language will be added a specific syntax for constructing the templates for abnormal events and for detection of the incidents as the final goal of the research.

  5. [Using exon combined target region capture sequencing chip to detect the disease-causing genes of retinitis pigmentosa].

    PubMed

    Rong, Weining; Chen, Xuejuan; Li, Huiping; Liu, Yani; Sheng, Xunlun

    2014-06-01

    To detect the disease-causing genes of 10 retinitis pigmentosa pedigrees by using exon combined target region capture sequencing chip. Pedigree investigation study. From October 2010 to December 2013, 10 RP pedigrees were recruited for this study in Ningxia Eye Hospital. All the patients and family members received complete ophthalmic examinations. DNA was abstracted from patients, family members and controls. Using exon combined target region capture sequencing chip to screen the candidate disease-causing mutations. Polymerase chain reaction (PCR) and direct sequencing were used to confirm the disease-causing mutations. Seventy patients and 23 normal family members were recruited from 10 pedigrees. Among 10 RP pedigrees, 1 was autosomal dominant pedigrees and 9 were autosomal recessive pedigrees. 7 mutations related to 5 genes of 5 pedigrees were detected. A frameshift mutation on BBS7 gene was detected in No.2 pedigree, the patients of this pedigree combined with central obesity, polydactyly and mental handicap. No.2 pedigree was diagnosed as Bardet-Biedl syndrome finally. A missense mutation was detected in No.7 and No.10 pedigrees respectively. Because the patients suffered deafness meanwhile, the final diagnosis was Usher syndrome. A missense mutation on C3 gene related to age-related macular degeneration was also detected in No. 7 pedigrees. A nonsense mutation and a missense mutation on CRB1 gene were detected in No. 1 pedigree and a splicesite mutation on PROM1 gene was detected in No. 5 pedigree. Retinitis pigmentosa is a kind of genetic eye disease with diversity clinical phenotypes. Rapid and effective genetic diagnosis technology combined with clinical characteristics analysis is helpful to improve the level of clinical diagnosis of RP.

  6. Removal of organic matter and ammonium from landfill leachate through different scenarios: Operational cost evaluation in a full-scale case study of a Flemish landfill.

    PubMed

    Oloibiri, Violet; Chys, Michael; De Wandel, Stijn; Demeestere, Kristof; Van Hulle, Stijn W H

    2017-12-01

    Several scenarios are available to landfilling facilities to effectively treat leachate at the lowest possible cost. In this study, the performance of various leachate treatment sequences to remove COD and nitrogen from a leachate stream and the associated cost are presented. The results show that, to achieve 100% nitrogen removal, autotrophic nitrogen removal (ANR) or a combination of ANR and nitrification - denitrification (N-dN) is more cost effective than using only the N-dN process (0.58 €/m 3 ) without changing the leachate polishing costs associated with granular activated carbon (GAC). Treatment of N-dN effluent by ozonation or coagulation led to the reduction of the COD concentration by 10% and 59% respectively before GAC adsorption. This reduced GAC costs and subsequently reduced the overall treatment costs by 7% (ozonation) and 22% (coagulation). On the contrary, using Fenton oxidation to reduce the COD concentration of N-dN effluent by 63% increased the overall leachate treatment costs by 3%. Leachate treatment sequences employing ANR for nitrogen removal followed by ozonation or Fenton or coagulation for COD removal and final polishing with GAC are on average 33% cheaper than a sequence with N-dN + GAC only. When ANR is the preceding step and GAC the final step, choice of AOP i.e., ozonation or Fenton did not affect the total treatment costs which amounted to 1.43 (ozonation) and 1.42 €/m 3 (Fenton). In all the investigated leachate treatment trains, the sequence with ANR + coagulation + GAC is the most cost effective at 0.94 €/m 3 . Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. MoRFPred-plus: Computational Identification of MoRFs in Protein Sequences using Physicochemical Properties and HMM profiles.

    PubMed

    Sharma, Ronesh; Bayarjargal, Maitsetseg; Tsunoda, Tatsuhiko; Patil, Ashwini; Sharma, Alok

    2018-01-21

    Intrinsically Disordered Proteins (IDPs) lack stable tertiary structure and they actively participate in performing various biological functions. These IDPs expose short binding regions called Molecular Recognition Features (MoRFs) that permit interaction with structured protein regions. Upon interaction they undergo a disorder-to-order transition as a result of which their functionality arises. Predicting these MoRFs in disordered protein sequences is a challenging task. In this study, we present MoRFpred-plus, an improved predictor over our previous proposed predictor to identify MoRFs in disordered protein sequences. Two separate independent propensity scores are computed via incorporating physicochemical properties and HMM profiles, these scores are combined to predict final MoRF propensity score for a given residue. The first score reflects the characteristics of a query residue to be part of MoRF region based on the composition and similarity of assumed MoRF and flank regions. The second score reflects the characteristics of a query residue to be part of MoRF region based on the properties of flanks associated around the given residue in the query protein sequence. The propensity scores are processed and common averaging is applied to generate the final prediction score of MoRFpred-plus. Performance of the proposed predictor is compared with available MoRF predictors, MoRFchibi, MoRFpred, and ANCHOR. Using previously collected training and test sets used to evaluate the mentioned predictors, the proposed predictor outperforms these predictors and generates lower false positive rate. In addition, MoRFpred-plus is a downloadable predictor, which makes it useful as it can be used as input to other computational tools. https://github.com/roneshsharma/MoRFpred-plus/wiki/MoRFpred-plus:-Download. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Instances of erroneous DNA barcoding of metazoan invertebrates: Are universal cox1 gene primers too "universal"?

    PubMed

    Mioduchowska, Monika; Czyż, Michał Jan; Gołdyn, Bartłomiej; Kur, Jarosław; Sell, Jerzy

    2018-01-01

    The cytochrome c oxidase subunit I (cox1) gene is the main mitochondrial molecular marker playing a pivotal role in phylogenetic research and is a crucial barcode sequence. Folmer's "universal" primers designed to amplify this gene in metazoan invertebrates allowed quick and easy barcode and phylogenetic analysis. On the other hand, the increase in the number of studies on barcoding leads to more frequent publishing of incorrect sequences, due to amplification of non-target taxa, and insufficient analysis of the obtained sequences. Consequently, some sequences deposited in genetic databases are incorrectly described as obtained from invertebrates, while being in fact bacterial sequences. In our study, in which we used Folmer's primers to amplify COI sequences of the crustacean fairy shrimp Branchipus schaefferi (Fischer 1834), we also obtained COI sequences of microbial contaminants from Aeromonas sp. However, when we searched the GenBank database for sequences closely matching these contaminations we found entries described as representatives of Gastrotricha and Mollusca. When these entries were compared with other sequences bearing the same names in the database, the genetic distance between the incorrect and correct sequences amplified from the same species was c.a. 65%. Although the responsibility for the correct molecular identification of species rests on researchers, the errors found in already published sequences data have not been re-evaluated so far. On the basis of the standard sampling technique we have estimated with 95% probability that the chances of finding incorrectly described metazoan sequences in the GenBank depend on the systematic group, and variety from less than 1% (Mollusca and Arthropoda) up to 6.9% (Gastrotricha). Consequently, the increasing popularity of DNA barcoding and metabarcoding analysis may lead to overestimation of species diversity. Finally, the study also discusses the sources of the problems with amplification of non-target sequences.

  9. Mapping-by-sequencing in complex polyploid genomes using genic sequence capture: a case study to map yellow rust resistance in hexaploid wheat.

    PubMed

    Gardiner, Laura-Jayne; Bansept-Basler, Pauline; Olohan, Lisa; Joynson, Ryan; Brenchley, Rachel; Hall, Neil; O'Sullivan, Donal M; Hall, Anthony

    2016-08-01

    Previously we extended the utility of mapping-by-sequencing by combining it with sequence capture and mapping sequence data to pseudo-chromosomes that were organized using wheat-Brachypodium synteny. This, with a bespoke haplotyping algorithm, enabled us to map the flowering time locus in the diploid wheat Triticum monococcum L. identifying a set of deleted genes (Gardiner et al., 2014). Here, we develop this combination of gene enrichment and sliding window mapping-by-synteny analysis to map the Yr6 locus for yellow stripe rust resistance in hexaploid wheat. A 110 MB NimbleGen capture probe set was used to enrich and sequence a doubled haploid mapping population of hexaploid wheat derived from an Avalon and Cadenza cross. The Yr6 locus was identified by mapping to the POPSEQ chromosomal pseudomolecules using a bespoke pipeline and algorithm (Chapman et al., 2015). Furthermore the same locus was identified using newly developed pseudo-chromosome sequences as a mapping reference that are based on the genic sequence used for sequence enrichment. The pseudo-chromosomes allow us to demonstrate the application of mapping-by-sequencing to even poorly defined polyploidy genomes where chromosomes are incomplete and sub-genome assemblies are collapsed. This analysis uniquely enabled us to: compare wheat genome annotations; identify the Yr6 locus - defining a smaller genic region than was previously possible; associate the interval with one wheat sub-genome and increase the density of SNP markers associated. Finally, we built the pipeline in iPlant, making it a user-friendly community resource for phenotype mapping. © 2016 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.

  10. Characterization of HIV Transmission in South-East Austria

    PubMed Central

    Kessler, Harald H.; Haas, Bernhard; Stelzl, Evelyn; Weninger, Karin; Little, Susan J.; Mehta, Sanjay R.

    2016-01-01

    To gain deeper insight into the epidemiology of HIV-1 transmission in South-East Austria we performed a retrospective analysis of 259 HIV-1 partial pol sequences obtained from unique individuals newly diagnosed with HIV infection in South-East Austria from 2008 through 2014. After quality filtering, putative transmission linkages were inferred when two sequences were ≤1.5% genetically different. Multiple linkages were resolved into putative transmission clusters. Further phylogenetic analyses were performed using BEAST v1.8.1. Finally, we investigated putative links between the 259 sequences from South-East Austria and all publicly available HIV polymerase sequences in the Los Alamos National Laboratory HIV sequence database. We found that 45.6% (118/259) of the sampled sequences were genetically linked with at least one other sequence from South-East Austria forming putative transmission clusters. Clustering individuals were more likely to be men who have sex with men (MSM; p<0.001), infected with subtype B (p<0.001) or subtype F (p = 0.02). Among clustered males who reported only heterosexual (HSX) sex as an HIV risk, 47% clustered closely with MSM (either as pairs or within larger MSM clusters). One hundred and seven of the 259 sequences (41.3%) from South-East Austria had at least one putative inferred linkage with sequences from a total of 69 other countries. In conclusion, analysis of HIV-1 sequences from newly diagnosed individuals residing in South-East Austria revealed a high degree of national and international clustering mainly within MSM. Interestingly, we found that a high number of heterosexual males clustered within MSM networks, suggesting either linkage between risk groups or misrepresentation of sexual risk behaviors by subjects. PMID:26967154

  11. Characterization of HIV Transmission in South-East Austria.

    PubMed

    Hoenigl, Martin; Chaillon, Antoine; Kessler, Harald H; Haas, Bernhard; Stelzl, Evelyn; Weninger, Karin; Little, Susan J; Mehta, Sanjay R

    2016-01-01

    To gain deeper insight into the epidemiology of HIV-1 transmission in South-East Austria we performed a retrospective analysis of 259 HIV-1 partial pol sequences obtained from unique individuals newly diagnosed with HIV infection in South-East Austria from 2008 through 2014. After quality filtering, putative transmission linkages were inferred when two sequences were ≤1.5% genetically different. Multiple linkages were resolved into putative transmission clusters. Further phylogenetic analyses were performed using BEAST v1.8.1. Finally, we investigated putative links between the 259 sequences from South-East Austria and all publicly available HIV polymerase sequences in the Los Alamos National Laboratory HIV sequence database. We found that 45.6% (118/259) of the sampled sequences were genetically linked with at least one other sequence from South-East Austria forming putative transmission clusters. Clustering individuals were more likely to be men who have sex with men (MSM; p<0.001), infected with subtype B (p<0.001) or subtype F (p = 0.02). Among clustered males who reported only heterosexual (HSX) sex as an HIV risk, 47% clustered closely with MSM (either as pairs or within larger MSM clusters). One hundred and seven of the 259 sequences (41.3%) from South-East Austria had at least one putative inferred linkage with sequences from a total of 69 other countries. In conclusion, analysis of HIV-1 sequences from newly diagnosed individuals residing in South-East Austria revealed a high degree of national and international clustering mainly within MSM. Interestingly, we found that a high number of heterosexual males clustered within MSM networks, suggesting either linkage between risk groups or misrepresentation of sexual risk behaviors by subjects.

  12. Mining new crystal protein genes from Bacillus thuringiensis on the basis of mixed plasmid-enriched genome sequencing and a computational pipeline.

    PubMed

    Ye, Weixing; Zhu, Lei; Liu, Yingying; Crickmore, Neil; Peng, Donghai; Ruan, Lifang; Sun, Ming

    2012-07-01

    We have designed a high-throughput system for the identification of novel crystal protein genes (cry) from Bacillus thuringiensis strains. The system was developed with two goals: (i) to acquire the mixed plasmid-enriched genomic sequence of B. thuringiensis using next-generation sequencing biotechnology, and (ii) to identify cry genes with a computational pipeline (using BtToxin_scanner). In our pipeline method, we employed three different kinds of well-developed prediction methods, BLAST, hidden Markov model (HMM), and support vector machine (SVM), to predict the presence of Cry toxin genes. The pipeline proved to be fast (average speed, 1.02 Mb/min for proteins and open reading frames [ORFs] and 1.80 Mb/min for nucleotide sequences), sensitive (it detected 40% more protein toxin genes than a keyword extraction method using genomic sequences downloaded from GenBank), and highly specific. Twenty-one strains from our laboratory's collection were selected based on their plasmid pattern and/or crystal morphology. The plasmid-enriched genomic DNA was extracted from these strains and mixed for Illumina sequencing. The sequencing data were de novo assembled, and a total of 113 candidate cry sequences were identified using the computational pipeline. Twenty-seven candidate sequences were selected on the basis of their low level of sequence identity to known cry genes, and eight full-length genes were obtained with PCR. Finally, three new cry-type genes (primary ranks) and five cry holotypes, which were designated cry8Ac1, cry7Ha1, cry21Ca1, cry32Fa1, and cry21Da1 by the B. thuringiensis Toxin Nomenclature Committee, were identified. The system described here is both efficient and cost-effective and can greatly accelerate the discovery of novel cry genes.

  13. Fast, accurate and easy-to-pipeline methods for amplicon sequence processing

    NASA Astrophysics Data System (ADS)

    Antonielli, Livio; Sessitsch, Angela

    2016-04-01

    Next generation sequencing (NGS) technologies established since years as an essential resource in microbiology. While on the one hand metagenomic studies can benefit from the continuously increasing throughput of the Illumina (Solexa) technology, on the other hand the spreading of third generation sequencing technologies (PacBio, Oxford Nanopore) are getting whole genome sequencing beyond the assembly of fragmented draft genomes, making it now possible to finish bacterial genomes even without short read correction. Besides (meta)genomic analysis next-gen amplicon sequencing is still fundamental for microbial studies. Amplicon sequencing of the 16S rRNA gene and ITS (Internal Transcribed Spacer) remains a well-established widespread method for a multitude of different purposes concerning the identification and comparison of archaeal/bacterial (16S rRNA gene) and fungal (ITS) communities occurring in diverse environments. Numerous different pipelines have been developed in order to process NGS-derived amplicon sequences, among which Mothur, QIIME and USEARCH are the most well-known and cited ones. The entire process from initial raw sequence data through read error correction, paired-end read assembly, primer stripping, quality filtering, clustering, OTU taxonomic classification and BIOM table rarefaction as well as alternative "normalization" methods will be addressed. An effective and accurate strategy will be presented using the state-of-the-art bioinformatic tools and the example of a straightforward one-script pipeline for 16S rRNA gene or ITS MiSeq amplicon sequencing will be provided. Finally, instructions on how to automatically retrieve nucleotide sequences from NCBI and therefore apply the pipeline to targets other than 16S rRNA gene (Greengenes, SILVA) and ITS (UNITE) will be discussed.

  14. Complete mitochondrial genome sequences of the northern spotted owl (Strix occidentalis caurina) and the barred owl (Strix varia; Aves: Strigiformes: Strigidae) confirm the presence of a duplicated control region

    PubMed Central

    Henderson, James B.; Sellas, Anna B.; Fuchs, Jérôme; Bowie, Rauri C.K.; Dumbacher, John P.

    2017-01-01

    We report here the successful assembly of the complete mitochondrial genomes of the northern spotted owl (Strix occidentalis caurina) and the barred owl (S. varia). We utilized sequence data from two sequencing methodologies, Illumina paired-end sequence data with insert lengths ranging from approximately 250 nucleotides (nt) to 9,600 nt and read lengths from 100–375 nt and Sanger-derived sequences. We employed multiple assemblers and alignment methods to generate the final assemblies. The circular genomes of S. o. caurina and S. varia are comprised of 19,948 nt and 18,975 nt, respectively. Both code for two rRNAs, twenty-two tRNAs, and thirteen polypeptides. They both have duplicated control region sequences with complex repeat structures. We were not able to assemble the control regions solely using Illumina paired-end sequence data. By fully spanning the control regions, Sanger-derived sequences enabled accurate and complete assembly of these mitochondrial genomes. These are the first complete mitochondrial genome sequences of owls (Aves: Strigiformes) possessing duplicated control regions. We searched the nuclear genome of S. o. caurina for copies of mitochondrial genes and found at least nine separate stretches of nuclear copies of gene sequences originating in the mitochondrial genome (Numts). The Numts ranged from 226–19,522 nt in length and included copies of all mitochondrial genes except tRNAPro, ND6, and tRNAGlu. Strix occidentalis caurina and S. varia exhibited an average of 10.74% (8.68% uncorrected p-distance) divergence across the non-tRNA mitochondrial genes. PMID:29038757

  15. Diagnosis of local hepatic tuberculosis through next-generation sequencing: Smarter, faster and better.

    PubMed

    Ai, Jing-Wen; Li, Yang; Cheng, Qi; Cui, Peng; Wu, Hong-Long; Xu, Bin; Zhang, Wen-Hong

    2018-06-01

    A 45-year-old man who complained of continuous fever and multiple hepatic masses was admitted to our hospital. Repeated MRI manifestations were similar while each radiological report suggested contradictory diagnosis pointing to infections or malignances respectively. Pathologic examination of the liver tissue showed no direct evidence of either infections or tumor. We performed next-generation sequencing on the liver tissue and peripheral blood to further investigate the possible etiology. High throughput sequencing was performed on the liver lesion tissues using BGISEQ-100 platform, and data was mapped to the Microbial Genome Databases after filtering low quality data and human reads. We identified a total of 299 sequencing reads of Mycobacterium tuberculosis (M. tuberculosis) complex sequences from the liver tissue, including 8, 229 of 4,424,435 of the M. tuberculosis nucleotide sequences, and Mycobacterium africanum, Mycobacterium bovis, and Mycobacterium canettii were also detected due to the 99.9% identical rate among these strains. No specific Mycobacterial tuberculosis nucleotide sequence was detected in the sample of peripheral blood. Patient's symptom quickly recovered after anti-tuberculosis treatment and repeated Ziehl-Neelsen staining of the liver tissue finally identified small numbers of positive bacillus. The diagnosis of this patient was difficult to establish before the next-generation sequencing because of contradictive radiological results and negative pathological findings. More sensitive diagnostic methods are urgently needed. This is the first case reporting hepatic tuberculosis confirmed by the next-generation sequencing, and marks the promising potential of the application of the next-generation sequencing in the diagnosis of hepatic lesions with unknown etiology. Copyright © 2018 Elsevier Masson SAS. All rights reserved.

  16. CpG PatternFinder: a Windows-based utility program for easy and rapid identification of the CpG methylation status of DNA.

    PubMed

    Xu, Yi-Hua; Manoharan, Herbert T; Pitot, Henry C

    2007-09-01

    The bisulfite genomic sequencing technique is one of the most widely used techniques to study sequence-specific DNA methylation because of its unambiguous ability to reveal DNA methylation status to the order of a single nucleotide. One characteristic feature of the bisulfite genomic sequencing technique is that a number of sample sequence files will be produced from a single DNA sample. The PCR products of bisulfite-treated DNA samples cannot be sequenced directly because they are heterogeneous in nature; therefore they should be cloned into suitable plasmids and then sequenced. This procedure generates an enormous number of sample DNA sequence files as well as adding extra bases belonging to the plasmids to the sequence, which will cause problems in the final sequence comparison. Finding the methylation status for each CpG in each sample sequence is not an easy job. As a result CpG PatternFinder was developed for this purpose. The main functions of the CpG PatternFinder are: (i) to analyze the reference sequence to obtain CpG and non-CpG-C residue position information. (ii) To tailor sample sequence files (delete insertions and mark deletions from the sample sequence files) based on a configuration of ClustalW multiple alignment. (iii) To align sample sequence files with a reference file to obtain bisulfite conversion efficiency and CpG methylation status. And, (iv) to produce graphics, highlighted aligned sequence text and a summary report which can be easily exported to Microsoft Office suite. CpG PatternFinder is designed to operate cooperatively with BioEdit, a freeware on the internet. It can handle up to 100 files of sample DNA sequences simultaneously, and the total CpG pattern analysis process can be finished in minutes. CpG PatternFinder is an ideal software tool for DNA methylation studies to determine the differential methylation pattern in a large number of individuals in a population. Previously we developed the CpG Analyzer program; CpG PatternFinder is our further effort to create software tools for DNA methylation studies.

  17. The complete chloroplast genome sequences of Lychnis wilfordii and Silene capitata and comparative analyses with other Caryophyllaceae genomes.

    PubMed

    Kang, Jong-Soo; Lee, Byoung Yoon; Kwak, Myounghai

    2017-01-01

    The complete chloroplast genomes of Lychnis wilfordii and Silene capitata were determined and compared with ten previously reported Caryophyllaceae chloroplast genomes. The chloroplast genome sequences of L. wilfordii and S. capitata contain 152,320 bp and 150,224 bp, respectively. The gene contents and orders among 12 Caryophyllaceae species are consistent, but several microstructural changes have occurred. Expansion of the inverted repeat (IR) regions at the large single copy (LSC)/IRb and small single copy (SSC)/IR boundaries led to partial or entire gene duplications. Additionally, rearrangements of the LSC region were caused by gene inversions and/or transpositions. The 18 kb inversions, which occurred three times in different lineages of tribe Sileneae, were thought to be facilitated by the intermolecular duplicated sequences. Sequence analyses of the L. wilfordii and S. capitata genomes revealed 39 and 43 repeats, respectively, including forward, palindromic, and reverse repeats. In addition, a total of 67 and 56 simple sequence repeats were discovered in the L. wilfordii and S. capitata chloroplast genomes, respectively. Finally, we constructed phylogenetic trees of the 12 Caryophyllaceae species and two Amaranthaceae species based on 73 protein-coding genes using both maximum parsimony and likelihood methods.

  18. A New Method for Setting Calculation Sequence of Directional Relay Protection in Multi-Loop Networks

    NASA Astrophysics Data System (ADS)

    Haijun, Xiong; Qi, Zhang

    2016-08-01

    Workload of relay protection setting calculation in multi-loop networks may be reduced effectively by optimization setting calculation sequences. A new method of setting calculation sequences of directional distance relay protection in multi-loop networks based on minimum broken nodes cost vector (MBNCV) was proposed to solve the problem experienced in current methods. Existing methods based on minimum breakpoint set (MBPS) lead to more break edges when untying the loops in dependent relationships of relays leading to possibly more iterative calculation workloads in setting calculations. A model driven approach based on behavior trees (BT) was presented to improve adaptability of similar problems. After extending the BT model by adding real-time system characters, timed BT was derived and the dependency relationship in multi-loop networks was then modeled. The model was translated into communication sequence process (CSP) models and an optimization setting calculation sequence in multi-loop networks was finally calculated by tools. A 5-nodes multi-loop network was applied as an example to demonstrate effectiveness of the modeling and calculation method. Several examples were then calculated with results indicating the method effectively reduces the number of forced broken edges for protection setting calculation in multi-loop networks.

  19. Homopolymer tail-mediated ligation PCR: a streamlined and highly efficient method for DNA cloning and library construction.

    PubMed

    Lazinski, David W; Camilli, Andrew

    2013-01-01

    The amplification of DNA fragments, cloned between user-defined 5' and 3' end sequences, is a prerequisite step in the use of many current applications including massively parallel sequencing (MPS). Here we describe an improved method, called homopolymer tail-mediated ligation PCR (HTML-PCR), that requires very little starting template, minimal hands-on effort, is cost-effective, and is suited for use in high-throughput and robotic methodologies. HTML-PCR starts with the addition of homopolymer tails of controlled lengths to the 3' termini of a double-stranded genomic template. The homopolymer tails enable the annealing-assisted ligation of a hybrid oligonucleotide to the template's recessed 5' ends. The hybrid oligonucleotide has a user-defined sequence at its 5' end. This primer, together with a second primer composed of a longer region complementary to the homopolymer tail and fused to a second 5' user-defined sequence, are used in a PCR reaction to generate the final product. The user-defined sequences can be varied to enable compatibility with a wide variety of downstream applications. We demonstrate our new method by constructing MPS libraries starting from nanogram and sub-nanogram quantities of Vibrio cholerae and Streptococcus pneumoniae genomic DNA.

  20. Integrated analysis of RNA-binding protein complexes using in vitro selection and high-throughput sequencing and sequence specificity landscapes (SEQRS).

    PubMed

    Lou, Tzu-Fang; Weidmann, Chase A; Killingsworth, Jordan; Tanaka Hall, Traci M; Goldstrohm, Aaron C; Campbell, Zachary T

    2017-04-15

    RNA-binding proteins (RBPs) collaborate to control virtually every aspect of RNA function. Tremendous progress has been made in the area of global assessment of RBP specificity using next-generation sequencing approaches both in vivo and in vitro. Understanding how protein-protein interactions enable precise combinatorial regulation of RNA remains a significant problem. Addressing this challenge requires tools that can quantitatively determine the specificities of both individual proteins and multimeric complexes in an unbiased and comprehensive way. One approach utilizes in vitro selection, high-throughput sequencing, and sequence-specificity landscapes (SEQRS). We outline a SEQRS experiment focused on obtaining the specificity of a multi-protein complex between Drosophila RBPs Pumilio (Pum) and Nanos (Nos). We discuss the necessary controls in this type of experiment and examine how the resulting data can be complemented with structural and cell-based reporter assays. Additionally, SEQRS data can be integrated with functional genomics data to uncover biological function. Finally, we propose extensions of the technique that will enhance our understanding of multi-protein regulatory complexes assembled onto RNA. Copyright © 2016 Elsevier Inc. All rights reserved.

  1. Human action classification using procrustes shape theory

    NASA Astrophysics Data System (ADS)

    Cho, Wanhyun; Kim, Sangkyoon; Park, Soonyoung; Lee, Myungeun

    2015-02-01

    In this paper, we propose new method that can classify a human action using Procrustes shape theory. First, we extract a pre-shape configuration vector of landmarks from each frame of an image sequence representing an arbitrary human action, and then we have derived the Procrustes fit vector for pre-shape configuration vector. Second, we extract a set of pre-shape vectors from tanning sample stored at database, and we compute a Procrustes mean shape vector for these preshape vectors. Third, we extract a sequence of the pre-shape vectors from input video, and we project this sequence of pre-shape vectors on the tangent space with respect to the pole taking as a sequence of mean shape vectors corresponding with a target video. And we calculate the Procrustes distance between two sequences of the projection pre-shape vectors on the tangent space and the mean shape vectors. Finally, we classify the input video into the human action class with minimum Procrustes distance. We assess a performance of the proposed method using one public dataset, namely Weizmann human action dataset. Experimental results reveal that the proposed method performs very good on this dataset.

  2. Poly(A)-tag deep sequencing data processing to extract poly(A) sites.

    PubMed

    Wu, Xiaohui; Ji, Guoli; Li, Qingshun Quinn

    2015-01-01

    Polyadenylation [poly(A)] is an essential posttranscriptional processing step in the maturation of eukaryotic mRNA. The advent of next-generation sequencing (NGS) technology has offered feasible means to generate large-scale data and new opportunities for intensive study of polyadenylation, particularly deep sequencing of the transcriptome targeting the junction of 3'-UTR and the poly(A) tail of the transcript. To take advantage of this unprecedented amount of data, we present an automated workflow to identify polyadenylation sites by integrating NGS data cleaning, processing, mapping, normalizing, and clustering. In this pipeline, a series of Perl scripts are seamlessly integrated to iteratively map the single- or paired-end sequences to the reference genome. After mapping, the poly(A) tags (PATs) at the same genome coordinate are grouped into one cleavage site, and the internal priming artifacts removed. Then the ambiguous region is introduced to parse the genome annotation for cleavage site clustering. Finally, cleavage sites within a close range of 24 nucleotides and from different samples can be clustered into poly(A) clusters. This procedure could be used to identify thousands of reliable poly(A) clusters from millions of NGS sequences in different tissues or treatments.

  3. Probing the Rare Biosphere of the North-West Mediterranean Sea: An Experiment with High Sequencing Effort.

    PubMed

    Crespo, Bibiana G; Wallhead, Philip J; Logares, Ramiro; Pedrós-Alió, Carlos

    2016-01-01

    High-throughput sequencing (HTS) techniques have suggested the existence of a wealth of species with very low relative abundance: the rare biosphere. We attempted to exhaustively map this rare biosphere in two water samples by performing an exceptionally deep pyrosequencing analysis (~500,000 final reads per sample). Species data were derived by a 97% identity criterion and various parametric distributions were fitted to the observed counts. Using the best-fitting Sichel distribution we estimate a total species richness of 1,568-1,669 (95% Credible Interval) and 5,027-5,196 for surface and deep water samples respectively, implying that 84-89% of the total richness in those two samples was sequenced, and we predict that a quadrupling of the present sequencing effort would suffice to observe 90% of the total richness in both samples. Comparing the HTS results with a culturing approach we found that most of the cultured taxa were not obtained by HTS, despite the high sequencing effort. Culturing therefore remains a useful tool for uncovering marine bacterial diversity, in addition to its other uses for studying the ecology of marine bacteria.

  4. HMG-D is an architecture-specific protein that preferentially binds to DNA containing the dinucleotide TG.

    PubMed Central

    Churchill, M E; Jones, D N; Glaser, T; Hefner, H; Searles, M A; Travers, A A

    1995-01-01

    The high mobility group (HMG) protein HMG-D from Drosophila melanogaster is a highly abundant chromosomal protein that is closely related to the vertebrate HMG domain proteins HMG1 and HMG2. In general, chromosomal HMG domain proteins lack sequence specificity. However, using both NMR spectroscopy and standard biochemical techniques we show that binding of HMG-D to a single DNA site is sequence selective. The preferred duplex DNA binding site comprises at least 5 bp and contains the deformable dinucleotide TG embedded in A/T-rich sequences. The TG motif constitutes a common core element in the binding sites of the well-characterized sequence-specific HMG domain proteins. We show that a conserved aromatic residue in helix 1 of the HMG domain may be involved in recognition of this core sequence. In common with other HMG domain proteins HMG-D binds preferentially to DNA sites that are stably bent and underwound, therefore HMG-D can be considered an architecture-specific protein. Finally, we show that HMG-D bends DNA and may confer a superhelical DNA conformation at a natural DNA binding site in the Drosophila fushi tarazu scaffold-associated region. Images PMID:7720717

  5. PRIMAL: Page Rank-Based Indoor Mapping and Localization Using Gene-Sequenced Unlabeled WLAN Received Signal Strength

    PubMed Central

    Zhou, Mu; Zhang, Qiao; Xu, Kunjie; Tian, Zengshan; Wang, Yanmeng; He, Wei

    2015-01-01

    Due to the wide deployment of wireless local area networks (WLAN), received signal strength (RSS)-based indoor WLAN localization has attracted considerable attention in both academia and industry. In this paper, we propose a novel page rank-based indoor mapping and localization (PRIMAL) by using the gene-sequenced unlabeled WLAN RSS for simultaneous localization and mapping (SLAM). Specifically, first of all, based on the observation of the motion patterns of the people in the target environment, we use the Allen logic to construct the mobility graph to characterize the connectivity among different areas of interest. Second, the concept of gene sequencing is utilized to assemble the sporadically-collected RSS sequences into a signal graph based on the transition relations among different RSS sequences. Third, we apply the graph drawing approach to exhibit both the mobility graph and signal graph in a more readable manner. Finally, the page rank (PR) algorithm is proposed to construct the mapping from the signal graph into the mobility graph. The experimental results show that the proposed approach achieves satisfactory localization accuracy and meanwhile avoids the intensive time and labor cost involved in the conventional location fingerprinting-based indoor WLAN localization. PMID:26404274

  6. Whole exome sequencing identifies novel genes for fetal hemoglobin response to hydroxyurea in children with sickle cell anemia.

    PubMed

    Sheehan, Vivien A; Crosby, Jacy R; Sabo, Aniko; Mortier, Nicole A; Howard, Thad A; Muzny, Donna M; Dugan-Perez, Shannon; Aygun, Banu; Nottage, Kerri A; Boerwinkle, Eric; Gibbs, Richard A; Ware, Russell E; Flanagan, Jonathan M

    2014-01-01

    Hydroxyurea has proven efficacy in children and adults with sickle cell anemia (SCA), but with considerable inter-individual variability in the amount of fetal hemoglobin (HbF) produced. Sibling and twin studies indicate that some of that drug response variation is heritable. To test the hypothesis that genetic modifiers influence pharmacological induction of HbF, we investigated phenotype-genotype associations using whole exome sequencing of children with SCA treated prospectively with hydroxyurea to maximum tolerated dose (MTD). We analyzed 171 unrelated patients enrolled in two prospective clinical trials, all treated with dose escalation to MTD. We examined two MTD drug response phenotypes: HbF (final %HbF minus baseline %HbF), and final %HbF. Analyzing individual genetic variants, we identified multiple low frequency and common variants associated with HbF induction by hydroxyurea. A validation cohort of 130 pediatric sickle cell patients treated to MTD with hydroxyurea was genotyped for 13 non-synonymous variants with the strongest association with HbF response to hydroxyurea in the discovery cohort. A coding variant in Spalt-like transcription factor, or SALL2, was associated with higher final HbF in this second independent replication sample and SALL2 represents an outstanding novel candidate gene for further investigation. These findings may help focus future functional studies and provide new insights into the pharmacological HbF upregulation by hydroxyurea in patients with SCA.

  7. Short Interspersed Nuclear Element (SINE) Sequences in the Genome of the Human Pathogenic Fungus Aspergillus fumigatus Af293

    PubMed Central

    Kanhayuwa, Lakkhana; Coutts, Robert H. A.

    2016-01-01

    Novel families of short interspersed nuclear element (SINE) sequences in the human pathogenic fungus Aspergillus fumigatus, clinical isolate Af293, were identified and categorised into tRNA-related and 5S rRNA-related SINEs. Eight predicted tRNA-related SINE families originating from different tRNAs, and nominated as AfuSINE2 sequences, contained target site duplications of short direct repeat sequences (4–14 bp) flanking the elements, an extended tRNA-unrelated region and typical features of RNA polymerase III promoter sequences. The elements ranged in size from 140–493 bp and were present in low copy number in the genome and five out of eight were actively transcribed. One putative tRNAArg-derived sequence, AfuSINE2-1a possessed a unique feature of repeated trinucleotide ACT residues at its 3’-terminus. This element was similar in sequence to the I-4_AO element found in A. oryzae and an I-1_AF long nuclear interspersed element-like sequence identified in A. fumigatus Af293. Families of 5S rRNA-related SINE sequences, nominated as AfuSINE3, were also identified and their 5'-5S rRNA-related regions show 50–65% and 60–75% similarity to respectively A. fumigatus 5S rRNAs and SINE3-1_AO found in A. oryzae. A. fumigatus Af293 contains five copies of AfuSINE3 sequences ranging in size from 259–343 bp and two out of five AfuSINE3 sequences were actively transcribed. Investigations on AfuSINE distribution in the fungal genome revealed that the elements are enriched in pericentromeric and subtelomeric regions and inserted within gene-rich regions. We also demonstrated that some, but not all, AfuSINE sequences are targeted by host RNA silencing mechanisms. Finally, we demonstrated that infection of the fungus with mycoviruses had no apparent effects on SINE activity. PMID:27736869

  8. Short Interspersed Nuclear Element (SINE) Sequences in the Genome of the Human Pathogenic Fungus Aspergillus fumigatus Af293.

    PubMed

    Kanhayuwa, Lakkhana; Coutts, Robert H A

    2016-01-01

    Novel families of short interspersed nuclear element (SINE) sequences in the human pathogenic fungus Aspergillus fumigatus, clinical isolate Af293, were identified and categorised into tRNA-related and 5S rRNA-related SINEs. Eight predicted tRNA-related SINE families originating from different tRNAs, and nominated as AfuSINE2 sequences, contained target site duplications of short direct repeat sequences (4-14 bp) flanking the elements, an extended tRNA-unrelated region and typical features of RNA polymerase III promoter sequences. The elements ranged in size from 140-493 bp and were present in low copy number in the genome and five out of eight were actively transcribed. One putative tRNAArg-derived sequence, AfuSINE2-1a possessed a unique feature of repeated trinucleotide ACT residues at its 3'-terminus. This element was similar in sequence to the I-4_AO element found in A. oryzae and an I-1_AF long nuclear interspersed element-like sequence identified in A. fumigatus Af293. Families of 5S rRNA-related SINE sequences, nominated as AfuSINE3, were also identified and their 5'-5S rRNA-related regions show 50-65% and 60-75% similarity to respectively A. fumigatus 5S rRNAs and SINE3-1_AO found in A. oryzae. A. fumigatus Af293 contains five copies of AfuSINE3 sequences ranging in size from 259-343 bp and two out of five AfuSINE3 sequences were actively transcribed. Investigations on AfuSINE distribution in the fungal genome revealed that the elements are enriched in pericentromeric and subtelomeric regions and inserted within gene-rich regions. We also demonstrated that some, but not all, AfuSINE sequences are targeted by host RNA silencing mechanisms. Finally, we demonstrated that infection of the fungus with mycoviruses had no apparent effects on SINE activity.

  9. [Multiplexing mapping of human cDNAs]. Final report, September 1, 1991--February 28, 1994

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Not Available

    Using PCR with automated product analysis, 329 human brain cDNA sequences have been assigned to individual human chromosomes. Primers were designed from single-pass cDNA sequences expressed sequence tags (ESTs). Primers were used in PCR reactions with DNA from somatic cell hybrid mapping panels as templates, often with multiplexing. Many ESTs mapped match sequence database records. To evaluate of these matches, the position of the primers relative to the matching region (In), the BLAST scores and the Poisson probability values of the EST/sequence record match were determined. In cases where the gene product was stringently identified by the sequence match hadmore » already been mapped, the gene locus determined by EST was consistent with the previous position which strongly supports the validity of assigning unknown genes to human chromosomes based on the EST sequence matches. In the present cases mapping the ESTs to a chromosome can also be considered to have mapped the known gene product: rolipram-sensitive cAMP phosphodiesterase, chromosome 1; protein phosphatase 2A{beta}, chromosome 4; alpha-catenin, chromosome 5; the ELE1 oncogene, chromosome 10q11.2 or q2.1-q23; MXII protein, chromosome l0q24-qter; ribosomal protein L18a homologue, chromosome 14; ribosomal protein L3, chromosome 17; and moesin, Xp11-cen. There were also ESTs mapped that were closely related to non-human sequence records. These matches therefore can be considered to identify human counterparts of known gene products, or members of known gene families. Examples of these include membrane proteins, translation-associated proteins, structural proteins, and enzymes. These data then demonstrate that single pass sequence information is sufficient to design PCR primers useful for assigning cDNA sequences to human chromosomes. When the EST sequence matches previous sequence database records, the chromosome assignments of the EST can be used to make preliminary assignments of the human gene to a chromosome.« less

  10. Gene and genon concept: coding versus regulation

    PubMed Central

    2007-01-01

    We analyse here the definition of the gene in order to distinguish, on the basis of modern insight in molecular biology, what the gene is coding for, namely a specific polypeptide, and how its expression is realized and controlled. Before the coding role of the DNA was discovered, a gene was identified with a specific phenotypic trait, from Mendel through Morgan up to Benzer. Subsequently, however, molecular biologists ventured to define a gene at the level of the DNA sequence in terms of coding. As is becoming ever more evident, the relations between information stored at DNA level and functional products are very intricate, and the regulatory aspects are as important and essential as the information coding for products. This approach led, thus, to a conceptual hybrid that confused coding, regulation and functional aspects. In this essay, we develop a definition of the gene that once again starts from the functional aspect. A cellular function can be represented by a polypeptide or an RNA. In the case of the polypeptide, its biochemical identity is determined by the mRNA prior to translation, and that is where we locate the gene. The steps from specific, but possibly separated sequence fragments at DNA level to that final mRNA then can be analysed in terms of regulation. For that purpose, we coin the new term “genon”. In that manner, we can clearly separate product and regulative information while keeping the fundamental relation between coding and function without the need to introduce a conceptual hybrid. In mRNA, the program regulating the expression of a gene is superimposed onto and added to the coding sequence in cis - we call it the genon. The complementary external control of a given mRNA by trans-acting factors is incorporated in its transgenon. A consequence of this definition is that, in eukaryotes, the gene is, in most cases, not yet present at DNA level. Rather, it is assembled by RNA processing, including differential splicing, from various pieces, as steered by the genon. It emerges finally as an uninterrupted nucleic acid sequence at mRNA level just prior to translation, in faithful correspondence with the amino acid sequence to be produced as a polypeptide. After translation, the genon has fulfilled its role and expires. The distinction between the protein coding information as materialised in the final polypeptide and the processing information represented by the genon allows us to set up a new information theoretic scheme. The standard sequence information determined by the genetic code expresses the relation between coding sequence and product. Backward analysis asks from which coding region in the DNA a given polypeptide originates. The (more interesting) forward analysis asks in how many polypeptides of how many different types a given DNA segment is expressed. This concerns the control of the expression process for which we have introduced the genon concept. Thus, the information theoretic analysis can capture the complementary aspects of coding and regulation, of gene and genon. PMID:18087760

  11. Evaluation of Advanced Microwave Landing System Procedures in the New York Terminal Area

    DTIC Science & Technology

    1991-03-01

    sector controller called the CAMRN sector who must then sequence that traffic with multiple feeders from the south before handing off to the final...Right (13R) were all being used by landing traffic, the final controller handled the runway 22 arrivals and the CAMRN controller handled the runway 13R...Feeder Fix AAL678 DC10 H 00:09:00 AAL68 B767 H 00:23:00 AAL588 A300 H 00:27:00 PAA224 A300 H 01:20:00 4/ TWAll L101 H 01:34:00 CAMRN Feeder Fix DAL144

  12. Double photoionization of the Be isoelectronic sequence

    NASA Astrophysics Data System (ADS)

    Barmaki, S.; Albert, M. A.; Belliveau, J.; Laulan, S.

    2018-05-01

    We investigate the double photoionization (DPI) process along the Be isoelectronic sequence (Be‑Ne6+) by solving the time-dependent Schrödinger equation with a spectral method of configuration interaction type. The results that we obtain of the DPI cross sections are in a good agreement with other reported data. We also present the first results of double-to-single photoionization cross sections ratios for Be-like ions in support of possible photofragmentation experiments with x-ray free electron lasers. Finally, we probe the mutual interaction of the valence electrons at different photon energies and examine the subsequent redistribution of the excess photon energy among them.

  13. A high-throughput multiplex method adapted for GMO detection.

    PubMed

    Chaouachi, Maher; Chupeau, Gaëlle; Berard, Aurélie; McKhann, Heather; Romaniuk, Marcel; Giancola, Sandra; Laval, Valérie; Bertheau, Yves; Brunel, Dominique

    2008-12-24

    A high-throughput multiplex assay for the detection of genetically modified organisms (GMO) was developed on the basis of the existing SNPlex method designed for SNP genotyping. This SNPlex assay allows the simultaneous detection of up to 48 short DNA sequences (approximately 70 bp; "signature sequences") from taxa endogenous reference genes, from GMO constructions, screening targets, construct-specific, and event-specific targets, and finally from donor organisms. This assay avoids certain shortcomings of multiplex PCR-based methods already in widespread use for GMO detection. The assay demonstrated high specificity and sensitivity. The results suggest that this assay is reliable, flexible, and cost- and time-effective for high-throughput GMO detection.

  14. Problems with DNA

    ERIC Educational Resources Information Center

    Erickson, Keith A.; Franciszkowicz, Marc J.

    2010-01-01

    A modified version of this project was used during the final seven days of a year-long calculus sequence at the United States Military Academy to introduce students to the nature of integrative learning. Students from different majors were brought together in groups and spent the first few days going over the mathematics material presented here.…

  15. 78 FR 44237 - Improving Government Regulations; Unified Agenda of Federal Regulatory and Deregulatory Actions

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-07-23

    .... Brenda Bowen, telephone 703-428-6173, or write to the U.S. Army Records Management and Declassification..., Administration and Management. Defense Acquisition Regulations Council--Final Rule Stage Regulation Sequence No... Ownership of Offeror 0750-AH58 (DFARS Case 2011-D044). 120 Release of Fundamental 0750-AH92 Research...

  16. Stage Theory and Research on Tobacco, Alcohol, and Other Drug Use.

    ERIC Educational Resources Information Center

    Werch, Chudley E.; Anzalone, Debra

    1995-01-01

    Examines the conceptual and empirical foundations of individual drug use stage development and progression related to tobacco, alcohol, and other drugs. Research examining interdrug use progression among youths supports the idea of a generally invariant sequence, involving nonuse to legal drug use, marijuana, and finally other illegal drug use.…

  17. A Correlational Analysis of the Effects of Learner and Linear Programming Characteristics on Learning Programmed Instruction. Final Report.

    ERIC Educational Resources Information Center

    Seibert, Warren F.; Reid, Christopher J.

    Learning and retention may be influenced by subtle instructional stimulus characteristics and certain visual memory aptitudes. Ten stimulus characteristics were chosen for study; 50 sequences of programed instructional material were specially written to conform to sampled values of each stimulus characteristic. Seventy-three freshman subjects…

  18. Children's Reproduction of Modeled Sequential Actions. Final Report.

    ERIC Educational Resources Information Center

    Uzgiris, Ina C.

    This paper describes seven interrelated studies concerned with children's understanding of sequential actions and with the effects of observing a model on this understanding. A total of 546 elementary and secondary school students served as subjects for the studies. The tasks for all of the studies involved deriving the pattern for a sequence from…

  19. The Technique of Documentary Film Production. Revised Edition.

    ERIC Educational Resources Information Center

    Baddeley, W. Hugh

    The means and methods of producing factual films are dealt with step-by-step from the initial idea to making and distributing the final release prints. The author describes preparing a script; breaking the script down to shooting sequences; budgeting for everything from materials to insurance; preliminary planning for actors, commentators,…

  20. THE CRYSTAL STRUCTURE AND AMINO ACID SEQUENCE OF DEHALOPEROXIDASE FROM AMPHITRITE ORNATA INDICATE COMMON ANCESTRY WITH GLOBINS. (R827612E02)

    EPA Science Inventory

    The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...

  1. Energy Management Technician Curriculum Development. Final Report.

    ERIC Educational Resources Information Center

    Sarvis, Robert E.

    This document is the result of an effort to develop a comprehensive curriculum to train community college students as energy management technicians. The main body of the document contains the energy management technician training curriculum and course content for the proposed courses in the two-year sequence; a report of how the curriculum was…

  2. 75 FR 41994 - Federal Management Regulation; Home-to-Work Transportation

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-07-20

    ...; Docket 2010-0013, Sequence 1] RIN 3090-AJ05 Federal Management Regulation; Home-to-Work Transportation... clarify existing Home-to-Work Transportation policy. This final rule updates and clarifies who is not... establish policy regarding home-to-work transportation. Section 102-5.20 defines who is not covered by the...

  3. Aircraft and avionic related research required to develop an effective high-speed runway exit system

    NASA Technical Reports Server (NTRS)

    Schoen, M. L.; Hosford, J. E.; Graham, J. M., Jr.; Preston, O. W.; Frankel, R. S.; Erickson, J. B.

    1979-01-01

    Research was conducted to increase airport capacity by studying the feasibility of the longitudinal separation between aircraft sequences on final approach. The multidisciplinary factors which include the utility of high speed exits for efficient runway operations were described along with recommendations and highlights of these studies.

  4. Sequence Curriculum: High School to College. Middlesex Community College/Haddam-Killingworth High School. Final Report.

    ERIC Educational Resources Information Center

    Middlesex Community Coll., Middletown, CT.

    Through a collaborative effort between Middlesex Community College (MxCC) and Haddam-Killingworth High School (HKHS), students taking specific high school courses in television production, broadcast journalism, electronics, and photography are granted college credit by MxCC upon admission to the college's Broadcast Communication Program. The…

  5. Education and Labor Force Skills in Postwar Japan. Final Report.

    ERIC Educational Resources Information Center

    Taira, Koji; Levine, Solomon B.

    As early as elementary school, a Japanese child faces a sequence of narrowing choices for an occupational future. Through decisions on further schooling, curriculum, and job entry, earlier choices severely restrict later ones. Usually, men go to four-year universities to study engineering or social sciences. Women generally attend two-year…

  6. An 11-year history of crop rotation into new perennial ryegrass and tall fescue

    USDA-ARS?s Scientific Manuscript database

    Converting multi-year remote sensing classification data into crop rotations is beneficial by defining the length of crop rotation cycles and the specific sequences of intervening crops grown between the final year of a grass seed stand and establishment of new perennial ryegrass and tall fescue see...

  7. Fire Fighter Level I-II-III [and] Practical Skills Test. Wisconsin Fire Service Certification Series. Final Revision.

    ERIC Educational Resources Information Center

    Pribyl, Paul F.

    Practical skills tests are provided for fire fighter trainees in the Wisconsin Fire Service Certification Series, Fire Fighter Levels I, II, and III. A course introduction appears first and contains this information: recommended instructional sequence, required facilities, instructional methodology, requirements for certification, course…

  8. A Multistep Synthesis Featuring Classic Carbonyl Chemistry for the Advanced Organic Chemistry Laboratory

    ERIC Educational Resources Information Center

    Duff, David B.; Abbe, Tyler G.; Goess, Brian C.

    2012-01-01

    A multistep synthesis of 5-isopropyl-1,3-cyclohexanedione is carried out from three commodity chemicals. The sequence involves an aldol condensation, Dieckmann-type annulation, ester hydrolysis, and decarboxylation. No purification is required until after the final step, at which point gravity column chromatography provides the desired product in…

  9. 10 CFR 2.705 - Discovery-additional methods.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 10 CFR 73.22(b)(3) or 73.23(b)(3), as applicable, by submitting fingerprints to the NRC Office of... fingerprints. However, before a final adverse determination by the NRC Office of Administration on an... discovery may be used in any sequence and the fact that a party is conducting discovery, whether by...

  10. 10 CFR 2.705 - Discovery-additional methods.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 10 CFR 73.22(b)(3) or 73.23(b)(3), as applicable, by submitting fingerprints to the NRC Office of... fingerprints. However, before a final adverse determination by the NRC Office of Administration on an... discovery may be used in any sequence and the fact that a party is conducting discovery, whether by...

  11. 10 CFR 2.705 - Discovery-additional methods.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 10 CFR 73.22(b)(3) or 73.23(b)(3), as applicable, by submitting fingerprints to the NRC Office of... fingerprints. However, before a final adverse determination by the NRC Office of Administration on an... discovery may be used in any sequence and the fact that a party is conducting discovery, whether by...

  12. 10 CFR 2.705 - Discovery-additional methods.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 10 CFR 73.22(b)(3) or 73.23(b)(3), as applicable, by submitting fingerprints to the NRC Office of... fingerprints. However, before a final adverse determination by the NRC Office of Administration on an... discovery may be used in any sequence and the fact that a party is conducting discovery, whether by...

  13. Rigorous Tests of Student Outcomes in CTE Programs of Study: Final Report

    ERIC Educational Resources Information Center

    Castellano, Marisa; Sundell, Kirsten E.; Overman, Laura T.; Richardson, George B.; Stone, James R., III

    2014-01-01

    This study was designed to investigate the relationship between participation in federally mandated college and career-preparatory programs--known as programs of study (POS)--and high school achievement outcomes. POS are an organized approach to college and career readiness that offer an aligned sequence of courses spanning secondary and…

  14. High School-College Sequenced Curriculum in Early Childhood Education. Final Report.

    ERIC Educational Resources Information Center

    Greenblatt, Cynthia; Lawrence, Terri

    This report describes a curriculum that enables students from Hartford Public High School to take a course which is relevant to the Early Childhood Program at Greater Hartford Community College. Successful completion of the course enables students to earn three college credits and meet high school graduation requirements. Objectives of the project…

  15. Synthesis of a Fluorescent Acridone Using a Grignard Addition, Oxidation, and Nucleophilic Aromatic Substitution Reaction Sequence

    ERIC Educational Resources Information Center

    Goodrich, Samuel; Patel, Miloni; Woydziak, Zachary R.

    2015-01-01

    A three-pot synthesis oriented for an undergraduate organic chemistry laboratory was developed to construct a fluorescent acridone molecule. This laboratory experiment utilizes Grignard addition to an aldehyde, alcohol oxidation, and iterative nucleophilic aromatic substitution steps to produce the final product. Each of the intermediates and the…

  16. Different Lactobacillus populations dominate in "Chorizo de León" manufacturing performed in different production plants.

    PubMed

    Quijada, Narciso M; De Filippis, Francesca; Sanz, José Javier; García-Fernández, María Del Camino; Rodríguez-Lázaro, David; Ercolini, Danilo; Hernández, Marta

    2018-04-01

    "Chorizo de Léon" is a high-value Spanish dry fermented sausage traditionally manufactured without the use of starter cultures, owing to the activity of a house-specific autochthonous microbiota that naturally contaminates the meat from the environment, the equipment and the raw materials. Lactic acid bacteria (particularly Lactobacillus) and coagulase-negative cocci (mainly Staphylococcus) have been reported as the most important bacterial groups regarding the organoleptic and safety properties of the dry fermented sausages. In this study, samples from raw minced meat to final products were taken from five different producers and the microbial diversity was investigated by high-throughput sequencing of 16S rRNA gene amplicons. The diverse microbial composition observed during the first stages of "Chorizo de Léon" evolved during ripening to a microbiota mainly composed by Lactobacillus in the final product. Oligotyping performed on 16S rRNA gene sequences of Lactobacillus and Staphylococcus populations revealed sub-genus level diversity within the different manufacturers, likely responsible of the characteristic organoleptic properties of the products from different companies. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Automated segmentation of three-dimensional MR brain images

    NASA Astrophysics Data System (ADS)

    Park, Jonggeun; Baek, Byungjun; Ahn, Choong-Il; Ku, Kyo Bum; Jeong, Dong Kyun; Lee, Chulhee

    2006-03-01

    Brain segmentation is a challenging problem due to the complexity of the brain. In this paper, we propose an automated brain segmentation method for 3D magnetic resonance (MR) brain images which are represented as a sequence of 2D brain images. The proposed method consists of three steps: pre-processing, removal of non-brain regions (e.g., the skull, meninges, other organs, etc), and spinal cord restoration. In pre-processing, we perform adaptive thresholding which takes into account variable intensities of MR brain images corresponding to various image acquisition conditions. In segmentation process, we iteratively apply 2D morphological operations and masking for the sequences of 2D sagittal, coronal, and axial planes in order to remove non-brain tissues. Next, final 3D brain regions are obtained by applying OR operation for segmentation results of three planes. Finally we reconstruct the spinal cord truncated during the previous processes. Experiments are performed with fifteen 3D MR brain image sets with 8-bit gray-scale. Experiment results show the proposed algorithm is fast, and provides robust and satisfactory results.

  18. An enriched multimedia eBook application to facilitate learning of anatomy.

    PubMed

    Stirling, Allan; Birt, James

    2014-01-01

    This pilot study compared the use of an enriched multimedia eBook with traditional methods for teaching the gross anatomy of the heart and great vessels. Seventy-one first-year students from an Australian medical school participated in the study. Students' abilities were examined by pretest, intervention, and post-test measurements. Perceptions and attitudes toward eBook technology were examined by survey questions. Results indicated a strongly positive user experience coupled with increased marks; however, there were no statistically significant results for the eBook method of delivery alone outperforming the traditional anatomy practical session. Results did show a statistically significant difference in the final marks achieved based on the sequencing of the learning modalities. With initial interaction with the multimedia content followed by active experimentation in the anatomy lab, students' performance was improved in the final test. Obtained data support the role of eBook technology in modern anatomy curriculum being a useful adjunct to traditional methods. Further study is needed to investigate the importance of sequencing of teaching interventions. © 2013 American Association of Anatomists.

  19. Thermo-mechanical characterization of a thermoplastic composite and prediction of the residual stresses and lamina curvature during cooling

    NASA Astrophysics Data System (ADS)

    Péron, Mael; Jacquemin, Frédéric; Casari, Pascal; Orange, Gilles; Bailleul, Jean-Luc; Boyard, Nicolas

    2017-10-01

    The prediction of process induced stresses during the cooling of thermoplastic composites still represents a challenge for the scientific community. However, a precise determination of these stresses is necessary in order to optimize the process conditions and thus lower the stresses effects on the final part health. A model is presented here, that permits the estimation of residual stresses during cooling. It relies on the nonlinear laminate theory, which has been adapted to arbitrary layup sequences. The developed model takes into account the heat transfers through the thickness of the laminate, together with the crystallization kinetics. The development of the composite mechanical properties during cooling is addressed by an incremental linear elastic constitutive law, which also considers thermal and crystallization strains. In order to feed the aforementioned model, a glass fiber and PA6.6 matrix unidirectional (UD) composite has been characterized. This work finally focuses on the identification of the material and process related parameters that lower the residual stresses level, including the ply sequence, the fiber volume fraction and the cooling rate.

  20. A novel approach on accelerated ageing towards reliability optimization of high concentration photovoltaic cells

    NASA Astrophysics Data System (ADS)

    Tsanakas, John A.; Jaffre, Damien; Sicre, Mathieu; Elouamari, Rachid; Vossier, Alexis; de Salins, Jean-Edouard; Bechou, Laurent; Levrier, Bruno; Perona, Arnaud; Dollet, Alain

    2014-09-01

    This paper presents a preliminary study upon a novel approach proposed for highly accelerated ageing and reliability optimization of high concentrating photovoltaic (HCPV) cells and assemblies. The intended approach aims to overcome several limitations of some current accelerated ageing tests (AAT) adopted up today, proposing the use of an alternative experimental set-up for performing faster and more realistic thermal cycles, under real sun, without the involvement of environmental chamber. The study also includes specific characterization techniques, before and after each AAT sequence, which respectively provide the initial and final diagnosis on the condition of the tested sample. The acquired data from these diagnostic/characterization methods are then used as indices to determine both quantitatively and qualitatively the severity of degradation and, thus, the ageing level for each tested HCPV assembly or cell sample. Ultimate goal of such "initial diagnosis - AAT - final diagnosis" sequences is to provide the basis for a future work on the reliability analysis of the main degradation mechanisms and confident prediction of failure propagation in HCPV cells, by means of acceleration factor (AF) and mean-time-to-failure (MTTF) estimations.

  1. Observation learning versus physical practice leads to different consolidation outcomes in a movement timing task.

    PubMed

    Trempe, Maxime; Sabourin, Maxime; Rohbanfard, Hassan; Proteau, Luc

    2011-03-01

    Motor learning is a process that extends beyond training sessions. Specifically, physical practice triggers a series of physiological changes in the CNS that are regrouped under the term "consolidation" (Stickgold and Walker 2007). These changes can result in between-session improvement or performance stabilization (Walker 2005). In a series of three experiments, we tested whether consolidation also occurs following observation. In Experiment 1, participants observed an expert model perform a sequence of arm movements. Although we found evidence of observation learning, no significant difference was revealed between participants asked to reproduce the observed sequence either 5 min or 24 h later (no between-session improvement). In Experiment 2, two groups of participants observed an expert model perform two distinct movement sequences (A and B) either 10 min or 8 h apart; participants then physically performed both sequences after a 24-h break. Participants in the 8-h group performed Sequence B less accurately compared to participants in the 5-min group, suggesting that the memory representation of the first sequence had been stabilized and that it interfered with the learning of the second sequence. Finally, in Experiment 3, the initial observation phase was replaced by a physical practice phase. In contrast with the results of Experiment 2, participants in the 8-h group performed Sequence B significantly more accurately compared to participants in the 5-min group. Together, our results suggest that the memory representation of a skill learned through observation undergoes consolidation. However, consolidation of an observed motor skill leads to distinct behavioural outcomes in comparison with physical practice.

  2. An electrooculogram-based binary saccade sequence classification (BSSC) technique for augmentative communication and control.

    PubMed

    Keegan, Johnalan; Burke, Edward; Condron, James

    2009-01-01

    In the field of assistive technology, the electrooculogram (EOG) can be used as a channel of communication and the basis of a man-machine interface. For many people with severe motor disabilities, simple actions such as changing the TV channel require assistance. This paper describes a method of detecting saccadic eye movements and the use of a saccade sequence classification algorithm to facilitate communication and control. Saccades are fast eye movements that occurs when a person's gaze jumps from one fixation point to another. The classification is based on pre-defined sequences of saccades, guided by a static visual template (e.g. a page or poster). The template, consisting of a table of symbols each having a clearly identifiable fixation point, is situated within view of the user. To execute a particular command, the user moves his or her gaze through a pre-defined path of eye movements. This results in a well-formed sequence of saccades which are translated into a command if a match is found in a library of predefined sequences. A coordinate transformation algorithm is applied to each candidate sequence of recorded saccades to mitigate the effect of changes in the user's position and orientation relative to the visual template. Upon recognition of a saccade sequence from the library, its associated command is executed. A preliminary experiment in which two subjects were instructed to perform a series of command sequences consisting of 8 different commands are presented in the final sections. The system is also shown to be extensible to facilitate convenient text entry via an alphabetic visual template.

  3. Self-locked aptamer probe mediated cascade amplification strategy for highly sensitive and selective detection of protein and small molecule.

    PubMed

    Li, Wei; Jiang, Wei; Wang, Lei

    2016-10-12

    In this work, a novel self-locked aptamer probe mediated cascade amplification strategy has been constructed for highly sensitive and specific detection of protein. First, the self-locked aptamer probe was designed with three functions: one was specific molecular recognition attributed to the aptamer sequence, the second was signal transduction owing to the transduction sequence, and the third was self-locking through the hybridization of the transduction sequence and part of the aptamer sequence. Then, the aptamer sequence specific recognized the target and folded into a three-way helix junction, leading to the release of the transduction sequence. Next, the 3'-end of this three-way junction acted as primer to trigger the strand displacement amplification (SDA), yielding a large amount of primers. Finally, the primers initiated the dual-exponential rolling circle amplification (DE-RCA) and generated numerous G-quadruples sequences. By inserting the fluorescent dye N-methyl mesoporphyrin IX (NMM), enhanced fluorescence signal was achieved. In this strategy, the self-locked aptamer probe was more stable to reduce the interference signals generated by the uncontrollable folding in unbounded state. Through the cascade amplification of SDA and DE-RCA, the sensitivity was further improved with a detection limit of 3.8 × 10(-16) mol/L for protein detection. Furthermore, by changing the aptamer sequence of the probe, sensitive and selective detection of adenosine has been also achieved, suggesting that the proposed strategy has good versatility and can be widely used in sensitive and selective detection of biomolecules. Copyright © 2016 Elsevier B.V. All rights reserved.

  4. Phylogenetic stratigraphy in the Guerrero Negro hypersaline microbial mat.

    PubMed

    Harris, J Kirk; Caporaso, J Gregory; Walker, Jeffrey J; Spear, John R; Gold, Nicholas J; Robertson, Charles E; Hugenholtz, Philip; Goodrich, Julia; McDonald, Daniel; Knights, Dan; Marshall, Paul; Tufo, Henry; Knight, Rob; Pace, Norman R

    2013-01-01

    The microbial mats of Guerrero Negro (GN), Baja California Sur, Mexico historically were considered a simple environment, dominated by cyanobacteria and sulfate-reducing bacteria. Culture-independent rRNA community profiling instead revealed these microbial mats as among the most phylogenetically diverse environments known. A preliminary molecular survey of the GN mat based on only ∼1500 small subunit rRNA gene sequences discovered several new phylum-level groups in the bacterial phylogenetic domain and many previously undetected lower-level taxa. We determined an additional ∼119,000 nearly full-length sequences and 28,000 >200 nucleotide 454 reads from a 10-layer depth profile of the GN mat. With this unprecedented coverage of long sequences from one environment, we confirm the mat is phylogenetically stratified, presumably corresponding to light and geochemical gradients throughout the depth of the mat. Previous shotgun metagenomic data from the same depth profile show the same stratified pattern and suggest that metagenome properties may be predictable from rRNA gene sequences. We verify previously identified novel lineages and identify new phylogenetic diversity at lower taxonomic levels, for example, thousands of operational taxonomic units at the family-genus levels differ considerably from known sequences. The new sequences populate parts of the bacterial phylogenetic tree that previously were poorly described, but indicate that any comprehensive survey of GN diversity has only begun. Finally, we show that taxonomic conclusions are generally congruent between Sanger and 454 sequencing technologies, with the taxonomic resolution achieved dependent on the abundance of reference sequences in the relevant region of the rRNA tree of life.

  5. Short reads from honey bee (Apis sp.) sequencing projects reflect microbial associate diversity

    PubMed Central

    Hurst, Gregory D.D.

    2017-01-01

    High throughput (or ‘next generation’) sequencing has transformed most areas of biological research and is now a standard method that underpins empirical study of organismal biology, and (through comparison of genomes), reveals patterns of evolution. For projects focused on animals, these sequencing methods do not discriminate between the primary target of sequencing (the animal genome) and ‘contaminating’ material, such as associated microbes. A common first step is to filter out these contaminants to allow better assembly of the animal genome or transcriptome. Here, we aimed to assess if these ‘contaminations’ provide information with regard to biologically important microorganisms associated with the individual. To achieve this, we examined whether the short read data from Apis retrieved elements of its well established microbiome. To this end, we screened almost 1,000 short read libraries of honey bee (Apis sp.) DNA sequencing project for the presence of microbial sequences, and find sequences from known honey bee microbial associates in at least 11% of them. Further to this, we screened ∼500 Apis RNA sequencing libraries for evidence of viral infections, which were found to be present in about half of them. We then used the data to reconstruct draft genomes of three Apis associated bacteria, as well as several viral strains de novo. We conclude that ‘contamination’ in short read sequencing libraries can provide useful genomic information on microbial taxa known to be associated with the target organisms, and may even lead to the discovery of novel associations. Finally, we demonstrate that RNAseq samples from experiments commonly carry uneven viral loads across libraries. We note variation in viral presence and load may be a confounding feature of differential gene expression analyses, and as such it should be incorporated as a random factor in analyses. PMID:28717593

  6. Volume interpolated 3D-spoiled gradient echo sequence is better than dynamic contrast spin echo sequence for MRI detection of corticotropin secreting pituitary microadenomas.

    PubMed

    Kasaliwal, Rajeev; Sankhe, Shilpa S; Lila, Anurag R; Budyal, Sweta R; Jagtap, Varsha S; Sarathi, Vijaya; Kakade, Harshal; Bandgar, Tushar; Menon, Padmavathy S; Shah, Nalini S

    2013-06-01

    Various techniques have been attempted to increase the yield of magnetic resonance imaging (MRI) for localization of pituitary microadenomas in corticotropin (ACTH)-dependent Cushing's syndrome (CS). To compare the performance of dynamic contrast spin echo (DC-SE) and volume interpolated 3D-spoiled gradient echo (VI-SGE) MR sequences in the diagnostic evaluation of ACTH-dependent CS. Data was analysed retrospectively from a series of ACTH-dependent CS patients treated over 2-year period at a tertiary care referral centre (2009-2011). Thirty-six patients (24 female and 12 male) were diagnosed to have ACTH-dependent CS during the study period. All patients underwent MRI by both sequences during a single examination. Cases with negative and equivocal pituitary MR imaging underwent corticotropin-releasing hormone (CRH) stimulated bilateral inferior petrosal sinus sampling (BIPSS) to confirm pituitary origin of ACTH excess state. Thirty patients were finally diagnosed to have Cushing's disease (CD) [based on histopathology proof of adenoma and/or remission (partial/complete) of hypercortisolism postsurgery]. Six patients were diagnosed to have histopathologically proven ectopic CS. Of 30 patients with CD, 24 patients had microadenomas and 6 patients had macroadenomas. DC-SE MRI sequence was able to identify microadenomas in 16 of 24 patients, whereas postcontrast VI-SGE sequence was able to identify microadenomas in 21 of 24 patients. All six patients of ectopic CS had negative pituitary MR imaging by both techniques (specificity: 100%). VI-SGE MR sequence was better for localization of pituitary microadenomas particularly when DC-SE MR sequence is negative or equivocal and should be used in addition to DC-SE MR sequence for the evaluation of ACTH-dependent CS. © 2012 John Wiley & Sons Ltd.

  7. Simian virus 40 major late promoter: an upstream DNA sequence required for efficient in vitro transcription.

    PubMed Central

    Brady, J; Radonovich, M; Thoren, M; Das, G; Salzman, N P

    1984-01-01

    We have previously identified an 11-base DNA sequence, 5'-G-G-T-A-C-C-T-A-A-C-C-3' (simian virus 40 [SV40] map position 294 to 304), which is important in the control of SV40 late RNA expression in vitro and in vivo (Brady et al., Cell 31:625-633, 1982). We report here the identification of another domain of the SV40 late promoter. A series of mutants with deletions extending from SV40 map position 0 to 300 was prepared by nuclease BAL 31 treatment. The cloned templates were then analyzed for efficiency and accuracy of late SV40 RNA expression in the Manley in vitro transcription system. Our studies showed that, in addition to the promoter domain near map position 300, there are essential DNA sequences between nucleotide positions 74 and 95 that are required for efficient expression of late SV40 RNA. Included in this SV40 DNA sequence were two of the six GGGCGG SV40 repeat sequences and an 11-nucleotide segment which showed strong homology with the upstream sequences required for the efficient in vitro and in vivo expression of the histone H2A gene. This upstream promoter sequence supported transcription with the same efficiency even when it was moved 72 nucleotides closer to the major late cap site. In vitro promoter competition analysis demonstrated that the upstream promoter sequence, independent of the 294 to 304 promoter element, is capable of binding polymerase-transcription factors required for SV40 late gene transcription. Finally, we show that DNA sequences which control the specificity of RNA initiation at nucleotide 325 lie downstream of map position 294. Images PMID:6321950

  8. A proline-rich sequence unique to MEK1 and MEK2 is required for raf binding and regulates MEK function.

    PubMed

    Catling, A D; Schaeffer, H J; Reuter, C W; Reddy, G R; Weber, M J

    1995-10-01

    Mammalian MEK1 and MEK2 contain a proline-rich (PR) sequence that is absent both from the yeast homologs Ste7 and Byr1 and from a recently cloned activator of the JNK/stress-activated protein kinases, SEK1/MKK4. Since this PR sequence occurs in MEKs that are regulated by Raf family enzymes but is missing from MEKs and SEKs activated independently of Raf, we sought to investigate the role of this sequence in MEK1 and MEK2 regulation and function. Deletion of the PR sequence from MEK1 blocked the ability of MEK1 to associate with members of the Raf family and markedly attenuated activation of the protein in vivo following growth factor stimulation. In addition, this sequence was necessary for efficient activation of MEK1 in vitro by B-Raf but dispensable for activation by a novel MEK1 activator which we have previously detected in fractionated fibroblast extracts. Furthermore, we found that a phosphorylation site within the PR sequence of MEK1 was required for sustained MEK1 activity in response to serum stimulation of quiescent fibroblasts. Consistent with this observation, we observed that MEK2, which lacks a phosphorylation site at the corresponding position, was activated only transiently following serum stimulation. Finally, we found that deletion of the PR sequence from a constitutively activated MEK1 mutant rendered the protein nontransforming in Rat1 fibroblasts. These observations indicate a critical role for the PR sequence in directing specific protein-protein interactions important for the activation, inactivation, and downstream functioning of the MEKs.

  9. A proline-rich sequence unique to MEK1 and MEK2 is required for raf binding and regulates MEK function.

    PubMed Central

    Catling, A D; Schaeffer, H J; Reuter, C W; Reddy, G R; Weber, M J

    1995-01-01

    Mammalian MEK1 and MEK2 contain a proline-rich (PR) sequence that is absent both from the yeast homologs Ste7 and Byr1 and from a recently cloned activator of the JNK/stress-activated protein kinases, SEK1/MKK4. Since this PR sequence occurs in MEKs that are regulated by Raf family enzymes but is missing from MEKs and SEKs activated independently of Raf, we sought to investigate the role of this sequence in MEK1 and MEK2 regulation and function. Deletion of the PR sequence from MEK1 blocked the ability of MEK1 to associate with members of the Raf family and markedly attenuated activation of the protein in vivo following growth factor stimulation. In addition, this sequence was necessary for efficient activation of MEK1 in vitro by B-Raf but dispensable for activation by a novel MEK1 activator which we have previously detected in fractionated fibroblast extracts. Furthermore, we found that a phosphorylation site within the PR sequence of MEK1 was required for sustained MEK1 activity in response to serum stimulation of quiescent fibroblasts. Consistent with this observation, we observed that MEK2, which lacks a phosphorylation site at the corresponding position, was activated only transiently following serum stimulation. Finally, we found that deletion of the PR sequence from a constitutively activated MEK1 mutant rendered the protein nontransforming in Rat1 fibroblasts. These observations indicate a critical role for the PR sequence in directing specific protein-protein interactions important for the activation, inactivation, and downstream functioning of the MEKs. PMID:7565670

  10. Quantitative statistical analysis of cis-regulatory sequences in ABA/VP1- and CBF/DREB1-regulated genes of Arabidopsis.

    PubMed

    Suzuki, Masaharu; Ketterling, Matthew G; McCarty, Donald R

    2005-09-01

    We have developed a simple quantitative computational approach for objective analysis of cis-regulatory sequences in promoters of coregulated genes. The program, designated MotifFinder, identifies oligo sequences that are overrepresented in promoters of coregulated genes. We used this approach to analyze promoter sequences of Viviparous1 (VP1)/abscisic acid (ABA)-regulated genes and cold-regulated genes, respectively, of Arabidopsis (Arabidopsis thaliana). We detected significantly enriched sequences in up-regulated genes but not in down-regulated genes. This result suggests that gene activation but not repression is mediated by specific and common sequence elements in promoters. The enriched motifs include several known cis-regulatory sequences as well as previously unidentified motifs. With respect to known cis-elements, we dissected the flanking nucleotides of the core sequences of Sph element, ABA response elements (ABREs), and the C repeat/dehydration-responsive element. This analysis identified the motif variants that may correlate with qualitative and quantitative differences in gene expression. While both VP1 and cold responses are mediated in part by ABA signaling via ABREs, these responses correlate with unique ABRE variants distinguished by nucleotides flanking the ACGT core. ABRE and Sph motifs are tightly associated uniquely in the coregulated set of genes showing a strict dependence on VP1 and ABA signaling. Finally, analysis of distribution of the enriched sequences revealed a striking concentration of enriched motifs in a proximal 200-base region of VP1/ABA and cold-regulated promoters. Overall, each class of coregulated genes possesses a discrete set of the enriched motifs with unique distributions in their promoters that may account for the specificity of gene regulation.

  11. The gene space in wheat: the complete γ-gliadin gene family from the wheat cultivar Chinese Spring.

    PubMed

    Anderson, Olin D; Huo, Naxin; Gu, Yong Q

    2013-06-01

    The complete set of unique γ-gliadin genes is described for the wheat cultivar Chinese Spring using a combination of expressed sequence tag (EST) and Roche 454 DNA sequences. Assemblies of Chinese Spring ESTs yielded 11 different γ-gliadin gene sequences. Two of the sequences encode identical polypeptides and are assumed to be the result of a recent gene duplication. One gene has a 3' coding mutation that changes the reading frame in the final eight codons. A second assembly of Chinese Spring γ-gliadin sequences was generated using Roche 454 total genomic DNA sequences. The 454 assembly confirmed the same 11 active genes as the EST assembly plus two pseudogenes not represented by ESTs. These 13 γ-gliadin sequences represent the complete unique set of γ-gliadin genes for cv Chinese Spring, although not ruled out are additional genes that are exact duplications of these 13 genes. A comparison with the ESTs of two other hexaploid cultivars (Butte 86 and Recital) finds that the most active genes are present in all three cultivars, with exceptions likely due to too few ESTs for detection in Butte 86 and Recital. A comparison of the numbers of ESTs per gene indicates differential levels of expression within the γ-gliadin gene family. Genome assignments were made for 6 of the 13 Chinese Spring γ-gliadin genes, i.e., one assignment from a match to two γ-gliadin genes found within a tetraploid wheat A genome BAC and four genes that match four distinct γ-gliadin sequences assembled from Roche 454 sequences from Aegilops tauschii, the hexaploid wheat D-genome ancestor.

  12. Short reads from honey bee (Apis sp.) sequencing projects reflect microbial associate diversity.

    PubMed

    Gerth, Michael; Hurst, Gregory D D

    2017-01-01

    High throughput (or 'next generation') sequencing has transformed most areas of biological research and is now a standard method that underpins empirical study of organismal biology, and (through comparison of genomes), reveals patterns of evolution. For projects focused on animals, these sequencing methods do not discriminate between the primary target of sequencing (the animal genome) and 'contaminating' material, such as associated microbes. A common first step is to filter out these contaminants to allow better assembly of the animal genome or transcriptome. Here, we aimed to assess if these 'contaminations' provide information with regard to biologically important microorganisms associated with the individual. To achieve this, we examined whether the short read data from Apis retrieved elements of its well established microbiome. To this end, we screened almost 1,000 short read libraries of honey bee ( Apis sp.) DNA sequencing project for the presence of microbial sequences, and find sequences from known honey bee microbial associates in at least 11% of them. Further to this, we screened ∼500 Apis RNA sequencing libraries for evidence of viral infections, which were found to be present in about half of them. We then used the data to reconstruct draft genomes of three Apis associated bacteria, as well as several viral strains de novo . We conclude that 'contamination' in short read sequencing libraries can provide useful genomic information on microbial taxa known to be associated with the target organisms, and may even lead to the discovery of novel associations. Finally, we demonstrate that RNAseq samples from experiments commonly carry uneven viral loads across libraries. We note variation in viral presence and load may be a confounding feature of differential gene expression analyses, and as such it should be incorporated as a random factor in analyses.

  13. Analysis of codon usage in beta-tubulin sequences of helminths.

    PubMed

    von Samson-Himmelstjerna, G; Harder, A; Failing, K; Pape, M; Schnieder, T

    2003-07-01

    Codon usage bias has been shown to be correlated with gene expression levels in many organisms, including the nematode Caenorhabditis elegans. Here, the codon usage (cu) characteristics for a set of currently available beta-tubulin coding sequences of helminths were assessed by calculating several indices, including the effective codon number (Nc), the intrinsic codon deviation index (ICDI), the P2 value and the mutational response index (MRI). The P2 value gives a measure of translational pressure, which has been shown to be correlated to high gene expression levels in some organisms, but it has not yet been analysed in that respect in helminths. For all but two of the C. elegans beta-tubulin coding sequences investigated, the P2 value was the only index that indicated the presence of codon usage bias. Therefore, we propose that in general the helminth beta-tubulin sequences investigated here are not expressed at high levels. Furthermore, we calculated the correlation coefficients for the cu patterns of the helminth beta-tubulin sequences compared with those of highly expressed genes in organisms such as Escherichia coli and C. elegans. It was found that beta-tubulin cu patterns for all sequences of members of the Strongylida were significantly correlated to those for highly expressed C. elegans genes. This approach provides a new measure for comparing the adaptation of cu of a particular coding sequence with that of highly expressed genes in possible expression systems.Finally, using the cu patterns of the sequences studied, a phylogenetic tree was constructed. The topology of this tree was very much in concordance with that of a phylogeny based on small subunit ribosomal DNA sequence alignments.

  14. Using Behavior Sequence Analysis to Map Serial Killers' Life Histories.

    PubMed

    Keatley, David A; Golightly, Hayley; Shephard, Rebecca; Yaksic, Enzo; Reid, Sasha

    2018-03-01

    The aim of the current research was to provide a novel method for mapping the developmental sequences of serial killers' life histories. An in-depth biographical account of serial killers' lives, from birth through to conviction, was gained and analyzed using Behavior Sequence Analysis. The analyses highlight similarities in behavioral events across the serial killers' lives, indicating not only which risk factors occur, but the temporal order of these factors. Results focused on early childhood environment, indicating the role of parental abuse; behaviors and events surrounding criminal histories of serial killers, showing that many had previous convictions and were known to police for other crimes; behaviors surrounding their murders, highlighting differences in victim choice and modus operandi; and, finally, trial pleas and convictions. The present research, therefore, provides a novel approach to synthesizing large volumes of data on criminals and presenting results in accessible, understandable outcomes.

  15. Infrared maritime target detection using a probabilistic single Gaussian model of sea clutter in Fourier domain

    NASA Astrophysics Data System (ADS)

    Zhou, Anran; Xie, Weixin; Pei, Jihong; Chen, Yapei

    2018-02-01

    For ship targets detection in cluttered infrared image sequences, a robust detection method, based on the probabilistic single Gaussian model of sea background in Fourier domain, is put forward. The amplitude spectrum sequences at each frequency point of the pure seawater images in Fourier domain, being more stable than the gray value sequences of each background pixel in the spatial domain, are regarded as a Gaussian model. Next, a probability weighted matrix is built based on the stability of the pure seawater's total energy spectrum in the row direction, to make the Gaussian model more accurate. Then, the foreground frequency points are separated from the background frequency points by the model. Finally, the false-alarm points are removed utilizing ships' shape features. The performance of the proposed method is tested by visual and quantitative comparisons with others.

  16. Multiple Myeloma Genomics: A Systematic Review.

    PubMed

    Weaver, Casey J; Tariman, Joseph D

    2017-08-01

    This integrative review describes the genomic variants that have been found to be associated with poor prognosis in patients diagnosed with multiple myeloma (MM). Second, it identifies MM genetic and genomic changes using next-generation sequencing, specifically whole-genome sequencing or exome sequencing. A search for peer-reviewed articles through PubMed, EBSCOhost, and DePaul WorldCat Libraries Worldwide yielded 33 articles that were included in the final analysis. The most commonly reported genetic changes were KRAS, NRAS, TP53, FAM46C, BRAF, DIS3, ATM, and CCND1. These genetic changes play a role in the pathogenesis of MM, prognostication, and therapeutic targets for novel therapies. MM genetics and genomics are expanding rapidly; oncology nurse clinicians must have basic competencies in genetics and genomics to help patients understand the complexities of genetic and genomic alterations and be able to refer patients to appropriate genomic professionals if needed. Copyright © 2017 Elsevier Inc. All rights reserved.

  17. The UK10K project identifies rare variants in health and disease.

    PubMed

    Walter, Klaudia; Min, Josine L; Huang, Jie; Crooks, Lucy; Memari, Yasin; McCarthy, Shane; Perry, John R B; Xu, ChangJiang; Futema, Marta; Lawson, Daniel; Iotchkova, Valentina; Schiffels, Stephan; Hendricks, Audrey E; Danecek, Petr; Li, Rui; Floyd, James; Wain, Louise V; Barroso, Inês; Humphries, Steve E; Hurles, Matthew E; Zeggini, Eleftheria; Barrett, Jeffrey C; Plagnol, Vincent; Richards, J Brent; Greenwood, Celia M T; Timpson, Nicholas J; Durbin, Richard; Soranzo, Nicole

    2015-10-01

    The contribution of rare and low-frequency variants to human traits is largely unexplored. Here we describe insights from sequencing whole genomes (low read depth, 7×) or exomes (high read depth, 80×) of nearly 10,000 individuals from population-based and disease collections. In extensively phenotyped cohorts we characterize over 24 million novel sequence variants, generate a highly accurate imputation reference panel and identify novel alleles associated with levels of triglycerides (APOB), adiponectin (ADIPOQ) and low-density lipoprotein cholesterol (LDLR and RGAG1) from single-marker and rare variant aggregation tests. We describe population structure and functional annotation of rare and low-frequency variants, use the data to estimate the benefits of sequencing for association studies, and summarize lessons from disease-specific collections. Finally, we make available an extensive resource, including individual-level genetic and phenotypic data and web-based tools to facilitate the exploration of association results.

  18. A Personal Journey of Discovery: Developing Technology and Changing Biology

    NASA Astrophysics Data System (ADS)

    Hood, Lee

    2008-07-01

    This autobiographical article describes my experiences in developing chemically based, biological technologies for deciphering biological information: DNA, RNA, proteins, interactions, and networks. The instruments developed include protein and DNA sequencers and synthesizers, as well as ink-jet technology for synthesizing DNA chips. Diverse new strategies for doing biology also arose from novel applications of these instruments. The functioning of these instruments can be integrated to generate powerful new approaches to cloning and characterizing genes from a small amount of protein sequence or to using gene sequences to synthesize peptide fragments so as to characterize various properties of the proteins. I also discuss the five paradigm changes in which I have participated: the development and integration of biological instrumentation; the human genome project; cross-disciplinary biology; systems biology; and predictive, personalized, preventive, and participatory (P4) medicine. Finally, I discuss the origins, the philosophy, some accomplishments, and the future trajectories of the Institute for Systems Biology.

  19. The dynamics of genome replication using deep sequencing

    PubMed Central

    Müller, Carolin A.; Hawkins, Michelle; Retkute, Renata; Malla, Sunir; Wilson, Ray; Blythe, Martin J.; Nakato, Ryuichiro; Komata, Makiko; Shirahige, Katsuhiko; de Moura, Alessandro P.S.; Nieduszynski, Conrad A.

    2014-01-01

    Eukaryotic genomes are replicated from multiple DNA replication origins. We present complementary deep sequencing approaches to measure origin location and activity in Saccharomyces cerevisiae. Measuring the increase in DNA copy number during a synchronous S-phase allowed the precise determination of genome replication. To map origin locations, replication forks were stalled close to their initiation sites; therefore, copy number enrichment was limited to origins. Replication timing profiles were generated from asynchronous cultures using fluorescence-activated cell sorting. Applying this technique we show that the replication profiles of haploid and diploid cells are indistinguishable, indicating that both cell types use the same cohort of origins with the same activities. Finally, increasing sequencing depth allowed the direct measure of replication dynamics from an exponentially growing culture. This is the first time this approach, called marker frequency analysis, has been successfully applied to a eukaryote. These data provide a high-resolution resource and methodological framework for studying genome biology. PMID:24089142

  20. Therapeutic change in interaction: conversation analysis of a transforming sequence.

    PubMed

    Voutilainen, Liisa; Perakyla, Anssi; Ruusuvuori, Johanna

    2011-05-01

    A process of change within a single case of cognitive-constructivist therapy is analyzed by means of conversation analysis (CA). The focus is on a process of change in the sequences of interaction, which consist of the therapist's conclusion and the patient's response to it. In the conclusions, the therapist investigates and challenges the patient's tendency to transform her feelings of disappointment and anger into self-blame. Over the course of the therapy, the patient's responses to these conclusions are recast: from the patient first rejecting the conclusion, to then being ambivalent, and finally to agreeing with the therapist. On the basis of this case study, we suggest that an analysis that focuses on sequences of talk that are interactionally similar offers a sensitive method to investigate the manifestation of therapeutic change. It is suggested that this line of research can complement assimilation analysis and other methods of analyzing changes in a client's talk.

  1. The Degradome database: mammalian proteases and diseases of proteolysis.

    PubMed

    Quesada, Víctor; Ordóñez, Gonzalo R; Sánchez, Luis M; Puente, Xose S; López-Otín, Carlos

    2009-01-01

    The degradome is defined as the complete set of proteases present in an organism. The recent availability of whole genomic sequences from multiple organisms has led us to predict the contents of the degradomes of several mammalian species. To ensure the fidelity of these predictions, our methods have included manual curation of individual sequences and, when necessary, direct cloning and sequencing experiments. The results of these studies in human, chimpanzee, mouse and rat have been incorporated into the Degradome database, which can be accessed through a web interface at http://degradome.uniovi.es. The annotations about each individual protease can be retrieved by browsing catalytic classes and families or by searching specific terms. This web site also provides detailed information about genetic diseases of proteolysis, a growing field of great importance for multiple users. Finally, the user can find additional information about protease structures, protease inhibitors, ancillary domains of proteases and differences between mammalian degradomes.

  2. The Degradome database: mammalian proteases and diseases of proteolysis

    PubMed Central

    Quesada, Víctor; Ordóñez, Gonzalo R.; Sánchez, Luis M.; Puente, Xose S.; López-Otín, Carlos

    2009-01-01

    The degradome is defined as the complete set of proteases present in an organism. The recent availability of whole genomic sequences from multiple organisms has led us to predict the contents of the degradomes of several mammalian species. To ensure the fidelity of these predictions, our methods have included manual curation of individual sequences and, when necessary, direct cloning and sequencing experiments. The results of these studies in human, chimpanzee, mouse and rat have been incorporated into the Degradome database, which can be accessed through a web interface at http://degradome.uniovi.es. The annotations about each individual protease can be retrieved by browsing catalytic classes and families or by searching specific terms. This web site also provides detailed information about genetic diseases of proteolysis, a growing field of great importance for multiple users. Finally, the user can find additional information about protease structures, protease inhibitors, ancillary domains of proteases and differences between mammalian degradomes. PMID:18776217

  3. Chemical biology on the genome.

    PubMed

    Balasubramanian, Shankar

    2014-08-15

    In this article I discuss studies towards understanding the structure and function of DNA in the context of genomes from the perspective of a chemist. The first area I describe concerns the studies that led to the invention and subsequent development of a method for sequencing DNA on a genome scale at high speed and low cost, now known as Solexa/Illumina sequencing. The second theme will feature the four-stranded DNA structure known as a G-quadruplex with a focus on its fundamental properties, its presence in cellular genomic DNA and the prospects for targeting such a structure in cels with small molecules. The final topic for discussion is naturally occurring chemically modified DNA bases with an emphasis on chemistry for decoding (or sequencing) such modifications in genomic DNA. The genome is a fruitful topic to be further elucidated by the creation and application of chemical approaches. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. Molecular Epidemiology of Oropouche Virus, Brazil

    PubMed Central

    Vasconcelos, Helena Baldez; Nunes, Márcio R.T.; Casseb, Lívia M.N.; Carvalho, Valéria L.; Pinto da Silva, Eliana V.; Silva, Mayra; Casseb, Samir M.M.

    2011-01-01

    Oropouche virus (OROV) is the causative agent of Oropouche fever, an urban febrile arboviral disease widespread in South America, with >30 epidemics reported in Brazil and other Latin American countries during 1960–2009. To describe the molecular epidemiology of OROV, we analyzed the entire N gene sequences (small RNA) of 66 strains and 35 partial Gn (medium RNA) and large RNA gene sequences. Distinct patterns of OROV strain clustered according to N, Gn, and large gene sequences, which suggests that each RNA segment had a different evolutionary history and that the classification in genotypes must consider the genetic information for all genetic segments. Finally, time-scale analysis based on the N gene showed that OROV emerged in Brazil ≈223 years ago and that genotype I (based on N gene data) was responsible for the emergence of all other genotypes and for virus dispersal. PMID:21529387

  5. Analysis of methylated patterns and quality-related genes in tobacco (Nicotiana tabacum) cultivars.

    PubMed

    Jiao, Junna; Jia, Yanlong; Lv, Zhuangwei; Sun, Chuanfei; Gao, Lijie; Yan, Xiaoxiao; Cui, Liusu; Tang, Zongxiang; Yan, Benju

    2014-08-01

    Methylation-sensitive amplified polymorphism was used in this study to investigate epigenetic information of four tobacco cultivars: Yunyan 85, NC89, K326, and Yunyan 87. The DNA fragments with methylated information were cloned by reamplified PCR and sequenced. The results of Blast alignments showed that the genes with methylation information included chitinase, nitrate reductase, chloroplast DNA, mitochondrial DNA, ornithine decarboxylase, ribulose carboxylase, and promoter sequences. Homologous comparison in three cloned gene sequences (nitrate reductase, ornithine decarboxylase, and ribulose decarboxylase) indicated that geographic factors had significant influence on the whole genome methylation. Introns also contained different information in different tobacco cultivars. These findings suggest that synthetic mechanisms for tobacco aromatic components could be affected by different environmental factors leading to variation of noncoding regions in the genome, which finally results in different fragrance and taste in different tobacco cultivars.

  6. Diversity of halophilic archaea from six hypersaline environments in Turkey.

    PubMed

    Ozcan, Birgul; Ozcengiz, Gulay; Coleri, Arzu; Cokmus, Cumhur

    2007-06-01

    The diversity of archaeal strains from six hypersaline environments in Turkey was analyzed by comparing their phenotypic characteristics and 16S rDNA sequences. Thirty-three isolates were characterized in terms of their phenotypic properties including morphological and biochemical characteristics, susceptibility to different antibiotics, and total lipid and plasmid contents, and finally compared by 16S rDNA gene sequences. The results showed that all isolates belong to the family Halobacteriaceae. Phylogenetic analyses using approximately 1,388 bp comparisions of 16S rDNA sequences demonstrated that all isolates clustered closely to species belonging to 9 genera, namely Halorubrum (8 isolates), Natrinema (5 isolates), Haloarcula (4 isolates), Natronococcus (4 isolates), Natrialba (4 isolates), Haloferax (3 isolates), Haloterrigena (3 isolates), Halalkalicoccus (1 isolate), and Halomicrobium (1 isolate). The results revealed a high diversity among the isolated halophilic strains and indicated that some of these strains constitute new taxa of extremely halophilic archaea.

  7. CRISPRTarget

    PubMed Central

    Biswas, Ambarish; Gagnon, Joshua N.; Brouns, Stan J.J.; Fineran, Peter C.; Brown, Chris M.

    2013-01-01

    The bacterial and archaeal CRISPR/Cas adaptive immune system targets specific protospacer nucleotide sequences in invading organisms. This requires base pairing between processed CRISPR RNA and the target protospacer. For type I and II CRISPR/Cas systems, protospacer adjacent motifs (PAM) are essential for target recognition, and for type III, mismatches in the flanking sequences are important in the antiviral response. In this study, we examine the properties of each class of CRISPR. We use this information to provide a tool (CRISPRTarget) that predicts the most likely targets of CRISPR RNAs (http://bioanalysis.otago.ac.nz/CRISPRTarget). This can be used to discover targets in newly sequenced genomic or metagenomic data. To test its utility, we discover features and targets of well-characterized Streptococcus thermophilus and Sulfolobus solfataricus type II and III CRISPR/Cas systems. Finally, in Pectobacterium species, we identify new CRISPR targets and propose a model of temperate phage exposure and subsequent inhibition by the type I CRISPR/Cas systems. PMID:23492433

  8. Defining functional distance using manifold embeddings of gene ontology annotations

    PubMed Central

    Lerman, Gilad; Shakhnovich, Boris E.

    2007-01-01

    Although rigorous measures of similarity for sequence and structure are now well established, the problem of defining functional relationships has been particularly daunting. Here, we present several manifold embedding techniques to compute distances between Gene Ontology (GO) functional annotations and consequently estimate functional distances between protein domains. To evaluate accuracy, we correlate the functional distance to the well established measures of sequence, structural, and phylogenetic similarities. Finally, we show that manual classification of structures into folds and superfamilies is mirrored by proximity in the newly defined function space. We show how functional distances place structure–function relationships in biological context resulting in insight into divergent and convergent evolution. The methods and results in this paper can be readily generalized and applied to a wide array of biologically relevant investigations, such as accuracy of annotation transference, the relationship between sequence, structure, and function, or coherence of expression modules. PMID:17595300

  9. Barcode extension for analysis and reconstruction of structures

    NASA Astrophysics Data System (ADS)

    Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L.; Gootenberg, Jonathan S.; Yin, Peng

    2017-03-01

    Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures.

  10. Barcode extension for analysis and reconstruction of structures.

    PubMed

    Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L; Gootenberg, Jonathan S; Yin, Peng

    2017-03-13

    Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures.

  11. Performance Analysis of Direct-Sequence Code-Division Multiple-Access Communications with Asymmetric Quadrature Phase-Shift-Keying Modulation

    NASA Technical Reports Server (NTRS)

    Wang, C.-W.; Stark, W.

    2005-01-01

    This article considers a quaternary direct-sequence code-division multiple-access (DS-CDMA) communication system with asymmetric quadrature phase-shift-keying (AQPSK) modulation for unequal error protection (UEP) capability. Both time synchronous and asynchronous cases are investigated. An expression for the probability distribution of the multiple-access interference is derived. The exact bit-error performance and the approximate performance using a Gaussian approximation and random signature sequences are evaluated by extending the techniques used for uniform quadrature phase-shift-keying (QPSK) and binary phase-shift-keying (BPSK) DS-CDMA systems. Finally, a general system model with unequal user power and the near-far problem is considered and analyzed. The results show that, for a system with UEP capability, the less protected data bits are more sensitive to the near-far effect that occurs in a multiple-access environment than are the more protected bits.

  12. Repeated Evolution of the Pyrrolizidine Alkaloid–Mediated Defense System in Separate Angiosperm LineagesW⃞

    PubMed Central

    Reimann, Andreas; Nurhayati, Niknik; Backenköhler, Anita; Ober, Dietrich

    2004-01-01

    Species of several unrelated families within the angiosperms are able to constitutively produce pyrrolizidine alkaloids as a defense against herbivores. In pyrrolizidine alkaloid (PA) biosynthesis, homospermidine synthase (HSS) catalyzes the first specific step. HSS was recruited during angiosperm evolution from deoxyhypusine synthase (DHS), an enzyme involved in the posttranslational activation of eukaryotic initiation factor 5A. Phylogenetic analysis of 23 cDNA sequences coding for HSS and DHS of various angiosperm species revealed at least four independent recruitments of HSS from DHS: one within the Boraginaceae, one within the monocots, and two within the Asteraceae family. Furthermore, sequence analyses indicated elevated substitution rates within HSS-coding sequences after each gene duplication, with an increased level of nonsynonymous mutations. However, the contradiction between the polyphyletic origin of the first enzyme in PA biosynthesis and the structural identity of the final biosynthetic PA products needs clarification. PMID:15466410

  13. Phonologic-graphemic transcodifier for Portuguese Language spoken in Brazil (PLB)

    NASA Astrophysics Data System (ADS)

    Fragadasilva, Francisco Jose; Saotome, Osamu; Deoliveira, Carlos Alberto

    An automatic speech-to-text transformer system, suited to unlimited vocabulary, is presented. The basic acoustic unit considered are the allophones of the phonemes corresponding to the Portuguese language spoken in Brazil (PLB). The input to the system is a phonetic sequence, from a former step of isolated word recognition of slowly spoken speech. In a first stage, the system eliminates phonetic elements that don't belong to PLB. Using knowledge sources such as phonetics, phonology, orthography, and PLB specific lexicon, the output is a sequence of written words, ordered by probabilistic criterion that constitutes the set of graphemic possibilities to that input sequence. Pronunciation differences of some regions of Brazil are considered, but only those that cause differences in phonological transcription, because those of phonetic level are absorbed, during the transformation to phonological level. In the final stage, all possible written words are analyzed for orthography and grammar point of view, to eliminate the incorrect ones.

  14. High-quality de novo assembly of the apple genome and methylome dynamics of early fruit development.

    PubMed

    Daccord, Nicolas; Celton, Jean-Marc; Linsmith, Gareth; Becker, Claude; Choisne, Nathalie; Schijlen, Elio; van de Geest, Henri; Bianco, Luca; Micheletti, Diego; Velasco, Riccardo; Di Pierro, Erica Adele; Gouzy, Jérôme; Rees, D Jasper G; Guérif, Philippe; Muranty, Hélène; Durel, Charles-Eric; Laurens, François; Lespinasse, Yves; Gaillard, Sylvain; Aubourg, Sébastien; Quesneville, Hadi; Weigel, Detlef; van de Weg, Eric; Troggio, Michela; Bucher, Etienne

    2017-07-01

    Using the latest sequencing and optical mapping technologies, we have produced a high-quality de novo assembly of the apple (Malus domestica Borkh.) genome. Repeat sequences, which represented over half of the assembly, provided an unprecedented opportunity to investigate the uncharacterized regions of a tree genome; we identified a new hyper-repetitive retrotransposon sequence that was over-represented in heterochromatic regions and estimated that a major burst of different transposable elements (TEs) occurred 21 million years ago. Notably, the timing of this TE burst coincided with the uplift of the Tian Shan mountains, which is thought to be the center of the location where the apple originated, suggesting that TEs and associated processes may have contributed to the diversification of the apple ancestor and possibly to its divergence from pear. Finally, genome-wide DNA methylation data suggest that epigenetic marks may contribute to agronomically relevant aspects, such as apple fruit development.

  15. Application of multi-objective optimization to pooled experiments of next generation sequencing for detection of rare mutations.

    PubMed

    Zilinskas, Julius; Lančinskas, Algirdas; Guarracino, Mario Rosario

    2014-01-01

    In this paper we propose some mathematical models to plan a Next Generation Sequencing experiment to detect rare mutations in pools of patients. A mathematical optimization problem is formulated for optimal pooling, with respect to minimization of the experiment cost. Then, two different strategies to replicate patients in pools are proposed, which have the advantage to decrease the overall costs. Finally, a multi-objective optimization formulation is proposed, where the trade-off between the probability to detect a mutation and overall costs is taken into account. The proposed solutions are devised in pursuance of the following advantages: (i) the solution guarantees mutations are detectable in the experimental setting, and (ii) the cost of the NGS experiment and its biological validation using Sanger sequencing is minimized. Simulations show replicating pools can decrease overall experimental cost, thus making pooling an interesting option.

  16. Barcode extension for analysis and reconstruction of structures

    PubMed Central

    Myhrvold, Cameron; Baym, Michael; Hanikel, Nikita; Ong, Luvena L; Gootenberg, Jonathan S; Yin, Peng

    2017-01-01

    Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures. PMID:28287117

  17. Single-cell isolation by a modular single-cell pipette for RNA-sequencing.

    PubMed

    Zhang, Kai; Gao, Min; Chong, Zechen; Li, Ying; Han, Xin; Chen, Rui; Qin, Lidong

    2016-11-29

    Single-cell transcriptome sequencing highly requires a convenient and reliable method to rapidly isolate a live cell into a specific container such as a PCR tube. Here, we report a modular single-cell pipette (mSCP) consisting of three modular components, a SCP-Tip, an air-displacement pipette (ADP), and ADP-Tips, that can be easily assembled, disassembled, and reassembled. By assembling the SCP-Tip containing a hydrodynamic trap, the mSCP can isolate single cells from 5-10 cells per μL of cell suspension. The mSCP is compatible with microscopic identification of captured single cells to finally achieve 100% single-cell isolation efficiency. The isolated live single cells are in submicroliter volumes and well suitable for single-cell PCR analysis and RNA-sequencing. The mSCP possesses merits of convenience, rapidness, and high efficiency, making it a powerful tool to isolate single cells for transcriptome analysis.

  18. Data on publications, structural analyses, and queries used to build and utilize the AlloRep database.

    PubMed

    Sousa, Filipa L; Parente, Daniel J; Hessman, Jacob A; Chazelle, Allen; Teichmann, Sarah A; Swint-Kruse, Liskin

    2016-09-01

    The AlloRep database (www.AlloRep.org) (Sousa et al., 2016) [1] compiles extensive sequence, mutagenesis, and structural information for the LacI/GalR family of transcription regulators. Sequence alignments are presented for >3000 proteins in 45 paralog subfamilies and as a subsampled alignment of the whole family. Phenotypic and biochemical data on almost 6000 mutants have been compiled from an exhaustive search of the literature; citations for these data are included herein. These data include information about oligomerization state, stability, DNA binding and allosteric regulation. Protein structural data for 65 proteins are presented as easily-accessible, residue-contact networks. Finally, this article includes example queries to enable the use of the AlloRep database. See the related article, "AlloRep: a repository of sequence, structural and mutagenesis data for the LacI/GalR transcription regulators" (Sousa et al., 2016) [1].

  19. Exome Sequence Reveals Mutations in CoA Synthase as a Cause of Neurodegeneration with Brain Iron Accumulation

    PubMed Central

    Dusi, Sabrina; Valletta, Lorella; Haack, Tobias B.; Tsuchiya, Yugo; Venco, Paola; Pasqualato, Sebastiano; Goffrini, Paola; Tigano, Marco; Demchenko, Nikita; Wieland, Thomas; Schwarzmayr, Thomas; Strom, Tim M.; Invernizzi, Federica; Garavaglia, Barbara; Gregory, Allison; Sanford, Lynn; Hamada, Jeffrey; Bettencourt, Conceição; Houlden, Henry; Chiapparini, Luisa; Zorzi, Giovanna; Kurian, Manju A.; Nardocci, Nardo; Prokisch, Holger; Hayflick, Susan; Gout, Ivan; Tiranti, Valeria

    2014-01-01

    Neurodegeneration with brain iron accumulation (NBIA) comprises a clinically and genetically heterogeneous group of disorders with progressive extrapyramidal signs and neurological deterioration, characterized by iron accumulation in the basal ganglia. Exome sequencing revealed the presence of recessive missense mutations in COASY, encoding coenzyme A (CoA) synthase in one NBIA-affected subject. A second unrelated individual carrying mutations in COASY was identified by Sanger sequence analysis. CoA synthase is a bifunctional enzyme catalyzing the final steps of CoA biosynthesis by coupling phosphopantetheine with ATP to form dephospho-CoA and its subsequent phosphorylation to generate CoA. We demonstrate alterations in RNA and protein expression levels of CoA synthase, as well as CoA amount, in fibroblasts derived from the two clinical cases and in yeast. This is the second inborn error of coenzyme A biosynthesis to be implicated in NBIA. PMID:24360804

  20. Utility of whole exome sequencing in the diagnosis of Usher syndrome: Report of novel compound heterozygous MYO7A mutations.

    PubMed

    Ramzan, Khushnooda; Al-Owain, Mohammed; Huma, Rozeena; Al-Hazzaa, Selwa A F; Al-Ageel, Sarah; Imtiaz, Faiqa; Al-Sayed, Moeenaldeen

    2018-05-01

    Next generation sequencing (NGS), such as targeted panel sequencing, whole-exome sequencing and whole-genome sequencing has led to an exponential increase of elucidated genetic causes in both rare diseases, and common but heterogeneous disorders. NGS is applied in both research and clinical settings, and the clinical exome sequencing (CES), which provides not only the sequence variation data but also clinical interpretation, aids in reaching a final conclusion with regards to a genetic diagnosis. Usher syndrome is a group of disorders, characterized by bilateral sensorineural hearing loss, with or without vestibular dysfunction and retinitis pigmentosa. The index patient, a 2-year-old child was initially diagnosed with nonsyndromic hearing impairment. Homozygosity mapping followed by CES was utilized as a diagnostic tool to identify the genetic basis of his hearing loss. A paternally inherited novel insertion, c.198_199insA (p.Val67Serfs*73) and a maternally inherited novel deletion, c.1219_1226del (p.Phe407Aspfs*33) in gene MYO7A were found in compound heterozygous state in the index patient. The result expands the mutational spectrum of MYO7A. In addition it helped in early diagnosis of the syndrome, for planning and adjustments for the patient, and as well as for future family planning. This study highlights the clinical effectiveness of CES for Usher syndrome diagnosis in a child presented with congenital hearing loss. Copyright © 2018. Published by Elsevier B.V.

  1. Overlap and diversity in antimicrobial peptide databases: compiling a non-redundant set of sequences.

    PubMed

    Aguilera-Mendoza, Longendri; Marrero-Ponce, Yovani; Tellez-Ibarra, Roberto; Llorente-Quesada, Monica T; Salgado, Jesús; Barigye, Stephen J; Liu, Jun

    2015-08-01

    The large variety of antimicrobial peptide (AMP) databases developed to date are characterized by a substantial overlap of data and similarity of sequences. Our goals are to analyze the levels of redundancy for all available AMP databases and use this information to build a new non-redundant sequence database. For this purpose, a new software tool is introduced. A comparative study of 25 AMP databases reveals the overlap and diversity among them and the internal diversity within each database. The overlap analysis shows that only one database (Peptaibol) contains exclusive data, not present in any other, whereas all sequences in the LAMP_Patent database are included in CAMP_Patent. However, the majority of databases have their own set of unique sequences, as well as some overlap with other databases. The complete set of non-duplicate sequences comprises 16 990 cases, which is almost half of the total number of reported peptides. On the other hand, the diversity analysis identifies the most and least diverse databases and proves that all databases exhibit some level of redundancy. Finally, we present a new parallel-free software, named Dover Analyzer, developed to compute the overlap and diversity between any number of databases and compile a set of non-redundant sequences. These results are useful for selecting or building a suitable representative set of AMPs, according to specific needs. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  2. Statistical learning of movement.

    PubMed

    Ongchoco, Joan Danielle Khonghun; Uddenberg, Stefan; Chun, Marvin M

    2016-12-01

    The environment is dynamic, but objects move in predictable and characteristic ways, whether they are a dancer in motion, or a bee buzzing around in flight. Sequences of movement are comprised of simpler motion trajectory elements chained together. But how do we know where one trajectory element ends and another begins, much like we parse words from continuous streams of speech? As a novel test of statistical learning, we explored the ability to parse continuous movement sequences into simpler element trajectories. Across four experiments, we showed that people can robustly parse such sequences from a continuous stream of trajectories under increasingly stringent tests of segmentation ability and statistical learning. Observers viewed a single dot as it moved along simple sequences of paths, and were later able to discriminate these sequences from novel and partial ones shown at test. Observers demonstrated this ability when there were potentially helpful trajectory-segmentation cues such as a common origin for all movements (Experiment 1); when the dot's motions were entirely continuous and unconstrained (Experiment 2); when sequences were tested against partial sequences as a more stringent test of statistical learning (Experiment 3); and finally, even when the element trajectories were in fact pairs of trajectories, so that abrupt directional changes in the dot's motion could no longer signal inter-trajectory boundaries (Experiment 4). These results suggest that observers can automatically extract regularities in movement - an ability that may underpin our capacity to learn more complex biological motions, as in sport or dance.

  3. A world of opportunities with nanopore sequencing.

    PubMed

    Leggett, Richard M; Clark, Matthew D

    2017-11-28

    Oxford Nanopore Technologies' MinION sequencer was launched in pre-release form in 2014 and represents an exciting new sequencing paradigm. The device offers multi-kilobase reads and a streamed mode of operation that allows processing of reads as they are generated. Crucially, it is an extremely compact device that is powered from the USB port of a laptop computer, enabling it to be taken out of the lab and facilitating previously impossible in-field sequencing experiments to be undertaken. Many of the initial publications concerning the platform focused on provision of tools to access and analyse the new sequence formats and then demonstrating the assembly of microbial genomes. More recently, as throughput and accuracy have increased, it has been possible to begin work involving more complex genomes and metagenomes. With the release of the high-throughput GridION X5 and PromethION platforms, the sequencing of large genomes will become more cost efficient, and enable the leveraging of extremely long (>100 kb) reads for resolution of complex genomic structures. This review provides a brief overview of nanopore sequencing technology, describes the growing range of nanopore bioinformatics tools, and highlights some of the most influential publications that have emerged over the last 2 years. Finally, we look to the future and the potential the platform has to disrupt work in human, microbiome, and plant genomics. © The Author 2017. Published by Oxford University Press on behalf of the Society for Experimental Biology. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  4. Representation of item position in immediate serial recall: Evidence from intrusion errors.

    PubMed

    Fischer-Baum, Simon; McCloskey, Michael

    2015-09-01

    In immediate serial recall, participants are asked to recall novel sequences of items in the correct order. Theories of the representations and processes required for this task differ in how order information is maintained; some have argued that order is represented through item-to-item associations, while others have argued that each item is coded for its position in a sequence, with position being defined either by distance from the start of the sequence, or by distance from both the start and the end of the sequence. Previous researchers have used error analyses to adjudicate between these different proposals. However, these previous attempts have not allowed researchers to examine the full set of alternative proposals. In the current study, we analyzed errors produced in 2 immediate serial recall experiments that differ in the modality of input (visual vs. aural presentation of words) and the modality of output (typed vs. spoken responses), using new analysis methods that allow for a greater number of alternative hypotheses to be considered. We find evidence that sequence positions are represented relative to both the start and the end of the sequence, and show a contribution of the end-based representation beyond the final item in the sequence. We also find limited evidence for item-to-item associations, suggesting that both a start-end positional scheme and item-to-item associations play a role in representing item order in immediate serial recall. (c) 2015 APA, all rights reserved).

  5. C-TERMINAL FRAGMENT OF TRANSFORMING GROWTH FACTOR BETA-INDUCED PROTEIN (TGFBIp) IS REQUIRED FOR APOPTOSIS IN HUMAN OSTEOSARCOMA CELLS

    PubMed Central

    Zamilpa, Rogelio; Rupaimoole, Rajesha; Phelix, Clyde F.; Somaraki-Cormier, Maria; Haskins, William; Asmis, Reto; LeBaron, Richard G.

    2009-01-01

    Transforming growth factor beta induced protein (TGFBIp), is secreted into the extracellular space. When fragmentation of C-terminal portions is blocked, apoptosis is low, even when the protein is overexpressed. If fragmentation occurs, apoptosis is observed. Whether full-length TGFBIp or integrin-binding fragments released from its C-terminus is necessary for apoptosis remains equivocal. More importantly, the exact portion of the C-terminus that conveys the pro-apoptotic property of TGFBIp is uncertain. It is reportedly within the final 166 amino acids. We sought to determine if this property is dependent upon the final 69 amino acids containing the integrin-binding, EPDIM and RGD, sequences. With MG-63 osteosarcoma cells, transforming growth factor (TGF)-β1 treatment increased expression of TGFBIp over 72 hours (p<0.001). At this time point, apoptosis was significantly increased (p<0.001) and was prevented by an anti-TGFBIp, polyclonal antibody (p<0.05). Overexpression of TGFBIp by transient transfection produced a 2-fold increase in apoptosis (p<0.01). Exogenous purified TGFBIp at concentrations of 37 to 150 nM produced a dose dependent increase in apoptosis (p<0.001). Mass spectrometry analysis of TGFBIp isolated from conditioned medium of cells treated with TGF-β1 revealed truncated forms of TGFBIp that lacked integrin-binding sequences in the C-terminus. Recombinant TGFBIp truncated, similarly, at amino acid 614 failed to induce apoptosis. A recombinant fragment encoding the final 69 amino acids of the TGFBIp C-terminus produced significant apoptosis. This apoptosis level was comparable to that induced by TGF-β1 upregulation of endogenous TGFBIp. Mutation of the integrin-binding sequence EPDIM, but not RGD, blocked apoptosis (p<0.001). These pro-apoptotic actions are dependent on the C-terminus most likely to interact with integrins. PMID:19505574

  6. First results from the IllustrisTNG simulations: the galaxy colour bimodality

    NASA Astrophysics Data System (ADS)

    Nelson, Dylan; Pillepich, Annalisa; Springel, Volker; Weinberger, Rainer; Hernquist, Lars; Pakmor, Rüdiger; Genel, Shy; Torrey, Paul; Vogelsberger, Mark; Kauffmann, Guinevere; Marinacci, Federico; Naiman, Jill

    2018-03-01

    We introduce the first two simulations of the IllustrisTNG project, a next generation of cosmological magnetohydrodynamical simulations, focusing on the optical colours of galaxies. We explore TNG100, a rerun of the original Illustris box, and TNG300, which includes 2 × 25003 resolution elements in a volume 20 times larger. Here, we present first results on the galaxy colour bimodality at low redshift. Accounting for the attenuation of stellar light by dust, we compare the simulated (g - r) colours of 109 < M⋆/M⊙ < 1012.5 galaxies to the observed distribution from the Sloan Digital Sky Survey. We find a striking improvement with respect to the original Illustris simulation, as well as excellent quantitative agreement with the observations, with a sharp transition in median colour from blue to red at a characteristic M⋆ ˜ 1010.5 M⊙. Investigating the build-up of the colour-mass plane and the formation of the red sequence, we demonstrate that the primary driver of galaxy colour transition is supermassive black hole feedback in its low accretion state. Across the entire population the median colour transition time-scale Δtgreen is ˜1.6 Gyr, a value which drops for increasingly massive galaxies. We find signatures of the physical process of quenching: at fixed stellar mass, redder galaxies have lower star formation rates, gas fractions, and gas metallicities; their stellar populations are also older and their large-scale interstellar magnetic fields weaker than in bluer galaxies. Finally, we measure the amount of stellar mass growth on the red sequence. Galaxies with M⋆ > 1011 M⊙ which redden at z < 1 accumulate on average ˜25 per cent of their final z = 0 mass post-reddening; at the same time, ˜18 per cent of such massive galaxies acquire half or more of their final stellar mass while on the red sequence.

  7. Prediction of β-turns in proteins from multiple alignment using neural network

    PubMed Central

    Kaur, Harpreet; Raghava, Gajendra Pal Singh

    2003-01-01

    A neural network-based method has been developed for the prediction of β-turns in proteins by using multiple sequence alignment. Two feed-forward back-propagation networks with a single hidden layer are used where the first-sequence structure network is trained with the multiple sequence alignment in the form of PSI-BLAST–generated position-specific scoring matrices. The initial predictions from the first network and PSIPRED-predicted secondary structure are used as input to the second structure-structure network to refine the predictions obtained from the first net. A significant improvement in prediction accuracy has been achieved by using evolutionary information contained in the multiple sequence alignment. The final network yields an overall prediction accuracy of 75.5% when tested by sevenfold cross-validation on a set of 426 nonhomologous protein chains. The corresponding Qpred, Qobs, and Matthews correlation coefficient values are 49.8%, 72.3%, and 0.43, respectively, and are the best among all the previously published β-turn prediction methods. The Web server BetaTPred2 (http://www.imtech.res.in/raghava/betatpred2/) has been developed based on this approach. PMID:12592033

  8. Prediction of pi-turns in proteins using PSI-BLAST profiles and secondary structure information.

    PubMed

    Wang, Yan; Xue, Zhi-Dong; Shi, Xiao-Hong; Xu, Jin

    2006-09-01

    Due to the structural and functional importance of tight turns, some methods have been proposed to predict gamma-turns, beta-turns, and alpha-turns in proteins. In the past, studies of pi-turns were made, but not a single prediction approach has been developed so far. It will be useful to develop a method for identifying pi-turns in a protein sequence. In this paper, the support vector machine (SVM) method has been introduced to predict pi-turns from the amino acid sequence. The training and testing of this approach is performed with a newly collected data set of 640 non-homologous protein chains containing 1931 pi-turns. Different sequence encoding schemes have been explored in order to investigate their effects on the prediction performance. With multiple sequence alignment and predicted secondary structure, the final SVM model yields a Matthews correlation coefficient (MCC) of 0.556 by a 7-fold cross-validation. A web server implementing the prediction method is available at the following URL: http://210.42.106.80/piturn/.

  9. Otoplasty: sequencing the operation for improved results.

    PubMed

    Hoehn, James G; Ashruf, Salman

    2005-01-01

    : After studying this article, the participant should be able to: 1. Understand the anatomy and embryology of the external ear. 2. Understand the anatomic causes of the prominent ear. 3. Understand the operative maneuvers used to shape the external ear. 4. Be able to sequence the otoplasty for consistent results. 5. Understand the possible complications of the otoplasty procedure. Correction of prominent ears is a common plastic surgical procedure. Proper execution of the surgical techniques is dependent on the surgeon's understanding of the surgical procedure. This understanding is best founded on an understanding of the historical bases for the operative steps and the execution of these operative steps in a logical fashion. This article describes the concept of sequencing the operation of otoplasty to produce predictable results combining the technical contributions from many authors. The historical, embryological, and anatomic bases for the operation are also discussed. Finally, the authors' preferred techniques are presented. Sequencing the steps in the preoperative assessment, preoperative planning, patient management, operative technique, and postoperative care will produce reproducible results for the attentive surgeon. Careful attention to the details of the operation of otoplasty will avoid many postoperative problems.

  10. High-resolution biophysical analysis of the dynamics of nucleosome formation

    PubMed Central

    Hatakeyama, Akiko; Hartmann, Brigitte; Travers, Andrew; Nogues, Claude; Buckle, Malcolm

    2016-01-01

    We describe a biophysical approach that enables changes in the structure of DNA to be followed during nucleosome formation in in vitro reconstitution with either the canonical “Widom” sequence or a judiciously mutated sequence. The rapid non-perturbing photochemical analysis presented here provides ‘snapshots’ of the DNA configuration at any given moment in time during nucleosome formation under a very broad range of reaction conditions. Changes in DNA photochemical reactivity upon protein binding are interpreted as being mainly induced by alterations in individual base pair roll angles. The results strengthen the importance of the role of an initial (H3/H4)2 histone tetramer-DNA interaction and highlight the modulation of this early event by the DNA sequence. (H3/H4)2 binding precedes and dictates subsequent H2A/H2B-DNA interactions, which are less affected by the DNA sequence, leading to the final octameric nucleosome. Overall, our results provide a novel, exciting way to investigate those biophysical properties of DNA that constitute a crucial component in nucleosome formation and stabilization. PMID:27263658

  11. TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data

    PubMed Central

    Roth, Andrew; Khattra, Jaswinder; Ho, Julie; Yap, Damian; Prentice, Leah M.; Melnyk, Nataliya; McPherson, Andrew; Bashashati, Ali; Laks, Emma; Biele, Justina; Ding, Jiarui; Le, Alan; Rosner, Jamie; Shumansky, Karey; Marra, Marco A.; Gilks, C. Blake; Huntsman, David G.; McAlpine, Jessica N.; Aparicio, Samuel

    2014-01-01

    The evolution of cancer genomes within a single tumor creates mixed cell populations with divergent somatic mutational landscapes. Inference of tumor subpopulations has been disproportionately focused on the assessment of somatic point mutations, whereas computational methods targeting evolutionary dynamics of copy number alterations (CNA) and loss of heterozygosity (LOH) in whole-genome sequencing data remain underdeveloped. We present a novel probabilistic model, TITAN, to infer CNA and LOH events while accounting for mixtures of cell populations, thereby estimating the proportion of cells harboring each event. We evaluate TITAN on idealized mixtures, simulating clonal populations from whole-genome sequences taken from genomically heterogeneous ovarian tumor sites collected from the same patient. In addition, we show in 23 whole genomes of breast tumors that the inference of CNA and LOH using TITAN critically informs population structure and the nature of the evolving cancer genome. Finally, we experimentally validated subclonal predictions using fluorescence in situ hybridization (FISH) and single-cell sequencing from an ovarian cancer patient sample, thereby recapitulating the key modeling assumptions of TITAN. PMID:25060187

  12. Analysis of the origin of predictability in human communications

    NASA Astrophysics Data System (ADS)

    Zhang, Lin; Liu, Yani; Wu, Ye; Xiao, Jinghua

    2014-01-01

    Human behaviors in daily life can be traced by their communications via electronic devices. E-mails, short messages and cell-phone calls can be used to investigate the predictability of communication partners’ patterns, because these three are the most representative and common behaviors in daily communications. In this paper, we show that all the three manners have apparent predictability in partners’ patterns, and moreover, the short message users’ sequences have the highest predictability among the three. We also reveal that people with fewer communication partners have higher predictability. Finally, we investigate the origin of predictability, which comes from two aspects: one is the intrinsic pattern in the partners sequence, that is, people have the preference of communicating with a fixed partner after another fixed one. The other aspect is the burst, which is communicating with the same partner several times in a row. The high burst in short message communication pattern is one of the main reasons for its high predictability, the intrinsic pattern in e-mail partners sequence is the main reason for its predictability, and the predictability of cell-phone call partners sequence comes from both aspects.

  13. Covalent attachment of TAT peptides and thiolated alkyl molecules on GaAs surfaces.

    PubMed

    Cho, Youngnam; Ivanisevic, Albena

    2005-07-07

    Four TAT peptide fragments were used to functionalize GaAs surfaces by adsorption from solution. In addition, two well-studied alkylthiols, mercaptohexadecanoic acid (MHA) and 1-octadecanethiol (ODT) were utilized as references to understand the structure of the TAT peptide monolayer on GaAs. The different sequences of TAT peptides were employed in recognition experiments where a synthetic RNA sequence was tested to verify the specific interaction with the TAT peptide. The modified GaAs surfaces were characterized by atomic force microscopy (AFM), X-ray photoelectron spectroscopy (XPS), and Fourier transform infrared reflection absorption spectroscopy (FT-IRRAS). AFM studies were used to compare the surface roughness before and after functionalization. XPS allowed us to characterize the chemical composition of the GaAs surface and conclude that the monolayers composed of different sequences of peptides have similar surface chemistries. Finally, FT-IRRAS experiments enabled us to deduce that the TAT peptide monolayers have a fairly ordered and densely packed alkyl chain structure. The recognition experiments showed preferred interaction of the RNA sequence toward peptides with high arginine content.

  14. A Hierarchical Convolutional Neural Network for vesicle fusion event classification.

    PubMed

    Li, Haohan; Mao, Yunxiang; Yin, Zhaozheng; Xu, Yingke

    2017-09-01

    Quantitative analysis of vesicle exocytosis and classification of different modes of vesicle fusion from the fluorescence microscopy are of primary importance for biomedical researches. In this paper, we propose a novel Hierarchical Convolutional Neural Network (HCNN) method to automatically identify vesicle fusion events in time-lapse Total Internal Reflection Fluorescence Microscopy (TIRFM) image sequences. Firstly, a detection and tracking method is developed to extract image patch sequences containing potential fusion events. Then, a Gaussian Mixture Model (GMM) is applied on each image patch of the patch sequence with outliers rejected for robust Gaussian fitting. By utilizing the high-level time-series intensity change features introduced by GMM and the visual appearance features embedded in some key moments of the fusion process, the proposed HCNN architecture is able to classify each candidate patch sequence into three classes: full fusion event, partial fusion event and non-fusion event. Finally, we validate the performance of our method on 9 challenging datasets that have been annotated by cell biologists, and our method achieves better performances when comparing with three previous methods. Copyright © 2017 Elsevier Ltd. All rights reserved.

  15. Microbes in deep marine sediments viewed through amplicon sequencing and metagenomics

    NASA Astrophysics Data System (ADS)

    Biddle, J.; Leon, Z. R.; Russell, J. A., III; Martino, A. J.

    2016-12-01

    Nearly twenty percent of microbial biomass on Earth can be found in the marine subsurface. The majority of this is concentrated on continental margins, which have been investigated by scientific drilling. On the Costa Rica Margin, Iberian Margin and Peru Margins, sediment samples have been investigated through DNA extraction followed by amplicon and metagenomic sequencing. Overall samples show a high degree of microbial diversity, including many lineages of newly defined groups. In this talk, metagenome assembled genomes of unusual lineages will be presented, including their relationships to shallower relatives. From Costa Rica, in particular, we have retrieved deep relatives of Lokiarchaeota and Thorarchaeota, as well as other deeply branching archaeal relatives. We discuss their genome similarities to both other archaea and eukaryotes. From the Iberian Margin, relatives of Atribacteria and Aerophobetes will be discussed. Finally, we will detail the knowledge lost or gained depending on whether samples are studied via amplicon sequencing or total metagenomics, as studies in other environments have shown that up to 15% of microbial diversity is ignored when samples are studied via amplicon sequencing alone.

  16. Effects of tonal language background on tests of temporal sequencing in children.

    PubMed

    Mukari, Siti Zamratol-Mai S; Yu, Xuan; Ishak, Wan Syafira; Mazlan, Rafidah

    2015-01-01

    The aims of the present study were to determine the effects of language background on the performance of the pitch pattern sequence test (PPST) and duration pattern sequence test (DPST). As temporal order sequencing may be affected by age and working memory, these factors were also studied. Performance of tonal and non-tonal language speakers on PPST and DPST were compared. Twenty-eight native Mandarin (tonal language) speakers and twenty-nine native Malay (non-tonal language) speakers between seven to nine years old participated in this study. The results revealed that relative to native Malay speakers, native Mandarin speakers demonstrated better scores on the PPST in both humming and verbal labeling responses. However, a similar language effect was not apparent in the DPST. An age effect was only significant in the PPST (verbal labeling). Finally, no significant effect of working memory was found on the PPST and the DPST. These findings suggest that the PPST is affected by tonal language background, and highlight the importance of developing different normative values for tonal and non-tonal language speakers.

  17. Reference genotype and exome data from an Australian Aboriginal population for health-based research

    PubMed Central

    Tang, Dave; Anderson, Denise; Francis, Richard W.; Syn, Genevieve; Jamieson, Sarra E.; Lassmann, Timo; Blackwell, Jenefer M.

    2016-01-01

    Genetic analyses, including genome-wide association studies and whole exome sequencing (WES), provide powerful tools for the analysis of complex and rare genetic diseases. To date there are no reference data for Aboriginal Australians to underpin the translation of health-based genomic research. Here we provide a catalogue of variants called after sequencing the exomes of 72 Aboriginal individuals to a depth of 20X coverage in ∼80% of the sequenced nucleotides. We determined 320,976 single nucleotide variants (SNVs) and 47,313 insertions/deletions using the Genome Analysis Toolkit. We had previously genotyped a subset of the Aboriginal individuals (70/72) using the Illumina Omni2.5 BeadChip platform and found ~99% concordance at overlapping sites, which suggests high quality genotyping. Finally, we compared our SNVs to six publicly available variant databases, such as dbSNP and the Exome Sequencing Project, and 70,115 of our SNVs did not overlap any of the single nucleotide polymorphic sites in all the databases. Our data set provides a useful reference point for genomic studies on Aboriginal Australians. PMID:27070114

  18. Reference genotype and exome data from an Australian Aboriginal population for health-based research.

    PubMed

    Tang, Dave; Anderson, Denise; Francis, Richard W; Syn, Genevieve; Jamieson, Sarra E; Lassmann, Timo; Blackwell, Jenefer M

    2016-04-12

    Genetic analyses, including genome-wide association studies and whole exome sequencing (WES), provide powerful tools for the analysis of complex and rare genetic diseases. To date there are no reference data for Aboriginal Australians to underpin the translation of health-based genomic research. Here we provide a catalogue of variants called after sequencing the exomes of 72 Aboriginal individuals to a depth of 20X coverage in ∼80% of the sequenced nucleotides. We determined 320,976 single nucleotide variants (SNVs) and 47,313 insertions/deletions using the Genome Analysis Toolkit. We had previously genotyped a subset of the Aboriginal individuals (70/72) using the Illumina Omni2.5 BeadChip platform and found ~99% concordance at overlapping sites, which suggests high quality genotyping. Finally, we compared our SNVs to six publicly available variant databases, such as dbSNP and the Exome Sequencing Project, and 70,115 of our SNVs did not overlap any of the single nucleotide polymorphic sites in all the databases. Our data set provides a useful reference point for genomic studies on Aboriginal Australians.

  19. Characterization of the complete mitochondrial genome of Ortleppascaris sinensis (Nematoda: Heterocheilidae) and comparative mitogenomic analysis of eighteen Ascaridida nematodes.

    PubMed

    Zhao, J H; Tu, G J; Wu, X B; Li, C P

    2018-05-01

    Ortleppascaris sinensis (Nematoda: Ascaridida) is a dominant intestinal nematode of the captive Chinese alligator. However, the epidemiology, molecular ecology and population genetics of this parasite remain largely unexplored. In this study, the complete mitochondrial (mt) genome sequence of O. sinensis was first determined using a polymerase chain reaction (PCR)-based primer-walking strategy, and this is also the first sequencing of the complete mitochondrial genome of a member of the genus Ortleppascaris. The circular mitochondrial genome (13,828 bp) of O. sinensis contained 12 protein-coding, 22 transfer RNA and 2 ribosomal RNA genes, but lacked the ATP synthetase subunit 8 gene. Finally, phylogenetic analysis of mtDNAs indicated that the genus Ortleppascaris should be attributed to the family Heterocheilidae. It is necessary to sequence more mtNDAs of Ortleppascaris nematodes in the future to test and confirm our conclusion. The complete mitochondrial genome sequence of O. sinensis reported here should contribute to molecular diagnosis, epidemiological investigations and ecological studies of O. sinensis and other related Ascaridida nematodes.

  20. The report of my death was an exaggeration: A review for researchers using microsatellites in the 21st century1

    PubMed Central

    Hodel, Richard G. J.; Segovia-Salcedo, M. Claudia; Landis, Jacob B.; Crowl, Andrew A.; Sun, Miao; Liu, Xiaoxian; Gitzendanner, Matthew A.; Douglas, Norman A.; Germain-Aubrey, Charlotte C.; Chen, Shichao; Soltis, Douglas E.; Soltis, Pamela S.

    2016-01-01

    Microsatellites, or simple sequence repeats (SSRs), have long played a major role in genetic studies due to their typically high polymorphism. They have diverse applications, including genome mapping, forensics, ascertaining parentage, population and conservation genetics, identification of the parentage of polyploids, and phylogeography. We compare SSRs and newer methods, such as genotyping by sequencing (GBS) and restriction site associated DNA sequencing (RAD-Seq), and offer recommendations for researchers considering which genetic markers to use. We also review the variety of techniques currently used for identifying microsatellite loci and developing primers, with a particular focus on those that make use of next-generation sequencing (NGS). Additionally, we review software for microsatellite development and report on an experiment to assess the utility of currently available software for SSR development. Finally, we discuss the future of microsatellites and make recommendations for researchers preparing to use microsatellites. We argue that microsatellites still have an important place in the genomic age as they remain effective and cost-efficient markers. PMID:27347456

  1. WormBase 2014: new views of curated biology

    PubMed Central

    Harris, Todd W.; Baran, Joachim; Bieri, Tamberlyn; Cabunoc, Abigail; Chan, Juancarlos; Chen, Wen J.; Davis, Paul; Done, James; Grove, Christian; Howe, Kevin; Kishore, Ranjana; Lee, Raymond; Li, Yuling; Muller, Hans-Michael; Nakamura, Cecilia; Ozersky, Philip; Paulini, Michael; Raciti, Daniela; Schindelman, Gary; Tuli, Mary Ann; Auken, Kimberly Van; Wang, Daniel; Wang, Xiaodong; Williams, Gary; Wong, J. D.; Yook, Karen; Schedl, Tim; Hodgkin, Jonathan; Berriman, Matthew; Kersey, Paul; Spieth, John; Stein, Lincoln; Sternberg, Paul W.

    2014-01-01

    WormBase (http://www.wormbase.org/) is a highly curated resource dedicated to supporting research using the model organism Caenorhabditis elegans. With an electronic history predating the World Wide Web, WormBase contains information ranging from the sequence and phenotype of individual alleles to genome-wide studies generated using next-generation sequencing technologies. In recent years, we have expanded the contents to include data on additional nematodes of agricultural and medical significance, bringing the knowledge of C. elegans to bear on these systems and providing support for underserved research communities. Manual curation of the primary literature remains a central focus of the WormBase project, providing users with reliable, up-to-date and highly cross-linked information. In this update, we describe efforts to organize the original atomized and highly contextualized curated data into integrated syntheses of discrete biological topics. Next, we discuss our experiences coping with the vast increase in available genome sequences made possible through next-generation sequencing platforms. Finally, we describe some of the features and tools of the new WormBase Web site that help users better find and explore data of interest. PMID:24194605

  2. Coping with Volume and Variety in Temporal Event Sequences: Strategies for Sharpening Analytic Focus.

    PubMed

    Fan Du; Shneiderman, Ben; Plaisant, Catherine; Malik, Sana; Perer, Adam

    2017-06-01

    The growing volume and variety of data presents both opportunities and challenges for visual analytics. Addressing these challenges is needed for big data to provide valuable insights and novel solutions for business, security, social media, and healthcare. In the case of temporal event sequence analytics it is the number of events in the data and variety of temporal sequence patterns that challenges users of visual analytic tools. This paper describes 15 strategies for sharpening analytic focus that analysts can use to reduce the data volume and pattern variety. Four groups of strategies are proposed: (1) extraction strategies, (2) temporal folding, (3) pattern simplification strategies, and (4) iterative strategies. For each strategy, we provide examples of the use and impact of this strategy on volume and/or variety. Examples are selected from 20 case studies gathered from either our own work, the literature, or based on email interviews with individuals who conducted the analyses and developers who observed analysts using the tools. Finally, we discuss how these strategies might be combined and report on the feedback from 10 senior event sequence analysts.

  3. SubCellProt: predicting protein subcellular localization using machine learning approaches.

    PubMed

    Garg, Prabha; Sharma, Virag; Chaudhari, Pradeep; Roy, Nilanjan

    2009-01-01

    High-throughput genome sequencing projects continue to churn out enormous amounts of raw sequence data. However, most of this raw sequence data is unannotated and, hence, not very useful. Among the various approaches to decipher the function of a protein, one is to determine its localization. Experimental approaches for proteome annotation including determination of a protein's subcellular localizations are very costly and labor intensive. Besides the available experimental methods, in silico methods present alternative approaches to accomplish this task. Here, we present two machine learning approaches for prediction of the subcellular localization of a protein from the primary sequence information. Two machine learning algorithms, k Nearest Neighbor (k-NN) and Probabilistic Neural Network (PNN) were used to classify an unknown protein into one of the 11 subcellular localizations. The final prediction is made on the basis of a consensus of the predictions made by two algorithms and a probability is assigned to it. The results indicate that the primary sequence derived features like amino acid composition, sequence order and physicochemical properties can be used to assign subcellular localization with a fair degree of accuracy. Moreover, with the enhanced accuracy of our approach and the definition of a prediction domain, this method can be used for proteome annotation in a high throughput manner. SubCellProt is available at www.databases.niper.ac.in/SubCellProt.

  4. Large-scale transcriptome sequencing and gene analyses in the crab-eating macaque (Macaca fascicularis) for biomedical research

    PubMed Central

    2012-01-01

    Background As a human replacement, the crab-eating macaque (Macaca fascicularis) is an invaluable non-human primate model for biomedical research, but the lack of genetic information on this primate has represented a significant obstacle for its broader use. Results Here, we sequenced the transcriptome of 16 tissues originated from two individuals of crab-eating macaque (male and female), and identified genes to resolve the main obstacles for understanding the biological response of the crab-eating macaque. From 4 million reads with 1.4 billion base sequences, 31,786 isotigs containing genes similar to those of humans, 12,672 novel isotigs, and 348,160 singletons were identified using the GS FLX sequencing method. Approximately 86% of human genes were represented among the genes sequenced in this study. Additionally, 175 tissue-specific transcripts were identified, 81 of which were experimentally validated. In total, 4,314 alternative splicing (AS) events were identified and analyzed. Intriguingly, 10.4% of AS events were associated with transposable element (TE) insertions. Finally, investigation of TE exonization events and evolutionary analysis were conducted, revealing interesting phenomena of human-specific amplified trends in TE exonization events. Conclusions This report represents the first large-scale transcriptome sequencing and genetic analyses of M. fascicularis and could contribute to its utility for biomedical research and basic biology. PMID:22554259

  5. Understanding the complex evolution of rapidly mutating viruses with deep sequencing: Beyond the analysis of viral diversity.

    PubMed

    Leung, Preston; Eltahla, Auda A; Lloyd, Andrew R; Bull, Rowena A; Luciani, Fabio

    2017-07-15

    With the advent of affordable deep sequencing technologies, detection of low frequency variants within genetically diverse viral populations can now be achieved with unprecedented depth and efficiency. The high-resolution data provided by next generation sequencing technologies is currently recognised as the gold standard in estimation of viral diversity. In the analysis of rapidly mutating viruses, longitudinal deep sequencing datasets from viral genomes during individual infection episodes, as well as at the epidemiological level during outbreaks, now allow for more sophisticated analyses such as statistical estimates of the impact of complex mutation patterns on the evolution of the viral populations both within and between hosts. These analyses are revealing more accurate descriptions of the evolutionary dynamics that underpin the rapid adaptation of these viruses to the host response, and to drug therapies. This review assesses recent developments in methods and provide informative research examples using deep sequencing data generated from rapidly mutating viruses infecting humans, particularly hepatitis C virus (HCV), human immunodeficiency virus (HIV), Ebola virus and influenza virus, to understand the evolution of viral genomes and to explore the relationship between viral mutations and the host adaptive immune response. Finally, we discuss limitations in current technologies, and future directions that take advantage of publically available large deep sequencing datasets. Copyright © 2016 Elsevier B.V. All rights reserved.

  6. Regularized rare variant enrichment analysis for case-control exome sequencing data.

    PubMed

    Larson, Nicholas B; Schaid, Daniel J

    2014-02-01

    Rare variants have recently garnered an immense amount of attention in genetic association analysis. However, unlike methods traditionally used for single marker analysis in GWAS, rare variant analysis often requires some method of aggregation, since single marker approaches are poorly powered for typical sequencing study sample sizes. Advancements in sequencing technologies have rendered next-generation sequencing platforms a realistic alternative to traditional genotyping arrays. Exome sequencing in particular not only provides base-level resolution of genetic coding regions, but also a natural paradigm for aggregation via genes and exons. Here, we propose the use of penalized regression in combination with variant aggregation measures to identify rare variant enrichment in exome sequencing data. In contrast to marginal gene-level testing, we simultaneously evaluate the effects of rare variants in multiple genes, focusing on gene-based least absolute shrinkage and selection operator (LASSO) and exon-based sparse group LASSO models. By using gene membership as a grouping variable, the sparse group LASSO can be used as a gene-centric analysis of rare variants while also providing a penalized approach toward identifying specific regions of interest. We apply extensive simulations to evaluate the performance of these approaches with respect to specificity and sensitivity, comparing these results to multiple competing marginal testing methods. Finally, we discuss our findings and outline future research. © 2013 WILEY PERIODICALS, INC.

  7. Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species

    PubMed Central

    2013-01-01

    Background The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly. Results In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies. Conclusions Many current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another. PMID:23870653

  8. Combined subtraction hybridization and polymerase chain reaction amplification procedure for isolation of strain-specific Rhizobium DNA sequences.

    PubMed Central

    Bjourson, A J; Stone, C E; Cooper, J E

    1992-01-01

    A novel subtraction hybridization procedure, incorporating a combination of four separation strategies, was developed to isolate unique DNA sequences from a strain of Rhizobium leguminosarum bv. trifolii. Sau3A-digested DNA from this strain, i.e., the probe strain, was ligated to a linker and hybridized in solution with an excess of pooled subtracter DNA from seven other strains of the same biovar which had been restricted, ligated to a different, biotinylated, subtracter-specific linker, and amplified by polymerase chain reaction to incorporate dUTP. Subtracter DNA and subtracter-probe hybrids were removed by phenol-chloroform extraction of a streptavidin-biotin-DNA complex. NENSORB chromatography of the sequences remaining in the aqueous layer captured biotinylated subtracter DNA which may have escaped removal by phenol-chloroform treatment. Any traces of contaminating subtracter DNA were removed by digestion with uracil DNA glycosylase. Finally, remaining sequences were amplified by polymerase chain reaction with a probe strain-specific primer, labelled with 32P, and tested for specificity in dot blot hybridizations against total genomic target DNA from each strain in the subtracter pool. Two rounds of subtraction-amplification were sufficient to remove cross-hybridizing sequences and to give a probe which hybridized only with homologous target DNA. The method is applicable to the isolation of DNA and RNA sequences from both procaryotic and eucaryotic cells. Images PMID:1637166

  9. Discovery and profiling of novel and conserved microRNAs during flower development in Carya cathayensis via deep sequencing.

    PubMed

    Wang, Zheng Jia; Huang, Jian Qin; Huang, You Jun; Li, Zheng; Zheng, Bing Song

    2012-08-01

    Hickory (Carya cathayensis Sarg.) is an economically important woody plant in China, but its long juvenile phase delays yield. MicroRNAs (miRNAs) are critical regulators of genes and important for normal plant development and physiology, including flower development. We used Solexa technology to sequence two small RNA libraries from two floral differentiation stages in hickory to identify miRNAs related to flower development. We identified 39 conserved miRNA sequences from 114 loci belonging to 23 families as well as two novel and ten potential novel miRNAs belonging to nine families. Moreover, 35 conserved miRNA*s and two novel miRNA*s were detected. Twenty miRNA sequences from 49 loci belonging to 11 families were differentially expressed; all were up-regulated at the later stage of flower development in hickory. Quantitative real-time PCR of 12 conserved miRNA sequences, five novel miRNA families, and two novel miRNA*s validated that all were expressed during hickory flower development, and the expression patterns were similar to those detected with Solexa sequencing. Finally, a total of 146 targets of the novel and conserved miRNAs were predicted. This study identified a diverse set of miRNAs that were closely related to hickory flower development and that could help in plant floral induction.

  10. Mission Management Computer and Sequencing Hardware for RLV-TD HEX-01 Mission

    NASA Astrophysics Data System (ADS)

    Gupta, Sukrat; Raj, Remya; Mathew, Asha Mary; Koshy, Anna Priya; Paramasivam, R.; Mookiah, T.

    2017-12-01

    Reusable Launch Vehicle-Technology Demonstrator Hypersonic Experiment (RLV-TD HEX-01) mission posed some unique challenges in the design and development of avionics hardware. This work presents the details of mission critical avionics hardware mainly Mission Management Computer (MMC) and sequencing hardware. The Navigation, Guidance and Control (NGC) chain for RLV-TD is dual redundant with cross-strapped Remote Terminals (RTs) interfaced through MIL-STD-1553B bus. MMC is Bus Controller on the 1553 bus, which does the function of GPS aided navigation, guidance, digital autopilot and sequencing for the RLV-TD launch vehicle in different periodicities (10, 20, 500 ms). Digital autopilot execution in MMC with a periodicity of 10 ms (in ascent phase) is introduced for the first time and successfully demonstrated in the flight. MMC is built around Intel i960 processor and has inbuilt fault tolerance features like ECC for memories. Fault Detection and Isolation schemes are implemented to isolate the failed MMC. The sequencing hardware comprises Stage Processing System (SPS) and Command Execution Module (CEM). SPS is `RT' on the 1553 bus which receives the sequencing and control related commands from MMCs and posts to downstream modules after proper error handling for final execution. SPS is designed as a high reliability system by incorporating various fault tolerance and fault detection features. CEM is a relay based module for sequence command execution.

  11. Systematic and stochastic influences on the performance of the MinION nanopore sequencer across a range of nucleotide bias

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Krishnakumar, Raga; Sinha, Anupama; Bird, Sara W.

    Emerging sequencing technologies are allowing us to characterize environmental, clinical and laboratory samples with increasing speed and detail, including real-time analysis and interpretation of data. One example of this is being able to rapidly and accurately detect a wide range of pathogenic organisms, both in the clinic and the field. Genomes can have radically different GC content however, such that accurate sequence analysis can be challenging depending upon the technology used. Here, we have characterized the performance of the Oxford MinION nanopore sequencer for detection and evaluation of organisms with a range of genomic nucleotide bias. We have diagnosed themore » quality of base-calling across individual reads and discovered that the position within the read affects base-calling and quality scores. Finally, we have evaluated the performance of the current state-of-the-art neural network-based MinION basecaller, characterizing its behavior with respect to systemic errors as well as context- and sequence-specific errors. Overall, we present a detailed characterization the capabilities of the MinION in terms of generating high-accuracy sequence data from genomes with a wide range of nucleotide content. This study provides a framework for designing the appropriate experiments that are the likely to lead to accurate and rapid field-forward diagnostics.« less

  12. A transcriptome-wide, organ-specific regulatory map of Dendrobium officinale, an important traditional Chinese orchid herb

    PubMed Central

    Meng, Yijun; Yu, Dongliang; Xue, Jie; Lu, Jiangjie; Feng, Shangguo; Shen, Chenjia; Wang, Huizhong

    2016-01-01

    Dendrobium officinale is an important traditional Chinese herb. Here, we did a transcriptome-wide, organ-specific study on this valuable plant by combining RNA, small RNA (sRNA) and degradome sequencing. RNA sequencing of four organs (flower, root, leaf and stem) of Dendrobium officinale enabled us to obtain 536,558 assembled transcripts, from which 2,645, 256, 42 and 54 were identified to be highly expressed in the four organs respectively. Based on sRNA sequencing, 2,038, 2, 21 and 24 sRNAs were identified to be specifically accumulated in the four organs respectively. A total of 1,047 mature microRNA (miRNA) candidates were detected. Based on secondary structure predictions and sequencing, tens of potential miRNA precursors were identified from the assembled transcripts. Interestingly, phase-distributed sRNAs with degradome-based processing evidences were discovered on the long-stem structures of two precursors. Target identification was performed for the 1,047 miRNA candidates, resulting in the discovery of 1,257 miRNA--target pairs. Finally, some biological meaningful subnetworks involving hormone signaling, development, secondary metabolism and Argonaute 1-related regulation were established. All of the sequencing data sets are available at NCBI Sequence Read Archive (http://www.ncbi.nlm.nih.gov/sra/). Summarily, our study provides a valuable resource for the in-depth molecular and functional studies on this important Chinese orchid herb. PMID:26732614

  13. Modeling of prepregs during automated draping sequences

    NASA Astrophysics Data System (ADS)

    Krogh, Christian; Glud, Jens A.; Jakobsen, Johnny

    2017-10-01

    The behavior of wowen prepreg fabric during automated draping sequences is investigated. A drape tool under development with an arrangement of grippers facilitates the placement of a woven prepreg fabric in a mold. It is essential that the draped configuration is free from wrinkles and other defects. The present study aims at setting up a virtual draping framework capable of modeling the draping process from the initial flat fabric to the final double curved shape and aims at assisting the development of an automated drape tool. The virtual draping framework consists of a kinematic mapping algorithm used to generate target points on the mold which are used as input to a draping sequence planner. The draping sequence planner prescribes the displacement history for each gripper in the drape tool and these displacements are then applied to each gripper in a transient model of the draping sequence. The model is based on a transient finite element analysis with the material's constitutive behavior currently being approximated as linear elastic orthotropic. In-plane tensile and bias-extension tests as well as bending tests are conducted and used as input for the model. The virtual draping framework shows a good potential for obtaining a better understanding of the drape process and guide the development of the drape tool. However, results obtained from using the framework on a simple test case indicate that the generation of draping sequences is non-trivial.

  14. The Universal Protein Resource (UniProt): an expanding universe of protein information.

    PubMed

    Wu, Cathy H; Apweiler, Rolf; Bairoch, Amos; Natale, Darren A; Barker, Winona C; Boeckmann, Brigitte; Ferro, Serenella; Gasteiger, Elisabeth; Huang, Hongzhan; Lopez, Rodrigo; Magrane, Michele; Martin, Maria J; Mazumder, Raja; O'Donovan, Claire; Redaschi, Nicole; Suzek, Baris

    2006-01-01

    The Universal Protein Resource (UniProt) provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), comprising the manually annotated UniProtKB/Swiss-Prot section and the automatically annotated UniProtKB/TrEMBL section, is the preeminent storehouse of protein annotation. The extensive cross-references, functional and feature annotations and literature-based evidence attribution enable scientists to analyse proteins and query across databases. The UniProt Reference Clusters (UniRef) speed similarity searches via sequence space compression by merging sequences that are 100% (UniRef100), 90% (UniRef90) or 50% (UniRef50) identical. Finally, the UniProt Archive (UniParc) stores all publicly available protein sequences, containing the history of sequence data with links to the source databases. UniProt databases continue to grow in size and in availability of information. Recent and upcoming changes to database contents, formats, controlled vocabularies and services are described. New download availability includes all major releases of UniProtKB, sequence collections by taxonomic division and complete proteomes. A bibliography mapping service has been added, and an ID mapping service will be available soon. UniProt databases can be accessed online at http://www.uniprot.org or downloaded at ftp://ftp.uniprot.org/pub/databases/.

  15. Systematic and stochastic influences on the performance of the MinION nanopore sequencer across a range of nucleotide bias

    DOE PAGES

    Krishnakumar, Raga; Sinha, Anupama; Bird, Sara W.; ...

    2018-02-16

    Emerging sequencing technologies are allowing us to characterize environmental, clinical and laboratory samples with increasing speed and detail, including real-time analysis and interpretation of data. One example of this is being able to rapidly and accurately detect a wide range of pathogenic organisms, both in the clinic and the field. Genomes can have radically different GC content however, such that accurate sequence analysis can be challenging depending upon the technology used. Here, we have characterized the performance of the Oxford MinION nanopore sequencer for detection and evaluation of organisms with a range of genomic nucleotide bias. We have diagnosed themore » quality of base-calling across individual reads and discovered that the position within the read affects base-calling and quality scores. Finally, we have evaluated the performance of the current state-of-the-art neural network-based MinION basecaller, characterizing its behavior with respect to systemic errors as well as context- and sequence-specific errors. Overall, we present a detailed characterization the capabilities of the MinION in terms of generating high-accuracy sequence data from genomes with a wide range of nucleotide content. This study provides a framework for designing the appropriate experiments that are the likely to lead to accurate and rapid field-forward diagnostics.« less

  16. DEEP MOTIF DASHBOARD: VISUALIZING AND UNDERSTANDING GENOMIC SEQUENCES USING DEEP NEURAL NETWORKS.

    PubMed

    Lanchantin, Jack; Singh, Ritambhara; Wang, Beilun; Qi, Yanjun

    2017-01-01

    Deep neural network (DNN) models have recently obtained state-of-the-art prediction accuracy for the transcription factor binding (TFBS) site classification task. However, it remains unclear how these approaches identify meaningful DNA sequence signals and give insights as to why TFs bind to certain locations. In this paper, we propose a toolkit called the Deep Motif Dashboard (DeMo Dashboard) which provides a suite of visualization strategies to extract motifs, or sequence patterns from deep neural network models for TFBS classification. We demonstrate how to visualize and understand three important DNN models: convolutional, recurrent, and convolutional-recurrent networks. Our first visualization method is finding a test sequence's saliency map which uses first-order derivatives to describe the importance of each nucleotide in making the final prediction. Second, considering recurrent models make predictions in a temporal manner (from one end of a TFBS sequence to the other), we introduce temporal output scores, indicating the prediction score of a model over time for a sequential input. Lastly, a class-specific visualization strategy finds the optimal input sequence for a given TFBS positive class via stochastic gradient optimization. Our experimental results indicate that a convolutional-recurrent architecture performs the best among the three architectures. The visualization techniques indicate that CNN-RNN makes predictions by modeling both motifs as well as dependencies among them.

  17. Genome-Wide Single-Nucleotide Polymorphisms Discovery and High-Density Genetic Map Construction in Cauliflower Using Specific-Locus Amplified Fragment Sequencing

    PubMed Central

    Zhao, Zhenqing; Gu, Honghui; Sheng, Xiaoguang; Yu, Huifang; Wang, Jiansheng; Huang, Long; Wang, Dan

    2016-01-01

    Molecular markers and genetic maps play an important role in plant genomics and breeding studies. Cauliflower is an important and distinctive vegetable; however, very few molecular resources have been reported for this species. In this study, a novel, specific-locus amplified fragment (SLAF) sequencing strategy was employed for large-scale single nucleotide polymorphism (SNP) discovery and high-density genetic map construction in a double-haploid, segregating population of cauliflower. A total of 12.47 Gb raw data containing 77.92 M pair-end reads were obtained after processing and 6815 polymorphic SLAFs between the two parents were detected. The average sequencing depths reached 52.66-fold for the female parent and 49.35-fold for the male parent. Subsequently, these polymorphic SLAFs were used to genotype the population and further filtered based on several criteria to construct a genetic linkage map of cauliflower. Finally, 1776 high-quality SLAF markers, including 2741 SNPs, constituted the linkage map with average data integrity of 95.68%. The final map spanned a total genetic length of 890.01 cM with an average marker interval of 0.50 cM, and covered 364.9 Mb of the reference genome. The markers and genetic map developed in this study could provide an important foundation not only for comparative genomics studies within Brassica oleracea species but also for quantitative trait loci identification and molecular breeding of cauliflower. PMID:27047515

  18. SFM: A novel sequence-based fusion method for disease genes identification and prioritization.

    PubMed

    Yousef, Abdulaziz; Moghadam Charkari, Nasrollah

    2015-10-21

    The identification of disease genes from human genome is of great importance to improve diagnosis and treatment of disease. Several machine learning methods have been introduced to identify disease genes. However, these methods mostly differ in the prior knowledge used to construct the feature vector for each instance (gene), the ways of selecting negative data (non-disease genes) where there is no investigational approach to find them and the classification methods used to make the final decision. In this work, a novel Sequence-based fusion method (SFM) is proposed to identify disease genes. In this regard, unlike existing methods, instead of using a noisy and incomplete prior-knowledge, the amino acid sequence of the proteins which is universal data has been carried out to present the genes (proteins) into four different feature vectors. To select more likely negative data from candidate genes, the intersection set of four negative sets which are generated using distance approach is considered. Then, Decision Tree (C4.5) has been applied as a fusion method to combine the results of four independent state-of the-art predictors based on support vector machine (SVM) algorithm, and to make the final decision. The experimental results of the proposed method have been evaluated by some standard measures. The results indicate the precision, recall and F-measure of 82.6%, 85.6% and 84, respectively. These results confirm the efficiency and validity of the proposed method. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. A re-evaluation of the final step of vanillin biosynthesis in the orchid Vanilla planifolia.

    PubMed

    Yang, Hailian; Barros-Rios, Jaime; Kourteva, Galina; Rao, Xiaolan; Chen, Fang; Shen, Hui; Liu, Chenggang; Podstolski, Andrzej; Belanger, Faith; Havkin-Frenkel, Daphna; Dixon, Richard A

    2017-07-01

    A recent publication describes an enzyme from the vanilla orchid Vanilla planifolia with the ability to convert ferulic acid directly to vanillin. The authors propose that this represents the final step in the biosynthesis of vanillin, which is then converted to its storage form, glucovanillin, by glycosylation. The existence of such a "vanillin synthase" could enable biotechnological production of vanillin from ferulic acid using a "natural" vanilla enzyme. The proposed vanillin synthase exhibits high identity to cysteine proteases, and is identical at the protein sequence level to a protein identified in 2003 as being associated with the conversion of 4-coumaric acid to 4-hydroxybenzaldehyde. We here demonstrate that the recombinant cysteine protease-like protein, whether expressed in an in vitro transcription-translation system, E. coli, yeast, or plants, is unable to convert ferulic acid to vanillin. Rather, the protein is a component of an enzyme complex that preferentially converts 4-coumaric acid to 4-hydroxybenzaldehyde, as demonstrated by the purification of this complex and peptide sequencing. Furthermore, RNA sequencing provides evidence that this protein is expressed in many tissues of V. planifolia irrespective of whether or not they produce vanillin. On the basis of our results, V. planifolia does not appear to contain a cysteine protease-like "vanillin synthase" that can, by itself, directly convert ferulic acid to vanillin. The pathway to vanillin in V. planifolia is yet to be conclusively determined. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. Transcriptome sequencing reveals high isoform diversity in the ant Formica exsecta

    PubMed Central

    Paviala, Jenni; Morandin, Claire; Wheat, Christopher; Sundström, Liselotte; Helanterä, Heikki

    2017-01-01

    Transcriptome resources for social insects have the potential to provide new insight into polyphenism, i.e., how divergent phenotypes arise from the same genome. Here we present a transcriptome based on paired-end RNA sequencing data for the ant Formica exsecta (Formicidae, Hymenoptera). The RNA sequencing libraries were constructed from samples of several life stages of both sexes and female castes of queens and workers, in order to maximize representation of expressed genes. We first compare the performance of common assembly and scaffolding software (Trinity, Velvet-Oases, and SOAPdenovo-trans), in producing de novo assemblies. Second, we annotate the resulting expressed contigs to the currently published genomes of ants, and other insects, including the honeybee, to filter genes that have annotation evidence of being true genes. Our pipeline resulted in a final assembly of altogether 39,262 mRNA transcripts, with an average coverage of >300X, belonging to 17,496 unique genes with annotation in the related ant species. From these genes, 536 genes were unique to one caste or sex only, highlighting the importance of comprehensive sampling. Our final assembly also showed expression of several splice variants in 6,975 genes, and we show that accounting for splice variants affects the outcome of downstream analyses such as gene ontologies. Our transcriptome provides an outstanding resource for future genetic studies on F. exsecta and other ant species, and the presented transcriptome assembly can be adapted to any non-model species that has genomic resources available from a related taxon. PMID:29177112

  1. Association of Streptomyces community composition determined by PCR-denaturing gradient gel electrophoresis with indoor mold status

    PubMed Central

    Johansson, Elisabet; Reponen, Tiina; Meller, Jarek; Vesper, Stephen; Yadav, Jagjit

    2014-01-01

    Both Streptomyces species and mold species have previously been isolated from moisture-damaged building materials; however, an association between these two groups of microorganisms in indoor environments is not clear. In this study we used a culture-independent method, PCR denaturing gradient gel electrophoresis (PCR-DGGE) to investigate the composition of the Streptomyces community in house dust. Twenty-three dust samples each from two sets of homes categorized as high-mold and low-mold based on mold specific quantitative PCR-analysis were used in the study. Taxonomic identification of prominent bands was performed by cloning and sequencing. Associations between DGGE amplicon band intensities and home mold status were assessed using univariate analyses, as well as multivariate recursive partitioning (decision trees) to test the predictive value of combinations of bands intensities. In the final classification tree, a combination of two bands was significantly associated with mold status of the home (p = 0.001). The sequence corresponding to one of the bands in the final decision tree matched a group of Streptomyces species that included S. coelicolor and S. sampsonii, both of which have been isolated from moisture-damaged buildings previously. The closest match for the majority of sequences corresponding to a second band consisted of a group of Streptomyces species that included S. hygroscopicus, an important producer of antibiotics and immunosuppressors. Taken together, the study showed that DGGE can be a useful tool for identifying bacterial species that may be more prevalent in mold-damaged buildings. PMID:25331035

  2. Stopping decisions: information order effects on nonfocal evaluations.

    PubMed

    Yu, Michael; Gonzalez, Cleotilde

    2013-08-01

    We investigated how the order in which information is presented affects when a person decides to stop performing a task. A stopping decision is a decision to stop performing a task on the basis of a sequence of cues. Previous order-effects models do not account for how these contexts limit available working memory for making such decisions. Participants decided how long to perform a task known as the Work Hazard Game that began by rewarding points but later cost points if work continued after an unannounced "emergency." An additive sequence of cues indicated the probability of an emergency. Study I involved a three-group design with cue sequences that indicated the same risk at each decision point but whose final cue presented a high, medium, or low probability. Study 2 had a 2 x 2 design with high or low final cues and an easy or a challenging task. In Study I, participants stopped sooner when the most recent cue presented a high rather than low probability (p = .09), despite the same emergency risk. In Study 2, participants stopped sooner when the most recent cue presented a high rather than low probability for the challenging task but not for the easy task (p = .08). Stopping decisions appear sensitive to the most recent cue observed while experiencing task load. Participants responded to the same risks differently only on the basis of a change in presentation. Findings may be relevant for research and training for hazardous jobs, such as subsurface coal mining, fishing, and trucking.

  3. Breast MRI at Very Short TE (minTE): Image Analysis of minTE Sequences on Non-Fat-Saturated, Subtracted T1-Weighted Images.

    PubMed

    Wenkel, Evelyn; Janka, Rolf; Geppert, Christian; Kaemmerer, Nadine; Hartmann, Arndt; Uder, Michael; Hammon, Matthias; Brand, Michael

    2017-02-01

    Purpose  The aim was to evaluate a minimum echo time (minTE) protocol for breast magnetic resonance imaging (MRI) in patients with breast lesions compared to a standard TE (nTE) time protocol. Methods  Breasts of 144 women were examined with a 1.5 Tesla MRI scanner. Additionally to the standard gradient-echo sequence with nTE (4.8 ms), a variant with minimum TE (1.2 ms) was used in an interleaved fashion which leads to a better temporal resolution and should reduce the scan time by approximately 50 %. Lesion sizes were measured and the signal-to-noise ratio (SNR) as well as the contrast-to-noise ratio (CNR) were calculated. Subjective confidence was evaluated using a 3-point scale before looking at the nTE sequences (1 = very sure that I can identify a lesion and classify it, 2 = quite sure that I can identify a lesion and classify it, 3 = definitely want to see nTE for final assessment) and the subjective image quality of all examinations was evaluated using a four-grade scale (1 = sharp, 2 = slight blur, 3 = moderate blur and 4 = severe blur/not evaluable) for lesion and skin sharpness. Lesion morphology and contrast enhancement were also evaluated. Results  With minTE sequences, no lesion was rated with "definitely want to see nTE sequences for final assessment". The difference of the longitudinal and transverse diameter did not differ significantly (p > 0.05). With minTE, lesions and skin were rated to be significantly more blurry (p < 0.01 for lesions and p < 0.05 for skin). There was no difference between both sequences with respect to SNR, CNR, lesion morphology, contrast enhancement and detection of multifocal disease. Conclusion  Dynamic breast MRI with a minTE protocol is feasible without a major loss of information (SNR, CNR, lesion morphology, contrast enhancement and lesion sizes) and the temporal resolution can be increased by a factor of 2 using minTE sequences. Key points   · Increase of temporal resolution for a better in-flow curve.. · Dynamic breast MRI with a shorter TE time is possible without relevant loss of information.. · Possible decrease of the overall scan time.. Citation Format · Wenkel E, Janka R, Geppert C et al. Breast MRI at Very Short TE (minTE): Image Analysis of minTE Sequences on Non-Fat-Saturated, Subtracted T1-Weighted Images. Fortschr Röntgenstr 2017; 189: 137 - 145. © Georg Thieme Verlag KG Stuttgart · New York.

  4. Too Much of a Good Thing: Stronger Bilingual Inhibition Leads to Larger Lag-2 Task Repetition Costs

    ERIC Educational Resources Information Center

    Prior, Anat

    2012-01-01

    Inhibitory control and monitoring abilities of Hebrew-English bilingual and English monolingual university students were compared, in a paradigm requiring participants to switch between performing three distinct tasks. Inhibitory control was gauged by lag-2 task repetition costs, namely decreased performance on the final trial of sequences of type…

  5. MOLECULAR CLONING OF CYP1A FROM THE ESTUARINE FISH FUNDULUS HETEROCLITUS AND PHYLOGENETIC ANALYSIS OF CYP1A GENES: UPDATE WITH NEW SEQUENCES. (R827102)

    EPA Science Inventory

    The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...

  6. The Effect of Animacy on Children's Noun Order in Verb-Final Sequences.

    ERIC Educational Resources Information Center

    Lempert, Henrietta

    1988-01-01

    Examines effect of training 70 preschool children with animate agent + animate patient sentences (AAV) or animate agent + inanimate patient sentences (IAV). Children were tested with noun-noun-verb (NNV) order sentence to assess whether AAV or IAV produced better comprehension. AAV and IAV showed comparable results at age three, IAV resulted in…

  7. ISOLATION OF A CYTOCHROME P450 3A CDNA SEQUENCE (CYP3A30) FROM THE MARINE TELEOST AND PHYLOGENETIC ANALYSIS OF CYP3A GENES. (R827102)

    EPA Science Inventory

    The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...

  8. A "Vision and Change" Reform of Introductory Biology Shifts Faculty Perceptions and Use of Active Learning

    ERIC Educational Resources Information Center

    Auerbach, Anna Jo; Schussler, Elisabeth

    2017-01-01

    Increasing faculty use of active-learning (AL) pedagogies in college classrooms is a persistent challenge in biology education. A large research-intensive university implemented changes to its biology majors' two-course introductory sequence as outlined by the "Vision and Change in Undergraduate Biology Education" final report. One goal…

  9. σ 54-dependent regulome in Desulfovibrio vulgaris Hildenborough

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kazakov, Alexey E.; Rajeev, Lara; Chen, Amy

    2015-11-10

    The σ 54 subunit controls a unique class of promoters in bacteria. Such promoters, without exception, require enhancer binding proteins (EBPs) for transcription initiation. Desulfovibrio vulgaris Hildenborough, a model bacterium for sulfate reduction studies, has a high number of EBPs, more than most sequenced bacteria. Finally, the cellular processes regulated by many of these EBPs remain unknown.

  10. An Advantage for Perceptual Edges in Young Infants' Memory for Speech

    ERIC Educational Resources Information Center

    Hochmann, Jean-Rémy; Langus, Alan; Mehler, Jacques

    2016-01-01

    Models of language acquisition are constrained by the information that learners can extract from their input. Experiment 1 investigated whether 3-month-old infants are able to encode a repeated, unsegmented sequence of five syllables. Event-related-potentials showed that infants reacted to a change of the initial or the final syllable, but not to…

  11. Biofilm Formation by a Metabolically Versatile Bacterium

    DTIC Science & Technology

    2009-03-19

    ABSTRACT Rhodopseudomonas palustris is a photosynthetic bacterium that has good potential as a biocatalyst for the production ofhydrogen gas, a biofuel...Biofilm formation by a metabolically versatile bacterium: final report Report Title ABSTRACT Rhodopseudomonas palustris is a photosynthetic bacterium...agricultural waste. We characterized five new Rhodopseudomonas genome sequences and isolated and described R. palustris mutant strains that produce

  12. A Deduction of the Golden Spiral Equation via Powers of the Golden Ratio ?

    ERIC Educational Resources Information Center

    Zahn, Maurício

    2017-01-01

    This paper presents an interesting deduction of the Golden Spiral equation in a suitable polar coordinate system. For this purpose, the concepts of Golden Ratio and Golden Rectangle, and a significant result for the calculation of powers of the Golden Ratio ? using terms of the Fibonacci sequence are mentioned. Finally, various geometrical…

  13. Sequencing from dried blood spots in infants with "false positive" newborn screen for MCAD deficiency.

    PubMed

    McCandless, Shawn E; Chandrasekar, Ram; Linard, Sharon; Kikano, Sandra; Rice, Lorrie

    2013-01-01

    Newborn screening (NBS) for medium chain acyl-CoA dehydrogenase deficiency (MCADD), one of the most common disorders identified, uses measurement of octanoylcarnitine (C8) from dried blood spots. In the state of Ohio, as in many places, primary care providers, with or without consultation from a metabolic specialist, may perform "confirmatory testing", with the final diagnostic decision returned to the state. Confirmatory testing may involve measurement of metabolites, enzyme analysis, mutation screening, or sequencing. We now report sequencing results for infants said to have "false positive" NBS results for MCAD deficiency, or who died before confirmatory testing could be performed. Dried blood spots (DBS) were obtained from all 18 available NBS cards identified as "false positive" by NBS for the 3 year period after screening began in Ohio in 2003 (N=20, thus 2 had no DBS available), and from all 6 infants with abnormal screens who died before confirmatory testing could be obtained. DNA extracted from DBS was screened for the common c.985A>G mutation in exon 11 of the ACADM gene, using a specific restriction digest method, followed by sequencing of the 12 exons, intron-exon junctions, and several hundred base pairs of the 5' untranslated region. The NBS cut-off value for C8 used was 0.7 μmol/L. Sequencing of ACADM in six neonates with elevated C8 on NBS who died before confirmatory testing was obtained did not identify any significant variants in the coding region of the gene, suggesting that MCADD was not a contributing factor in these deaths. The mean C8 for the 18 surviving infants labeled as "False Positives" was 0.90 (95%CI 0.77-1.15), much lower than the mean value for confirmed cases. Ten of the 18 were premature births weighing <1200 g, the rest were normal sized and full term. Eight infants, mostly full term with appropriate birth weight, were heterozygous for the common c.985A>G mutation; one of those also has a novel sequence change identified in exon 9 that predicts a PRO to LEU change at residue 258 of the protein. Both the phase and any possible clinical significance of the variant are unknown, but several lines of evidence suggest that it could lead to protein malfunction. That child had an NBS C8 of 2.2, more than double the mean for the False Positive group. Unfortunately, the study design did not provide clinical outcome data, but the child is not known to have presented clinically by age 7 years. These results suggest that sequencing of ACADM from dried blood spots can be one useful follow-up tool to provide accurate genetic counseling in the situation of an infant with elevated C8 on NBS who dies before confirmatory testing is obtained. Of surviving neonates, there appear to be two populations of infants with false positive NBS C8 values: 1) term AGA infants who are heterozygous for the common c.985A>G mutation, and, 2) premature infants, regardless of carrier status. The finding of two sequence variants in an infant reported to the state as not affected suggests the possibility that some infants with two mutations may be reported as normal at follow-up. State registries may wish to consider asking that metabolic specialists, who are most familiar with the variability of these rare disorders, be involved in the final diagnostic evaluation. Finally, providers may wish to consider ACADM sequencing, or other diagnostic testing, as part of the confirmatory evaluation for infants with NBS C8 concentrations that are significantly above the cut-off value, even if plasma and urine metabolites are not strikingly increased. Copyright © 2012 Elsevier Inc. All rights reserved.

  14. AfterQC: automatic filtering, trimming, error removing and quality control for fastq data.

    PubMed

    Chen, Shifu; Huang, Tanxiao; Zhou, Yanqing; Han, Yue; Xu, Mingyan; Gu, Jia

    2017-03-14

    Some applications, especially those clinical applications requiring high accuracy of sequencing data, usually have to face the troubles caused by unavoidable sequencing errors. Several tools have been proposed to profile the sequencing quality, but few of them can quantify or correct the sequencing errors. This unmet requirement motivated us to develop AfterQC, a tool with functions to profile sequencing errors and correct most of them, plus highly automated quality control and data filtering features. Different from most tools, AfterQC analyses the overlapping of paired sequences for pair-end sequencing data. Based on overlapping analysis, AfterQC can detect and cut adapters, and furthermore it gives a novel function to correct wrong bases in the overlapping regions. Another new feature is to detect and visualise sequencing bubbles, which can be commonly found on the flowcell lanes and may raise sequencing errors. Besides normal per cycle quality and base content plotting, AfterQC also provides features like polyX (a long sub-sequence of a same base X) filtering, automatic trimming and K-MER based strand bias profiling. For each single or pair of FastQ files, AfterQC filters out bad reads, detects and eliminates sequencer's bubble effects, trims reads at front and tail, detects the sequencing errors and corrects part of them, and finally outputs clean data and generates HTML reports with interactive figures. AfterQC can run in batch mode with multiprocess support, it can run with a single FastQ file, a single pair of FastQ files (for pair-end sequencing), or a folder for all included FastQ files to be processed automatically. Based on overlapping analysis, AfterQC can estimate the sequencing error rate and profile the error transform distribution. The results of our error profiling tests show that the error distribution is highly platform dependent. Much more than just another new quality control (QC) tool, AfterQC is able to perform quality control, data filtering, error profiling and base correction automatically. Experimental results show that AfterQC can help to eliminate the sequencing errors for pair-end sequencing data to provide much cleaner outputs, and consequently help to reduce the false-positive variants, especially for the low-frequency somatic mutations. While providing rich configurable options, AfterQC can detect and set all the options automatically and require no argument in most cases.

  15. Integrating multi-view transmission system into MPEG-21 stereoscopic and multi-view DIA (digital item adaptation)

    NASA Astrophysics Data System (ADS)

    Lee, Seungwon; Park, Ilkwon; Kim, Manbae; Byun, Hyeran

    2006-10-01

    As digital broadcasting technologies have been rapidly progressed, users' expectations for realistic and interactive broadcasting services also have been increased. As one of such services, 3D multi-view broadcasting has received much attention recently. In general, all the view sequences acquired at the server are transmitted to the client. Then, the user can select a part of views or all the views according to display capabilities. However, this kind of system requires high processing power of the server as well as the client, thus posing a difficulty in practical applications. To overcome this problem, a relatively simple method is to transmit only two view-sequences requested by the client in order to deliver a stereoscopic video. In this system, effective communication between the server and the client is one of important aspects. In this paper, we propose an efficient multi-view system that transmits two view-sequences and their depth maps according to user's request. The view selection process is integrated into MPEG-21 DIA (Digital Item Adaptation) so that our system is compatible to MPEG-21 multimedia framework. DIA is generally composed of resource adaptation and descriptor adaptation. It is one of merits that SVA (stereoscopic video adaptation) descriptors defined in DIA standard are used to deliver users' preferences and device capabilities. Furthermore, multi-view descriptions related to multi-view camera and system are newly introduced. The syntax of the descriptions and their elements is represented in XML (eXtensible Markup Language) schema. If the client requests an adapted descriptor (e.g., view numbers) to the server, then the server sends its associated view sequences. Finally, we present a method which can reduce user's visual discomfort that might occur while viewing stereoscopic video. This phenomenon happens when view changes as well as when a stereoscopic image produces excessive disparity caused by a large baseline between two cameras. To solve for the former, IVR (intermediate view reconstruction) is employed for smooth transition between two stereoscopic view sequences. As well, a disparity adjustment scheme is used for the latter. Finally, from the implementation of testbed and the experiments, we can show the valuables and possibilities of our system.

  16. In silico mining and characterization of simple sequence repeats from gilthead sea bream (Sparus aurata) expressed sequence tags (EST-SSRs); PCR amplification, polymorphism evaluation and multiplexing and cross-species assays.

    PubMed

    Vogiatzi, Emmanouella; Lagnel, Jacques; Pakaki, Victoria; Louro, Bruno; Canario, Adelino V M; Reinhardt, Richard; Kotoulas, Georgios; Magoulas, Antonios; Tsigenopoulos, Costas S

    2011-06-01

    We screened for simple sequence repeats (SSRs) found in ESTs derived from an EST-database development project ('Marine Genomics Europe' Network of Excellence). Different motifs of di-, tri-, tetra-, penta- and hexanucleotide SSRs were evaluated for variation in length and position in the expressed sequences, relative abundance and distribution in gilthead sea bream (Sparus aurata). We found 899 ESTs that harbor 997 SSRs (4.94%). On average, one SSR was found per 2.95 kb of EST sequence and the dinucleotide SSRs are the most abundant accounting for 47.6% of the total number. EST-SSRs were used as template for primer design. 664 primer pairs could be successfully identified and a subset of 206 pairs of primers was synthesized, PCR-tested and visualized on ethidium bromide stained agarose gels. The main objective was to further assess the potential of EST-SSRs as informative markers and investigate their cross-species amplification in sixteen teleost fish species: seven sparid species and nine other species from different families. Approximately 78% of the primer pairs gave PCR products of expected size in gilthead sea bream, and as expected, the rate of successful amplification of sea bream EST-SSRs was higher in sparids, lower in other perciforms and even lower in species of the Clupeiform and Gadiform orders. We finally determined the polymorphism and the heterozygosity of 63 markers in a wild gilthead sea bream population; fifty-eight loci were found to be polymorphic with the expected heterozygosity and the number of alleles ranging from 0.089 to 0.946 and from 2 to 27, respectively. These tools and markers are expected to enhance the available genetic linkage map in gilthead sea bream, to assist comparative mapping and genome analyses for this species and further with other model fish species and finally to help advance genetic analysis for cultivated and wild populations and accelerate breeding programs. Copyright © 2011 Elsevier B.V. All rights reserved.

  17. A novel method based on new adaptive LVQ neural network for predicting protein-protein interactions from protein sequences.

    PubMed

    Yousef, Abdulaziz; Moghadam Charkari, Nasrollah

    2013-11-07

    Protein-Protein interaction (PPI) is one of the most important data in understanding the cellular processes. Many interesting methods have been proposed in order to predict PPIs. However, the methods which are based on the sequence of proteins as a prior knowledge are more universal. In this paper, a sequence-based, fast, and adaptive PPI prediction method is introduced to assign two proteins to an interaction class (yes, no). First, in order to improve the presentation of the sequences, twelve physicochemical properties of amino acid have been used by different representation methods to transform the sequence of protein pairs into different feature vectors. Then, for speeding up the learning process and reducing the effect of noise PPI data, principal component analysis (PCA) is carried out as a proper feature extraction algorithm. Finally, a new and adaptive Learning Vector Quantization (LVQ) predictor is designed to deal with different models of datasets that are classified into balanced and imbalanced datasets. The accuracy of 93.88%, 90.03%, and 89.72% has been found on S. cerevisiae, H. pylori, and independent datasets, respectively. The results of various experiments indicate the efficiency and validity of the method. © 2013 Published by Elsevier Ltd.

  18. Homopolymer tail-mediated ligation PCR: a streamlined and highly efficient method for DNA cloning and library construction

    PubMed Central

    Lazinski, David W.; Camilli, Andrew

    2013-01-01

    The amplification of DNA fragments, cloned between user-defined 5′ and 3′ end sequences, is a prerequisite step in the use of many current applications including massively parallel sequencing (MPS). Here we describe an improved method, called homopolymer tail-mediated ligation PCR (HTML-PCR), that requires very little starting template, minimal hands-on effort, is cost-effective, and is suited for use in high-throughput and robotic methodologies. HTML-PCR starts with the addition of homopolymer tails of controlled lengths to the 3′ termini of a double-stranded genomic template. The homopolymer tails enable the annealing-assisted ligation of a hybrid oligonucleotide to the template's recessed 5′ ends. The hybrid oligonucleotide has a user-defined sequence at its 5′ end. This primer, together with a second primer composed of a longer region complementary to the homopolymer tail and fused to a second 5′ user-defined sequence, are used in a PCR reaction to generate the final product. The user-defined sequences can be varied to enable compatibility with a wide variety of downstream applications. We demonstrate our new method by constructing MPS libraries starting from nanogram and sub-nanogram quantities of Vibrio cholerae and Streptococcus pneumoniae genomic DNA. PMID:23311318

  19. The Complete Chloroplast Genome of 17 Individuals of Pest Species Jacobaea vulgaris: SNPs, Microsatellites and Barcoding Markers for Population and Phylogenetic Studies

    PubMed Central

    Doorduin, Leonie; Gravendeel, Barbara; Lammers, Youri; Ariyurek, Yavuz; Chin-A-Woeng, Thomas; Vrieling, Klaas

    2011-01-01

    Invasive individuals from the pest species Jacobaea vulgaris show different allocation patterns in defence and growth compared with native individuals. To examine if these changes are caused by fast evolution, it is necessary to identify native source populations and compare these with invasive populations. For this purpose, we are in need of intraspecific polymorphic markers. We therefore sequenced the complete chloroplast genomes of 12 native and 5 invasive individuals of J. vulgaris with next generation sequencing and discovered single-nucleotide polymorphisms (SNPs) and microsatellites. This is the first study in which the chloroplast genome of that many individuals within a single species was sequenced. Thirty-two SNPs and 34 microsatellite regions were found. For none of the individuals, differences were found between the inverted repeats. Furthermore, being the first chloroplast genome sequenced in the Senecioneae clade, we compared it with four other members of the Asteraceae family to identify new regions for phylogentic inference within this clade and also within the Asteraceae family. Five markers (ndhC-trnV, ndhC-atpE, rps18-rpl20, clpP and psbM-trnD) contained parsimony-informative characters higher than 2%. Finally, we compared two procedures of preparing chloroplast DNA for next generation sequencing. PMID:21444340

  20. Bacterial Pathogens and Community Composition in Advanced Sewage Treatment Systems Revealed by Metagenomics Analysis Based on High-Throughput Sequencing

    PubMed Central

    Lu, Xin; Zhang, Xu-Xiang; Wang, Zhu; Huang, Kailong; Wang, Yuan; Liang, Weigang; Tan, Yunfei; Liu, Bo; Tang, Junying

    2015-01-01

    This study used 454 pyrosequencing, Illumina high-throughput sequencing and metagenomic analysis to investigate bacterial pathogens and their potential virulence in a sewage treatment plant (STP) applying both conventional and advanced treatment processes. Pyrosequencing and Illumina sequencing consistently demonstrated that Arcobacter genus occupied over 43.42% of total abundance of potential pathogens in the STP. At species level, potential pathogens Arcobacter butzleri, Aeromonas hydrophila and Klebsiella pneumonia dominated in raw sewage, which was also confirmed by quantitative real time PCR. Illumina sequencing also revealed prevalence of various types of pathogenicity islands and virulence proteins in the STP. Most of the potential pathogens and virulence factors were eliminated in the STP, and the removal efficiency mainly depended on oxidation ditch. Compared with sand filtration, magnetic resin seemed to have higher removals in most of the potential pathogens and virulence factors. However, presence of the residual A. butzleri in the final effluent still deserves more concerns. The findings indicate that sewage acts as an important source of environmental pathogens, but STPs can effectively control their spread in the environment. Joint use of the high-throughput sequencing technologies is considered a reliable method for deep and comprehensive overview of environmental bacterial virulence. PMID:25938416

Top