A Comparison of Cut Scores Using Multiple Standard Setting Methods.
ERIC Educational Resources Information Center
Impara, James C.; Plake, Barbara S.
This paper reports the results of using several alternative methods of setting cut scores. The methods used were: (1) a variation of the Angoff method (1971); (2) a variation of the borderline group method; and (3) an advanced impact method (G. Dillon, 1996). The results discussed are from studies undertaken to set the cut scores for fourth grade…
Generalizing Observational Study Results: Applying Propensity Score Methods to Complex Surveys
DuGoff, Eva H; Schuler, Megan; Stuart, Elizabeth A
2014-01-01
ObjectiveTo provide a tutorial for using propensity score methods with complex survey data. Data SourcesSimulated data and the 2008 Medical Expenditure Panel Survey. Study DesignUsing simulation, we compared the following methods for estimating the treatment effect: a naïve estimate (ignoring both survey weights and propensity scores), survey weighting, propensity score methods (nearest neighbor matching, weighting, and subclassification), and propensity score methods in combination with survey weighting. Methods are compared in terms of bias and 95 percent confidence interval coverage. In Example 2, we used these methods to estimate the effect on health care spending of having a generalist versus a specialist as a usual source of care. Principal FindingsIn general, combining a propensity score method and survey weighting is necessary to achieve unbiased treatment effect estimates that are generalizable to the original survey target population. ConclusionsPropensity score methods are an essential tool for addressing confounding in observational studies. Ignoring survey weights may lead to results that are not generalizable to the survey target population. This paper clarifies the appropriate inferences for different propensity score methods and suggests guidelines for selecting an appropriate propensity score method based on a researcher’s goal. PMID:23855598
Generalizing observational study results: applying propensity score methods to complex surveys.
Dugoff, Eva H; Schuler, Megan; Stuart, Elizabeth A
2014-02-01
To provide a tutorial for using propensity score methods with complex survey data. Simulated data and the 2008 Medical Expenditure Panel Survey. Using simulation, we compared the following methods for estimating the treatment effect: a naïve estimate (ignoring both survey weights and propensity scores), survey weighting, propensity score methods (nearest neighbor matching, weighting, and subclassification), and propensity score methods in combination with survey weighting. Methods are compared in terms of bias and 95 percent confidence interval coverage. In Example 2, we used these methods to estimate the effect on health care spending of having a generalist versus a specialist as a usual source of care. In general, combining a propensity score method and survey weighting is necessary to achieve unbiased treatment effect estimates that are generalizable to the original survey target population. Propensity score methods are an essential tool for addressing confounding in observational studies. Ignoring survey weights may lead to results that are not generalizable to the survey target population. This paper clarifies the appropriate inferences for different propensity score methods and suggests guidelines for selecting an appropriate propensity score method based on a researcher's goal. © Health Research and Educational Trust.
Mallett, Susan; Halligan, Steve; Collins, Gary S.; Altman, Doug G.
2014-01-01
Background Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. Methods In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. Results Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. Conclusions The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests. PMID:25353643
Mallett, Susan; Halligan, Steve; Collins, Gary S; Altman, Doug G
2014-01-01
Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests.
Swiderska, Zaneta; Korzynska, Anna; Markiewicz, Tomasz; Lorent, Malgorzata; Zak, Jakub; Wesolowska, Anna; Roszkowiak, Lukasz; Slodkowska, Janina; Grala, Bartlomiej
2015-01-01
Background. This paper presents the study concerning hot-spot selection in the assessment of whole slide images of tissue sections collected from meningioma patients. The samples were immunohistochemically stained to determine the Ki-67/MIB-1 proliferation index used for prognosis and treatment planning. Objective. The observer performance was examined by comparing results of the proposed method of automatic hot-spot selection in whole slide images, results of traditional scoring under a microscope, and results of a pathologist's manual hot-spot selection. Methods. The results of scoring the Ki-67 index using optical scoring under a microscope, software for Ki-67 index quantification based on hot spots selected by two pathologists (resp., once and three times), and the same software but on hot spots selected by proposed automatic methods were compared using Kendall's tau-b statistics. Results. Results show intra- and interobserver agreement. The agreement between Ki-67 scoring with manual and automatic hot-spot selection is high, while agreement between Ki-67 index scoring results in whole slide images and traditional microscopic examination is lower. Conclusions. The agreement observed for the three scoring methods shows that automation of area selection is an effective tool in supporting physicians and in increasing the reliability of Ki-67 scoring in meningioma.
Swiderska, Zaneta; Korzynska, Anna; Markiewicz, Tomasz; Lorent, Malgorzata; Zak, Jakub; Wesolowska, Anna; Roszkowiak, Lukasz; Slodkowska, Janina; Grala, Bartlomiej
2015-01-01
Background. This paper presents the study concerning hot-spot selection in the assessment of whole slide images of tissue sections collected from meningioma patients. The samples were immunohistochemically stained to determine the Ki-67/MIB-1 proliferation index used for prognosis and treatment planning. Objective. The observer performance was examined by comparing results of the proposed method of automatic hot-spot selection in whole slide images, results of traditional scoring under a microscope, and results of a pathologist's manual hot-spot selection. Methods. The results of scoring the Ki-67 index using optical scoring under a microscope, software for Ki-67 index quantification based on hot spots selected by two pathologists (resp., once and three times), and the same software but on hot spots selected by proposed automatic methods were compared using Kendall's tau-b statistics. Results. Results show intra- and interobserver agreement. The agreement between Ki-67 scoring with manual and automatic hot-spot selection is high, while agreement between Ki-67 index scoring results in whole slide images and traditional microscopic examination is lower. Conclusions. The agreement observed for the three scoring methods shows that automation of area selection is an effective tool in supporting physicians and in increasing the reliability of Ki-67 scoring in meningioma. PMID:26240787
Graphical method for comparative statistical study of vaccine potency tests.
Pay, T W; Hingley, P J
1984-03-01
Producers and consumers are interested in some of the intrinsic characteristics of vaccine potency assays for the comparative evaluation of suitable experimental design. A graphical method is developed which represents the precision of test results, the sensitivity of such results to changes in dosage, and the relevance of the results in the way they reflect the protection afforded in the host species. The graphs can be constructed from Producer's scores and Consumer's scores on each of the scales of test score, antigen dose and probability of protection against disease. A method for calculating these scores is suggested and illustrated for single and multiple component vaccines, for tests which do or do not employ a standard reference preparation, and for tests which employ quantitative or quantal systems of scoring.
The performance of different propensity score methods for estimating marginal hazard ratios.
Austin, Peter C
2013-07-20
Propensity score methods are increasingly being used to reduce or minimize the effects of confounding when estimating the effects of treatments, exposures, or interventions when using observational or non-randomized data. Under the assumption of no unmeasured confounders, previous research has shown that propensity score methods allow for unbiased estimation of linear treatment effects (e.g., differences in means or proportions). However, in biomedical research, time-to-event outcomes occur frequently. There is a paucity of research into the performance of different propensity score methods for estimating the effect of treatment on time-to-event outcomes. Furthermore, propensity score methods allow for the estimation of marginal or population-average treatment effects. We conducted an extensive series of Monte Carlo simulations to examine the performance of propensity score matching (1:1 greedy nearest-neighbor matching within propensity score calipers), stratification on the propensity score, inverse probability of treatment weighting (IPTW) using the propensity score, and covariate adjustment using the propensity score to estimate marginal hazard ratios. We found that both propensity score matching and IPTW using the propensity score allow for the estimation of marginal hazard ratios with minimal bias. Of these two approaches, IPTW using the propensity score resulted in estimates with lower mean squared error when estimating the effect of treatment in the treated. Stratification on the propensity score and covariate adjustment using the propensity score result in biased estimation of both marginal and conditional hazard ratios. Applied researchers are encouraged to use propensity score matching and IPTW using the propensity score when estimating the relative effect of treatment on time-to-event outcomes. Copyright © 2012 John Wiley & Sons, Ltd.
A Comparison of Presentation Levels to Maximize Word Recognition Scores
Guthrie, Leslie A.; Mackersie, Carol L.
2010-01-01
Background While testing suprathreshold word recognition at multiple levels is considered best practice, studies on practice patterns do not suggest that this is common practice. Audiologists often test at a presentation level intended to maximize recognition scores, but methods for selecting this level are not well established for a wide range of hearing losses. Purpose To determine the presentation level methods that resulted in maximum suprathreshold phoneme-recognition scores while avoiding loudness discomfort. Research Design Performance-intensity functions were obtained for 40 participants with sensorineural hearing loss using the Computer-Assisted Speech Perception Assessment. Participants had either gradually sloping (mild, moderate, moderately severe/severe) or steeply sloping losses. Performance-intensity functions were obtained at presentation levels ranging from 10 dB above the SRT to 5 dB below the UCL (uncomfortable level). In addition, categorical loudness ratings were obtained across a range of intensities using speech stimuli. Scores obtained at UCL – 5 dB (maximum level below loudness discomfort) were compared to four alternative presentation-level methods. The alternative presentation-level methods included sensation level (SL; 2 kHz reference, SRT reference), a fixed-level (95 dB SPL) method, and the most comfortable loudness level (MCL). For the SL methods, scores used in the analysis were selected separately for the SRT and 2 kHz references based on several criteria. The general goal was to choose levels that represented asymptotic performance while avoiding loudness discomfort. The selection of SLs varied across the range of hearing losses. Results Scores obtained using the different presentation-level methods were compared to scores obtained using UCL – 5 dB. For the mild hearing loss group, the mean phoneme scores were similar for all presentation levels. For the moderately severe/severe group, the highest mean score was obtained using UCL - 5 dB. For the moderate and steeply sloping groups, the mean scores obtained using 2 kHz SL were equivalent to UCL - 5 dB, whereas scores obtained using the SRT SL were significantly lower than those obtained using UCL - 5 dB. The mean scores corresponding to MCL and 95 dB SPL were significantly lower than scores for UCL - 5 dB for the moderate and the moderately severe/severe group. Conclusions For participants with mild to moderate gradually sloping losses and for those with steeply sloping losses, the UCL – 5 dB and the 2 kHz SL methods resulted in the highest scores without exceeding listeners' UCLs. For participants with moderately severe/severe losses, the UCL - 5 dB method resulted in the highest phoneme recognition scores. PMID:19594086
Preequating with Empirical Item Characteristic Curves: An Observed-Score Preequating Method
ERIC Educational Resources Information Center
Zu, Jiyun; Puhan, Gautam
2014-01-01
Preequating is in demand because it reduces score reporting time. In this article, we evaluated an observed-score preequating method: the empirical item characteristic curve (EICC) method, which makes preequating without item response theory (IRT) possible. EICC preequating results were compared with a criterion equating and with IRT true-score…
The assignment of scores procedure for ordinal categorical data.
Chen, Han-Ching; Wang, Nae-Sheng
2014-01-01
Ordinal data are the most frequently encountered type of data in the social sciences. Many statistical methods can be used to process such data. One common method is to assign scores to the data, convert them into interval data, and further perform statistical analysis. There are several authors who have recently developed assigning score methods to assign scores to ordered categorical data. This paper proposes an approach that defines an assigning score system for an ordinal categorical variable based on underlying continuous latent distribution with interpretation by using three case study examples. The results show that the proposed score system is well for skewed ordinal categorical data.
Overview of BioCreative II gene mention recognition.
Smith, Larry; Tanabe, Lorraine K; Ando, Rie Johnson nee; Kuo, Cheng-Ju; Chung, I-Fang; Hsu, Chun-Nan; Lin, Yu-Shi; Klinger, Roman; Friedrich, Christoph M; Ganchev, Kuzman; Torii, Manabu; Liu, Hongfang; Haddow, Barry; Struble, Craig A; Povinelli, Richard J; Vlachos, Andreas; Baumgartner, William A; Hunter, Lawrence; Carpenter, Bob; Tsai, Richard Tzong-Han; Dai, Hong-Jie; Liu, Feng; Chen, Yifei; Sun, Chengjie; Katrenko, Sophia; Adriaans, Pieter; Blaschke, Christian; Torres, Rafael; Neves, Mariana; Nakov, Preslav; Divoli, Anna; Maña-López, Manuel; Mata, Jacinto; Wilbur, W John
2008-01-01
Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all the methods used and a statistical analysis of the results. We also demonstrate that, by combining the results from all submissions, an F score of 0.9066 is feasible, and furthermore that the best result makes use of the lowest scoring submissions.
Overview of BioCreative II gene mention recognition
Smith, Larry; Tanabe, Lorraine K; Ando, Rie Johnson nee; Kuo, Cheng-Ju; Chung, I-Fang; Hsu, Chun-Nan; Lin, Yu-Shi; Klinger, Roman; Friedrich, Christoph M; Ganchev, Kuzman; Torii, Manabu; Liu, Hongfang; Haddow, Barry; Struble, Craig A; Povinelli, Richard J; Vlachos, Andreas; Baumgartner, William A; Hunter, Lawrence; Carpenter, Bob; Tsai, Richard Tzong-Han; Dai, Hong-Jie; Liu, Feng; Chen, Yifei; Sun, Chengjie; Katrenko, Sophia; Adriaans, Pieter; Blaschke, Christian; Torres, Rafael; Neves, Mariana; Nakov, Preslav; Divoli, Anna; Maña-López, Manuel; Mata, Jacinto; Wilbur, W John
2008-01-01
Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all the methods used and a statistical analysis of the results. We also demonstrate that, by combining the results from all submissions, an F score of 0.9066 is feasible, and furthermore that the best result makes use of the lowest scoring submissions. PMID:18834493
Yu, Jingkai; Finley, Russell L
2009-01-01
High-throughput experimental and computational methods are generating a wealth of protein-protein interaction data for a variety of organisms. However, data produced by current state-of-the-art methods include many false positives, which can hinder the analyses needed to derive biological insights. One way to address this problem is to assign confidence scores that reflect the reliability and biological significance of each interaction. Most previously described scoring methods use a set of likely true positives to train a model to score all interactions in a dataset. A single positive training set, however, may be biased and not representative of true interaction space. We demonstrate a method to score protein interactions by utilizing multiple independent sets of training positives to reduce the potential bias inherent in using a single training set. We used a set of benchmark yeast protein interactions to show that our approach outperforms other scoring methods. Our approach can also score interactions across data types, which makes it more widely applicable than many previously proposed methods. We applied the method to protein interaction data from both Drosophila melanogaster and Homo sapiens. Independent evaluations show that the resulting confidence scores accurately reflect the biological significance of the interactions.
Stevenson, Jennifer L; Hart, Kari R
2017-06-01
The current study systematically investigated the effects of scoring and categorization methods on the psychometric properties of the Autism-Spectrum Quotient. Four hundred and three college students completed the Autism-Spectrum Quotient at least once. Total scores on the Autism-Spectrum Quotient had acceptable internal consistency and test-retest reliability using a binary or Likert scoring method, but the results were more varied for the subscales. Overall, Likert scoring yielded higher internal consistency and test-retest reliability than binary scoring. However, agreement in categorization of low and high autistic traits was poor over time (except for a median split on Likert scores). The results support using Likert scoring and administering the Autism-Spectrum Quotient at the same time as the task of interest with neurotypical participants.
Shorey, Ryan C.; Brasfield, Hope; Febres, Jeniimarie; Cornelius, Tara L.; Stuart, Gregory L.
2012-01-01
Psychological aggression in females’ dating relationships has received increased empirical attention in recent years. However, researchers have used numerous measures of psychological aggression, and various scoring methods with these measures, making it difficult to compare across studies on psychological aggression. In addition, research has yet to examine whether different scoring methods for psychological aggression measures may affect the psychometric properties of these instruments. The current study examined three self-report measures of psychological aggression within a sample of female college students (N = 108), including their psychometric properties when scored using frequency, sum, and variety scores. Results showed that the Revised Conflict Tactics Scales (CTS2) had variable internal consistency depending on the scoring method used and good validity; the Multidimensional Measure of Emotional Abuse (MMEA) and the Follingstad Psychological Aggression Scale (FPAS) both had good internal consistency and validity across scoring methods. Implications of these findings for the assessment of psychological aggression and future research are discussed. PMID:23393957
Shorey, Ryan C; Brasfield, Hope; Febres, Jeniimarie; Cornelius, Tara L; Stuart, Gregory L
2012-01-01
Psychological aggression in females' dating relationships has received increased empirical attention in recent years. However, researchers' have used numerous measures of psychological aggression and various scoring methods with these measures, making it difficult to compare across studies on psychological aggression. In addition, research has yet to examine whether different scoring methods for psychological aggression measures may affect the psychometric properties of these instruments. This study examined three self-report measures of psychological aggression within a sample of female college students (N = 108), including their psychometric properties when scored using frequency, sum, and variety scores. Results showed that the Revised Conflict Tactics Scales (CTS2) had variable internal consistency depending on the scoring method used and good validity; the Multidimensional Measure of Emotional Abuse (MMEA) and the Follingstad Psychological Aggression Scale (FPAS) both had good internal consistency and validity across scoring methods. Implications of these findings for the assessment of psychological aggression and future research are discussed.
Moore, Tyler M.; Reise, Steven P.; Roalf, David R.; Satterthwaite, Theodore D.; Davatzikos, Christos; Bilker, Warren B.; Port, Allison M.; Jackson, Chad T.; Ruparel, Kosha; Savitt, Adam P.; Baron, Robert B.; Gur, Raquel E.; Gur, Ruben C.
2016-01-01
Traditional “paper-and-pencil” testing is imprecise in measuring speed and hence limited in assessing performance efficiency, but computerized testing permits precision in measuring itemwise response time. We present a method of scoring performance efficiency (combining information from accuracy and speed) at the item level. Using a community sample of 9,498 youths age 8-21, we calculated item-level efficiency scores on four neurocognitive tests, and compared the concurrent, convergent, discriminant, and predictive validity of these scores to simple averaging of standardized speed and accuracy-summed scores. Concurrent validity was measured by the scores' abilities to distinguish men from women and their correlations with age; convergent and discriminant validity were measured by correlations with other scores inside and outside of their neurocognitive domains; predictive validity was measured by correlations with brain volume in regions associated with the specific neurocognitive abilities. Results provide support for the ability of itemwise efficiency scoring to detect signals as strong as those detected by standard efficiency scoring methods. We find no evidence of superior validity of the itemwise scores over traditional scores, but point out several advantages of the former. The itemwise efficiency scoring method shows promise as an alternative to standard efficiency scoring methods, with overall moderate support from tests of four different types of validity. This method allows the use of existing item analysis methods and provides the convenient ability to adjust the overall emphasis of accuracy versus speed in the efficiency score, thus adjusting the scoring to the real-world demands the test is aiming to fulfill. PMID:26866796
Clinical outcomes of arthroscopic single and double row repair in full thickness rotator cuff tears
Ji, Jong-Hun; Shafi, Mohamed; Kim, Weon-Yoo; Kim, Young-Yul
2010-01-01
Background: There has been a recent interest in the double row repair method for arthroscopic rotator cuff repair following favourable biomechanical results reported by some studies. The purpose of this study was to compare the clinical results of arthroscopic single row and double row repair methods in the full-thickness rotator cuff tears. Materials and Methods: 22 patients of arthroscopic single row repair (group I) and 25 patients who underwent double row repair (group II) from March 2003 to March 2005 were retrospectively evaluated and compared for the clinical outcomes. The mean age was 58 years and 56 years respectively for group I and II. The average follow-up in the two groups was 24 months. The evaluation was done by using the University of California Los Angeles (UCLA) rating scale and the shoulder index of the American Shoulder and Elbow Surgeons (ASES). Results: In Group I, the mean ASES score increased from 30.48 to 87.40 and the mean ASES score increased from 32.00 to 91.45 in the Group II. The mean UCLA score increased from the preoperative 12.23 to 30.82 in Group I and from 12.20 to 32.40 in Group II. Each method has shown no statistical clinical differences between two methods, but based on the sub scores of UCLA score, the double row repair method yields better results for the strength, and it gives more satisfaction to the patients than the single row repair method. Conclusions: Comparing the two methods, double row repair group showed better clinical results in recovering strength and gave more satisfaction to the patients but no statistical clinical difference was found between 2 methods. PMID:20697485
Assessing Equating Results on Different Equating Criteria
ERIC Educational Resources Information Center
Tong, Ye; Kolen, Michael
2005-01-01
The performance of three equating methods--the presmoothed equipercentile method, the item response theory (IRT) true score method, and the IRT observed score method--were examined based on three equating criteria: the same distributions property, the first-order equity property, and the second-order equity property. The magnitude of the…
Hansen, Karen E; Blank, Robert D; Palermo, Lisa; Fink, Howard A; Orwoll, Eric S
2014-01-01
Summary In this study, the area under the curve was highest when using the lowest vertebral body T-score to diagnose osteoporosis. In men for whom hip imaging is not possible, the lowest vertebral body T-score improves ability to diagnose osteoporosis in men who are likely to have an incident fragility fracture. Purpose Spine T-scores have limited ability to predict fragility fracture. We hypothesized that using lowest vertebral body T-score to diagnose osteoporosis would better predict fracture. Methods Among men enrolled in the Osteoporotic Fractures in Men Study, we identified cases with incident clinical fracture (n=484) and controls without fracture (n=1,516). We analyzed the lumbar spine BMD in cases and controls (n=2,000) to record the L1-L4 (referent), the lowest vertebral body and ISCD-determined T-scores using a male normative database, and the L1-L4 T-score using a female normative database. We compared the ability of method to diagnose osteoporosis and therefore predict incident clinical fragility fracture, using area under the receiver operator curves (AUC) and the net reclassification index (NCI) as measures of diagnostic accuracy. ISCD-determined T-scores were determined in only 60% of participants (n=1205). Results Among 1,205 men, the AUC to predict incident clinical fracture was 0.546 for L1-L4 male, 0.542 for the L1-L4 female, 0.585 for lowest vertebral body and 0.559 for ISCD-determined T-score. The lowest vertebral body AUC was the only method significantly different from the referent method (p=0.002). Likewise, a diagnosis of osteoporosis based on the lowest vertebral body T-score demonstrated a significantly better NRI than the referent method (net NRI +0.077, p=0.005). By contrast, the net NRI for other methods of analysis did not differ from the referent method. Conclusion Our study suggests that in men, the lowest vertebral body T-score is an acceptable method by which to estimate fracture risk. PMID:24850381
Semi-automated scoring of triple-probe FISH in human sperm using confocal microscopy.
Branch, Francesca; Nguyen, GiaLinh; Porter, Nicholas; Young, Heather A; Martenies, Sheena E; McCray, Nathan; Deloid, Glen; Popratiloff, Anastas; Perry, Melissa J
2017-09-01
Structural and numerical sperm chromosomal aberrations result from abnormal meiosis and are directly linked to infertility. Any live births that arise from aneuploid conceptuses can result in syndromes such as Kleinfelter, Turners, XYY and Edwards. Multi-probe fluorescence in situ hybridization (FISH) is commonly used to study sperm aneuploidy, however manual FISH scoring in sperm samples is labor-intensive and introduces errors. Automated scoring methods are continuously evolving. One challenging aspect for optimizing automated sperm FISH scoring has been the overlap in excitation and emission of the fluorescent probes used to enumerate the chromosomes of interest. Our objective was to demonstrate the feasibility of combining confocal microscopy and spectral imaging with high-throughput methods for accurately measuring sperm aneuploidy. Our approach used confocal microscopy to analyze numerical chromosomal abnormalities in human sperm using enhanced slide preparation and rigorous semi-automated scoring methods. FISH for chromosomes X, Y, and 18 was conducted to determine sex chromosome disomy in sperm nuclei. Application of online spectral linear unmixing was used for effective separation of four fluorochromes while decreasing data acquisition time. Semi-automated image processing, segmentation, classification, and scoring were performed on 10 slides using custom image processing and analysis software and results were compared with manual methods. No significant differences in disomy frequencies were seen between the semi automated and manual methods. Samples treated with pepsin were observed to have reduced background autofluorescence and more uniform distribution of cells. These results demonstrate that semi-automated methods using spectral imaging on a confocal platform are a feasible approach for analyzing numerical chromosomal aberrations in sperm, and are comparable to manual methods. © 2017 International Society for Advancement of Cytometry. © 2017 International Society for Advancement of Cytometry.
Quantitative prediction of drug side effects based on drug-related features.
Niu, Yanqing; Zhang, Wen
2017-09-01
Unexpected side effects of drugs are great concern in the drug development, and the identification of side effects is an important task. Recently, machine learning methods are proposed to predict the presence or absence of interested side effects for drugs, but it is difficult to make the accurate prediction for all of them. In this paper, we transform side effect profiles of drugs as their quantitative scores, by summing up their side effects with weights. The quantitative scores may measure the dangers of drugs, and thus help to compare the risk of different drugs. Here, we attempt to predict quantitative scores of drugs, namely the quantitative prediction. Specifically, we explore a variety of drug-related features and evaluate their discriminative powers for the quantitative prediction. Then, we consider several feature combination strategies (direct combination, average scoring ensemble combination) to integrate three informative features: chemical substructures, targets, and treatment indications. Finally, the average scoring ensemble model which produces the better performances is used as the final quantitative prediction model. Since weights for side effects are empirical values, we randomly generate different weights in the simulation experiments. The experimental results show that the quantitative method is robust to different weights, and produces satisfying results. Although other state-of-the-art methods cannot make the quantitative prediction directly, the prediction results can be transformed as the quantitative scores. By indirect comparison, the proposed method produces much better results than benchmark methods in the quantitative prediction. In conclusion, the proposed method is promising for the quantitative prediction of side effects, which may work cooperatively with existing state-of-the-art methods to reveal dangers of drugs.
Jamali, Jamshid; Ayatollahi, Seyyed Mohammad Taghi
2015-01-01
Background: Nurses constitute the most providers of health care systems. Their mental health can affect the quality of services and patients’ satisfaction. General Health Questionnaire (GHQ-12) is a general screening tool used to detect mental disorders. Scoring method and determining thresholds for this questionnaire are debatable and the cut-off points can vary from sample to sample. This study was conducted to estimate the prevalence of mental disorders among Iranian nurses using GHQ-12 and also compare Latent Class Analysis (LCA) and K-means clustering with traditional scoring method. Methodology: A cross-sectional study was carried out in Fars and Bushehr provinces of southern Iran in 2014. Participants were 771 Iranian nurses, who filled out the GHQ-12 questionnaire. Traditional scoring method, LCA and K-means were used to estimate the prevalence of mental disorder among Iranian nurses. Cohen’s kappa statistic was applied to assess the agreement between the LCA and K-means with traditional scoring method of GHQ-12. Results: The nurses with mental disorder by scoring method, LCA and K-mean were 36.3% (n=280), 32.2% (n=248), and 26.5% (n=204), respectively. LCA and logistic regression revealed that the prevalence of mental disorder in females was significantly higher than males. Conclusion: Mental disorder in nurses was in a medium level compared to other people living in Iran. There was a little difference between prevalence of mental disorder estimated by scoring method, K-means and LCA. According to the advantages of LCA than K-means and different results in scoring method, we suggest LCA for classification of Iranian nurses according to their mental health outcomes using GHQ-12 questionnaire PMID:26622202
Stürmer, Til; Joshi, Manisha; Glynn, Robert J.; Avorn, Jerry; Rothman, Kenneth J.; Schneeweiss, Sebastian
2006-01-01
Objective Propensity score analyses attempt to control for confounding in non-experimental studies by adjusting for the likelihood that a given patient is exposed. Such analyses have been proposed to address confounding by indication, but there is little empirical evidence that they achieve better control than conventional multivariate outcome modeling. Study design and methods Using PubMed and Science Citation Index, we assessed the use of propensity scores over time and critically evaluated studies published through 2003. Results Use of propensity scores increased from a total of 8 papers before 1998 to 71 in 2003. Most of the 177 published studies abstracted assessed medications (N=60) or surgical interventions (N=51), mainly in cardiology and cardiac surgery (N=90). Whether PS methods or conventional outcome models were used to control for confounding had little effect on results in those studies in which such comparison was possible. Only 9 out of 69 studies (13%) had an effect estimate that differed by more than 20% from that obtained with a conventional outcome model in all PS analyses presented. Conclusions Publication of results based on propensity score methods has increased dramatically, but there is little evidence that these methods yield substantially different estimates compared with conventional multivariable methods. PMID:16632131
Lu, R; Xiao, Y
2017-07-18
Objective: To evaluate the clinical value of ultrasonic elastography and ultrasonography comprehensive scoring method in the diagnosis of cervical lesions. Methods: A total of 116 patients were selected from the Department of Gynecology of the first hospital affiliated with Central South University from March 2014 to September 2015.All of the lesions were preoperatively examined by Doppler Ultrasound and elastography.The elasticity score was determined by a 5-point scoring method. Calculation of the strain ratio was based on a comparison of the average strain measured in the lesion with the adjacent tissue of the same depth, size, and shape.All these ultrasonic parameters were quantified, added, and arrived at ultrasonography comprehensive scores.To use surgical pathology as the gold standard, the sensitivity, specificity, accuracy of Doppler Ultrasound, elasticity score and strain ratio methods and ultrasonography comprehensive scoring method were comparatively analyzed. Results: (1) The sensitivity, specificity, and accuracy of Doppler Ultrasound in diagnosing cervical lesions were 82.89% (63/76), 85.0% (34/40), and 83.62% (97/116), respectively.(2) The sensitivity, specificity, and accuracy of the elasticity score method were 77.63% (59/76), 82.5% (33/40), and 79.31% (92/116), respectively; the sensitivity, specificity, and accuracy of the strain ratio measure method were 84.21% (64/76), 87.5% (35/40), and 85.34% (99/116), respectively.(3) The sensitivity, specificity, and accuracy of ultrasonography comprehensive scoring method were 90.79% (69/76), 92.5% (37/40), and 91.38% (106/116), respectively. Conclusion: (1) It was obvious that ultrasonic elastography had certain diagnostic value in cervical lesions. Strain ratio measurement can be more objective than elasticity score method.(2) The combined application of ultrasonography comprehensive scoring method, ultrasonic elastography and conventional sonography was more accurate than single parameter.
Burden, Anne; Roche, Nicolas; Miglio, Cristiana; Hillyer, Elizabeth V; Postma, Dirkje S; Herings, Ron MC; Overbeek, Jetty A; Khalid, Javaria Mona; van Eickels, Daniela; Price, David B
2017-01-01
Background Cohort matching and regression modeling are used in observational studies to control for confounding factors when estimating treatment effects. Our objective was to evaluate exact matching and propensity score methods by applying them in a 1-year pre–post historical database study to investigate asthma-related outcomes by treatment. Methods We drew on longitudinal medical record data in the PHARMO database for asthma patients prescribed the treatments to be compared (ciclesonide and fine-particle inhaled corticosteroid [ICS]). Propensity score methods that we evaluated were propensity score matching (PSM) using two different algorithms, the inverse probability of treatment weighting (IPTW), covariate adjustment using the propensity score, and propensity score stratification. We defined balance, using standardized differences, as differences of <10% between cohorts. Results Of 4064 eligible patients, 1382 (34%) were prescribed ciclesonide and 2682 (66%) fine-particle ICS. The IPTW and propensity score-based methods retained more patients (96%–100%) than exact matching (90%); exact matching selected less severe patients. Standardized differences were >10% for four variables in the exact-matched dataset and <10% for both PSM algorithms and the weighted pseudo-dataset used in the IPTW method. With all methods, ciclesonide was associated with better 1-year asthma-related outcomes, at one-third the prescribed dose, than fine-particle ICS; results varied slightly by method, but direction and statistical significance remained the same. Conclusion We found that each method has its particular strengths, and we recommend at least two methods be applied for each matched cohort study to evaluate the robustness of the findings. Balance diagnostics should be applied with all methods to check the balance of confounders between treatment cohorts. If exact matching is used, the calculation of a propensity score could be useful to identify variables that require balancing, thereby informing the choice of matching criteria together with clinical considerations. PMID:28356782
A new method of scoring radiographic change in rheumatoid arthritis.
Rau, R; Wassenberg, S; Herborn, G; Stucki, G; Gebler, A
1998-11-01
To test the reliability and to define the minimal detectable change of a new radiographic scoring method in rheumatoid arthritis (RA). Following the recommendations of an expert panel a new radiographic scoring method was defined. It scores 38 joints [all proximal interphalangeal (PIP) and metacarpophalangeal joints, 4 sites in the wrists, IP of the great toes, and metatarsophalangeals 2 to 5], regarding only the amount of joint surface destruction on a 0 to 5 scale for each joint. Each grade represents 20% of joint surface destruction. The method was tested by 5 readers on a set of 7 serial radiographs of hands and forefeet of 20 patients with progressive and destructive RA. Analysis of variance was performed, as it provides the best information about the capability of a method to detect real change and to define its sensitivity according to the minimal detectable change. Analysis of variance proved a high probability that the readers found real change with a ratio of intrapatient to intrareader standard deviation of 2.6. It also confirmed that one reader could detect a change of 3.5% of the total score with a probability of 95% and that different readers agreed upon a change of 4.6%. Inexperienced readers performed with comparable results to experienced readers. The time required for the reading averaged less than 10 minutes for the scoring of one set. The new radiographic scoring method proved to be reliable, precise, and easy to learn, with reasonable cost. Compared to published data, it may provide better results than the widely used Larsen score. These features favor our new method for use in clinical trials and in longterm observational studies in RA.
Ruvinsky, Anatoly M
2007-06-01
We present results of testing the ability of eleven popular scoring functions to predict native docked positions using a recently developed method (Ruvinsky and Kozintsev, J Comput Chem 2005, 26, 1089) for estimation the entropy contributions of relative motions to protein-ligand binding affinity. The method is based on the integration of the configurational integral over clusters obtained from multiple docked positions. We use a test set of 100 PDB protein-ligand complexes and ensembles of 101 docked positions generated by (Wang et al. J Med Chem 2003, 46, 2287) for each ligand in the test set. To test the suggested method we compared the averaged root-mean square deviations (RMSD) of the top-scored ligand docked positions, accounting and not accounting for entropy contributions, relative to the experimentally determined positions. We demonstrate that the method increases docking accuracy by 10-21% when used in conjunction with the AutoDock scoring function, by 2-25% with G-Score, by 7-41% with D-Score, by 0-8% with LigScore, by 1-6% with PLP, by 0-12% with LUDI, by 2-8% with F-Score, by 7-29% with ChemScore, by 0-9% with X-Score, by 2-19% with PMF, and by 1-7% with DrugScore. We also compared the performance of the suggested method with the method based on ranking by cluster occupancy only. We analyze how the choice of a clustering-RMSD and a low bound of dense clusters impacts on docking accuracy of the scoring methods. We derive optimal intervals of the clustering-RMSD for 11 scoring functions.
Malau-Aduli, Bunmi Sherifat; Teague, Peta-Ann; D'Souza, Karen; Heal, Clare; Turner, Richard; Garne, David L; van der Vleuten, Cees
2017-12-01
A key issue underpinning the usefulness of the OSCE assessment to medical education is standard setting, but the majority of standard-setting methods remain challenging for performance assessment because they produce varying passing marks. Several studies have compared standard-setting methods; however, most of these studies are limited by their experimental scope, or use data on examinee performance at a single OSCE station or from a single medical school. This collaborative study between 10 Australian medical schools investigated the effect of standard-setting methods on OSCE cut scores and failure rates. This research used 5256 examinee scores from seven shared OSCE stations to calculate cut scores and failure rates using two different compromise standard-setting methods, namely the Borderline Regression and Cohen's methods. The results of this study indicate that Cohen's method yields similar outcomes to the Borderline Regression method, particularly for large examinee cohort sizes. However, with lower examinee numbers on a station, the Borderline Regression method resulted in higher cut scores and larger difference margins in the failure rates. Cohen's method yields similar outcomes as the Borderline Regression method and its application for benchmarking purposes and in resource-limited settings is justifiable, particularly with large examinee numbers.
Zhao, Gai; Bian, Yang; Li, Ming
2013-12-18
To analyze the impact of passing items above the roof level in the gross motor subtest of Peabody development motor scales (PDMS-2) on its assessment results. In the subtests of PDMS-2, 124 children from 1.2 to 71 months were administered. Except for the original scoring method, a new scoring method which includes passing items above the ceiling were developed. The standard scores and quotients of the two scoring methods were compared using the independent-samples t test. Only one child could pass the items above the ceiling in the stationary subtest, 19 children in the locomotion subtest, and 17 children in the visual-motor integration subtest. When the scores of these passing items were included in the raw scores, the total raw scores got the added points of 1-12, the standard scores added 0-1 points and the motor quotients added 0-3 points. The diagnostic classification was changed only in two children. There was no significant difference between those two methods about motor quotients or standard scores in the specific subtest (P>0.05). The passing items above a ceiling of PDMS-2 isn't a rare situation. It usually takes place in the locomotion subtest and visual-motor integration subtest. Including these passing items into the scoring system will not make significant difference in the standard scores of the subtests or the developmental motor quotients (DMQ), which supports the original setting of a ceiling established by upassing 3 items in a row. However, putting the passing items above the ceiling into the raw score will improve tracking of children's developmental trajectory and intervention effects.
Green, Kerry M.; Stuart, Elizabeth A.
2014-01-01
Objective This study provides guidance on how propensity score methods can be combined with moderation analyses (i.e., effect modification) to examine subgroup differences in potential causal effects in non-experimental studies. As a motivating example, we focus on how depression may affect subsequent substance use differently for men and women. Method Using data from a longitudinal community cohort study (N=952) of urban African Americans with assessments in childhood, adolescence, young adulthood and midlife, we estimate the influence of depression by young adulthood on substance use outcomes in midlife, and whether that influence varies by gender. We illustrate and compare five different techniques for estimating subgroup effects using propensity score methods, including separate propensity score models and matching for men and women, a joint propensity score model for men and women with matching separately and together by gender, and a joint male/female propensity score model that includes theoretically important gender interactions with matching separately and together by gender. Results Analyses showed that estimating separate models for men and women yielded the best balance and, therefore, is a preferred technique when subgroup analyses are of interest, at least in this data. Results also showed substance use consequences of depression but no significant gender differences. Conclusions It is critical to prespecify subgroup effects before the estimation of propensity scores and to check balance within subgroups regardless of the type of propensity score model used. Results also suggest that depression may affect multiple substance use outcomes in midlife for both men and women relatively equally. PMID:24731233
Roguev, Assen; Ryan, Colm J; Xu, Jiewei; Colson, Isabelle; Hartsuiker, Edgar; Krogan, Nevan
2018-02-01
This protocol describes computational analysis of genetic interaction screens, ranging from data capture (plate imaging) to downstream analyses. Plate imaging approaches using both digital camera and office flatbed scanners are included, along with a protocol for the extraction of colony size measurements from the resulting images. A commonly used genetic interaction scoring method, calculation of the S-score, is discussed. These methods require minimal computer skills, but some familiarity with MATLAB and Linux/Unix is a plus. Finally, an outline for using clustering and visualization software for analysis of resulting data sets is provided. © 2018 Cold Spring Harbor Laboratory Press.
Orsini, A; Pezzuti, L; Hulbert, S
2015-05-01
It is now widely known that children with severe intellectual disability show a 'floor effect' on the Wechsler scales. This effect emerges because the practice of transforming raw scores into scaled scores eliminates any variability present in participants with low intellectual ability and because intelligence quotient (IQ) scores are limited insofar as they do not measure scores lower than 40. Following Hessl et al.'s results, the present authors propose a method for the computation of the Wechsler Intelligence Scale for Children--4th Ed. (WISC-IV)'s IQ and Indexes in intellectually disabled participants affected by a floored pattern of results. The Italian standardization sample (n = 2200) for the WISC-IV was used. The method presented in this study highlights the limits of the 'floor effect' of the WISC-IV in children with serious intellectual disability who present a profile with weighted scores of 1 in all the subtests despite some variability in the raw scores. Such method eliminates the floor effect of the scale and therefore makes it possible to analyse the strengths and weaknesses of the WISC-IV's Indexes in these participants. The Authors reflect on clinical utility of this method and on the meaning of raw score of 0 on subtest. © 2014 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
2018-01-01
Propensity score methods are increasingly being used to estimate the effects of treatments and exposures when using observational data. The propensity score was initially developed for use with binary exposures. The generalized propensity score (GPS) is an extension of the propensity score for use with quantitative or continuous exposures (eg, dose or quantity of medication, income, or years of education). We used Monte Carlo simulations to examine the performance of different methods of using the GPS to estimate the effect of continuous exposures on binary outcomes. We examined covariate adjustment using the GPS and weighting using weights based on the inverse of the GPS. We examined both the use of ordinary least squares to estimate the propensity function and the use of the covariate balancing propensity score algorithm. The use of methods based on the GPS was compared with the use of G‐computation. All methods resulted in essentially unbiased estimation of the population dose‐response function. However, GPS‐based weighting tended to result in estimates that displayed greater variability and had higher mean squared error when the magnitude of confounding was strong. Of the methods based on the GPS, covariate adjustment using the GPS tended to result in estimates with lower variability and mean squared error when the magnitude of confounding was strong. We illustrate the application of these methods by estimating the effect of average neighborhood income on the probability of death within 1 year of hospitalization for an acute myocardial infarction. PMID:29508424
Austin, Peter C
2018-05-20
Propensity score methods are increasingly being used to estimate the effects of treatments and exposures when using observational data. The propensity score was initially developed for use with binary exposures. The generalized propensity score (GPS) is an extension of the propensity score for use with quantitative or continuous exposures (eg, dose or quantity of medication, income, or years of education). We used Monte Carlo simulations to examine the performance of different methods of using the GPS to estimate the effect of continuous exposures on binary outcomes. We examined covariate adjustment using the GPS and weighting using weights based on the inverse of the GPS. We examined both the use of ordinary least squares to estimate the propensity function and the use of the covariate balancing propensity score algorithm. The use of methods based on the GPS was compared with the use of G-computation. All methods resulted in essentially unbiased estimation of the population dose-response function. However, GPS-based weighting tended to result in estimates that displayed greater variability and had higher mean squared error when the magnitude of confounding was strong. Of the methods based on the GPS, covariate adjustment using the GPS tended to result in estimates with lower variability and mean squared error when the magnitude of confounding was strong. We illustrate the application of these methods by estimating the effect of average neighborhood income on the probability of death within 1 year of hospitalization for an acute myocardial infarction. © 2018 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
2012-01-01
Background Existing methods for predicting protein solubility on overexpression in Escherichia coli advance performance by using ensemble classifiers such as two-stage support vector machine (SVM) based classifiers and a number of feature types such as physicochemical properties, amino acid and dipeptide composition, accompanied with feature selection. It is desirable to develop a simple and easily interpretable method for predicting protein solubility, compared to existing complex SVM-based methods. Results This study proposes a novel scoring card method (SCM) by using dipeptide composition only to estimate solubility scores of sequences for predicting protein solubility. SCM calculates the propensities of 400 individual dipeptides to be soluble using statistic discrimination between soluble and insoluble proteins of a training data set. Consequently, the propensity scores of all dipeptides are further optimized using an intelligent genetic algorithm. The solubility score of a sequence is determined by the weighted sum of all propensity scores and dipeptide composition. To evaluate SCM by performance comparisons, four data sets with different sizes and variation degrees of experimental conditions were used. The results show that the simple method SCM with interpretable propensities of dipeptides has promising performance, compared with existing SVM-based ensemble methods with a number of feature types. Furthermore, the propensities of dipeptides and solubility scores of sequences can provide insights to protein solubility. For example, the analysis of dipeptide scores shows high propensity of α-helix structure and thermophilic proteins to be soluble. Conclusions The propensities of individual dipeptides to be soluble are varied for proteins under altered experimental conditions. For accurately predicting protein solubility using SCM, it is better to customize the score card of dipeptide propensities by using a training data set under the same specified experimental conditions. The proposed method SCM with solubility scores and dipeptide propensities can be easily applied to the protein function prediction problems that dipeptide composition features play an important role. Availability The used datasets, source codes of SCM, and supplementary files are available at http://iclab.life.nctu.edu.tw/SCM/. PMID:23282103
Comparative study of multimodal biometric recognition by fusion of iris and fingerprint.
Benaliouche, Houda; Touahria, Mohamed
2014-01-01
This research investigates the comparative performance from three different approaches for multimodal recognition of combined iris and fingerprints: classical sum rule, weighted sum rule, and fuzzy logic method. The scores from the different biometric traits of iris and fingerprint are fused at the matching score and the decision levels. The scores combination approach is used after normalization of both scores using the min-max rule. Our experimental results suggest that the fuzzy logic method for the matching scores combinations at the decision level is the best followed by the classical weighted sum rule and the classical sum rule in order. The performance evaluation of each method is reported in terms of matching time, error rates, and accuracy after doing exhaustive tests on the public CASIA-Iris databases V1 and V2 and the FVC 2004 fingerprint database. Experimental results prior to fusion and after fusion are presented followed by their comparison with related works in the current literature. The fusion by fuzzy logic decision mimics the human reasoning in a soft and simple way and gives enhanced results.
Comparative Study of Multimodal Biometric Recognition by Fusion of Iris and Fingerprint
Benaliouche, Houda; Touahria, Mohamed
2014-01-01
This research investigates the comparative performance from three different approaches for multimodal recognition of combined iris and fingerprints: classical sum rule, weighted sum rule, and fuzzy logic method. The scores from the different biometric traits of iris and fingerprint are fused at the matching score and the decision levels. The scores combination approach is used after normalization of both scores using the min-max rule. Our experimental results suggest that the fuzzy logic method for the matching scores combinations at the decision level is the best followed by the classical weighted sum rule and the classical sum rule in order. The performance evaluation of each method is reported in terms of matching time, error rates, and accuracy after doing exhaustive tests on the public CASIA-Iris databases V1 and V2 and the FVC 2004 fingerprint database. Experimental results prior to fusion and after fusion are presented followed by their comparison with related works in the current literature. The fusion by fuzzy logic decision mimics the human reasoning in a soft and simple way and gives enhanced results. PMID:24605065
Clinical outcomes of arthroscopic single and double row repair in full thickness rotator cuff tears.
Ji, Jong-Hun; Shafi, Mohamed; Kim, Weon-Yoo; Kim, Young-Yul
2010-07-01
There has been a recent interest in the double row repair method for arthroscopic rotator cuff repair following favourable biomechanical results reported by some studies. The purpose of this study was to compare the clinical results of arthroscopic single row and double row repair methods in the full-thickness rotator cuff tears. 22 patients of arthroscopic single row repair (group I) and 25 patients who underwent double row repair (group II) from March 2003 to March 2005 were retrospectively evaluated and compared for the clinical outcomes. The mean age was 58 years and 56 years respectively for group I and II. The average follow-up in the two groups was 24 months. The evaluation was done by using the University of California Los Angeles (UCLA) rating scale and the shoulder index of the American Shoulder and Elbow Surgeons (ASES). In Group I, the mean ASES score increased from 30.48 to 87.40 and the mean ASES score increased from 32.00 to 91.45 in the Group II. The mean UCLA score increased from the preoperative 12.23 to 30.82 in Group I and from 12.20 to 32.40 in Group II. Each method has shown no statistical clinical differences between two methods, but based on the sub scores of UCLA score, the double row repair method yields better results for the strength, and it gives more satisfaction to the patients than the single row repair method. Comparing the two methods, double row repair group showed better clinical results in recovering strength and gave more satisfaction to the patients but no statistical clinical difference was found between 2 methods.
Shulruf, Boaz; Turner, Rolf; Poole, Phillippa; Wilkinson, Tim
2013-05-01
The decision to pass or fail a medical student is a 'high stakes' one. The aim of this study is to introduce and demonstrate the feasibility and practicality of a new objective standard-setting method for determining the pass/fail cut-off score from borderline grades. Three methods for setting up pass/fail cut-off scores were compared: the Regression Method, the Borderline Group Method, and the new Objective Borderline Method (OBM). Using Year 5 students' OSCE results from one medical school we established the pass/fail cut-off scores by the abovementioned three methods. The comparison indicated that the pass/fail cut-off scores generated by the OBM were similar to those generated by the more established methods (0.840 ≤ r ≤ 0.998; p < .0001). Based on theoretical and empirical analysis, we suggest that the OBM has advantages over existing methods in that it combines objectivity, realism, robust empirical basis and, no less importantly, is simple to use.
Psychometric challenges and proposed solutions when scoring facial emotion expression codes.
Olderbak, Sally; Hildebrandt, Andrea; Pinkpank, Thomas; Sommer, Werner; Wilhelm, Oliver
2014-12-01
Coding of facial emotion expressions is increasingly performed by automated emotion expression scoring software; however, there is limited discussion on how best to score the resulting codes. We present a discussion of facial emotion expression theories and a review of contemporary emotion expression coding methodology. We highlight methodological challenges pertinent to scoring software-coded facial emotion expression codes and present important psychometric research questions centered on comparing competing scoring procedures of these codes. Then, on the basis of a time series data set collected to assess individual differences in facial emotion expression ability, we derive, apply, and evaluate several statistical procedures, including four scoring methods and four data treatments, to score software-coded emotion expression data. These scoring procedures are illustrated to inform analysis decisions pertaining to the scoring and data treatment of other emotion expression questions and under different experimental circumstances. Overall, we found applying loess smoothing and controlling for baseline facial emotion expression and facial plasticity are recommended methods of data treatment. When scoring facial emotion expression ability, maximum score is preferred. Finally, we discuss the scoring methods and data treatments in the larger context of emotion expression research.
Impact of Different Creatinine Measurement Methods on Liver Transplant Allocation
Kaiser, Thorsten; Kinny-Köster, Benedict; Bartels, Michael; Parthaune, Tanja; Schmidt, Michael; Thiery, Joachim
2014-01-01
Introduction The model for end-stage liver disease (MELD) score is used in many countries to prioritize organ allocation for the majority of patients who require orthotopic liver transplantation. This score is calculated based on the following laboratory parameters: creatinine, bilirubin and the international normalized ratio (INR). Consequently, high measurement accuracy is essential for equitable and fair organ allocation. For serum creatinine measurements, the Jaffé method and enzymatic detection are well-established routine diagnostic tests. Methods A total of 1,013 samples from 445 patients on the waiting list or in evaluation for liver transplantation were measured using both creatinine methods from November 2012 to September 2013 at the university hospital Leipzig, Germany. The measurements were performed in parallel according to the manufacturer’s instructions after the samples arrived at the institute of laboratory medicine. Patients who had required renal replacement therapy twice in the previous week were excluded from analyses. Results Despite the good correlation between the results of both creatinine quantification methods, relevant differences were observed, which led to different MELD scores. The Jaffé measurement led to greater MELD score in 163/1,013 (16.1%) samples with differences of up to 4 points in one patient, whereas differences of up to 2 points were identified in 15/1,013 (1.5%) samples using the enzymatic assay. Overall, 50/152 (32.9%) patients with MELD scores >20 had higher scores when the Jaffé method was used. Discussion Using the Jaffé method to measure creatinine levels in samples from patients who require liver transplantation may lead to a systematic preference in organ allocation. In this study, the differences were particularly pronounced in samples with MELD scores >20, which has clinical relevance in the context of urgency of transplantation. These data suggest that official recommendations are needed to determine which laboratory diagnostic methods should be used when calculating MELD scores. PMID:24587188
Gloudeman, Mark W; Shah-Manek, Bijal; Wong, Terri H; Vo, Christina; Ip, Eric J
2018-02-01
The flipped teaching method was implemented through a series of multiple condensed videos for pharmaceutical calculations with student perceptions and academic performance assessed post-intervention. Student perceptions from the intervention group were assessed via an online survey. Pharmaceutical exam scores of the intervention group were compared to the control group. The intervention group spent a greater amount of class time on active learning. The majority of students (68.2%) thought that the flipped teaching method was more effective to learn pharmaceutical calculations than the traditional method. The mean exam scores of the intervention group were not significantly different than the control group (80.5 ± 15.8% vs 77.8 ± 16.8%; p = 0.253). Previous studies on the flipped teaching method have shown mixed results in regards to student perceptions and exam scores, where either student satisfaction increased or exam scores improved, but rarely both. The flipped teaching method was rated favorably by a majority of students. The flipped teaching method resulted in similar outcomes in pharmaceutical calculations exam scores, and it appears to be an acceptable and effective option to deliver pharmaceutical calculations in a Doctor of Pharmacy program. Copyright © 2017 Elsevier Inc. All rights reserved.
Rice, J P; Saccone, N L; Corbett, J
2001-01-01
The lod score method originated in a seminal article by Newton Morton in 1955. The method is broadly concerned with issues of power and the posterior probability of linkage, ensuring that a reported linkage has a high probability of being a true linkage. In addition, the method is sequential, so that pedigrees or lod curves may be combined from published reports to pool data for analysis. This approach has been remarkably successful for 50 years in identifying disease genes for Mendelian disorders. After discussing these issues, we consider the situation for complex disorders, where the maximum lod score (MLS) statistic shares some of the advantages of the traditional lod score approach but is limited by unknown power and the lack of sharing of the primary data needed to optimally combine analytic results. We may still learn from the lod score method as we explore new methods in molecular biology and genetic analysis to utilize the complete human DNA sequence and the cataloging of all human genes.
Ocular dominance stability and reading skill: a controversial relationship.
Zeri, Fabrizio; De Luca, Maria; Spinelli, Donatella; Zoccolotti, Pierluigi
2011-11-01
Evidence is mixed concerning the relationship between stability of ocular dominance and reading deficits. Contrasting results may be due to the use of different tests of dominance, different samples of readers, and different scoring methods. The aim of this study was to investigate the relationship among ocular dominance, general visual abilities, and reading performance, and to evaluate the consistency and reliability of different tests of ocular dominance and the effects of different types of eye dominance scoring. In a group of young adults, we measured: (a) main optometric parameters; (b) reading time and accuracy; and (c) ocular dominance in two sighting and four motor tests. Dominance was determined using different scoring methods (relative, absolute, and binary scores). All dominance tests showed good levels of internal reliability. Sighting tests were consistent regardless of the scoring method, and all participants had stable dominance. Three of four motor tests were moderately consistent when dominance was measured with relative scores but not when it was measured with absolute or binary scores. No relationship was found between stability of dominance and reading performance, regardless of the type of test or scoring method. No systematic pattern of correlation was found between binocular vision variables and dominance measures. Choosing the type of motor test to measure ocular dominance is crucial, because the level of consistency among tests is low to moderate. Furthermore, motor tests were not correlated with reading performances. Present results suggest caution when trying to link reading difficulties with specific profiles of ocular dominance.
Stuart, Elizabeth A.; Lee, Brian K.; Leacy, Finbarr P.
2013-01-01
Objective Examining covariate balance is the prescribed method for determining when propensity score methods are successful at reducing bias. This study assessed the performance of various balance measures, including a proposed balance measure based on the prognostic score (also known as the disease-risk score), to determine which balance measures best correlate with bias in the treatment effect estimate. Study Design and Setting The correlations of multiple common balance measures with bias in the treatment effect estimate produced by weighting by the odds, subclassification on the propensity score, and full matching on the propensity score were calculated. Simulated data were used, based on realistic data settings. Settings included both continuous and binary covariates and continuous covariates only. Results The standardized mean difference in prognostic scores, the mean standardized mean difference, and the mean t-statistic all had high correlations with bias in the effect estimate. Overall, prognostic scores displayed the highest correlations of all the balance measures considered. Prognostic score measure performance was generally not affected by model misspecification and performed well under a variety of scenarios. Conclusion Researchers should consider using prognostic score–based balance measures for assessing the performance of propensity score methods for reducing bias in non-experimental studies. PMID:23849158
Carias, D; Cioccia, A M; Hevia, P
1995-06-01
Protein digestibility is a key factor in the determination of protein quality using the chemical score. Since there are several methods available for determining protein digestibility the purpose of this study was to compare three methods in vitro (pH drop, pH stat and pepsin digestibility) and two methods in vivo (true and apparent digestibility in rats) in the determination of the protein digestibility of: casein, soy protein isolate, fish meal, black beans, corn meal and wheat flour. The results showed that in the case of highly digestible proteins all methods agreed very well. However, this agreement was much less apparent in the case of protein with digestibilities below 85%. As a result, the chemical score of these proteins varied substantially depending upon the method used to determine its digestibility. Thus, when the chemical score of the proteins analyzed was corrected by the true protein digestibility measured in rats, they ranked as: casein 83.56, soy 76.11, corn-beans mixtures (1:1) 58.14, fish meal 55.25, black beans 47.93, corn meal 46.06 and wheat flour 32.77. In contrast, when the chemical score of these proteins was corrected by the pepsin digestibility method, the lowest quality was assigned to fish meal. In summary, this results pointed out that for non conventional proteins of for known proteins which have been subjected to processing, protein digestibility should be measured in vivo.
A method for modelling GP practice level deprivation scores using GIS
Strong, Mark; Maheswaran, Ravi; Pearson, Tim; Fryers, Paul
2007-01-01
Background A measure of general practice level socioeconomic deprivation can be used to explore the association between deprivation and other practice characteristics. An area-based categorisation is commonly chosen as the basis for such a deprivation measure. Ideally a practice population-weighted area-based deprivation score would be calculated using individual level spatially referenced data. However, these data are often unavailable. One approach is to link the practice postcode to an area-based deprivation score, but this method has limitations. This study aimed to develop a Geographical Information Systems (GIS) based model that could better predict a practice population-weighted deprivation score in the absence of patient level data than simple practice postcode linkage. Results We calculated predicted practice level Index of Multiple Deprivation (IMD) 2004 deprivation scores using two methods that did not require patient level data. Firstly we linked the practice postcode to an IMD 2004 score, and secondly we used a GIS model derived using data from Rotherham, UK. We compared our two sets of predicted scores to "gold standard" practice population-weighted scores for practices in Doncaster, Havering and Warrington. Overall, the practice postcode linkage method overestimated "gold standard" IMD scores by 2.54 points (95% CI 0.94, 4.14), whereas our modelling method showed no such bias (mean difference 0.36, 95% CI -0.30, 1.02). The postcode-linked method systematically underestimated the gold standard score in less deprived areas, and overestimated it in more deprived areas. Our modelling method showed a small underestimation in scores at higher levels of deprivation in Havering, but showed no bias in Doncaster or Warrington. The postcode-linked method showed more variability when predicting scores than did the GIS modelling method. Conclusion A GIS based model can be used to predict a practice population-weighted area-based deprivation measure in the absence of patient level data. Our modelled measure generally had better agreement with the population-weighted measure than did a postcode-linked measure. Our model may also avoid an underestimation of IMD scores in less deprived areas, and overestimation of scores in more deprived areas, seen when using postcode linked scores. The proposed method may be of use to researchers who do not have access to patient level spatially referenced data. PMID:17822545
Comparison of Manual Refraction Versus Autorefraction in 60 Diabetic Retinopathy Patients
Shirzadi, Keyvan; Shahraki, Kourosh; Yahaghi, Emad; Makateb, Ali; Khosravifard, Keivan
2016-01-01
Aim: The purpose of the study was to evaluate the comparison of manual refraction versus autorefraction in diabetic retinopathy patients. Material and Methods: The study was conducted at the Be’sat Army Hospital from 2013-2015. In the present study differences between two common refractometry methods (manual refractometry and Auto refractometry) in diagnosis and follow up of retinopathy in patients affected with diabetes is investigated. Results: Our results showed that there is a significant difference in visual acuity score of patients between manual and auto refractometry. Despite this fact, spherical equivalent scores of two methods of refractometry did not show a significant statistical difference in the patients. Conclusion: Although use of manual refraction is comparable with autorefraction in evaluating spherical equivalent scores in diabetic patients affected with retinopathy, but in the case of visual acuity results from these two methods are not comparable. PMID:27703289
Tian, Yuxi; Schuemie, Martijn J; Suchard, Marc A
2018-06-22
Propensity score adjustment is a popular approach for confounding control in observational studies. Reliable frameworks are needed to determine relative propensity score performance in large-scale studies, and to establish optimal propensity score model selection methods. We detail a propensity score evaluation framework that includes synthetic and real-world data experiments. Our synthetic experimental design extends the 'plasmode' framework and simulates survival data under known effect sizes, and our real-world experiments use a set of negative control outcomes with presumed null effect sizes. In reproductions of two published cohort studies, we compare two propensity score estimation methods that contrast in their model selection approach: L1-regularized regression that conducts a penalized likelihood regression, and the 'high-dimensional propensity score' (hdPS) that employs a univariate covariate screen. We evaluate methods on a range of outcome-dependent and outcome-independent metrics. L1-regularization propensity score methods achieve superior model fit, covariate balance and negative control bias reduction compared with the hdPS. Simulation results are mixed and fluctuate with simulation parameters, revealing a limitation of simulation under the proportional hazards framework. Including regularization with the hdPS reduces commonly reported non-convergence issues but has little effect on propensity score performance. L1-regularization incorporates all covariates simultaneously into the propensity score model and offers propensity score performance superior to the hdPS marginal screen.
Goos, Matthias; Schubach, Fabian; Seifert, Gabriel; Boeker, Martin
2016-08-17
Health professionals often manage medical problems in critical situations under time pressure and on the basis of vague information. In recent years, dual process theory has provided a framework of cognitive processes to assist students in developing clinical reasoning skills critical especially in surgery due to the high workload and the elevated stress levels. However, clinical reasoning skills can be observed only indirectly and the corresponding constructs are difficult to measure in order to assess student performance. The script concordance test has been established in this field. A number of studies suggest that the test delivers a valid assessment of clinical reasoning. However, different scoring methods have been suggested. They reflect different interpretations of the underlying construct. In this work we want to shed light on the theoretical framework of script theory and give an idea of script concordance testing. We constructed a script concordance test in the clinical context of "acute abdomen" and compared previously proposed scores with regard to their validity. A test comprising 52 items in 18 clinical scenarios was developed, revised along the guidelines and administered to 56 4(th) and 5(th) year medical students at the end of a blended-learning seminar. We scored the answers using five different scoring methods (distance (2×), aggregate (2×), single best answer) and compared the scoring keys, the resulting final scores and Cronbach's α after normalization of the raw scores. All scores except the single best answers calculation achieved acceptable reliability scores (>= 0.75), as measured by Cronbach's α. Students were clearly distinguishable from the experts, whose results were set to a mean of 80 and SD of 5 by the normalization process. With the two aggregate scoring methods, the students' means values were between 62.5 (AGGPEN) and 63.9 (AGG) equivalent to about three expert SD below the experts' mean value (Cronbach's α : 0.76 (AGGPEN) and 0.75 (AGG)). With the two distance scoring methods the students' mean was between 62.8 (DMODE) and 66.8 (DMEAN) equivalent to about two expert SD below the experts' mean value (Cronbach's α: 0.77 (DMODE) and 0.79 (DMEAN)). In this study the single best answer (SBA) scoring key yielded the worst psychometric results (Cronbach's α: 0.68). Assuming the psychometric properties of the script concordance test scores are valid, then clinical reasoning skills can be measured reliably with different scoring keys in the SCT presented here. Psychometrically, the distance methods seem to be superior, wherein inherent statistical properties of the scales might play a significant role. For methodological reasons, the aggregate methods can also be used. Despite the limitations and complexity of the underlying scoring process and the calculation of reliability, we advocate for SCT because it allows a new perspective on the measurement and teaching of cognitive skills.
Music behind Scores: Case Study of Learning Improvisation with "Playback Orchestra" Method
ERIC Educational Resources Information Center
Juntunen, P.; Ruokonen, I.; Ruismäki, H.
2015-01-01
For music students in the early stages of learning, the music may seem to be hidden behind the scores. To support home practising, Juntunen has created the "Playback Orchestra" method with which the students can practise with the support of the notation program playback of the full orchestra. The results of testing the method with…
ERIC Educational Resources Information Center
Grant, Mary C.; Zhang, Lilly; Damiano, Michele
2009-01-01
This study investigated kernel equating methods by comparing these methods to operational equatings for two tests in the SAT Subject Tests[TM] program. GENASYS (ETS, 2007) was used for all equating methods and scaled score kernel equating results were compared to Tucker, Levine observed score, chained linear, and chained equipercentile equating…
Döring, Sophie; Arzi, Boaz; Barich, Catherine R; Hatcher, David C; Kass, Philip H; Verstraete, Frank J M
2018-01-01
OBJECTIVE To evaluate the diagnostic yield of dental radiography (Rad method) and 3 cone-beam CT (CBCT) methods for the identification of predefined anatomic landmarks in brachycephalic dogs. ANIMALS 19 client-owned brachycephalic dogs admitted for evaluation and treatment of dental disease. PROCEDURES 26 predefined anatomic landmarks were evaluated separately by use of the RAD method and 3 CBCT software modules (serial CBCT slices and custom cross sections, tridimensional rendering, and reconstructed panoramic views). A semiquantitative scoring system was used, and mean scores were calculated for each anatomic landmark and imaging method. The Friedman test was used to evaluate values for significant differences in diagnostic yield. For values that were significant, the Wilcoxon signed rank test was used with the Bonferroni-Holm multiple comparison adjustment to determine significant differences among each of the 6 possible pairs of diagnostic methods. RESULTS Differences of diagnostic yield among the Rad and 3 CBCT methods were significant for 19 of 26 anatomic landmarks. For these landmarks, Rad scores were significantly higher than scores for reconstructed panoramic views for 4 of 19 anatomic landmarks, but Rad scores were significantly lower than scores for reconstructed panoramic views for 8 anatomic landmarks, tridimensional rendering for 18 anatomic landmarks, and serial CBCT slices and custom cross sections for all 19 anatomic landmarks. CONCLUSIONS AND CLINICAL RELEVANCE CBCT methods were better suited than dental radiography for the identification of anatomic landmarks in brachycephalic dogs. Results of this study can serve as a basis for CBCT evaluation of dental disorders in brachycephalic dogs.
Assessment of LVEF using a new 16-segment wall motion score in echocardiography.
Lebeau, Real; Serri, Karim; Lorenzo, Maria Di; Sauvé, Claude; Le, Van Hoai Viet; Soulières, Vicky; El-Rayes, Malak; Pagé, Maude; Zaïani, Chimène; Garot, Jérôme; Poulin, Frédéric
2018-06-01
Simpson biplane method and 3D by transthoracic echocardiography (TTE), radionuclide angiography (RNA) and cardiac magnetic resonance imaging (CMR) are the most accepted techniques for left ventricular ejection fraction (LVEF) assessment. Wall motion score index (WMSI) by TTE is an accepted complement. However, the conversion from WMSI to LVEF is obtained through a regression equation, which may limit its use. In this retrospective study, we aimed to validate a new method to derive LVEF from the wall motion score in 95 patients. The new score consisted of attributing a segmental EF to each LV segment based on the wall motion score and averaging all 16 segmental EF into a global LVEF. This segmental EF score was calculated on TTE in 95 patients, and RNA was used as the reference LVEF method. LVEF using the new segmental EF 15-40-65 score on TTE was compared to the reference methods using linear regression and Bland-Altman analyses. The median LVEF was 45% (interquartile range 32-53%; range from 15 to 65%). Our new segmental EF 15-40-65 score derived on TTE correlated strongly with RNA-LVEF ( r = 0.97). Overall, the new score resulted in good agreement of LVEF compared to RNA (mean bias 0.61%). The standard deviations (s.d.s) of the distributions of inter-method difference for the comparison of the new score with RNA were 6.2%, indicating good precision. LVEF assessment using segmental EF derived from the wall motion score applied to each of the 16 LV segments has excellent correlation and agreement with a reference method. © 2018 The authors.
Houssaini, Allal; Assoumou, Lambert; Miller, Veronica; Calvez, Vincent; Marcelin, Anne-Geneviève; Flandre, Philippe
2013-01-01
Background Several attempts have been made to determine HIV-1 resistance from genotype resistance testing. We compare scoring methods for building weighted genotyping scores and commonly used systems to determine whether the virus of a HIV-infected patient is resistant. Methods and Principal Findings Three statistical methods (linear discriminant analysis, support vector machine and logistic regression) are used to determine the weight of mutations involved in HIV resistance. We compared these weighted scores with known interpretation systems (ANRS, REGA and Stanford HIV-db) to classify patients as resistant or not. Our methodology is illustrated on the Forum for Collaborative HIV Research didanosine database (N = 1453). The database was divided into four samples according to the country of enrolment (France, USA/Canada, Italy and Spain/UK/Switzerland). The total sample and the four country-based samples allow external validation (one sample is used to estimate a score and the other samples are used to validate it). We used the observed precision to compare the performance of newly derived scores with other interpretation systems. Our results show that newly derived scores performed better than or similar to existing interpretation systems, even with external validation sets. No difference was found between the three methods investigated. Our analysis identified four new mutations associated with didanosine resistance: D123S, Q207K, H208Y and K223Q. Conclusions We explored the potential of three statistical methods to construct weighted scores for didanosine resistance. Our proposed scores performed at least as well as already existing interpretation systems and previously unrecognized didanosine-resistance associated mutations were identified. This approach could be used for building scores of genotypic resistance to other antiretroviral drugs. PMID:23555613
A new approach for computing a flood vulnerability index using cluster analysis
NASA Astrophysics Data System (ADS)
Fernandez, Paulo; Mourato, Sandra; Moreira, Madalena; Pereira, Luísa
2016-08-01
A Flood Vulnerability Index (FloodVI) was developed using Principal Component Analysis (PCA) and a new aggregation method based on Cluster Analysis (CA). PCA simplifies a large number of variables into a few uncorrelated factors representing the social, economic, physical and environmental dimensions of vulnerability. CA groups areas that have the same characteristics in terms of vulnerability into vulnerability classes. The grouping of the areas determines their classification contrary to other aggregation methods in which the areas' classification determines their grouping. While other aggregation methods distribute the areas into classes, in an artificial manner, by imposing a certain probability for an area to belong to a certain class, as determined by the assumption that the aggregation measure used is normally distributed, CA does not constrain the distribution of the areas by the classes. FloodVI was designed at the neighbourhood level and was applied to the Portuguese municipality of Vila Nova de Gaia where several flood events have taken place in the recent past. The FloodVI sensitivity was assessed using three different aggregation methods: the sum of component scores, the first component score and the weighted sum of component scores. The results highlight the sensitivity of the FloodVI to different aggregation methods. Both sum of component scores and weighted sum of component scores have shown similar results. The first component score aggregation method classifies almost all areas as having medium vulnerability and finally the results obtained using the CA show a distinct differentiation of the vulnerability where hot spots can be clearly identified. The information provided by records of previous flood events corroborate the results obtained with CA, because the inundated areas with greater damages are those that are identified as high and very high vulnerability areas by CA. This supports the fact that CA provides a reliable FloodVI.
Estimating health state utility values for comorbid health conditions using SF-6D data.
Ara, Roberta; Brazier, John
2011-01-01
When health state utility values for comorbid health conditions are not available, data from cohorts with single conditions are used to estimate scores. The methods used can produce very different results and there is currently no consensus on which is the most appropriate approach. The objective of the current study was to compare the accuracy of five different methods within the same dataset. Data collected during five Welsh Health Surveys were subgrouped by health status. Mean short-form 6 dimension (SF-6D) scores for cohorts with a specific health condition were used to estimate mean SF-6D scores for cohorts with comorbid conditions using the additive, multiplicative, and minimum methods, the adjusted decrement estimator (ADE), and a linear regression model. The mean SF-6D for subgroups with comorbid health conditions ranged from 0.4648 to 0.6068. The linear model produced the most accurate scores for the comorbid health conditions with 88% of values accurate to within the minimum important difference for the SF-6D. The additive and minimum methods underestimated or overestimated the actual SF-6D scores respectively. The multiplicative and ADE methods both underestimated the majority of scores. However, both methods performed better when estimating scores smaller than 0.50. Although the range in actual health state utility values (HSUVs) was relatively small, our data covered the lower end of the index and the majority of previous research has involved actual HSUVs at the upper end of possible ranges. Although the linear model gave the most accurate results in our data, additional research is required to validate our findings. Copyright © 2011 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Leonardi, Marcelo
The primary purpose of this study was to examine the impact of a scheduling change from a trimester 4x4 block schedule to a modified hybrid schedule on student achievement in ninth grade biology courses. This study examined the impact of the scheduling change on student achievement through teacher created benchmark assessments in Genetics, DNA, and Evolution and on the California Standardized Test in Biology. The secondary purpose of this study examined the ninth grade biology teacher perceptions of ninth grade biology student achievement. Using a mixed methods research approach, data was collected both quantitatively and qualitatively as aligned to research questions. Quantitative methods included gathering data from departmental benchmark exams and California Standardized Test in Biology and conducting multiple analysis of covariance and analysis of covariance to determine significance differences. Qualitative methods include journal entries questions and focus group interviews. The results revealed a statistically significant increase in scores on both the DNA and Evolution benchmark exams. DNA and Evolution benchmark exams showed significant improvements from a change in scheduling format. The scheduling change was responsible for 1.5% of the increase in DNA benchmark scores and 2% of the increase in Evolution benchmark scores. The results revealed a statistically significant decrease in scores on the Genetics Benchmark exam as a result of the scheduling change. The scheduling change was responsible for 1% of the decrease in Genetics benchmark scores. The results also revealed a statistically significant increase in scores on the CST Biology exam. The scheduling change was responsible for .7% of the increase in CST Biology scores. Results of the focus group discussions indicated that all teachers preferred the modified hybrid schedule over the trimester schedule and that it improved student achievement.
Comparing State SAT Scores: Problems, Biases, and Corrections.
ERIC Educational Resources Information Center
Gohmann, Stephen F.
1988-01-01
One method to correct for selection bias in comparing Scholastic Aptitude Test (SAT) scores among states is presented, which is a modification of J. J. Heckman's Selection Bias Correction (1976, 1979). Empirical results suggest that sample selection bias is present in SAT score regressions. (SLD)
Benchmarking protein-protein interface predictions: why you should care about protein size.
Martin, Juliette
2014-07-01
A number of predictive methods have been developed to predict protein-protein binding sites. Each new method is traditionally benchmarked using sets of protein structures of various sizes, and global statistics are used to assess the quality of the prediction. Little attention has been paid to the potential bias due to protein size on these statistics. Indeed, small proteins involve proportionally more residues at interfaces than large ones. If a predictive method is biased toward small proteins, this can lead to an over-estimation of its performance. Here, we investigate the bias due to the size effect when benchmarking protein-protein interface prediction on the widely used docking benchmark 4.0. First, we simulate random scores that favor small proteins over large ones. Instead of the 0.5 AUC (Area Under the Curve) value expected by chance, these biased scores result in an AUC equal to 0.6 using hypergeometric distributions, and up to 0.65 using constant scores. We then use real prediction results to illustrate how to detect the size bias by shuffling, and subsequently correct it using a simple conversion of the scores into normalized ranks. In addition, we investigate the scores produced by eight published methods and show that they are all affected by the size effect, which can change their relative ranking. The size effect also has an impact on linear combination scores by modifying the relative contributions of each method. In the future, systematic corrections should be applied when benchmarking predictive methods using data sets with mixed protein sizes. © 2014 Wiley Periodicals, Inc.
Technology Performance Level (TPL) Scoring Tool
DOE Office of Scientific and Technical Information (OSTI.GOV)
Weber, Jochem; Roberts, Jesse D.; Costello, Ronan
2016-09-01
Three different ways of combining scores are used in the revised formulation. These are arithmetic mean, geometric mean and multiplication with normalisation. Arithmetic mean is used when combining scores that measure similar attributes, e.g. used for combining costs. The arithmetic mean has the property that it is similar to a logical OR, e.g. when combining costs it does not matter what the individual costs are only what the combined cost is. Geometric mean and Multiplication are used when combining scores that measure disparate attributes. Multiplication is similar to a logical AND, it is used to combine ‘must haves.’ As amore » result, this method is more punitive than the geometric mean; to get a good score in the combined result it is necessary to have a good score in ALL of the inputs. e.g. the different types of survivability are ‘must haves.’ On balance, the revised TPL is probably less punitive than the previous spreadsheet, multiplication is used sparingly as a method of combining scores. This is in line with the feedback of the Wave Energy Prize judges.« less
Building a composite score of general practitioners' intrinsic motivation: a comparison of methods.
Sicsic, Jonathan; Le Vaillant, Marc; Franc, Carine
2014-04-01
Pay-for-performance programmes have been widely implemented in primary care, but few studies have investigated their potential adverse effects on the intrinsic motivation of general practitioners (GPs) even though intrinsic motivation may be a key determinant of quality in health care. Our aim was to compare methods for developing a composite score of GPs' intrinsic motivation and to select one that is most consistent with self-reported data. A postal survey. French GPs practicing in private practice. Using a set of variables selected to characterize the dimensions of intrinsic motivation, three alternative composite scores were calculated based on a multiple correspondence analysis (MCA), a confirmatory factor analysis (CFA) and a two-parameter logistic model (2-PLM). Weighted kappa coefficients were used to evaluate variation in GPs' ranks according to each method. The three methods produced similar results on both the estimation of the indicators' weights and the order of GP rank lists. All weighted kappa coefficients were >0.80. The CFA and 2-PLM produced the most similar results. There was little difference regarding the three methods' results, validating our measure of GPs' intrinsic motivation. The 2-PLM appeared theoretically and empirically more robust for establishing the intrinsic motivation score. Code JEL C38, C43, I18.
Statistical analysis to assess automated level of suspicion scoring methods in breast ultrasound
NASA Astrophysics Data System (ADS)
Galperin, Michael
2003-05-01
A well-defined rule-based system has been developed for scoring 0-5 the Level of Suspicion (LOS) based on qualitative lexicon describing the ultrasound appearance of breast lesion. The purposes of the research are to asses and select one of the automated LOS scoring quantitative methods developed during preliminary studies in benign biopsies reduction. The study has used Computer Aided Imaging System (CAIS) to improve the uniformity and accuracy of applying the LOS scheme by automatically detecting, analyzing and comparing breast masses. The overall goal is to reduce biopsies on the masses with lower levels of suspicion, rather that increasing the accuracy of diagnosis of cancers (will require biopsy anyway). On complex cysts and fibroadenoma cases experienced radiologists were up to 50% less certain in true negatives than CAIS. Full correlation analysis was applied to determine which of the proposed LOS quantification methods serves CAIS accuracy the best. This paper presents current results of applying statistical analysis for automated LOS scoring quantification for breast masses with known biopsy results. It was found that First Order Ranking method yielded most the accurate results. The CAIS system (Image Companion, Data Companion software) is developed by Almen Laboratories and was used to achieve the results.
Curtis, David; Knight, Jo; Sham, Pak C
2005-09-01
Although LOD score methods have been applied to diseases with complex modes of inheritance, linkage analysis of quantitative traits has tended to rely on non-parametric methods based on regression or variance components analysis. Here, we describe a new method for LOD score analysis of quantitative traits which does not require specification of a mode of inheritance. The technique is derived from the MFLINK method for dichotomous traits. A range of plausible transmission models is constructed, constrained to yield the correct population mean and variance for the trait but differing with respect to the contribution to the variance due to the locus under consideration. Maximized LOD scores under homogeneity and admixture are calculated, as is a model-free LOD score which compares the maximized likelihoods under admixture assuming linkage and no linkage. These LOD scores have known asymptotic distributions and hence can be used to provide a statistical test for linkage. The method has been implemented in a program called QMFLINK. It was applied to data sets simulated using a variety of transmission models and to a measure of monoamine oxidase activity in 105 pedigrees from the Collaborative Study on the Genetics of Alcoholism. With the simulated data, the results showed that the new method could detect linkage well if the true allele frequency for the trait was close to that specified. However, it performed poorly on models in which the true allele frequency was much rarer. For the Collaborative Study on the Genetics of Alcoholism data set only a modest overlap was observed between the results obtained from the new method and those obtained when the same data were analysed previously using regression and variance components analysis. Of interest is that D17S250 produced a maximized LOD score under homogeneity and admixture of 2.6 but did not indicate linkage using the previous methods. However, this region did produce evidence for linkage in a separate data set, suggesting that QMFLINK may have been able to detect a true linkage which was not picked up by the other methods. The application of model-free LOD score analysis to quantitative traits is novel and deserves further evaluation of its merits and disadvantages relative to other methods.
Fully Convolutional Network-Based Multifocus Image Fusion.
Guo, Xiaopeng; Nie, Rencan; Cao, Jinde; Zhou, Dongming; Qian, Wenhua
2018-07-01
As the optical lenses for cameras always have limited depth of field, the captured images with the same scene are not all in focus. Multifocus image fusion is an efficient technology that can synthesize an all-in-focus image using several partially focused images. Previous methods have accomplished the fusion task in spatial or transform domains. However, fusion rules are always a problem in most methods. In this letter, from the aspect of focus region detection, we propose a novel multifocus image fusion method based on a fully convolutional network (FCN) learned from synthesized multifocus images. The primary novelty of this method is that the pixel-wise focus regions are detected through a learning FCN, and the entire image, not just the image patches, are exploited to train the FCN. First, we synthesize 4500 pairs of multifocus images by repeatedly using a gaussian filter for each image from PASCAL VOC 2012, to train the FCN. After that, a pair of source images is fed into the trained FCN, and two score maps indicating the focus property are generated. Next, an inversed score map is averaged with another score map to produce an aggregative score map, which take full advantage of focus probabilities in two score maps. We implement the fully connected conditional random field (CRF) on the aggregative score map to accomplish and refine a binary decision map for the fusion task. Finally, we exploit the weighted strategy based on the refined decision map to produce the fused image. To demonstrate the performance of the proposed method, we compare its fused results with several start-of-the-art methods not only on a gray data set but also on a color data set. Experimental results show that the proposed method can achieve superior fusion performance in both human visual quality and objective assessment.
Xu, Feng; Beyazoglu, Turker; Hefner, Evan; Gurkan, Umut Atakan
2011-01-01
Cellular alignment plays a critical role in functional, physical, and biological characteristics of many tissue types, such as muscle, tendon, nerve, and cornea. Current efforts toward regeneration of these tissues include replicating the cellular microenvironment by developing biomaterials that facilitate cellular alignment. To assess the functional effectiveness of the engineered microenvironments, one essential criterion is quantification of cellular alignment. Therefore, there is a need for rapid, accurate, and adaptable methodologies to quantify cellular alignment for tissue engineering applications. To address this need, we developed an automated method, binarization-based extraction of alignment score (BEAS), to determine cell orientation distribution in a wide variety of microscopic images. This method combines a sequenced application of median and band-pass filters, locally adaptive thresholding approaches and image processing techniques. Cellular alignment score is obtained by applying a robust scoring algorithm to the orientation distribution. We validated the BEAS method by comparing the results with the existing approaches reported in literature (i.e., manual, radial fast Fourier transform-radial sum, and gradient based approaches). Validation results indicated that the BEAS method resulted in statistically comparable alignment scores with the manual method (coefficient of determination R2=0.92). Therefore, the BEAS method introduced in this study could enable accurate, convenient, and adaptable evaluation of engineered tissue constructs and biomaterials in terms of cellular alignment and organization. PMID:21370940
Chrzanowski, Frank
2008-01-01
Two numerical methods, Decision Analysis (DA) and Potential Problem Analysis (PPA) are presented as alternative selection methods to the logical method presented in Part I. In DA properties are weighted and outcomes are scored. The weighted scores for each candidate are totaled and final selection is based on the totals. Higher scores indicate better candidates. In PPA potential problems are assigned a seriousness factor and test outcomes are used to define the probability of occurrence. The seriousness-probability products are totaled and forms with minimal scores are preferred. DA and PPA have never been compared to the logical-elimination method. Additional data were available for two forms of McN-5707 to provide complete preformulation data for five candidate forms. Weight and seriousness factors (independent variables) were obtained from a survey of experienced formulators. Scores and probabilities (dependent variables) were provided independently by Preformulation. The rankings of the five candidate forms, best to worst, were similar for all three methods. These results validate the applicability of DA and PPA for candidate form selection. DA and PPA are particularly applicable in cases where there are many candidate forms and where each form has some degree of unfavorable properties.
Purely Structural Protein Scoring Functions Using Support Vector Machine and Ensemble Learning.
Mirzaei, Shokoufeh; Sidi, Tomer; Keasar, Chen; Crivelli, Silvia
2016-08-24
The function of a protein is determined by its structure, which creates a need for efficient methods of protein structure determination to advance scientific and medical research. Because current experimental structure determination methods carry a high price tag, computational predictions are highly desirable. Given a protein sequence, computational methods produce numerous 3D structures known as decoys. However, selection of the best quality decoys is challenging as the end users can handle only a few ones. Therefore, scoring functions are central to decoy selection. They combine measurable features into a single number indicator of decoy quality. Unfortunately, current scoring functions do not consistently select the best decoys. Machine learning techniques offer great potential to improve decoy scoring. This paper presents two machine-learning based scoring functions to predict the quality of proteins structures, i.e., the similarity between the predicted structure and the experimental one without knowing the latter. We use different metrics to compare these scoring functions against three state-of-the-art scores. This is a first attempt at comparing different scoring functions using the same non-redundant dataset for training and testing and the same features. The results show that adding informative features may be more significant than the method used.
Umay, Ebru Karaca; Unlu, Ece; Saylam, Guleser Kılıc; Cakci, Aytul; Korkmaz, Hakan
2013-09-01
We aimed in this study to evaluate dysphagia in early stroke patients using a bedside screening test and flexible fiberoptic endoscopic evaluation of swallowing (FFEES) and electrophysiological evaluation (EE) methods and to compare the effectiveness of these methods. Twenty-four patients who were hospitalized in our clinic within the first 3 months after stroke were included in this study. Patients were evaluated using a bedside screening test [including bedside dysphagia score (BDS), neurological examination dysphagia score (NEDS), and total dysphagia score (TDS)] and FFEES and EE methods. Patients were divided into normal-swallowing and dysphagia groups according to the results of the evaluation methods. Patients with dysphagia as determined by any of these methods were compared to the patients with normal swallowing based on the results of the other two methods. Based on the results of our study, a high BDS was positively correlated with dysphagia identified by FFEES and EE methods. Moreover, the FFEES and EE methods were positively correlated. There was no significant correlation between NEDS and TDS levels and either EE or FFEES method. Bedside screening tests should be used mainly as an initial screening test; then FFEES and EE methods should be combined in patients who show risks. This diagnostic algorithm may provide a practical and fast solution for selected stroke patients.
The Apgar score has survived the test of time.
Finster, Mieczyslaw; Wood, Margaret
2005-04-01
In 1953, Virginia Apgar, M.D. published her proposal for a new method of evaluation of the newborn infant. The avowed purpose of this paper was to establish a simple and clear classification of newborn infants which can be used to compare the results of obstetric practices, types of maternal pain relief and the results of resuscitation. Having considered several objective signs pertaining to the condition of the infant at birth she selected five that could be evaluated and taught to the delivery room personnel without difficulty. These signs were heart rate, respiratory effort, reflex irritability, muscle tone and color. Sixty seconds after the complete birth of the baby a rating of zero, one or two was given to each sign, depending on whether it was absent or present. Virginia Apgar reviewed anesthesia records of 1025 infants born alive at Columbia Presbyterian Medical Center during the period of this report. All had been rated by her method. Infants in poor condition scored 0-2, infants in fair condition scored 3-7, while scores 8-10 were achieved by infants in good condition. The most favorable score 1 min after birth was obtained by infants delivered vaginally with the occiput the presenting part (average 8.4). Newborns delivered by version and breech extraction had the lowest score (average 6.3). Infants delivered by cesarean section were more vigorous (average score 8.0) when spinal was the method of anesthesia versus an average score of 5.0 when general anesthesia was used. Correlating the 60 s score with neonatal mortality, Virginia found that mature infants receiving 0, 1 or 2 scores had a neonatal death rate of 14%; those scoring 3, 4, 5, 6 or 7 had a death rate of 1.1%; and those in the 8-10 score group had a death rate of 0.13%. She concluded that the prognosis of an infant is excellent if he receives one of the upper three scores, and poor if one of the lowest three scores.
Mansoorian, Mohammad Reza; Hosseiny, Marzeih Sadat; Khosravan, Shahla; Alami, Ali; Alaviani, Mehri
2015-06-01
Despite the benefits of the objective structured assessment of technical skills (OSATS) and it appropriateness for evaluating clinical abilities of nursing students , few studies are available on the application of this method in nursing education. The purpose of this study was to compare the effect of using OSATS and traditional methods on the students' learning. We also aimed to signify students' views about these two methods and their views about the scores they received in these methods in a medical emergency course. A quasi-experimental study was performed on 45 first semester students in nursing and medical emergencies passing a course on fundamentals of practice. The students were selected by a census method and evaluated by both the OSATS and traditional methods. Data collection was performed using checklists prepared based on the 'text book of nursing procedures checklists' published by Iranian nursing organization and a questionnaire containing learning rate and students' estimation of their received scores. Descriptive statistics as well as paired t-test and independent samples t-test were used in data analysis. The mean of students' score in OSATS was significantly higher than their mean score in traditional method (P = 0.01). Moreover, the mean of self-evaluation score after the traditional method was relatively the same as the score the students received in the exam. However, the mean of self-evaluation score after the OSATS was relatively lower than the scores the students received in the OSATS exam. Most students believed that OSATS can evaluate a wide range of students' knowledge and skills compared to traditional method. Results of this study indicated the better effect of OSATS on learning and its relative superiority in precise assessment of clinical skills compared with the traditional evaluation method. Therefore, we recommend using this method in evaluation of students in practical courses.
Proposal for a new categorization of aseptic processing facilities based on risk assessment scores.
Katayama, Hirohito; Toda, Atsushi; Tokunaga, Yuji; Katoh, Shigeo
2008-01-01
Risk assessment of aseptic processing facilities was performed using two published risk assessment tools. Calculated risk scores were compared with experimental test results, including environmental monitoring and media fill run results, in three different types of facilities. The two risk assessment tools used gave a generally similar outcome. However, depending on the tool used, variations were observed in the relative scores between the facilities. For the facility yielding the lowest risk scores, the corresponding experimental test results showed no contamination, indicating that these ordinal testing methods are insufficient to evaluate this kind of facility. A conventional facility having acceptable aseptic processing lines gave relatively high risk scores. The facility showing a rather high risk score demonstrated the usefulness of conventional microbiological test methods. Considering the significant gaps observed in calculated risk scores and in the ordinal microbiological test results between advanced and conventional facilities, we propose a facility categorization based on risk assessment. The most important risk factor in aseptic processing is human intervention. When human intervention is eliminated from the process by advanced hardware design, the aseptic processing facility can be classified into a new risk category that is better suited for assuring sterility based on a new set of criteria rather than on currently used microbiological analysis. To fully benefit from advanced technologies, we propose three risk categories for these aseptic facilities.
Jamali, Jamshid; Ayatollahi, Seyyed Mohammad Taghi
2015-10-01
Nurses constitute the most providers of health care systems. Their mental health can affect the quality of services and patients' satisfaction. General Health Questionnaire (GHQ-12) is a general screening tool used to detect mental disorders. Scoring method and determining thresholds for this questionnaire are debatable and the cut-off points can vary from sample to sample. This study was conducted to estimate the prevalence of mental disorders among Iranian nurses using GHQ-12 and also compare Latent Class Analysis (LCA) and K-means clustering with traditional scoring method. A cross-sectional study was carried out in Fars and Bushehr provinces of southern Iran in 2014. Participants were 771 Iranian nurses, who filled out the GHQ-12 questionnaire. Traditional scoring method, LCA and K-means were used to estimate the prevalence of mental disorder among Iranian nurses. Cohen's kappa statistic was applied to assess the agreement between the LCA and K-means with traditional scoring method of GHQ-12. The nurses with mental disorder by scoring method, LCA and K-mean were 36.3% (n=280), 32.2% (n=248), and 26.5% (n=204), respectively. LCA and logistic regression revealed that the prevalence of mental disorder in females was significantly higher than males. Mental disorder in nurses was in a medium level compared to other people living in Iran. There was a little difference between prevalence of mental disorder estimated by scoring method, K-means and LCA. According to the advantages of LCA than K-means and different results in scoring method, we suggest LCA for classification of Iranian nurses according to their mental health outcomes using GHQ-12 questionnaire.
A KARAOKE System Singing Evaluation Method that More Closely Matches Human Evaluation
NASA Astrophysics Data System (ADS)
Takeuchi, Hideyo; Hoguro, Masahiro; Umezaki, Taizo
KARAOKE is a popular amusement for old and young. Many KARAOKE machines have singing evaluation function. However, it is often said that the scores given by KARAOKE machines do not match human evaluation. In this paper a KARAOKE scoring method strongly correlated with human evaluation is proposed. This paper proposes a way to evaluate songs based on the distance between singing pitch and musical scale, employing a vibrato extraction method based on template matching of spectrum. The results show that correlation coefficients between scores given by the proposed system and human evaluation are -0.76∼-0.89.
Application of the Optimized Summed Scored Attributes Method to Sex Estimation in Asian Crania.
Tallman, Sean D; Go, Matthew C
2018-05-01
The optimized summed scored attributes (OSSA) method was recently introduced and validated for nonmetric ancestry estimation between American Black and White individuals. The method proceeds by scoring, dichotomizing, and subsequently summing ordinal morphoscopic trait scores to maximize between-group differences. This study tests the applicability of the OSSA method for sex estimation using five cranial traits given the methodological similarities between classifying sex and ancestry. A large sample of documented crania from Japan and Thailand (n = 744 males, 320 females) are used to develop a heuristically selected OSSA sectioning point of ≤1 separating males and females. This sectioning point is validated using a holdout sample of Japanese, Thai, and Filipino (n = 178 males, 82 females) individuals. The results indicate a general correct classification rate of 82% using all five traits, and 81% when excluding the mental eminence. Designating an OSSA score of 2 as indeterminate is recommended. © 2017 American Academy of Forensic Sciences.
A quality score for coronary artery tree extraction results
NASA Astrophysics Data System (ADS)
Cao, Qing; Broersen, Alexander; Kitslaar, Pieter H.; Lelieveldt, Boudewijn P. F.; Dijkstra, Jouke
2018-02-01
Coronary artery trees (CATs) are often extracted to aid the fully automatic analysis of coronary artery disease on coronary computed tomography angiography (CCTA) images. Automatically extracted CATs often miss some arteries or include wrong extractions which require manual corrections before performing successive steps. For analyzing a large number of datasets, a manual quality check of the extraction results is time-consuming. This paper presents a method to automatically calculate quality scores for extracted CATs in terms of clinical significance of the extracted arteries and the completeness of the extracted CAT. Both right dominant (RD) and left dominant (LD) anatomical statistical models are generated and exploited in developing the quality score. To automatically determine which model should be used, a dominance type detection method is also designed. Experiments are performed on the automatically extracted and manually refined CATs from 42 datasets to evaluate the proposed quality score. In 39 (92.9%) cases, the proposed method is able to measure the quality of the manually refined CATs with higher scores than the automatically extracted CATs. In a 100-point scale system, the average scores for automatically and manually refined CATs are 82.0 (+/-15.8) and 88.9 (+/-5.4) respectively. The proposed quality score will assist the automatic processing of the CAT extractions for large cohorts which contain both RD and LD cases. To the best of our knowledge, this is the first time that a general quality score for an extracted CAT is presented.
Accountancy, teaching methods, sex, and American College Test scores.
Heritage, J; Harper, B S; Harper, J P
1990-10-01
This study examines the significance of sex, methodology, academic preparation, and age as related to development of judgmental and problem-solving skills. Sex, American College Test (ACT) Mathematics scores, Composite ACT scores, grades in course work, grade point average (GPA), and age were used in studying the effects of teaching method on 96 students' ability to analyze data in financial statements. Results reflect positively on accounting students compared to the general college population and the women students in particular.
Measuring the Interestingness of Articles in a Limited User Environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pon, Raymond K.
Search engines, such as Google, assign scores to news articles based on their relevancy to a query. However, not all relevant articles for the query may be interesting to a user. For example, if the article is old or yields little new information, the article would be uninteresting. Relevancy scores do not take into account what makes an article interesting, which varies from user to user. Although methods such as collaborative filtering have been shown to be effective in recommendation systems, in a limited user environment, there are not enough users that would make collaborative filtering effective. A general framework,more » called iScore, is presented for defining and measuring the 'interestingness' of articles, incorporating user-feedback. iScore addresses various aspects of what makes an article interesting, such as topic relevancy, uniqueness, freshness, source reputation, and writing style. It employs various methods to measure these features and uses a classifier operating on these features to recommend articles. The basic iScore configuration is shown to improve recommendation results by as much as 20%. In addition to the basic iScore features, additional features are presented to address the deficiencies of existing feature extractors, such as one that tracks multiple topics, called MTT, and a version of the Rocchio algorithm that learns its parameters online as it processes documents, called eRocchio. The inclusion of both MTT and eRocchio into iScore is shown to improve iScore recommendation results by as much as 3.1% and 5.6%, respectively. Additionally, in TREC11 Adaptive Filter Task, eRocchio is shown to be 10% better than the best filter in the last run of the task. In addition to these two major topic relevancy measures, other features are also introduced that employ language models, phrases, clustering, and changes in topics to improve recommendation results. These additional features are shown to improve recommendation results by iScore by up to 14%. Due to varying reasons that users hold regarding why an article is interesting, an online feature selection method in naive Bayes is also introduced. Online feature selection can improve recommendation results in iScore by up to 18.9%. In summary, iScore in its best configuration can outperform traditional IR techniques by as much as 50.7%. iScore and its components are evaluated in the news recommendation task using three datasets from Yahoo! News, actual users, and Digg. iScore and its components are also evaluated in the TREC Adaptive Filter task using the Reuters RCV1 corpus.« less
Zheng, Jie; Erzurumluoglu, A Mesut; Elsworth, Benjamin L; Kemp, John P; Howe, Laurence; Haycock, Philip C; Hemani, Gibran; Tansey, Katherine; Laurin, Charles; Pourcain, Beate St; Warrington, Nicole M; Finucane, Hilary K; Price, Alkes L; Bulik-Sullivan, Brendan K; Anttila, Verneri; Paternoster, Lavinia; Gaunt, Tom R; Evans, David M; Neale, Benjamin M
2017-01-15
LD score regression is a reliable and efficient method of using genome-wide association study (GWAS) summary-level results data to estimate the SNP heritability of complex traits and diseases, partition this heritability into functional categories, and estimate the genetic correlation between different phenotypes. Because the method relies on summary level results data, LD score regression is computationally tractable even for very large sample sizes. However, publicly available GWAS summary-level data are typically stored in different databases and have different formats, making it difficult to apply LD score regression to estimate genetic correlations across many different traits simultaneously. In this manuscript, we describe LD Hub - a centralized database of summary-level GWAS results for 173 diseases/traits from different publicly available resources/consortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traits/diseases using LD Hub; and estimated SNP heritability and the genetic correlation across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis GWAS meta-analysis to examine the genetic correlation between the condition and other potentially related traits. In response to the growing availability of publicly accessible GWAS summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the LD score regression methodology, provide a useful database for the public dissemination of GWAS results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies. The web interface and instructions for using LD Hub are available at http://ldsc.broadinstitute.org/ CONTACT: jie.zheng@bristol.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Sanders, Sharon; Flaws, Dylan; Than, Martin; Pickering, John W; Doust, Jenny; Glasziou, Paul
2016-01-01
Scoring systems are developed to assist clinicians in making a diagnosis. However, their uptake is often limited because they are cumbersome to use, requiring information on many predictors, or complicated calculations. We examined whether, and how, simplifications affected the performance of a validated score for identifying adults with chest pain in an emergency department who have low risk of major adverse cardiac events. We simplified the Emergency Department Assessment of Chest pain Score (EDACS) by three methods: (1) giving equal weight to each predictor included in the score, (2) reducing the number of predictors, and (3) using both methods--giving equal weight to a reduced number of predictors. The diagnostic accuracy of the simplified scores was compared with the original score in the derivation (n = 1,974) and validation (n = 909) data sets. There was no difference in the overall accuracy of the simplified versions of the score compared with the original EDACS as measured by the area under the receiver operating characteristic curve (0.74 to 0.75 for simplified versions vs. 0.75 for the original score in the validation cohort). With score cut-offs set to maintain the sensitivity of the combination of score and tests (electrocardiogram and cardiac troponin) at a level acceptable to clinicians (99%), simplification reduced the proportion of patients classified as low risk from 50% with the original score to between 22% and 42%. Simplification of a clinical score resulted in similar overall accuracy but reduced the proportion classified as low risk and therefore eligible for early discharge compared with the original score. Whether the trade-off is acceptable, will depend on the context in which the score is to be used. Developers of clinical scores should consider simplification as a method to increase uptake, but further studies are needed to determine the best methods of deriving and evaluating simplified scores. Copyright © 2016 Elsevier Inc. All rights reserved.
Automated scoring system of standard uptake value for torso FDG-PET images
NASA Astrophysics Data System (ADS)
Hara, Takeshi; Kobayashi, Tatsunori; Kawai, Kazunao; Zhou, Xiangrong; Itoh, Satoshi; Katafuchi, Tetsuro; Fujita, Hiroshi
2008-03-01
The purpose of this work was to develop an automated method to calculate the score of SUV for torso region on FDG-PET scans. The three dimensional distributions for the mean and the standard deviation values of SUV were stored in each volume to score the SUV in corresponding pixel position within unknown scans. The modeling methods is based on SPM approach using correction technique of Euler characteristic and Resel (Resolution element). We employed 197 nor-mal cases (male: 143, female: 54) to assemble the normal metabolism distribution of FDG. The physique were registered each other in a rectangular parallelepiped shape using affine transformation and Thin-Plate-Spline technique. The regions of the three organs were determined based on semi-automated procedure. Seventy-three abnormal spots were used to estimate the effectiveness of the scoring methods. As a result, the score images correctly represented that the scores for normal cases were between zeros to plus/minus 2 SD. Most of the scores of abnormal spots associated with cancer were lager than the upper of the SUV interval of normal organs.
Quantification of myocardial fibrosis by digital image analysis and interactive stereology
2014-01-01
Background Cardiac fibrosis disrupts the normal myocardial structure and has a direct impact on heart function and survival. Despite already available digital methods, the pathologist’s visual score is still widely considered as ground truth and used as a primary method in histomorphometric evaluations. The aim of this study was to compare the accuracy of digital image analysis tools and the pathologist’s visual scoring for evaluating fibrosis in human myocardial biopsies, based on reference data obtained by point counting performed on the same images. Methods Endomyocardial biopsy material from 38 patients diagnosed with inflammatory dilated cardiomyopathy was used. The extent of total cardiac fibrosis was assessed by image analysis on Masson’s trichrome-stained tissue specimens using automated Colocalization and Genie software, by Stereology grid count and manually by Pathologist’s visual score. Results A total of 116 slides were analyzed. The mean results obtained by the Colocalization software (13.72 ± 12.24%) were closest to the reference value of stereology (RVS), while the Genie software and Pathologist score gave a slight underestimation. RVS values correlated strongly with values obtained using the Colocalization and Genie (r > 0.9, p < 0.001) software as well as the pathologist visual score. Differences in fibrosis quantification by Colocalization and RVS were statistically insignificant. However, significant bias was found in the results obtained by using Genie versus RVS and pathologist score versus RVS with mean difference values of: -1.61% and 2.24%. Bland-Altman plots showed a bidirectional bias dependent on the magnitude of the measurement: Colocalization software overestimated the area fraction of fibrosis in the lower end, and underestimated in the higher end of the RVS values. Meanwhile, Genie software as well as the pathologist score showed more uniform results throughout the values, with a slight underestimation in the mid-range for both. Conclusion Both applied digital image analysis methods revealed almost perfect correlation with the criterion standard obtained by stereology grid count and, in terms of accuracy, outperformed the pathologist’s visual score. Genie algorithm proved to be the method of choice with the only drawback of a slight underestimation bias, which is considered acceptable for both clinical and research evaluations. Virtual slides The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/9857909611227193 PMID:24912374
Stiegler, Marjorie; Hobbs, Gene; Martinelli, Susan M; Zvara, David; Arora, Harendra; Chen, Fei
2018-01-01
Background Simulation is an effective method for creating objective summative assessments of resident trainees. Real-time assessment (RTA) in simulated patient care environments is logistically challenging, especially when evaluating a large group of residents in multiple simulation scenarios. To date, there is very little data comparing RTA with delayed (hours, days, or weeks later) video-based assessment (DA) for simulation-based assessments of Accreditation Council for Graduate Medical Education (ACGME) sub-competency milestones. We hypothesized that sub-competency milestone evaluation scores obtained from DA, via audio-video recordings, are equivalent to the scores obtained from RTA. Methods Forty-one anesthesiology residents were evaluated in three separate simulated scenarios, representing different ACGME sub-competency milestones. All scenarios had one faculty member perform RTA and two additional faculty members perform DA. Subsequently, the scores generated by RTA were compared with the average scores generated by DA. Variance component analysis was conducted to assess the amount of variation in scores attributable to residents and raters. Results Paired t-tests showed no significant difference in scores between RTA and averaged DA for all cases. Cases 1, 2, and 3 showed an intraclass correlation coefficient (ICC) of 0.67, 0.85, and 0.50 for agreement between RTA scores and averaged DA scores, respectively. Analysis of variance of the scores assigned by the three raters showed a small proportion of variance attributable to raters (4% to 15%). Conclusions The results demonstrate that video-based delayed assessment is as reliable as real-time assessment, as both assessment methods yielded comparable scores. Based on a department’s needs or logistical constraints, our findings support the use of either real-time or delayed video evaluation for assessing milestones in a simulated patient care environment. PMID:29736352
Jakovljev, Aleksandra; Bergh, Kåre
2015-11-06
Bloodstream infections represent serious conditions carrying a high mortality and morbidity rate. Rapid identification of microorganisms and prompt institution of adequate antimicrobial therapy is of utmost importance for a successful outcome. Aiming at the development of a rapid, simplified and efficient protocol, we developed and compared two in-house preparatory methods for the direct identification of bacteria from positive blood culture flasks (BD BACTEC FX system) by using matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI TOF MS). Both methods employed saponin and distilled water for erythrocyte lysis. In method A the cellular pellet was overlaid with formic acid on the MALDI TOF target plate for protein extraction, whereas in method B the pellet was exposed to formic acid followed by acetonitrile prior to placing on the target plate. Best results were obtained by method A. Direct identification was achieved for 81.9 % and 65.8 % (50.3 % and 26.2 % with scores >2.0) of organisms by method A and method B, respectively. Overall concordance with final identification was 100 % to genus and 97.9 % to species level. By applying a lower cut-off score value, the levels of identification obtained by method A and method B increased to 89.3 % and 77.8 % of organisms (81.9 % and 65.8 % identified with scores >1.7), respectively. Using the lowered score criteria, concordance with final results was obtained for 99.3 % of genus and 96.6 % of species identifications. The reliability of results, rapid performance (approximately 25 min) and applicability of in-house method A have contributed to implementation of this robust and cost-effective method in our laboratory.
Quantifying Leisure Physical Activity and Its Relation to Bone Density and Strength
SHEDD, KRISTINE M.; HANSON, KATHY B.; ALEKEL, D. LEE; SCHIFERL, DANIEL J.; HANSON, LAURA N.; VAN LOAN, MARTA D.
2010-01-01
Purpose Compare three published methods of quantifying physical activity (total activity, peak strain, and bone-loading exposure (BLE) scores) and identify their associations with areal bone mineral density (aBMD), volumetric BMD (vBMD), and bone strength. Methods Postmenopausal women (N = 239; mean age: 53.8 yr) from Iowa (ISU) and California (UCD) completed the Paffenbarger Physical Activity Questionnaire, which was scored with each method. Dual energy x-ray absorptiometry assessed aBMD at the spine, hip, and femoral neck, and peripheral quantitative computed tomography (pQCT) measured vBMD and bone strength properties at the distal tibia and midshaft femur. Results UCD women had higher total activity scores and hours per week of leisure activity. All scoring methods were correlated with each other. No method was associated with aBMD. Peak strain score was negatively associated with polar moment of inertia and strength–strain index at the tibia, and total activity score was positively associated with cortical area and thickness at the femur. Separating by geographic site, the peak strain and hip BLE scores were negatively associated with pQCT measures at the tibia and femur among ISU subjects. Among UCD women, no method was significantly associated with any tibia measure, but total activity score was positively associated with measures at the femur (P < 0.05 for all associations). Conclusion Given the significantly greater hours per week of leisure activity done by UCD subjects, duration may be an important determinant of the effect physical activity has on bone. The positive association between leisure physical activity (assessed by the total activity score) and cortical bone measures in postmenopausal women may indicate a lifestyle factor that can help offset age-related bone loss. PMID:18046190
Dong, Chengliang; Wei, Peng; Jian, Xueqiu; Gibbs, Richard; Boerwinkle, Eric; Wang, Kai; Liu, Xiaoming
2015-01-01
Accurate deleteriousness prediction for nonsynonymous variants is crucial for distinguishing pathogenic mutations from background polymorphisms in whole exome sequencing (WES) studies. Although many deleteriousness prediction methods have been developed, their prediction results are sometimes inconsistent with each other and their relative merits are still unclear in practical applications. To address these issues, we comprehensively evaluated the predictive performance of 18 current deleteriousness-scoring methods, including 11 function prediction scores (PolyPhen-2, SIFT, MutationTaster, Mutation Assessor, FATHMM, LRT, PANTHER, PhD-SNP, SNAP, SNPs&GO and MutPred), 3 conservation scores (GERP++, SiPhy and PhyloP) and 4 ensemble scores (CADD, PON-P, KGGSeq and CONDEL). We found that FATHMM and KGGSeq had the highest discriminative power among independent scores and ensemble scores, respectively. Moreover, to ensure unbiased performance evaluation of these prediction scores, we manually collected three distinct testing datasets, on which no current prediction scores were tuned. In addition, we developed two new ensemble scores that integrate nine independent scores and allele frequency. Our scores achieved the highest discriminative power compared with all the deleteriousness prediction scores tested and showed low false-positive prediction rate for benign yet rare nonsynonymous variants, which demonstrated the value of combining information from multiple orthologous approaches. Finally, to facilitate variant prioritization in WES studies, we have pre-computed our ensemble scores for 87 347 044 possible variants in the whole-exome and made them publicly available through the ANNOVAR software and the dbNSFP database. PMID:25552646
Do Examinees Understand Score Reports for Alternate Methods of Scoring Computer Based Tests?
ERIC Educational Resources Information Center
Whittaker, Tiffany A.; Williams, Natasha J.; Dodd, Barbara G.
2011-01-01
This study assessed the interpretability of scaled scores based on either number correct (NC) scoring for a paper-and-pencil test or one of two methods of scoring computer-based tests: an item pattern (IP) scoring method and a method based on equated NC scoring. The equated NC scoring method for computer-based tests was proposed as an alternative…
Ramanujam, Nedunchelian; Kaliappan, Manivannan
2016-01-01
Nowadays, automatic multidocument text summarization systems can successfully retrieve the summary sentences from the input documents. But, it has many limitations such as inaccurate extraction to essential sentences, low coverage, poor coherence among the sentences, and redundancy. This paper introduces a new concept of timestamp approach with Naïve Bayesian Classification approach for multidocument text summarization. The timestamp provides the summary an ordered look, which achieves the coherent looking summary. It extracts the more relevant information from the multiple documents. Here, scoring strategy is also used to calculate the score for the words to obtain the word frequency. The higher linguistic quality is estimated in terms of readability and comprehensibility. In order to show the efficiency of the proposed method, this paper presents the comparison between the proposed methods with the existing MEAD algorithm. The timestamp procedure is also applied on the MEAD algorithm and the results are examined with the proposed method. The results show that the proposed method results in lesser time than the existing MEAD algorithm to execute the summarization process. Moreover, the proposed method results in better precision, recall, and F-score than the existing clustering with lexical chaining approach. PMID:27034971
Cecilio-Fernandes, Dario; Medema, Harro; Collares, Carlos Fernando; Schuwirth, Lambert; Cohen-Schotanus, Janke; Tio, René A
2017-11-09
Progress testing is an assessment tool used to periodically assess all students at the end-of-curriculum level. Because students cannot know everything, it is important that they recognize their lack of knowledge. For that reason, the formula-scoring method has usually been used. However, where partial knowledge needs to be taken into account, the number-right scoring method is used. Research comparing both methods has yielded conflicting results. As far as we know, in all these studies, Classical Test Theory or Generalizability Theory was used to analyze the data. In contrast to these studies, we will explore the use of the Rasch model to compare both methods. A 2 × 2 crossover design was used in a study where 298 students from four medical schools participated. A sample of 200 previously used questions from the progress tests was selected. The data were analyzed using the Rasch model, which provides fit parameters, reliability coefficients, and response option analysis. The fit parameters were in the optimal interval ranging from 0.50 to 1.50, and the means were around 1.00. The person and item reliability coefficients were higher in the number-right condition than in the formula-scoring condition. The response option analysis showed that the majority of dysfunctional items emerged in the formula-scoring condition. The findings of this study support the use of number-right scoring over formula scoring. Rasch model analyses showed that tests with number-right scoring have better psychometric properties than formula scoring. However, choosing the appropriate scoring method should depend not only on psychometric properties but also on self-directed test-taking strategies and metacognitive skills.
Improving IQ measurement in intellectual disabilities using true deviation from population norms
2014-01-01
Background Intellectual disability (ID) is characterized by global cognitive deficits, yet the very IQ tests used to assess ID have limited range and precision in this population, especially for more impaired individuals. Methods We describe the development and validation of a method of raw z-score transformation (based on general population norms) that ameliorates floor effects and improves the precision of IQ measurement in ID using the Stanford Binet 5 (SB5) in fragile X syndrome (FXS; n = 106), the leading inherited cause of ID, and in individuals with idiopathic autism spectrum disorder (ASD; n = 205). We compared the distributional characteristics and Q-Q plots from the standardized scores with the deviation z-scores. Additionally, we examined the relationship between both scoring methods and multiple criterion measures. Results We found evidence that substantial and meaningful variation in cognitive ability on standardized IQ tests among individuals with ID is lost when converting raw scores to standardized scaled, index and IQ scores. Use of the deviation z- score method rectifies this problem, and accounts for significant additional variance in criterion validation measures, above and beyond the usual IQ scores. Additionally, individual and group-level cognitive strengths and weaknesses are recovered using deviation scores. Conclusion Traditional methods for generating IQ scores in lower functioning individuals with ID are inaccurate and inadequate, leading to erroneously flat profiles. However assessment of cognitive abilities is substantially improved by measuring true deviation in performance from standardization sample norms. This work has important implications for standardized test development, clinical assessment, and research for which IQ is an important measure of interest in individuals with neurodevelopmental disorders and other forms of cognitive impairment. PMID:26491488
Burden, Anne; Roche, Nicolas; Miglio, Cristiana; Hillyer, Elizabeth V; Postma, Dirkje S; Herings, Ron Mc; Overbeek, Jetty A; Khalid, Javaria Mona; van Eickels, Daniela; Price, David B
2017-01-01
Cohort matching and regression modeling are used in observational studies to control for confounding factors when estimating treatment effects. Our objective was to evaluate exact matching and propensity score methods by applying them in a 1-year pre-post historical database study to investigate asthma-related outcomes by treatment. We drew on longitudinal medical record data in the PHARMO database for asthma patients prescribed the treatments to be compared (ciclesonide and fine-particle inhaled corticosteroid [ICS]). Propensity score methods that we evaluated were propensity score matching (PSM) using two different algorithms, the inverse probability of treatment weighting (IPTW), covariate adjustment using the propensity score, and propensity score stratification. We defined balance, using standardized differences, as differences of <10% between cohorts. Of 4064 eligible patients, 1382 (34%) were prescribed ciclesonide and 2682 (66%) fine-particle ICS. The IPTW and propensity score-based methods retained more patients (96%-100%) than exact matching (90%); exact matching selected less severe patients. Standardized differences were >10% for four variables in the exact-matched dataset and <10% for both PSM algorithms and the weighted pseudo-dataset used in the IPTW method. With all methods, ciclesonide was associated with better 1-year asthma-related outcomes, at one-third the prescribed dose, than fine-particle ICS; results varied slightly by method, but direction and statistical significance remained the same. We found that each method has its particular strengths, and we recommend at least two methods be applied for each matched cohort study to evaluate the robustness of the findings. Balance diagnostics should be applied with all methods to check the balance of confounders between treatment cohorts. If exact matching is used, the calculation of a propensity score could be useful to identify variables that require balancing, thereby informing the choice of matching criteria together with clinical considerations.
Deep learning of mutation-gene-drug relations from the literature.
Lee, Kyubum; Kim, Byounggun; Choi, Yonghwa; Kim, Sunkyu; Shin, Wonho; Lee, Sunwon; Park, Sungjoon; Kim, Seongsoon; Tan, Aik Choon; Kang, Jaewoo
2018-01-25
Molecular biomarkers that can predict drug efficacy in cancer patients are crucial components for the advancement of precision medicine. However, identifying these molecular biomarkers remains a laborious and challenging task. Next-generation sequencing of patients and preclinical models have increasingly led to the identification of novel gene-mutation-drug relations, and these results have been reported and published in the scientific literature. Here, we present two new computational methods that utilize all the PubMed articles as domain specific background knowledge to assist in the extraction and curation of gene-mutation-drug relations from the literature. The first method uses the Biomedical Entity Search Tool (BEST) scoring results as some of the features to train the machine learning classifiers. The second method uses not only the BEST scoring results, but also word vectors in a deep convolutional neural network model that are constructed from and trained on numerous documents such as PubMed abstracts and Google News articles. Using the features obtained from both the BEST search engine scores and word vectors, we extract mutation-gene and mutation-drug relations from the literature using machine learning classifiers such as random forest and deep convolutional neural networks. Our methods achieved better results compared with the state-of-the-art methods. We used our proposed features in a simple machine learning model, and obtained F1-scores of 0.96 and 0.82 for mutation-gene and mutation-drug relation classification, respectively. We also developed a deep learning classification model using convolutional neural networks, BEST scores, and the word embeddings that are pre-trained on PubMed or Google News data. Using deep learning, the classification accuracy improved, and F1-scores of 0.96 and 0.86 were obtained for the mutation-gene and mutation-drug relations, respectively. We believe that our computational methods described in this research could be used as an important tool in identifying molecular biomarkers that predict drug responses in cancer patients. We also built a database of these mutation-gene-drug relations that were extracted from all the PubMed abstracts. We believe that our database can prove to be a valuable resource for precision medicine researchers.
FLiGS Score: A New Method of Outcome Assessment for Lip Carcinoma–Treated Patients
Grassi, Rita; Toia, Francesca; Di Rosa, Luigi; Cordova, Adriana
2015-01-01
Background: Lip cancer and its treatment have considerable functional and cosmetic effects with resultant nutritional and physical detriments. As we continue to investigate new treatment regimens, we are simultaneously required to assess postoperative outcomes to design interventions that lessen the adverse impact of this disease process. We wish to introduce Functional Lip Glasgow Scale (FLiGS) score as a new method of outcome assessment to measure the effect of lip cancer and its treatment on patients’ daily functioning. Methods: Fifty patients affected by lip squamous cell carcinoma were recruited between 2009 and 2013. Patients were asked to fill the FLiGS questionnaire before surgery, 1 month, 6 months, and 1 year after surgery. The subscores were used to calculate a total FLiGS score of global oral disability. Statistical analysis was performed to test validity and reliability. Results: FLiGS scores improved significantly from preoperative to 12 months postoperative values (P = 0.000). Statistical evidence of validity was provided through rs (Spearman correlation coefficient) that resulted >0.30 for all surveys and for which P < 0.001. FLiGS score reliability was shown through examination of internal consistency and test-retest reliability. Conclusions: FLiGS score is a simple way of assessing functional impairment related to lip cancer before and after surgery; it is sensitive, valid, reliable, and clinically relevant: it provides useful information to orient the physician in the postoperative management and in the rehabilitation program. PMID:26034652
Effect of Item Arrangement, Knowledge of Arrangement, and Test Anxiety on Two Scoring Methods.
ERIC Educational Resources Information Center
Plake, Barbara S.; And Others
1981-01-01
Number right and elimination scores were analyzed on a college level mathematics exam assembled from pretest data. Anxiety measures were administered along with the experimental forms to undergraduates. Results suggest that neither test scores nor attitudes are influenced by item order knowledge thereof, or anxiety level. (Author/GK)
MRL and SuperFine+MRL: new supertree methods
2012-01-01
Background Supertree methods combine trees on subsets of the full taxon set together to produce a tree on the entire set of taxa. Of the many supertree methods, the most popular is MRP (Matrix Representation with Parsimony), a method that operates by first encoding the input set of source trees by a large matrix (the "MRP matrix") over {0,1, ?}, and then running maximum parsimony heuristics on the MRP matrix. Experimental studies evaluating MRP in comparison to other supertree methods have established that for large datasets, MRP generally produces trees of equal or greater accuracy than other methods, and can run on larger datasets. A recent development in supertree methods is SuperFine+MRP, a method that combines MRP with a divide-and-conquer approach, and produces more accurate trees in less time than MRP. In this paper we consider a new approach for supertree estimation, called MRL (Matrix Representation with Likelihood). MRL begins with the same MRP matrix, but then analyzes the MRP matrix using heuristics (such as RAxML) for 2-state Maximum Likelihood. Results We compared MRP and SuperFine+MRP with MRL and SuperFine+MRL on simulated and biological datasets. We examined the MRP and MRL scores of each method on a wide range of datasets, as well as the resulting topological accuracy of the trees. Our experimental results show that MRL, coupled with a very good ML heuristic such as RAxML, produced more accurate trees than MRP, and MRL scores were more strongly correlated with topological accuracy than MRP scores. Conclusions SuperFine+MRP, when based upon a good MP heuristic, such as TNT, produces among the best scores for both MRP and MRL, and is generally faster and more topologically accurate than other supertree methods we tested. PMID:22280525
Piram, Maryam; Frenkel, Joost; Gattorno, Marco; Ozen, Seza; Lachmann, Helen J; Goldbach-Mansky, Raphaela; Hentgen, Véronique; Neven, Bénédicte; Stankovic Stojanovic, Katia; Simon, Anna; Kuemmerle-Deschner, Jasmin; Hoffman, Hal; Stojanov, Silvia; Duquesne, Agnès; Pillet, Pascal; Martini, Alberto; Pouchot, Jacques; Koné-Paut, Isabelle
2012-01-01
Background The systemic autoinflammatory disorders (SAID) share many clinical manifestations, albeit with variable patterns, intensity and frequency. A common definition of disease activity would be rational and useful in the management of these lifelong diseases. Moreover, standardised disease activity scores are required for the assessment of new therapies in constant development. The aim of this study was to develop preliminary activity scores for familial Mediterranean fever, mevalonate kinase deficiency, tumour necrosis factor receptor-1-associated periodic syndrome and cryopyrin-associated periodic syndromes (CAPS). Methods The study was conducted using two well-recognised consensus formation methods: the Delphi technique and the nominal group technique. The results from a two-step survey and data from parent/patient interviews were used as preliminary data to develop the agenda for a consensus conference to build a provisional scoring system. Results 24 of 65 experts in SAID from 20 countries answered the web questionnaire and 16 attended the consensus conference. There was consensus agreement to develop separate activity scores for each disease but with a common format based on patient diaries. Fever and disease-specific clinical variables were scored according to their severity. A final score was generated by summing the score of all the variables divided by the number of days over which the diary was completed. Scores varied from 0 to 16 (0–13 in CAPS). These scores were developed for the purpose of clinical studies but could be used in clinical practice. Conclusion Using widely recognised consensus formation techniques, preliminary scores were obtained to measure disease activity in four main SAID. Further prospective validation study of this instrument will follow. PMID:21081528
A Method for the Alignment of Heterogeneous Macromolecules from Electron Microscopy
Shatsky, Maxim; Hall, Richard J.; Brenner, Steven E.; Glaeser, Robert M.
2009-01-01
We propose a feature-based image alignment method for single-particle electron microscopy that is able to accommodate various similarity scoring functions while efficiently sampling the two-dimensional transformational space. We use this image alignment method to evaluate the performance of a scoring function that is based on the Mutual Information (MI) of two images rather than one that is based on the cross-correlation function. We show that alignment using MI for the scoring function has far less model-dependent bias than is found with cross-correlation based alignment. We also demonstrate that MI improves the alignment of some types of heterogeneous data, provided that the signal to noise ratio is relatively high. These results indicate, therefore, that use of MI as the scoring function is well suited for the alignment of class-averages computed from single particle images. Our method is tested on data from three model structures and one real dataset. PMID:19166941
Wolfe, Frederick; Lane, Nancy E; Buckland-Wright, Chris
2002-12-01
Current radiographic evaluation of knee osteoarthritis (OA) depends primarily on the presence and severity of joint space narrowing (JSN) and osteophytes. Radiographic JSN is a function of the actual JSN caused by articular cartilage loss and the observable JSN artifactually caused when the tibial and femoral surfaces diverge due to variations in patient's knee position. Views yielding the greatest JSN are the most accurate. Osteophytes are also dependent on positioning. This study investigated the consequences of positioning on JSN and osteophytes in clinical studies in which the outcome of OA knee is scored. In total, 1105 patients underwent 1175 paired radiographic examinations using weight-bearing (WB) standard anterior-posterior (AP) extended knee views (AP-WB), semiflexed WB posterior-anterior views with the knee in contact with the film and the 1st metatarsophalangeal (MTP) joint under the film plane (MTP) (method of Buckland-Wright), and WB PA views with the tip of the great toe at the film plane, 20 degrees of knee flexion and 5 degrees downward angulation of the x-ray tube (schuss-tunnel view). Careful attention was given to proper positioning. JSN and osteophytes were scored on a 0-3 scale. JSN was significantly greater by the MTP and schuss-tunnel methods than by the AP-WB method, but no difference was found between the MTP and schuss-tunnel methods. In addition, disagreement was identified in 34% of MTP and AP-WB scores. In 69.3% of disagreements the scores were more abnormal in the MTP view. When the disagreements were studied, the mean MTP score was 1.68 compared to 1.12 for the AP-WB score. Fifty-seven knees were scored as 3 by the MTP view and as 2 by the AP-WB, and 8 knees were scored as 3 by the AP-WB view and 2 by the MTP view. Little difference in osteophytes was noted among the 3 methods, although fewer osteophytes were identified by the schuss-tunnel method than the AP-WB method. Using the clinical reading methods of this study, the MTP and schuss-tunnel views were equivalent when compared to each other. When compared with the AP-WB view, the schuss-tunnel view resulted in a lower osteophyte score. These results, based on clinical readings, are similar to previous computerized analyses that indicated that the MTP and schuss-tunnel views were superior to the AP-WB, but that the MTP view was superior to the schuss-tunnel view.
Sengupta Chattopadhyay, Amrita; Hsiao, Ching-Lin; Chang, Chien Ching; Lian, Ie-Bin; Fann, Cathy S J
2014-01-01
Identifying susceptibility genes that influence complex diseases is extremely difficult because loci often influence the disease state through genetic interactions. Numerous approaches to detect disease-associated SNP-SNP interactions have been developed, but none consistently generates high-quality results under different disease scenarios. Using summarizing techniques to combine a number of existing methods may provide a solution to this problem. Here we used three popular non-parametric methods-Gini, absolute probability difference (APD), and entropy-to develop two novel summary scores, namely principle component score (PCS) and Z-sum score (ZSS), with which to predict disease-associated genetic interactions. We used a simulation study to compare performance of the non-parametric scores, the summary scores, the scaled-sum score (SSS; used in polymorphism interaction analysis (PIA)), and the multifactor dimensionality reduction (MDR). The non-parametric methods achieved high power, but no non-parametric method outperformed all others under a variety of epistatic scenarios. PCS and ZSS, however, outperformed MDR. PCS, ZSS and SSS displayed controlled type-I-errors (<0.05) compared to GS, APDS, ES (>0.05). A real data study using the genetic-analysis-workshop 16 (GAW 16) rheumatoid arthritis dataset identified a number of interesting SNP-SNP interactions. © 2013 Elsevier B.V. All rights reserved.
Verification of learner’s differences by team-based learning in biochemistry classes
2017-01-01
Purpose We tested the effect of team-based learning (TBL) on medical education through the second-year premedical students’ TBL scores in biochemistry classes over 5 years. Methods We analyzed the results based on test scores before and after the students’ debate. The groups of students for statistical analysis were divided as follows: group 1 comprised the top-ranked students, group 3 comprised the low-ranked students, and group 2 comprised the medium-ranked students. Therefore, group T comprised 382 students (the total number of students in group 1, 2, and 3). To calibrate the difficulty of the test, original scores were converted into standardized scores. We determined the differences of the tests using Student t-test, and the relationship between scores before, and after the TBL using linear regression tests. Results Although there was a decrease in the lowest score, group T and 3 showed a significant increase in both original and standardized scores; there was also an increase in the standardized score of group 3. There was a positive correlation between the pre- and the post-debate scores in group T, and 2. And the beta values of the pre-debate scores and “the changes between the pre- and post-debate scores” were statistically significant in both original and standardized scores. Conclusion TBL is one of the educational methods for helping students improve their grades, particularly those of low-ranked students. PMID:29207457
Austin, Peter C; Schuster, Tibor
2016-10-01
Observational studies are increasingly being used to estimate the effect of treatments, interventions and exposures on outcomes that can occur over time. Historically, the hazard ratio, which is a relative measure of effect, has been reported. However, medical decision making is best informed when both relative and absolute measures of effect are reported. When outcomes are time-to-event in nature, the effect of treatment can also be quantified as the change in mean or median survival time due to treatment and the absolute reduction in the probability of the occurrence of an event within a specified duration of follow-up. We describe how three different propensity score methods, propensity score matching, stratification on the propensity score and inverse probability of treatment weighting using the propensity score, can be used to estimate absolute measures of treatment effect on survival outcomes. These methods are all based on estimating marginal survival functions under treatment and lack of treatment. We then conducted an extensive series of Monte Carlo simulations to compare the relative performance of these methods for estimating the absolute effects of treatment on survival outcomes. We found that stratification on the propensity score resulted in the greatest bias. Caliper matching on the propensity score and a method based on earlier work by Cole and Hernán tended to have the best performance for estimating absolute effects of treatment on survival outcomes. When the prevalence of treatment was less extreme, then inverse probability of treatment weighting-based methods tended to perform better than matching-based methods. © The Author(s) 2014.
The creation, management, and use of data quality information for life cycle assessment.
Edelen, Ashley; Ingwersen, Wesley W
2018-04-01
Despite growing access to data, questions of "best fit" data and the appropriate use of results in supporting decision making still plague the life cycle assessment (LCA) community. This discussion paper addresses revisions to assessing data quality captured in a new US Environmental Protection Agency guidance document as well as additional recommendations on data quality creation, management, and use in LCA databases and studies. Existing data quality systems and approaches in LCA were reviewed and tested. The evaluations resulted in a revision to a commonly used pedigree matrix, for which flow and process level data quality indicators are described, more clarity for scoring criteria, and further guidance on interpretation are given. Increased training for practitioners on data quality application and its limits are recommended. A multi-faceted approach to data quality assessment utilizing the pedigree method alongside uncertainty analysis in result interpretation is recommended. A method of data quality score aggregation is proposed and recommendations for usage of data quality scores in existing data are made to enable improved use of data quality scores in LCA results interpretation. Roles for data generators, data repositories, and data users are described in LCA data quality management. Guidance is provided on using data with data quality scores from other systems alongside data with scores from the new system. The new pedigree matrix and recommended data quality aggregation procedure can now be implemented in openLCA software. Additional ways in which data quality assessment might be improved and expanded are described. Interoperability efforts in LCA data should focus on descriptors to enable user scoring of data quality rather than translation of existing scores. Developing and using data quality indicators for additional dimensions of LCA data, and automation of data quality scoring through metadata extraction and comparison to goal and scope are needed.
Shi, Xiaohu; Zhang, Jingfen; He, Zhiquan; Shang, Yi; Xu, Dong
2011-09-01
One of the major challenges in protein tertiary structure prediction is structure quality assessment. In many cases, protein structure prediction tools generate good structural models, but fail to select the best models from a huge number of candidates as the final output. In this study, we developed a sampling-based machine-learning method to rank protein structural models by integrating multiple scores and features. First, features such as predicted secondary structure, solvent accessibility and residue-residue contact information are integrated by two Radial Basis Function (RBF) models trained from different datasets. Then, the two RBF scores and five selected scoring functions developed by others, i.e., Opus-CA, Opus-PSP, DFIRE, RAPDF, and Cheng Score are synthesized by a sampling method. At last, another integrated RBF model ranks the structural models according to the features of sampling distribution. We tested the proposed method by using two different datasets, including the CASP server prediction models of all CASP8 targets and a set of models generated by our in-house software MUFOLD. The test result shows that our method outperforms any individual scoring function on both best model selection, and overall correlation between the predicted ranking and the actual ranking of structural quality.
Compression of next-generation sequencing quality scores using memetic algorithm
2014-01-01
Background The exponential growth of next-generation sequencing (NGS) derived DNA data poses great challenges to data storage and transmission. Although many compression algorithms have been proposed for DNA reads in NGS data, few methods are designed specifically to handle the quality scores. Results In this paper we present a memetic algorithm (MA) based NGS quality score data compressor, namely MMQSC. The algorithm extracts raw quality score sequences from FASTQ formatted files, and designs compression codebook using MA based multimodal optimization. The input data is then compressed in a substitutional manner. Experimental results on five representative NGS data sets show that MMQSC obtains higher compression ratio than the other state-of-the-art methods. Particularly, MMQSC is a lossless reference-free compression algorithm, yet obtains an average compression ratio of 22.82% on the experimental data sets. Conclusions The proposed MMQSC compresses NGS quality score data effectively. It can be utilized to improve the overall compression ratio on FASTQ formatted files. PMID:25474747
Feature and Score Fusion Based Multiple Classifier Selection for Iris Recognition
Islam, Md. Rabiul
2014-01-01
The aim of this work is to propose a new feature and score fusion based iris recognition approach where voting method on Multiple Classifier Selection technique has been applied. Four Discrete Hidden Markov Model classifiers output, that is, left iris based unimodal system, right iris based unimodal system, left-right iris feature fusion based multimodal system, and left-right iris likelihood ratio score fusion based multimodal system, is combined using voting method to achieve the final recognition result. CASIA-IrisV4 database has been used to measure the performance of the proposed system with various dimensions. Experimental results show the versatility of the proposed system of four different classifiers with various dimensions. Finally, recognition accuracy of the proposed system has been compared with existing N hamming distance score fusion approach proposed by Ma et al., log-likelihood ratio score fusion approach proposed by Schmid et al., and single level feature fusion approach proposed by Hollingsworth et al. PMID:25114676
Feature and score fusion based multiple classifier selection for iris recognition.
Islam, Md Rabiul
2014-01-01
The aim of this work is to propose a new feature and score fusion based iris recognition approach where voting method on Multiple Classifier Selection technique has been applied. Four Discrete Hidden Markov Model classifiers output, that is, left iris based unimodal system, right iris based unimodal system, left-right iris feature fusion based multimodal system, and left-right iris likelihood ratio score fusion based multimodal system, is combined using voting method to achieve the final recognition result. CASIA-IrisV4 database has been used to measure the performance of the proposed system with various dimensions. Experimental results show the versatility of the proposed system of four different classifiers with various dimensions. Finally, recognition accuracy of the proposed system has been compared with existing N hamming distance score fusion approach proposed by Ma et al., log-likelihood ratio score fusion approach proposed by Schmid et al., and single level feature fusion approach proposed by Hollingsworth et al.
ERIC Educational Resources Information Center
Holley, Hope D.
2017-01-01
Despite research that high-stakes tests do not improve knowledge, Florida requires students to pass an Algebra I End-of-Course exam (EOC) to earn a high school diploma. Test passing scores are determined by a raw score to t-score to scale score analysis. This method ultimately results as a comparative test model where students' passage is…
NASA Astrophysics Data System (ADS)
Wu, Jing; Ferns, Gordon; Giles, John; Lewis, Emma
2012-03-01
Inter- and intra- observer variability is a problem often faced when an expert or observer is tasked with assessing the severity of a disease. This issue is keenly felt in coronary calcium scoring of patients suffering from atherosclerosis where in clinical practice, the observer must identify firstly the presence, followed by the location of candidate calcified plaques found within the coronary arteries that may prevent oxygenated blood flow to the heart muscle. However, it can be difficult for a human observer to differentiate calcified plaques that are located in the coronary arteries from those found in surrounding anatomy such as the mitral valve or pericardium. In addition to the benefits to scoring accuracy, the use of fast, low dose multi-slice CT imaging to perform the cardiac scan is capable of acquiring the entire heart within a single breath hold. Thus exposing the patient to lower radiation dose, which for a progressive disease such as atherosclerosis where multiple scans may be required, is beneficial to their health. Presented here is a fully automated method for calcium scoring using both the traditional Agatston method, as well as the volume scoring method. Elimination of the unwanted regions of the cardiac image slices such as lungs, ribs, and vertebrae is carried out using adaptive heart isolation. Such regions cannot contain calcified plaques but can be of a similar intensity and their removal will aid detection. Removal of both the ascending and descending aortas, as they contain clinical insignificant plaques, is necessary before the final calcium scores are calculated and examined against ground truth scores of three averaged expert observer results. The results presented here are intended to show the feasibility and requirement for an automated scoring method to reduce the subjectivity and reproducibility error inherent with manual clinical calcium scoring.
Scoring clustering solutions by their biological relevance.
Gat-Viks, I; Sharan, R; Shamir, R
2003-12-12
A central step in the analysis of gene expression data is the identification of groups of genes that exhibit similar expression patterns. Clustering gene expression data into homogeneous groups was shown to be instrumental in functional annotation, tissue classification, regulatory motif identification, and other applications. Although there is a rich literature on clustering algorithms for gene expression analysis, very few works addressed the systematic comparison and evaluation of clustering results. Typically, different clustering algorithms yield different clustering solutions on the same data, and there is no agreed upon guideline for choosing among them. We developed a novel statistically based method for assessing a clustering solution according to prior biological knowledge. Our method can be used to compare different clustering solutions or to optimize the parameters of a clustering algorithm. The method is based on projecting vectors of biological attributes of the clustered elements onto the real line, such that the ratio of between-groups and within-group variance estimators is maximized. The projected data are then scored using a non-parametric analysis of variance test, and the score's confidence is evaluated. We validate our approach using simulated data and show that our scoring method outperforms several extant methods, including the separation to homogeneity ratio and the silhouette measure. We apply our method to evaluate results of several clustering methods on yeast cell-cycle gene expression data. The software is available from the authors upon request.
Comparison of Manual Refraction Versus Autorefraction in 60 Diabetic Retinopathy Patients.
Shirzadi, Keyvan; Shahraki, Kourosh; Yahaghi, Emad; Makateb, Ali; Khosravifard, Keivan
2016-07-27
The purpose of the study was to evaluate the comparison of manual refraction versus autorefraction in diabetic retinopathy patients. The study was conducted at the Be'sat Army Hospital from 2013-2015. In the present study differences between two common refractometry methods (manual refractometry and Auto refractometry) in diagnosis and follow up of retinopathy in patients affected with diabetes is investigated. Our results showed that there is a significant difference in visual acuity score of patients between manual and auto refractometry. Despite this fact, spherical equivalent scores of two methods of refractometry did not show a significant statistical difference in the patients. Although use of manual refraction is comparable with autorefraction in evaluating spherical equivalent scores in diabetic patients affected with retinopathy, but in the case of visual acuity results from these two methods are not comparable.
Prediction of true test scores from observed item scores and ancillary data.
Haberman, Shelby J; Yao, Lili; Sinharay, Sandip
2015-05-01
In many educational tests which involve constructed responses, a traditional test score is obtained by adding together item scores obtained through holistic scoring by trained human raters. For example, this practice was used until 2008 in the case of GRE(®) General Analytical Writing and until 2009 in the case of TOEFL(®) iBT Writing. With use of natural language processing, it is possible to obtain additional information concerning item responses from computer programs such as e-rater(®). In addition, available information relevant to examinee performance may include scores on related tests. We suggest application of standard results from classical test theory to the available data to obtain best linear predictors of true traditional test scores. In performing such analysis, we require estimation of variances and covariances of measurement errors, a task which can be quite difficult in the case of tests with limited numbers of items and with multiple measurements per item. As a consequence, a new estimation method is suggested based on samples of examinees who have taken an assessment more than once. Such samples are typically not random samples of the general population of examinees, so that we apply statistical adjustment methods to obtain the needed estimated variances and covariances of measurement errors. To examine practical implications of the suggested methods of analysis, applications are made to GRE General Analytical Writing and TOEFL iBT Writing. Results obtained indicate that substantial improvements are possible both in terms of reliability of scoring and in terms of assessment reliability. © 2015 The British Psychological Society.
RAId_aPS: MS/MS Analysis with Multiple Scoring Functions and Spectrum-Specific Statistics
Alves, Gelio; Ogurtsov, Aleksey Y.; Yu, Yi-Kuo
2010-01-01
Statistically meaningful comparison/combination of peptide identification results from various search methods is impeded by the lack of a universal statistical standard. Providing an -value calibration protocol, we demonstrated earlier the feasibility of translating either the score or heuristic -value reported by any method into the textbook-defined -value, which may serve as the universal statistical standard. This protocol, although robust, may lose spectrum-specific statistics and might require a new calibration when changes in experimental setup occur. To mitigate these issues, we developed a new MS/MS search tool, RAId_aPS, that is able to provide spectrum-specific -values for additive scoring functions. Given a selection of scoring functions out of RAId score, K-score, Hyperscore and XCorr, RAId_aPS generates the corresponding score histograms of all possible peptides using dynamic programming. Using these score histograms to assign -values enables a calibration-free protocol for accurate significance assignment for each scoring function. RAId_aPS features four different modes: (i) compute the total number of possible peptides for a given molecular mass range, (ii) generate the score histogram given a MS/MS spectrum and a scoring function, (iii) reassign -values for a list of candidate peptides given a MS/MS spectrum and the scoring functions chosen, and (iv) perform database searches using selected scoring functions. In modes (iii) and (iv), RAId_aPS is also capable of combining results from different scoring functions using spectrum-specific statistics. The web link is http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/raid_aps/index.html. Relevant binaries for Linux, Windows, and Mac OS X are available from the same page. PMID:21103371
Initial assessment of facial nerve paralysis based on motion analysis using an optical flow method.
Samsudin, Wan Syahirah W; Sundaraj, Kenneth; Ahmad, Amirozi; Salleh, Hasriah
2016-01-01
An initial assessment method that can classify as well as categorize the severity of paralysis into one of six levels according to the House-Brackmann (HB) system based on facial landmarks motion using an Optical Flow (OF) algorithm is proposed. The desired landmarks were obtained from the video recordings of 5 normal and 3 Bell's Palsy subjects and tracked using the Kanade-Lucas-Tomasi (KLT) method. A new scoring system based on the motion analysis using area measurement is proposed. This scoring system uses the individual scores from the facial exercises and grades the paralysis based on the HB system. The proposed method has obtained promising results and may play a pivotal role towards improved rehabilitation programs for patients.
Ahlman, Mark A; Nietert, Paul J; Wahlquist, Amy E; Serguson, Jill M; Berry, Max W; Suranyi, Pal; Liu, Songtao; Spicer, Kenneth M
2014-01-01
Purpose: In the effort to reduce radiation exposure to patients undergoing myocardial perfusion imaging (MPI) with SPECT/CT, we evaluate the feasibility of a single CT for attenuation correction (AC) of single-day rest (R)/stress (S) perfusion. Methods: Processing of 20 single isotope and 20 dual isotope MPI with perfusion defects were retrospectively repeated in three steps: (1) the standard method using a concurrent R-CT for AC of R-SPECT and S-CT for S-SPECT; (2) the standard method repeated; and (3) with the R-CT used for AC of S-SPECT, and the S-CT used for AC of R-SPECT. Intra-Class Correlation Coefficients (ICC) and Choen’s kappa were used to measure intra-operator variability in sum scoring. Results: The highest level of intra-operator reliability was seen with the reproduction of the sum rest score (SRS) and sum stress score (SSS) (ICC > 95%). ICCs were > 85% for SRS and SSS when alternate CTs were used for AC, but when sum difference scores were calculated, ICC values were much lower (~22% to 27%), which may imply that neither CT substitution resulted in a reproducible difference score. Similar results were seen when evaluating dichotomous outcomes (sum scores difference of ≥ 4) when comparing different processing techniques (kappas ~0.32 to 0.43). Conclusions: When a single CT is used for AC of both rest and stress SPECT, there is disproportionately high variability in sum scoring that is independent of user error. This information can be used to direct further investigation in radiation reduction for common imaging exams in nuclear medicine. PMID:24482701
Nakanishi, Rine; Sankaran, Sethuraman; Grady, Leo; Malpeso, Jenifer; Yousfi, Razik; Osawa, Kazuhiro; Ceponiene, Indre; Nazarat, Negin; Rahmani, Sina; Kissel, Kendall; Jayawardena, Eranthi; Dailing, Christopher; Zarins, Christopher; Koo, Bon-Kwon; Min, James K; Taylor, Charles A; Budoff, Matthew J
2018-03-23
Our goal was to evaluate the efficacy of a fully automated method for assessing the image quality (IQ) of coronary computed tomography angiography (CCTA). The machine learning method was trained using 75 CCTA studies by mapping features (noise, contrast, misregistration scores, and un-interpretability index) to an IQ score based on manual ground truth data. The automated method was validated on a set of 50 CCTA studies and subsequently tested on a new set of 172 CCTA studies against visual IQ scores on a 5-point Likert scale. The area under the curve in the validation set was 0.96. In the 172 CCTA studies, our method yielded a Cohen's kappa statistic for the agreement between automated and visual IQ assessment of 0.67 (p < 0.01). In the group where good to excellent (n = 163), fair (n = 6), and poor visual IQ scores (n = 3) were graded, 155, 5, and 2 of the patients received an automated IQ score > 50 %, respectively. Fully automated assessment of the IQ of CCTA data sets by machine learning was reproducible and provided similar results compared with visual analysis within the limits of inter-operator variability. • The proposed method enables automated and reproducible image quality assessment. • Machine learning and visual assessments yielded comparable estimates of image quality. • Automated assessment potentially allows for more standardised image quality. • Image quality assessment enables standardization of clinical trial results across different datasets.
A Two-Step Bayesian Approach for Propensity Score Analysis: Simulations and Case Study.
Kaplan, David; Chen, Jianshen
2012-07-01
A two-step Bayesian propensity score approach is introduced that incorporates prior information in the propensity score equation and outcome equation without the problems associated with simultaneous Bayesian propensity score approaches. The corresponding variance estimators are also provided. The two-step Bayesian propensity score is provided for three methods of implementation: propensity score stratification, weighting, and optimal full matching. Three simulation studies and one case study are presented to elaborate the proposed two-step Bayesian propensity score approach. Results of the simulation studies reveal that greater precision in the propensity score equation yields better recovery of the frequentist-based treatment effect. A slight advantage is shown for the Bayesian approach in small samples. Results also reveal that greater precision around the wrong treatment effect can lead to seriously distorted results. However, greater precision around the correct treatment effect parameter yields quite good results, with slight improvement seen with greater precision in the propensity score equation. A comparison of coverage rates for the conventional frequentist approach and proposed Bayesian approach is also provided. The case study reveals that credible intervals are wider than frequentist confidence intervals when priors are non-informative.
Cousins, Matthew M.; Swan, David; Magaret, Craig A.; Hoover, Donald R.; Eshleman, Susan H.
2012-01-01
Background HIV diversity may be a useful biomarker for discriminating between recent and non-recent HIV infection. The high resolution melting (HRM) diversity assay was developed to quantify HIV diversity in viral populations without sequencing. In this assay, HIV diversity is expressed as a single numeric HRM score that represents the width of a melting peak. HRM scores are highly associated with diversity measures obtained with next generation sequencing. In this report, a software package, the HRM Diversity Assay Analysis Tool (DivMelt), was developed to automate calculation of HRM scores from melting curve data. Methods DivMelt uses computational algorithms to calculate HRM scores by identifying the start (T1) and end (T2) melting temperatures for a DNA sample and subtracting them (T2–T1 = HRM score). DivMelt contains many user-supplied analysis parameters to allow analyses to be tailored to different contexts. DivMelt analysis options were optimized to discriminate between recent and non-recent HIV infection and to maximize HRM score reproducibility. HRM scores calculated using DivMelt were compared to HRM scores obtained using a manual method that is based on visual inspection of DNA melting curves. Results HRM scores generated with DivMelt agreed with manually generated HRM scores obtained from the same DNA melting data. Optimal parameters for discriminating between recent and non-recent HIV infection were identified. DivMelt provided greater discrimination between recent and non-recent HIV infection than the manual method. Conclusion DivMelt provides a rapid, accurate method of determining HRM scores from melting curve data, facilitating use of the HRM diversity assay for large-scale studies. PMID:23240016
Data-Driven Benchmarking of Building Energy Efficiency Utilizing Statistical Frontier Models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kavousian, A; Rajagopal, R
2014-01-01
Frontier methods quantify the energy efficiency of buildings by forming an efficient frontier (best-practice technology) and by comparing all buildings against that frontier. Because energy consumption fluctuates over time, the efficiency scores are stochastic random variables. Existing applications of frontier methods in energy efficiency either treat efficiency scores as deterministic values or estimate their uncertainty by resampling from one set of measurements. Availability of smart meter data (repeated measurements of energy consumption of buildings) enables using actual data to estimate the uncertainty in efficiency scores. Additionally, existing applications assume a linear form for an efficient frontier; i.e.,they assume that themore » best-practice technology scales up and down proportionally with building characteristics. However, previous research shows that buildings are nonlinear systems. This paper proposes a statistical method called stochastic energy efficiency frontier (SEEF) to estimate a bias-corrected efficiency score and its confidence intervals from measured data. The paper proposes an algorithm to specify the functional form of the frontier, identify the probability distribution of the efficiency score of each building using measured data, and rank buildings based on their energy efficiency. To illustrate the power of SEEF, this paper presents the results from applying SEEF on a smart meter data set of 307 residential buildings in the United States. SEEF efficiency scores are used to rank individual buildings based on energy efficiency, to compare subpopulations of buildings, and to identify irregular behavior of buildings across different time-of-use periods. SEEF is an improvement to the energy-intensity method (comparing kWh/sq.ft.): whereas SEEF identifies efficient buildings across the entire spectrum of building sizes, the energy-intensity method showed bias toward smaller buildings. The results of this research are expected to assist researchers and practitioners compare and rank (i.e.,benchmark) buildings more robustly and over a wider range of building types and sizes. Eventually, doing so is expected to result in improved resource allocation in energy-efficiency programs.« less
ERIC Educational Resources Information Center
Klinger, Don A.; Rogers, W. Todd
2003-01-01
The estimation accuracy of procedures based on classical test score theory and item response theory (generalized partial credit model) were compared for examinations consisting of multiple-choice and extended-response items. Analysis of British Columbia Scholarship Examination results found an error rate of about 10 percent for both methods, with…
Unsupervised Deep Learning Applied to Breast Density Segmentation and Mammographic Risk Scoring.
Kallenberg, Michiel; Petersen, Kersten; Nielsen, Mads; Ng, Andrew Y; Pengfei Diao; Igel, Christian; Vachon, Celine M; Holland, Katharina; Winkel, Rikke Rass; Karssemeijer, Nico; Lillholm, Martin
2016-05-01
Mammographic risk scoring has commonly been automated by extracting a set of handcrafted features from mammograms, and relating the responses directly or indirectly to breast cancer risk. We present a method that learns a feature hierarchy from unlabeled data. When the learned features are used as the input to a simple classifier, two different tasks can be addressed: i) breast density segmentation, and ii) scoring of mammographic texture. The proposed model learns features at multiple scales. To control the models capacity a novel sparsity regularizer is introduced that incorporates both lifetime and population sparsity. We evaluated our method on three different clinical datasets. Our state-of-the-art results show that the learned breast density scores have a very strong positive relationship with manual ones, and that the learned texture scores are predictive of breast cancer. The model is easy to apply and generalizes to many other segmentation and scoring problems.
Double propensity-score adjustment: A solution to design bias or bias due to incomplete matching.
Austin, Peter C
2017-02-01
Propensity-score matching is frequently used to reduce the effects of confounding when using observational data to estimate the effects of treatments. Matching allows one to estimate the average effect of treatment in the treated. Rosenbaum and Rubin coined the term "bias due to incomplete matching" to describe the bias that can occur when some treated subjects are excluded from the matched sample because no appropriate control subject was available. The presence of incomplete matching raises important questions around the generalizability of estimated treatment effects to the entire population of treated subjects. We describe an analytic solution to address the bias due to incomplete matching. Our method is based on using optimal or nearest neighbor matching, rather than caliper matching (which frequently results in the exclusion of some treated subjects). Within the sample matched on the propensity score, covariate adjustment using the propensity score is then employed to impute missing potential outcomes under lack of treatment for each treated subject. Using Monte Carlo simulations, we found that the proposed method resulted in estimates of treatment effect that were essentially unbiased. This method resulted in decreased bias compared to caliper matching alone and compared to either optimal matching or nearest neighbor matching alone. Caliper matching alone resulted in design bias or bias due to incomplete matching, while optimal matching or nearest neighbor matching alone resulted in bias due to residual confounding. The proposed method also tended to result in estimates with decreased mean squared error compared to when caliper matching was used.
Double propensity-score adjustment: A solution to design bias or bias due to incomplete matching
2016-01-01
Propensity-score matching is frequently used to reduce the effects of confounding when using observational data to estimate the effects of treatments. Matching allows one to estimate the average effect of treatment in the treated. Rosenbaum and Rubin coined the term “bias due to incomplete matching” to describe the bias that can occur when some treated subjects are excluded from the matched sample because no appropriate control subject was available. The presence of incomplete matching raises important questions around the generalizability of estimated treatment effects to the entire population of treated subjects. We describe an analytic solution to address the bias due to incomplete matching. Our method is based on using optimal or nearest neighbor matching, rather than caliper matching (which frequently results in the exclusion of some treated subjects). Within the sample matched on the propensity score, covariate adjustment using the propensity score is then employed to impute missing potential outcomes under lack of treatment for each treated subject. Using Monte Carlo simulations, we found that the proposed method resulted in estimates of treatment effect that were essentially unbiased. This method resulted in decreased bias compared to caliper matching alone and compared to either optimal matching or nearest neighbor matching alone. Caliper matching alone resulted in design bias or bias due to incomplete matching, while optimal matching or nearest neighbor matching alone resulted in bias due to residual confounding. The proposed method also tended to result in estimates with decreased mean squared error compared to when caliper matching was used. PMID:25038071
Automatic Summarization as a Combinatorial Optimization Problem
NASA Astrophysics Data System (ADS)
Hirao, Tsutomu; Suzuki, Jun; Isozaki, Hideki
We derived the oracle summary with the highest ROUGE score that can be achieved by integrating sentence extraction with sentence compression from the reference abstract. The analysis results of the oracle revealed that summarization systems have to assign an appropriate compression rate for each sentence in the document. In accordance with this observation, this paper proposes a summarization method as a combinatorial optimization: selecting the set of sentences that maximize the sum of the sentence scores from the pool which consists of the sentences with various compression rates, subject to length constrains. The score of the sentence is defined by its compression rate, content words and positional information. The parameters for the compression rates and positional information are optimized by minimizing the loss between score of oracles and that of candidates. The results obtained from TSC-2 corpus showed that our method outperformed the previous systems with statistical significance.
Drawing causal inferences using propensity scores: a practical guide for community psychologists.
Lanza, Stephanie T; Moore, Julia E; Butera, Nicole M
2013-12-01
Confounding present in observational data impede community psychologists' ability to draw causal inferences. This paper describes propensity score methods as a conceptually straightforward approach to drawing causal inferences from observational data. A step-by-step demonstration of three propensity score methods-weighting, matching, and subclassification-is presented in the context of an empirical examination of the causal effect of preschool experiences (Head Start vs. parental care) on reading development in kindergarten. Although the unadjusted population estimate indicated that children with parental care had substantially higher reading scores than children who attended Head Start, all propensity score adjustments reduce the size of this overall causal effect by more than half. The causal effect was also defined and estimated among children who attended Head Start. Results provide no evidence for improved reading if those children had instead received parental care. We carefully define different causal effects and discuss their respective policy implications, summarize advantages and limitations of each propensity score method, and provide SAS and R syntax so that community psychologists may conduct causal inference in their own research.
Drawing Causal Inferences Using Propensity Scores: A Practical Guide for Community Psychologists
Lanza, Stephanie T.; Moore, Julia E.; Butera, Nicole M.
2014-01-01
Confounding present in observational data impede community psychologists’ ability to draw causal inferences. This paper describes propensity score methods as a conceptually straightforward approach to drawing causal inferences from observational data. A step-by-step demonstration of three propensity score methods – weighting, matching, and subclassification – is presented in the context of an empirical examination of the causal effect of preschool experiences (Head Start vs. parental care) on reading development in kindergarten. Although the unadjusted population estimate indicated that children with parental care had substantially higher reading scores than children who attended Head Start, all propensity score adjustments reduce the size of this overall causal effect by more than half. The causal effect was also defined and estimated among children who attended Head Start. Results provide no evidence for improved reading if those children had instead received parental care. We carefully define different causal effects and discuss their respective policy implications, summarize advantages and limitations of each propensity score method, and provide SAS and R syntax so that community psychologists may conduct causal inference in their own research. PMID:24185755
Soleymani, Mohammad Reza; Hemmati, Soheila; Ashrafi-Rizi, Hassan; Shahrzadi, Leila
2017-01-01
BACKGROUND AND OBJECTIVE: Maintaining and improving the health situation of children requires them to become more aware about personal hygiene through proper education. Based on several studies, teachings provided through informal methods are fully understandable for children. Therefore, the goal of this study is to compare the effects of creative drama and storytelling education methods on increasing the awareness of children regarding personal hygiene. METHODS: This is an applied study conducted using semiempirical method in two groups. The study population consisted of 85 children participating in 4th center for Institute for the Intellectual Development of Children and Young Adults in Isfahan, 40 of which were randomly selected and placed in storytelling and creative drama groups with 20 members each. The data gathering tool was a questionnaire created by the researchers whose content validity was confirmed by health education experts. The gathered information were analyzed using both descriptive (average and standard deviation) and analytical (independent t-test and paired t-test) statistical methods. RESULTS: The findings showed that there was a meaningful difference between the awareness score of both groups before and after intervention. The average awareness score of storytelling group was increased from 50.69 to 86.83 while the average score of creative drama group was increased from 57.37 to 85.09. Furthermore, according to paired t-test results, there was no significant difference between average scores of storytelling and creative drama groups. CONCLUSIONS: The results of the current study showed that although both storytelling and creative drama methods are effective in increasing the awareness of children regarding personal hygiene, there is no significant difference between the two methods. PMID:29114550
Two-Way Gene Interaction From Microarray Data Based on Correlation Methods
Alavi Majd, Hamid; Talebi, Atefeh; Gilany, Kambiz; Khayyer, Nasibeh
2016-01-01
Background Gene networks have generated a massive explosion in the development of high-throughput techniques for monitoring various aspects of gene activity. Networks offer a natural way to model interactions between genes, and extracting gene network information from high-throughput genomic data is an important and difficult task. Objectives The purpose of this study is to construct a two-way gene network based on parametric and nonparametric correlation coefficients. The first step in constructing a Gene Co-expression Network is to score all pairs of gene vectors. The second step is to select a score threshold and connect all gene pairs whose scores exceed this value. Materials and Methods In the foundation-application study, we constructed two-way gene networks using nonparametric methods, such as Spearman’s rank correlation coefficient and Blomqvist’s measure, and compared them with Pearson’s correlation coefficient. We surveyed six genes of venous thrombosis disease, made a matrix entry representing the score for the corresponding gene pair, and obtained two-way interactions using Pearson’s correlation, Spearman’s rank correlation, and Blomqvist’s coefficient. Finally, these methods were compared with Cytoscape, based on BIND, and Gene Ontology, based on molecular function visual methods; R software version 3.2 and Bioconductor were used to perform these methods. Results Based on the Pearson and Spearman correlations, the results were the same and were confirmed by Cytoscape and GO visual methods; however, Blomqvist’s coefficient was not confirmed by visual methods. Conclusions Some results of the correlation coefficients are not the same with visualization. The reason may be due to the small number of data. PMID:27621916
NASA Astrophysics Data System (ADS)
Wu, Jing; Ferns, Gordon; Giles, John; Lewis, Emma
2012-02-01
Inter- and intra- observer variability is a problem often faced when an expert or observer is tasked with assessing the severity of a disease. This issue is keenly felt in coronary calcium scoring of patients suffering from atherosclerosis where in clinical practice, the observer must identify firstly the presence, followed by the location of candidate calcified plaques found within the coronary arteries that may prevent oxygenated blood flow to the heart muscle. This can be challenging for a human observer as it is difficult to differentiate calcified plaques that are located in the coronary arteries from those found in surrounding anatomy such as the mitral valve or pericardium. The inclusion or exclusion of false positive or true positive calcified plaques respectively will alter the patient calcium score incorrectly, thus leading to the possibility of incorrect treatment prescription. In addition to the benefits to scoring accuracy, the use of fast, low dose multi-slice CT imaging to perform the cardiac scan is capable of acquiring the entire heart within a single breath hold. Thus exposing the patient to lower radiation dose, which for a progressive disease such as atherosclerosis where multiple scans may be required, is beneficial to their health. Presented here is a fully automated method for calcium scoring using both the traditional Agatston method, as well as the Volume scoring method. Elimination of the unwanted regions of the cardiac image slices such as lungs, ribs, and vertebrae is carried out using adaptive heart isolation. Such regions cannot contain calcified plaques but can be of a similar intensity and their removal will aid detection. Removal of both the ascending and descending aortas, as they contain clinical insignificant plaques, is necessary before the final calcium scores are calculated and examined against ground truth scores of three averaged expert observer results. The results presented here are intended to show the requirement and feasibility for an automated scoring method that reduces the subjectivity and reproducibility error inherent with manual clinical calcium scoring.
Bastien, Olivier; Ortet, Philippe; Roy, Sylvaine; Maréchal, Eric
2005-03-10
Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence alignments, respect probabilistic properties of Z-scores (Monte Carlo methods applied to pair-wise comparisons) and be the basis for a novel method of consistent and stable phylogenetic reconstruction. We have built up a spatial representation of protein sequences using concepts from particle physics (configuration space) and respecting a frame of constraints deduced from pair-wise alignment score properties in information theory. The obtained configuration space of homologous proteins (CSHP) allows the representation of real and shuffled sequences, and thereupon an expression of the TULIP theorem for Z-score probabilities. Based on the CSHP, we propose a phylogeny reconstruction using Z-scores. Deduced trees, called TULIP trees, are consistent with multiple-alignment based trees. Furthermore, the TULIP tree reconstruction method provides a solution for some previously reported incongruent results, such as the apicomplexan enolase phylogeny. The CSHP is a unified model that conserves mutual information between proteins in the way physical models conserve energy. Applications include the reconstruction of evolutionary consistent and robust trees, the topology of which is based on a spatial representation that is not reordered after addition or removal of sequences. The CSHP and its assigned phylogenetic topology, provide a powerful and easily updated representation for massive pair-wise genome comparisons based on Z-score computations.
Hsieh, Jui-Hua; Yin, Shuangye; Wang, Xiang S; Liu, Shubin; Dokholyan, Nikolay V; Tropsha, Alexander
2012-01-23
Poor performance of scoring functions is a well-known bottleneck in structure-based virtual screening (VS), which is most frequently manifested in the scoring functions' inability to discriminate between true ligands vs known nonbinders (therefore designated as binding decoys). This deficiency leads to a large number of false positive hits resulting from VS. We have hypothesized that filtering out or penalizing docking poses recognized as non-native (i.e., pose decoys) should improve the performance of VS in terms of improved identification of true binders. Using several concepts from the field of cheminformatics, we have developed a novel approach to identifying pose decoys from an ensemble of poses generated by computational docking procedures. We demonstrate that the use of target-specific pose (scoring) filter in combination with a physical force field-based scoring function (MedusaScore) leads to significant improvement of hit rates in VS studies for 12 of the 13 benchmark sets from the clustered version of the Database of Useful Decoys (DUD). This new hybrid scoring function outperforms several conventional structure-based scoring functions, including XSCORE::HMSCORE, ChemScore, PLP, and Chemgauss3, in 6 out of 13 data sets at early stage of VS (up 1% decoys of the screening database). We compare our hybrid method with several novel VS methods that were recently reported to have good performances on the same DUD data sets. We find that the retrieved ligands using our method are chemically more diverse in comparison with two ligand-based methods (FieldScreen and FLAP::LBX). We also compare our method with FLAP::RBLB, a high-performance VS method that also utilizes both the receptor and the cognate ligand structures. Interestingly, we find that the top ligands retrieved using our method are highly complementary to those retrieved using FLAP::RBLB, hinting effective directions for best VS applications. We suggest that this integrative VS approach combining cheminformatics and molecular mechanics methodologies may be applied to a broad variety of protein targets to improve the outcome of structure-based drug discovery studies.
The contribution of clinical assessments to the diagnostic algorithm of pulmonary embolism.
Turan, Onur; Turgut, Deniz; Gunay, Turkan; Yilmaz, Erkan; Turan, Ayse; Akkoclu, Atila
2017-01-01
Pulmonary thromboembolism (PE) is a major disease in respiratory emergencies. Thoracic CT angiography (CTA) is an important method of visualizing PE. Because of the high radiation and contrast exposure, the method should be performed selectively in patients in whom PE is suspected. The aim of the study was to identify the role of clinical scoring systems utilizing CTA results to diagnose PE. The study investigated 196 patients referred to the hospital emergency service in whom PE was suspected and CTA performed. They were evaluated by empirical, Wells, Geneva and Miniati assessments and classified as low, intermediate and high clinical probability. They were also classified according to serum D-dimer levels. The sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were calculated and evaluated according to CTA findings. Empirical scoring was found to have the highest sensitivity, while the Wells system had the highest specificity. When low D-dimer levels and "low probabilty" were evaluated together for each scoring system, the sensitivity was found to be 100% for all methods. Wells scoring with a cut-off score of 4 had the highest specificity (56.1%). Clinical scoring systems may be guides for patients in whom PE is suspected in the emergency department. The empirical and Wells scoring systems are effective methods for patient selection. Adding evaluation of D-dimer serum levels to the clinical scores could identify patients in whom CTA should be performed. Since CTA can only be used conservatively, the use of clinical scoring systems in conjunction with D-dimer levels can be a useful guide for patient selection.
Albuquerque, Maicon R.; Lopes, Mariana C.; de Paula, Jonas J.; Faria, Larissa O.; Pereira, Eveline T.; da Costa, Varley T.
2017-01-01
In order to understand the reasons that lead individuals to practice physical activity, researchers developed the Motives for Physical Activity Measure-Revised (MPAM-R) scale. In 2010, a translation of MPAM-R to Portuguese and its validation was performed. However, psychometric measures were not acceptable. In addition, factor scores in some sports psychology scales are calculated by the mean of scores by items of the factor. Nevertheless, it seems appropriate that items with higher factor loadings, extracted by Factor Analysis, have greater weight in the factor score, as items with lower factor loadings have less weight in the factor score. The aims of the present study are to translate, validate the MPAM-R for Portuguese versions, and investigate agreement between two methods used to calculate factor scores. Three hundred volunteers who were involved in physical activity programs for at least 6 months were collected. Confirmatory Factor Analysis of the 30 items indicated that the version did not fit the model. After excluding four items, the final model with 26 items showed acceptable model fit measures by Exploratory Factor Analysis, as well as it conceptually supports the five factors as the original proposal. When two methods are compared to calculate factors scores, our results showed that only “Enjoyment” and “Appearance” factors showed agreement between methods to calculate factor scores. So, the Portuguese version of the MPAM-R can be used in a Brazilian context, and a new proposal for the calculation of the factor score seems to be promising. PMID:28293203
Yildiz, Bulent O.; Bolour, Sheila; Woods, Keslie; Moore, April; Azziz, Ricardo
2010-01-01
BACKGROUND Hirsutism is the presence of excess body or facial terminal (coarse) hair growth in females in a male-like pattern, affects 5–15% of women, and is an important sign of underlying androgen excess. Different methods are available for the assessment of hair growth in women. METHODS We conducted a literature search and analyzed the published studies that reported methods for the assessment of hair growth. We review the basic physiology of hair growth, the development of methods for visually quantifying hair growth, the comparison of these methods with objective measurements of hair growth, how hirsutism may be defined using a visual scoring method, the influence of race and ethnicity on hirsutism, and the impact of hirsutism in diagnosing androgen excess and polycystic ovary syndrome. RESULTS Objective methods for the assessment of hair growth including photographic evaluations and microscopic measurements are available but these techniques have limitations for clinical use, including a significant degree of complexity and a high cost. Alternatively, methods for visually scoring or quantifying the amount of terminal body and facial hair growth have been in use since the early 1920s; these methods are semi-quantitative at best and subject to significant inter-observer variability. The most common visual method of scoring the extent of body and facial terminal hair growth in use today is based on a modification of the method originally described by Ferriman and Gallwey in 1961 (i.e. the mFG method). CONCLUSION Overall, the mFG scoring method is a useful visual instrument for assessing excess terminal hair growth, and the presence of hirsutism, in women. PMID:19567450
Zeger, Scott L.; Kolars, Joseph C.
2008-01-01
Background Little is known about the associations of previous standardized examination scores with scores on subsequent standardized examinations used to assess medical knowledge in internal medicine residencies. Objective To examine associations of previous standardized test scores on subsequent standardized test scores. Design Retrospective cohort study. Participants One hundred ninety-five internal medicine residents. Methods Bivariate associations of United States Medical Licensing Examination (USMLE) Steps and Internal Medicine In-Training Examination (IM-ITE) scores were determined. Random effects analysis adjusting for repeated administrations of the IM-ITE and other variables known or hypothesized to affect IM-ITE score allowed for discrimination of associations of individual USMLE Step scores on IM-ITE scores. Results In bivariate associations, USMLE scores explained 17% to 27% of the variance in IME-ITE scores, and previous IM-ITE scores explained 66% of the variance in subsequent IM-ITE scores. Regression coefficients (95% CI) for adjusted associations of each USMLE Step with IM-ITE scores were USMLE-1 0.19 (0.12, 0.27), USMLE-2 0.23 (0.17, 0.30), and USMLE-3 0.19 (0.09, 0.29). Conclusions No single USMLE Step is more strongly associated with IM-ITE scores than the others. Because previous IM-ITE scores are strongly associated with subsequent IM-ITE scores, appropriate modeling, such as random effects methods, should be used to account for previous IM-ITE administrations in studies for which IM-ITE score is an outcome. PMID:18612735
Specific algorithm method of scoring the Clock Drawing Test applied in cognitively normal elderly
Mendes-Santos, Liana Chaves; Mograbi, Daniel; Spenciere, Bárbara; Charchat-Fichman, Helenice
2015-01-01
The Clock Drawing Test (CDT) is an inexpensive, fast and easily administered measure of cognitive function, especially in the elderly. This instrument is a popular clinical tool widely used in screening for cognitive disorders and dementia. The CDT can be applied in different ways and scoring procedures also vary. Objective The aims of this study were to analyze the performance of elderly on the CDT and evaluate inter-rater reliability of the CDT scored by using a specific algorithm method adapted from Sunderland et al. (1989). Methods We analyzed the CDT of 100 cognitively normal elderly aged 60 years or older. The CDT ("free-drawn") and Mini-Mental State Examination (MMSE) were administered to all participants. Six independent examiners scored the CDT of 30 participants to evaluate inter-rater reliability. Results and Conclusion A score of 5 on the proposed algorithm ("Numbers in reverse order or concentrated"), equivalent to 5 points on the original Sunderland scale, was the most frequent (53.5%). The CDT specific algorithm method used had high inter-rater reliability (p<0.01), and mean score ranged from 5.06 to 5.96. The high frequency of an overall score of 5 points may suggest the need to create more nuanced evaluation criteria, which are sensitive to differences in levels of impairment in visuoconstructive and executive abilities during aging. PMID:29213954
Horn, F K; Mardin, C Y; Bendschneider, D; Jünemann, A G; Adler, W; Tornow, R P
2011-01-01
Purpose To assess the combined diagnostic power of frequency-doubling technique (FDT)-perimetry and retinal nerve fibre layer (RNFL) thickness measurements with spectral domain optical coherence tomography (SDOCT). Methods The study included 330 experienced participants in five age-related groups: 77 ‘preperimetric' open-angle glaucoma (OAG) patients, 52 ‘early' OAG, 50 ‘moderate' OAG, 54 ocular hypertensivepatients, and 97 healthy subjects. For glaucoma assessment in all subjects conventional perimetry, evaluation of fundus photographs, FDT-perimetry and RNFL thickness measurement with SDOCT was done. Glaucomatous visual field defects were classified using the Glaucoma Staging System. FDT evaluation used a published method with casewise calculation of an ‘FDT-score', including all missed localized probability levels. SDOCT evaluation used mean RNFL thickness and a new individual SDOCT-score considering normal confidence limits in 32 sectors of a peripapillary circular scan. To examine the joined value of both methods a combined score was introduced. Significance of the difference between Receiver-operating-characteristic (ROC) curves was calculated for a specificity of 96%. Results Sensitivity in the preperimetric glaucoma group was 44% for SDOCT-score, 25% for FDT-score, and 44% for combined score, in the early glaucoma group 83, 81, and 89%, respectively, and in the moderate glaucoma group 94, 94, and 98%, respectively, all at a specificity of 96%. ROC performance of the newly developed combined score is significantly above single ROC curves of FDT-score in preperimetric and early OAG and above RNFL thickness in moderate OAG. Conclusion Combination of function and morphology by using the FDT-score and the SDOCT-score performs equal or even better than each single method alone. PMID:21102494
Nalwadda, Gorrette; Tumwesigye, Nazarius M.; Faxelid, Elisabeth; Byamugisha, Josaphat; Mirembe, Florence
2011-01-01
Background Low and inconsistent use of contraceptives by young people contributes to unintended pregnancies. This study assessed quality of contraceptive services for young people aged 15–24 in two rural districts in Uganda. Methods Five female and two male simulated clients (SCs) interacted with 128 providers at public, private not-for-profit (PNFP), and private for profit (PFP) health facilities. After consultations, SCs were interviewed using a structured questionnaire. Six aspects of quality of care (client's needs, choice of contraceptive methods, information given to users, client-provider interpersonal relations, constellation of services, and continuity mechanisms) were assessed. Descriptive statistics and factor analysis were performed. Results Means and categorized quality scores for all aspects of quality were low in both public and private facilities. The lowest quality scores were observed in PFP, and medium scores in PNFP facilities. The choice of contraceptive methods and interpersonal relations quality scores were slightly higher in public facilities. Needs assessment scores were highest in PNFP facilities. All facilities were classified as having low scores for appropriate constellation of services. Information given to users was suboptimal and providers promoted specific contraceptive methods. Minority of providers offered preferred method of choice and showed respect for privacy. Conclusions The quality of contraceptive services provided to young people was low. Concurrent quality improvements and strengthening of health systems are needed. PMID:22132168
ERIC Educational Resources Information Center
Klein, Anna C.; Whitney, Douglas R.
Procedures and related issues involved in the application of trait-treatment interaction (TTI) to institutional research, in general, and to placement and proficiency testing, in particular, are discussed and illustrated. Traditional methods for choosing cut-off scores are compared and proposals for evaluating the results in the TTI framework are…
SU-F-T-423: Automating Treatment Planning for Cervical Cancer in Low- and Middle- Income Countries
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kisling, K; Zhang, L; Yang, J
Purpose: To develop and test two independent algorithms that automatically create the photon treatment fields for a four-field box beam arrangement, a common treatment technique for cervical cancer in low- and middle-income countries. Methods: Two algorithms were developed and integrated into Eclipse using its Advanced Programming Interface:3D Method: We automatically segment bony anatomy on CT using an in-house multi-atlas contouring tool and project the structures into the beam’s-eye-view. We identify anatomical landmarks on the projections to define the field apertures. 2D Method: We generate DRRs for all four beams. An atlas of DRRs for six standard patients with corresponding fieldmore » apertures are deformably registered to the test patient DRRs. The set of deformed atlas apertures are fitted to an expected shape to define the final apertures. Both algorithms were tested on 39 patient CTs, and the resulting treatment fields were scored by a radiation oncologist. We also investigated the feasibility of using one algorithm as an independent check of the other algorithm. Results: 96% of the 3D-Method-generated fields and 79% of the 2D-method-generated fields were scored acceptable for treatment (“Per Protocol” or “Acceptable Variation”). The 3D Method generated more fields scored “Per Protocol” than the 2D Method (62% versus 17%). The 4% of the 3D-Method-generated fields that were scored “Unacceptable Deviation” were all due to an improper L5 vertebra contour resulting in an unacceptable superior jaw position. When these same patients were planned with the 2D method, the superior jaw was acceptable, suggesting that the 2D method can be used to independently check the 3D method. Conclusion: Our results show that our 3D Method is feasible for automatically generating cervical treatment fields. Furthermore, the 2D Method can serve as an automatic, independent check of the automatically-generated treatment fields. These algorithms will be implemented for fully automated cervical treatment planning.« less
Polyadenylation site prediction using PolyA-iEP method.
Kavakiotis, Ioannis; Tzanis, George; Vlahavas, Ioannis
2014-01-01
This chapter presents a method called PolyA-iEP that has been developed for the prediction of polyadenylation sites. More precisely, PolyA-iEP is a method that recognizes mRNA 3'ends which contain polyadenylation sites. It is a modular system which consists of two main components. The first exploits the advantages of emerging patterns and the second is a distance-based scoring method. The outputs of the two components are finally combined by a classifier. The final results reach very high scores of sensitivity and specificity.
Ertmer, David J.
2012-01-01
Purpose This investigation sought to determine whether scores from a commonly used word-based articulation test are closely associated with speech intelligibility in children with hearing loss. If the scores are closely related, articulation testing results might be used to estimate intelligibility. If not, the importance of direct assessment of intelligibility would be reinforced. Methods Forty-four children with hearing losses produced words from the Goldman-Fristoe Test of Articulation-2 and sets of 10 short sentences. Correlation analyses were conducted between scores for seven word-based predictor variables and percent-intelligible scores derived from listener judgments of stimulus sentences. Results Six of seven predictor variables were significantly correlated with percent-intelligible scores. However, regression analysis revealed that no single predictor variable or multi- variable model accounted for more than 25% of the variability in intelligibility scores. Implications The findings confirm the importance of assessing connected speech intelligibility directly. PMID:20220022
[Equating scores using bridging stations on the clinical performance examination].
Yoo, Dong-Mi; Han, Jae-Jin
2013-06-01
This study examined the use of the Tucker linear equating method in producing an individual student's score in 3 groups with bridging stations over 3 consecutive days of the clinical performance examination (CPX) and compared the differences in scoring patterns by bridging number. Data were drawn from 88 examinees from 3 different CPX groups-DAY1, DAY2, and DAY3-each of which comprised of 6 stations. Each group had 3 common stations, and each group had 2 or 3 stations that differed from other groups. DAY1 and DAY3 were equated to DAY2. Equated mean scores and standard deviations were compared with the originals. DAY1 and DAY3 were equated again, and the differences in scores (equated score-raw score) were compared between the 3 sets of equated scores. By equating to DAY2, DAY1 decreased in mean score from 58.188 to 56.549 and in standard deviation from 4.991 to 5.046, and DAY3 fell in mean score from 58.351 to 58.057 and in standard deviation from 5.546 to 5.856, which demonstrates that the scores of examinees in DAY1 and DAY2 were accentuated after use of the equation. The patterns in score differences between the equated sets to DAY1, DAY2, and DAY3 yielded information on the soundness of the equating results from individual and overall comparisons. To generate equated scores between 3 groups on 3 consecutive days of the CPX, we applied the Tucker linear equating method. We also present a method of equating reciprocal days to the anchoring day as much as bridging stations.
A method to score radiographic change in psoriatic arthritis.
Wassenberg, S; Fischer-Kahle, V; Herborn, G; Rau, R
2001-06-01
Radiographic features of psoriatic arthritis (PsA) are very characteristic and differ from those observed in rheumatoid arthritis, especially in two aspects: 1) the distribution of affected joints (i.e. DIP joints), 2) the presence of destructive changes and bone proliferation at the same time. A scoring method for PsA, therefore, has to account for these characteristics of PsA. To develop, describe and validate a method for scoring radiographic changes in patients with PsA. Forty joints of the hands and feet are scored for destruction and proliferation. In the destruction score (DS) grading on a 0-5 scale is based on the amount of joint surface destruction: 0 = normal, 1 = one or more erosions with an interruption of the cortical plate of > 1 mm with destruction of the total joint surface up to 10%, 2 = 11-25%, 3 = 26-50%, 4 = 51-75%, 5 = > 75% joint surface destruction. The proliferation score (PS) sums up any kind of bony proliferation typical for PsA; graded 0-4: 0 = normal, 1 = bony proliferation of 1-2 mm or bone growth < 25% of the original size (diameter), 2 = bony proliferation 2-3 mm or bone growth 25-50%, 3 = bony proliferation > 3 mm or bone growth > 50%, 4 = bony ankylosis. The DS (0-200) and the PS (0-160) can be summed up to the total score (0-360). VALIDATION OF THE METHOD: To validate the method x-rays of 20 patients with active PsA taken 3 years apart were read twice in pairs, knowing the chronological order but not knowing demographic, clinical or laboratory data of the patients. The data were analyzed with a hierarchical analysis of variance model. There was good agreement between the first and the second reading of the same rater and between the two raters regarding the destruction score. The agreement regarding the proliferation score was lower but still acceptable. The reliability of the method to describe change over time--relation of progression (intra-patient variance) to the measurement error (inter-rater variance)--was 3.9 for the DS, 2.8 for the PS and 4.1 for the total score. The minimal detectable change when the readings of two raters were compared (inter-rater MDC) was 5.8, 5.0 and 4.6%, respectively of the maximum possible score for the destruction, the proliferation and the total score. These data compare very well with the results of standard scoring methods in rheumatoid arthritis. We propose a method for scoring radiographic change in psoriatic arthritis which reliably quantifies the progression of the disease seen on radiographs.
[Cancer nursing care education programs: the effectiveness of different teaching methods].
Cheng, Yun-Ju; Kao, Yu-Hsiu
2012-10-01
In-service education affects the quality of cancer care directly. Using classroom teaching to deliver in-service education is often ineffective due to participants' large workload and shift requirements. This study evaluated the learning effectiveness of different teaching methods in the dimensions of knowledge, attitude, and learning satisfaction. This study used a quasi-experimental study design. Participants were cancer ward nurses working at one medical center in northern Taiwan. Participants were divided into an experimental group and control group. The experimental group took an e-learning course and the control group took a standard classroom course using the same basic course material. Researchers evaluated the learning efficacy of each group using a questionnaire based on the quality of cancer nursing care learning effectiveness scale. All participants answered the questionnaire once before and once after completing the course. (1) Post-test "knowledge" scores for both groups were significantly higher than pre-test scores for both groups. Post-test "attitude" scores were significantly higher for the control group, while the experimental group reported no significant change. (2) after a covariance analysis of the pre-test scores for both groups, the post-test score for the experimental group was significantly lower than the control group in the knowledge dimension. Post-test scores did not differ significantly from pre-test scores for either group in the attitude dimension. (3) Post-test satisfaction scores between the two groups did not differ significantly with regard to teaching methods. The e-learning method, however, was demonstrated as more flexible than the classroom teaching method. Study results demonstrate the importance of employing a variety of teaching methods to instruct clinical nursing staff. We suggest that both classroom teaching and e-learning instruction methods be used to enhance the quality of cancer nursing care education programs. We also encourage that interactivity between student and instructor be incorporated into e-learning course designs to enhance effectiveness.
Assessment of calcium scoring performance in cardiac computed tomography.
Ulzheimer, Stefan; Kalender, Willi A
2003-03-01
Electron beam tomography (EBT) has been used for cardiac diagnosis and the quantitative assessment of coronary calcium since the late 1980s. The introduction of mechanical multi-slice spiral CT (MSCT) scanners with shorter rotation times opened new possibilities of cardiac imaging with conventional CT scanners. The purpose of this work was to qualitatively and quantitatively evaluate the performance for EBT and MSCT for the task of coronary artery calcium imaging as a function of acquisition protocol, heart rate, spiral reconstruction algorithm (where applicable) and calcium scoring method. A cardiac CT semi-anthropomorphic phantom was designed and manufactured for the investigation of all relevant image quality parameters in cardiac CT. This phantom includes various test objects, some of which can be moved within the anthropomorphic phantom in a manner that mimics realistic heart motion. These tools were used to qualitatively and quantitatively demonstrate the accuracy of coronary calcium imaging using typical protocols for an electron beam (Evolution C-150XP, Imatron, South San Francisco, Calif.) and a 0.5-s four-slice spiral CT scanner (Sensation 4, Siemens, Erlangen, Germany). A special focus was put on the method of quantifying coronary calcium, and three scoring systems were evaluated (Agatston, volume, and mass scoring). Good reproducibility in coronary calcium scoring is always the result of a combination of high temporal and spatial resolution; consequently, thin-slice protocols in combination with retrospective gating on MSCT scanners yielded the best results. The Agatston score was found to be the least reproducible scoring method. The hydroxyapatite mass, being better reproducible and comparable on different scanners and being a physical quantitative measure, appears to be the method of choice for future clinical studies. The hydroxyapatite mass is highly correlated to the Agatston score. The introduced phantoms can be used to quantitatively assess the performance characteristics of, for example, different scanners, reconstruction algorithms, and quantification methods in cardiac CT. This is especially important for quantitative tasks, such as the determination of the amount of calcium in the coronary arteries, to achieve high and constant quality in this field.
Svetnik, Vladimir; Ma, Junshui; Soper, Keith A.; Doran, Scott; Renger, John J.; Deacon, Steve; Koblan, Ken S.
2007-01-01
Objective: To evaluate the performance of 2 automated systems, Morpheus and Somnolyzer24X7, with various levels of human review/editing, in scoring polysomnographic (PSG) recordings from a clinical trial using zolpidem in a model of transient insomnia. Methods: 164 all-night PSG recordings from 82 subjects collected during 2 nights of sleep, one under placebo and one under zolpidem (10 mg) treatment were used. For each recording, 6 different methods were used to provide sleep stage scores based on Rechtschaffen & Kales criteria: 1) full manual scoring, 2) automated scoring by Morpheus 3) automated scoring by Somnolyzer24X7, 4) automated scoring by Morpheus with full manual review, 5) automated scoring by Morpheus with partial manual review, 6) automated scoring by Somnolyzer24X7 with partial manual review. Ten traditional clinical efficacy measures of sleep initiation, maintenance, and architecture were calculated. Results: Pair-wise epoch-by-epoch agreements between fully automated and manual scores were in the range of intersite manual scoring agreements reported in the literature (70%-72%). Pair-wise epoch-by-epoch agreements between automated scores manually reviewed were higher (73%-76%). The direction and statistical significance of treatment effect sizes using traditional efficacy endpoints were essentially the same whichever method was used. As the degree of manual review increased, the magnitude of the effect size approached those estimated with fully manual scoring. Conclusion: Automated or semi-automated sleep PSG scoring offers valuable alternatives to costly, time consuming, and intrasite and intersite variable manual scoring, especially in large multicenter clinical trials. Reduction in scoring variability may also reduce the sample size of a clinical trial. Citation: Svetnik V; Ma J; Soper KA; Doran S; Renger JJ; Deacon S; Koblan KS. Evaluation of automated and semi-automated scoring of polysomnographic recordings from a clinical trial using zolpidem in the treatment of insomnia. SLEEP 2007;30(11):1562-1574. PMID:18041489
Automated Agatston score computation in non-ECG gated CT scans using deep learning
NASA Astrophysics Data System (ADS)
Cano-Espinosa, Carlos; González, Germán.; Washko, George R.; Cazorla, Miguel; San José Estépar, Raúl
2018-03-01
Introduction: The Agatston score is a well-established metric of cardiovascular disease related to clinical outcomes. It is computed from CT scans by a) measuring the volume and intensity of the atherosclerotic plaques and b) aggregating such information in an index. Objective: To generate a convolutional neural network that inputs a non-contrast chest CT scan and outputs the Agatston score associated with it directly, without a prior segmentation of Coronary Artery Calcifications (CAC). Materials and methods: We use a database of 5973 non-contrast non-ECG gated chest CT scans where the Agatston score has been manually computed. The heart of each scan is cropped automatically using an object detector. The database is split in 4973 cases for training and 1000 for testing. We train a 3D deep convolutional neural network to regress the Agatston score directly from the extracted hearts. Results: The proposed method yields a Pearson correlation coefficient of r = 0.93; p <= 0.0001 against manual reference standard in the 1000 test cases. It further stratifies correctly 72.6% of the cases with respect to standard risk groups. This compares to more complex state-of-the-art methods based on prior segmentations of the CACs, which achieve r = 0.94 in ECG-gated pulmonary CT. Conclusions: A convolutional neural network can regress the Agatston score from the image of the heart directly, without a prior segmentation of the CACs. This is a new and simpler paradigm in the Agatston score computation that yields similar results to the state-of-the-art literature.
Leske, David A.; Hatt, Sarah R.; Liebermann, Laura; Holmes, Jonathan M.
2016-01-01
Purpose We compare two methods of analysis for Rasch scoring pre- to postintervention data: Rasch lookup table versus de novo stacked Rasch analysis using the Adult Strabismus-20 (AS-20). Methods One hundred forty-seven subjects completed the AS-20 questionnaire prior to surgery and 6 weeks postoperatively. Subjects were classified 6 weeks postoperatively as “success,” “partial success,” or “failure” based on angle and diplopia status. Postoperative change in AS-20 scores was compared for all four AS-20 domains (self-perception, interactions, reading function, and general function) overall and by success status using two methods: (1) applying historical Rasch threshold measures from lookup tables and (2) performing a stacked de novo Rasch analysis. Change was assessed by analyzing effect size, improvement exceeding 95% limits of agreement (LOA), and score distributions. Results Effect sizes were similar for all AS-20 domains whether obtained from lookup tables or stacked analysis. Similar proportions exceeded 95% LOAs using lookup tables versus stacked analysis. Improvement in median score was observed for all AS-20 domains using lookup tables and stacked analysis (P < 0.0001 for all comparisons). Conclusions The Rasch-scored AS-20 is a responsive and valid instrument designed to measure strabismus-specific health-related quality of life. When analyzing pre- to postoperative change in AS-20 scores, Rasch lookup tables and de novo stacked Rasch analysis yield essentially the same results. Translational Relevance We describe a practical application of lookup tables, allowing the clinician or researcher to score the Rasch-calibrated AS-20 questionnaire without specialized software. PMID:26933524
Patellar tendinopathy: late-stage results from surgical treatment☆
Cenni, Marcos Henrique Frauendorf; Silva, Thiago Daniel Macedo; do Nascimento, Bruno Fajardo; de Andrade, Rodrigo Cristiano; Júnior, Lúcio Flávio Biondi Pinheiro; Nicolai, Oscar Pinheiro
2015-01-01
Objective To evaluate the late-stage results from surgical treatment of patellar tendinopathy (PT), using the Visa score (Victorian Institute of Sport Tendon Study Group) and the Verheyden method. Methods This was a retrospective study in which the postoperative results from 12 patients (14 knees) who were operated between July 2002 and February 2011 were evaluated. The patients included in the study presented patellar tendinopathy that was refractory to conservative treatment, without any other concomitant lesions. Patients who were not properly followed up during the postoperative period were excluded. Results Using the Verheyden method, nine patients were considered to have very good results, two had good results and one had poor results. In relation to Visa, the mean was 92.4 points and only two patients had scores less than 70 points (66 and 55 points). Conclusion When surgical treatment for patellar tendinopathy is correctly indicated, it has good long-term results. PMID:26535202
A comparison of interteaching and lecture in the college classroom.
Saville, Bryan K; Zinn, Tracy E; Neef, Nancy A; Van Norman, Renee; Ferreri, Summer J
2006-01-01
Interteaching is a new method of classroom instruction that is based on behavioral principles but offers more flexibility than other behaviorally based methods. We examined the effectiveness of interteaching relative to a traditional form of classroom instruction-the lecture. In Study 1, participants in a graduate course in special education took short quizzes after alternating conditions of interteaching and lecture. Quiz scores following interteaching were higher than quiz scores following lecture, although both methods improved performance relative to pretest measures. In Study 2, we also alternated interteaching and lecture but counterbalanced the conditions across two sections of an undergraduate research methods class. After each unit of information, participants from both sections took the same test. Again, test scores following interteaching were higher than test scores following lecture. In addition, students correctly answered more interteaching-based questions than lecture-based questions on a cumulative final test. In both studies, the majority of students reported a preference for interteaching relative to traditional lecture. In sum, the results suggest that interteaching may be an effective alternative to traditional lecture-based methods of instruction.
Probabilistic determination of probe locations from distance data
Xu, Xiao-Ping; Slaughter, Brian D.; Volkmann, Niels
2013-01-01
Distance constraints, in principle, can be employed to determine information about the location of probes within a three-dimensional volume. Traditional methods for locating probes from distance constraints involve optimization of scoring functions that measure how well the probe location fits the distance data, exploring only a small subset of the scoring function landscape in the process. These methods are not guaranteed to find the global optimum and provide no means to relate the identified optimum to all other optima in scoring space. Here, we introduce a method for the location of probes from distance information that is based on probability calculus. This method allows exploration of the entire scoring space by directly combining probability functions representing the distance data and information about attachment sites. The approach is guaranteed to identify the global optimum and enables the derivation of confidence intervals for the probe location as well as statistical quantification of ambiguities. We apply the method to determine the location of a fluorescence probe using distances derived by FRET and show that the resulting location matches that independently derived by electron microscopy. PMID:23770585
Graphic report of the results from propensity score method analyses.
Shrier, Ian; Pang, Menglan; Platt, Robert W
2017-08-01
To increase transparency in studies reporting propensity scores by using graphical methods that clearly illustrate (1) the number of participant exclusions that occur as a consequence of the analytic strategy and (2) whether treatment effects are constant or heterogeneous across propensity scores. We applied graphical methods to a real-world pharmacoepidemiologic study that evaluated the effect of initiating statin medication on the 1-year all-cause mortality post-myocardial infarction. We propose graphical methods to show the consequences of trimming and matching on the exclusion of participants from the analysis. We also propose the use of meta-analytical forest plots to show the magnitude of effect heterogeneity. A density plot with vertical lines demonstrated the proportion of subjects excluded because of trimming. A frequency plot with horizontal lines demonstrated the proportion of subjects excluded because of matching. An augmented forest plot illustrates the amount of effect heterogeneity present in the data. Our proposed techniques present additional and useful information that helps readers understand the sample that is analyzed with propensity score methods and whether effect heterogeneity is present. Copyright © 2017 Elsevier Inc. All rights reserved.
Wood, Jonathan S; Donnell, Eric T; Porter, Richard J
2015-02-01
A variety of different study designs and analysis methods have been used to evaluate the performance of traffic safety countermeasures. The most common study designs and methods include observational before-after studies using the empirical Bayes method and cross-sectional studies using regression models. The propensity scores-potential outcomes framework has recently been proposed as an alternative traffic safety countermeasure evaluation method to address the challenges associated with selection biases that can be part of cross-sectional studies. Crash modification factors derived from the application of all three methods have not yet been compared. This paper compares the results of retrospective, observational evaluations of a traffic safety countermeasure using both before-after and cross-sectional study designs. The paper describes the strengths and limitations of each method, focusing primarily on how each addresses site selection bias, which is a common issue in observational safety studies. The Safety Edge paving technique, which seeks to mitigate crashes related to roadway departure events, is the countermeasure used in the present study to compare the alternative evaluation methods. The results indicated that all three methods yielded results that were consistent with each other and with previous research. The empirical Bayes results had the smallest standard errors. It is concluded that the propensity scores with potential outcomes framework is a viable alternative analysis method to the empirical Bayes before-after study. It should be considered whenever a before-after study is not possible or practical. Copyright © 2014 Elsevier Ltd. All rights reserved.
Pezzuti, L; Nacinovich, R; Oggiano, S; Bomba, M; Ferri, R; La Stella, A; Rossetti, S; Orsini, A
2018-07-01
Individuals with Down syndrome generally show a floor effect on Wechsler Scales that is manifested by flat profiles and with many or all of the weighted scores on the subtests equal to 1. The main aim of the present paper is to use the statistical Hessl method and the extended statistical method of Orsini, Pezzuti and Hulbert with a sample of individuals with Down syndrome (n = 128; 72 boys and 56 girls), to underline the variability of performance on Wechsler Intelligence Scale for Children-Fourth Edition subtests and indices, highlighting any strengths and weaknesses of this population that otherwise appear to be flattened. Based on results using traditional transformation of raw scores into weighted scores, a very high percentage of subtests with weighted score of 1 occurred in the Down syndrome sample, with a floor effect and without any statistically significant difference between four core Wechsler Intelligence Scale for Children-Fourth Edition indices. The results, using traditional transformation, confirm a deep cognitive impairment of those with Down syndrome. Conversely, using the new statistical method, it is immediately apparent that the variability of the scores, both on subtests and indices, is wider with respect to the traditional method. Children with Down syndrome show a greater ability in the Verbal Comprehension Index than in the Working Memory Index. © 2018 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
Peterson, Thomas A; Nehrt, Nathan L; Park, DoHwan
2012-01-01
Background and objective With recent breakthroughs in high-throughput sequencing, identifying deleterious mutations is one of the key challenges for personalized medicine. At the gene and protein level, it has proven difficult to determine the impact of previously unknown variants. A statistical method has been developed to assess the significance of disease mutation clusters on protein domains by incorporating domain functional annotations to assist in the functional characterization of novel variants. Methods Disease mutations aggregated from multiple databases were mapped to domains, and were classified as either cancer- or non-cancer-related. The statistical method for identifying significantly disease-associated domain positions was applied to both sets of mutations and to randomly generated mutation sets for comparison. To leverage the known function of protein domain regions, the method optionally distributes significant scores to associated functional feature positions. Results Most disease mutations are localized within protein domains and display a tendency to cluster at individual domain positions. The method identified significant disease mutation hotspots in both the cancer and non-cancer datasets. The domain significance scores (DS-scores) for cancer form a bimodal distribution with hotspots in oncogenes forming a second peak at higher DS-scores than non-cancer, and hotspots in tumor suppressors have scores more similar to non-cancers. In addition, on an independent mutation benchmarking set, the DS-score method identified mutations known to alter protein function with very high precision. Conclusion By aggregating mutations with known disease association at the domain level, the method was able to discover domain positions enriched with multiple occurrences of deleterious mutations while incorporating relevant functional annotations. The method can be incorporated into translational bioinformatics tools to characterize rare and novel variants within large-scale sequencing studies. PMID:22319177
Dilber, Daniel; Malcic, Ivan
2010-08-01
The Aristotle basic complexity score and the risk adjustment in congenital cardiac surgery-1 method were developed and used to compare outcomes of congenital cardiac surgery. Both methods were used to compare results of procedures performed on our patients in Croatian cardiosurgical centres and results of procedures were taken abroad. The study population consisted of all patients with congenital cardiac disease born to Croatian residents between 1 October, 2002 and 1 October, 2007 undergoing a cardiovascular operation during this period. Of the 556 operations, the Aristotle basic complexity score could be assigned to 553 operations and the risk adjustment in congenital cardiac surgery-1 method to 536 operations. Procedures were performed in two institutions in Croatia and seven institutions abroad. The average complexity for cardiac procedures performed in Croatia was significantly lower. With both systems, along with the increase in complexity, there is also an increase in mortality before discharge and postoperative length of stay. Only after the adjustment for complexity there are marked differences in mortality and occurrence of postoperative complications. Both, the Aristotle basic complexity score and the risk adjustment in congenital cardiac surgery-1 method were predictive of in-hospital mortality as well as prolonged postoperative length to stay, and can be used as a tool in our country to evaluate a cardiosurgical model and recognise potential problems.
Computerized quantitative evaluation of mammographic accreditation phantom images
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Yongbum; Tsai, Du-Yih; Shinohara, Norimitsu
2010-12-15
Purpose: The objective was to develop and investigate an automated scoring scheme of the American College of Radiology (ACR) mammographic accreditation phantom (RMI 156, Middleton, WI) images. Methods: The developed method consisted of background subtraction, determination of region of interest, classification of fiber and mass objects by Mahalanobis distance, detection of specks by template matching, and rule-based scoring. Fifty-one phantom images were collected from 51 facilities for this study (one facility provided one image). A medical physicist and two radiologic technologists also scored the images. The human and computerized scores were compared. Results: In terms of meeting the ACR's criteria,more » the accuracies of the developed method for computerized evaluation of fiber, mass, and speck were 90%, 80%, and 98%, respectively. Contingency table analysis revealed significant association between observer and computer scores for microcalcifications (p<5%) but not for masses and fibers. Conclusions: The developed method may achieve a stable assessment of visibility for test objects in mammographic accreditation phantom image in whether the phantom image meets the ACR's criteria in the evaluation test, although there is room left for improvement in the approach for fiber and mass objects.« less
Measuring the degree of integration for an integrated service network
Ye, Chenglin; Browne, Gina; Grdisa, Valerie S; Beyene, Joseph; Thabane, Lehana
2012-01-01
Background Integration involves the coordination of services provided by autonomous agencies and improves the organization and delivery of multiple services for target patients. Current measures generally do not distinguish between agencies’ perception and expectation. We propose a method for quantifying the agencies’ service integration. Using the data from the Children’s Treatment Network (CTN), we aimed to measure the degree of integration for the CTN agencies in York and Simcoe. Theory and methods We quantified the integration by the agreement between perceived and expected levels of involvement and calculated four scores from different perspectives for each agency. We used the average score to measure the global network integration and examined the sensitivity of the global score. Results Most agencies’ integration scores were <65%. As measured by the agreement between every other agency’s perception and expectation, the overall integration of CTN in Simcoe and York was 44% (95% CI: 39%–49%) and 52% (95% CI: 48%–56%), respectively. The sensitivity analysis showed that the global scores were robust. Conclusion Our method extends existing measures of integration and possesses a good extent of validity. We can also apply the method in monitoring improvement and linking integration with other outcomes. PMID:23593050
Teaching Fashion Illustration to University Students: Experiential and Expository Methods.
ERIC Educational Resources Information Center
Dragoo, Sheri; Martin, Ruth E.; Horridge, Patricia
1998-01-01
In a fashion illustration course, 24 students were taught using expository methods and 28 with experiential methods. Each method involved 20 lessons over eight weeks. Pre/posttest results indicated that both methods were equally effective in improving scores. (SK)
Lesson plan profile of senior high school biology teachers in Subang
NASA Astrophysics Data System (ADS)
Rohayati, E.; Diana, S. W.; Priyandoko, D.
2018-05-01
Lesson plan have important role for biology teachers in teaching and learning process. The aim of this study was intended to gain an overview of lesson plan of biology teachers’ at Senior High Schools in Subang which were the members of biology teachers association in Subang. The research method was descriptive method. Data was collected from 30 biology teachers. The result of study showed that lesson plan profile in terms of subject’s identity had good category with 83.33 % of average score. Analysis on basic competence in fair category with 74.45 % of average score. The compatibility of method/strategy was in fair category with average score 72.22 %. The compatibility of instrument, media, and learning resources in fair category with 71.11 % of average score. Learning scenario was in good category with 77.00 % of average score. The compatibility of evaluation was in low category with 56.39 % of average score. It can be concluded that biology teachers in Subang were good enough in making lesson plan, however in terms of the compatibility of evaluation needed to be fixed. Furthermore, teachers’ training for biology teachers’ association was recommended to increasing teachers’ skill to be professional teachers.
Vijarnsorn, Chodchanok; Laohaprasitiporn, Duangmanee; Durongpisitkul, Kritvikrom; Chantong, Prakul; Soongswang, Jarupim; Cheungsomprasong, Paweena; Nana, Apichart; Sriyoschati, Somchai; Subtaweesin, Thawon; Thongcharoen, Punnarerk; Prakanrattana, Ungkab; Krobprachya, Jiraporn; Pooliam, Julaporn
2011-01-01
Objectives. To determine in-hospital mortality and complications of cardiac surgery in pediatric patients and identify predictors of hospital mortality. Methods. Records of pediatric patients who had undergone cardiac surgery in 2005 were reviewed retrospectively. The risk adjustment for congenital heart surgery (RACHS-1) method, the Aristotle basic complexity score (ABC score), and the Society of Thoracic Surgeons and the European Association for Cardiothoracic Surgery Mortality score (STS-EACTS score) were used as measures. Potential predictors were analyzed by risk analysis. Results. 230 pediatric patients had undergone congenital cardiac surgery. Overall, the mortality discharge was 6.1%. From the ROC curve of the RACHS-1, the ABC level, and the STS-EACTS categories, the validities were determined to be 0.78, 0.74, and 0.67, respectively. Mortality risks were found at the high complexity levels of the three tools, bypass time >85 min, and cross clamp time >60 min. Common morbidities were postoperative pyrexia, bleeding, and pleural effusion. Conclusions. Overall mortality and morbidities were 6.1%. The RACHS-1 method, ABC score, and STS-EACTS score were helpful for risk stratification. PMID:21738856
Hegazy, Galal; Safwat, Hesham; Seddik, Mahmoud; Al-shal, Ehab A.; Al-Sebai, Ibrahim; Negm, Mohame
2016-01-01
Background: The optimal operative method for acromioclavicular joint reconstruction remains controversial. The modified Weaver-Dunn method is one of the most popular methods. Anatomic reconstruction of coracoclavicular ligaments with autogenous tendon grafts, widely used in treating chronic acromioclavicular joint instability, reportedly diminishes pain, eliminates sequelae, and improves function as well as strength. Objective: To compare clinical and radiologic outcomes between a modified Weaver-Dunn procedure and an anatomic coracoclavicular ligaments reconstruction technique using autogenous semitendinosus tendon graft. Methods: Twenty patients (mean age, 39 years) with painful, chronic Rockwood type III acromioclavicular joint dislocations were subjected to surgical reconstruction. In ten patients, a modified Weaver-Dunn procedure was performed, in the other ten patients; autogenous semitendinosus tendon graft was used. The mean time between injury and the index procedure was 18 month (range from 9 – 28). Clinical evaluation was performed using the Oxford Shoulder Score and Nottingham Clavicle Score after a mean follow-up time of 27.8 months. Preoperative and postoperative radiographs were compared. Results: In the Weaver-Dunn group the Oxford Shoulder Score improved from 25±4 to 40±2 points. While the Nottingham Clavicle Score increased from 48±7 to 84±11. In semitendinosus tendon graft group, the Oxford Shoulder Score improved from 25±3 points to 50±2 points and the Nottingham Clavicle Score from 48±8 points to 95±8, respectively. Conclusion: Acromioclavicular joint reconstruction using the semitendinosus tendon graft achieved better Oxford Shoulder Score and Nottingham Clavicle Score compared to the modified Weaver-Dunn procedure. PMID:27347245
Food safety in food services in Lombardy: proposal for an inspection-scoring model.
Balzaretti, Claudia M; Razzini, Katia; Ziviani, Silvia; Ratti, Sabrina; Milicevic, Vesna; Chiesa, Luca M; Panseri, Sara; Castrica, Marta
2017-10-20
The purpose of this study was to elaborate a checklist with an inspection scoring system at national level in order to assess compliance with sanitary hygiene requirements of food services. The inspection scoring system was elaborated taking into account the guidelines drawn up by NYC Department of Food Safety and Mental Hygiene. Moreover the checklist was used simultaneously with the standard inspection protocol adopted by Servizio Igiene Alimenti Nutrizione ( Servizio Igiene Alimenti Nutrizione - Ss. I.A.N) and defined by D.G.R 6 March 2017 - n. X/6299 Lombardy Region. Ss. I.A.N protocol consists of a qualitative response according to which we have generated a new protocol with three different grading: A, B and C. The designed checklist was divided into 17 sections. Each section corresponds to prerequisites to be verified during the inspection. Every section includes the type of conformity to check and the type of violation: critical or general. Moreover, the failure to respect the expected compliance generates 4 severity levels that correspond to score classes. A total of 7 food services were checked with the two different inspection methods. The checklist results generated a food safety score for each food service that ranged from 0.0 (no flaws observed) to 187.2, and generates three grading class: A (0.0-28.0); B (29.0-70.0) and C (>71.00). The results from the Ss. I. A. N grading method and the checklist show positive correlation ( r =0.94, P>0.01) suggesting that the methods are comparable. Moreover, our scoring checklist is an easy and unique method compared to standard and allows also managers to perform effective surveillance programs in food service.
Food safety in food services in Lombardy: proposal for an inspection-scoring model
Balzaretti, Claudia M.; Razzini, Katia; Ziviani, Silvia; Ratti, Sabrina; Milicevic, Vesna; Chiesa, Luca M.; Panseri, Sara; Castrica, Marta
2017-01-01
The purpose of this study was to elaborate a checklist with an inspection scoring system at national level in order to assess compliance with sanitary hygiene requirements of food services. The inspection scoring system was elaborated taking into account the guidelines drawn up by NYC Department of Food Safety and Mental Hygiene. Moreover the checklist was used simultaneously with the standard inspection protocol adopted by Servizio Igiene Alimenti Nutrizione (Servizio Igiene Alimenti Nutrizione - Ss. I.A.N) and defined by D.G.R 6 March 2017 – n. X/6299 Lombardy Region. Ss. I.A.N protocol consists of a qualitative response according to which we have generated a new protocol with three different grading: A, B and C. The designed checklist was divided into 17 sections. Each section corresponds to prerequisites to be verified during the inspection. Every section includes the type of conformity to check and the type of violation: critical or general. Moreover, the failure to respect the expected compliance generates 4 severity levels that correspond to score classes. A total of 7 food services were checked with the two different inspection methods. The checklist results generated a food safety score for each food service that ranged from 0.0 (no flaws observed) to 187.2, and generates three grading class: A (0.0-28.0); B (29.0-70.0) and C (>71.00). The results from the Ss. I. A. N grading method and the checklist show positive correlation (r=0.94, P>0.01) suggesting that the methods are comparable. Moreover, our scoring checklist is an easy and unique method compared to standard and allows also managers to perform effective surveillance programs in food service. PMID:29564236
2010-01-01
Background The purpose of this study was to reduce the number of items, create a scoring method and assess the psychometric properties of the Freedom from Glasses Value Scale (FGVS), which measures benefits of freedom from glasses perceived by cataract and presbyopic patients after multifocal intraocular lens (IOL) surgery. Methods The 21-item FGVS, developed simultaneously in French and Spanish, was administered by phone during an observational study to 152 French and 152 Spanish patients who had undergone cataract or presbyopia surgery at least 1 year before the study. Reduction of items and creation of the scoring method employed statistical methods (principal component analysis, multitrait analysis) and content analysis. Psychometric properties (validation of the structure, internal consistency reliability, and known-group validity) of the resulting version were assessed in the pooled population and per country. Results One item was deleted and 3 were kept but not aggregated in a dimension. The other 17 items were grouped into 2 dimensions ('global evaluation', 9 items; 'advantages', 8 items) and divided into 5 sub-dimensions, with higher scores indicating higher benefit of surgery. The structure was validated (good item convergent and discriminant validity). Internal consistency reliability was good for all dimensions and sub-dimensions (Cronbach's alphas above 0.70). The FGVS was able to discriminate between patients wearing glasses or not after surgery (higher scores for patients not wearing glasses). FGVS scores were significantly higher in Spain than France; however, the measure had similar psychometric performances in both countries. Conclusions The FGVS is a valid and reliable instrument measuring benefits of freedom from glasses perceived by cataract and presbyopic patients after multifocal IOL surgery. PMID:20497555
Pavlidis, Paul; Qin, Jie; Arango, Victoria; Mann, John J; Sibille, Etienne
2004-06-01
One of the challenges in the analysis of gene expression data is placing the results in the context of other data available about genes and their relationships to each other. Here, we approach this problem in the study of gene expression changes associated with age in two areas of the human prefrontal cortex, comparing two computational methods. The first method, "overrepresentation analysis" (ORA), is based on statistically evaluating the fraction of genes in a particular gene ontology class found among the set of genes showing age-related changes in expression. The second method, "functional class scoring" (FCS), examines the statistical distribution of individual gene scores among all genes in the gene ontology class and does not involve an initial gene selection step. We find that FCS yields more consistent results than ORA, and the results of ORA depended strongly on the gene selection threshold. Our findings highlight the utility of functional class scoring for the analysis of complex expression data sets and emphasize the advantage of considering all available genomic information rather than sets of genes that pass a predetermined "threshold of significance."
A Comparison of Two Scoring Methods for an Automated Speech Scoring System
ERIC Educational Resources Information Center
Xi, Xiaoming; Higgins, Derrick; Zechner, Klaus; Williamson, David
2012-01-01
This paper compares two alternative scoring methods--multiple regression and classification trees--for an automated speech scoring system used in a practice environment. The two methods were evaluated on two criteria: construct representation and empirical performance in predicting human scores. The empirical performance of the two scoring models…
Common Clinical Practice versus new PRIM Score in Predicting Coronary Heart Disease Risk
Frikke-Schmidt, Ruth; Tybjærg-Hansen, Anne; Schnohr, Peter; Jensen, Gorm B.; Nordestgaard, Børge G.
2011-01-01
Objectives To compare the new Patient Rule Induction Method(PRIM) Score and common clinical practice with the Framingham Point Score for classification of individuals with respect to coronary heart disease(CHD) risk. Methods and Results PRIM Score and the Framingham Point Score were estimated for 11,444 participants from the Copenhagen City Heart Study. Gender specific cumulative incidences and 10 year absolute CHD risks were estimated for subsets defined by age, total cholesterol, high-density lipoprotein(HDL) cholesterol, blood pressure, diabetes and smoking categories. PRIM defined seven mutually exclusive subsets in women and men, with cumulative incidences of CHD from 0.01 to 0.22 in women, and from 0.03 to 0.26 in men. PRIM versus Framingham Point Score found 11% versus 4% of all women, and 31% versus 35% of all men to have 10 year CHD risks >20%. Among women ≥65 years with hypertension and/or with diabetes, 10 year CHD risk >20% was found for 100% with PRIM scoring but for only 18% with the Framingham Point Score. Conclusion Compared to the PRIM Score, common clinical practice with the Framingham Point Score underestimates CHD risk in women, especially in women ≥65 years with hypertension and/or with diabetes. PMID:20728887
Evaluation of a New Scoring System for Retinal Nerve Fiber Layer Photography Using HRA1 in 964 Eyes
Hong, Samin; Moon, Jong Wook; Ha, Seung Joo; Kim, Chan Yun; Seong, Gong Je
2007-01-01
Purpose To evaluate retinal nerve fiber layer (RNFL) defect by a new scoring system for RNFL photography using the Heidelberg Retina Angiograph 1 (HRA1). Methods This retrospective study included 128 healthy eyes and 836 primary open-angle glaucoma eyes. The RNFL photography using HRA1 was interpreted using a new scoring system, and correlated with visual field indices of standard automated perimetry (SAP). Using the presence of RNFL defect, darkness, width, and location, we established the new scoring system of RNFL photos. Results The mean RNFL defect score I in the early, moderate, severe, and control groups were 7.3, 9.2, 10.4, and 3.6, respectively. The mean RNFL defect score II in the early, moderate, severe, and control groups were 14.5, 28.5, 43.4, and 3.4, respectively. Correlations between the RNFL defect score II and the mean deviation of SAP was the strongest of the various combinations (r=-0.675, P<.001). Conclusions Using a new scoring system, we propose a method for semi-quantitative interpretation of RNFL photographs. This scoring system may be helpful to distinguish between normal and glaucomatous eyes, and the score is associated with the severity of visual field loss. PMID:18063886
Apply lightweight recognition algorithms in optical music recognition
NASA Astrophysics Data System (ADS)
Pham, Viet-Khoi; Nguyen, Hai-Dang; Nguyen-Khac, Tung-Anh; Tran, Minh-Triet
2015-02-01
The problems of digitalization and transformation of musical scores into machine-readable format are necessary to be solved since they help people to enjoy music, to learn music, to conserve music sheets, and even to assist music composers. However, the results of existing methods still require improvements for higher accuracy. Therefore, the authors propose lightweight algorithms for Optical Music Recognition to help people to recognize and automatically play musical scores. In our proposal, after removing staff lines and extracting symbols, each music symbol is represented as a grid of identical M ∗ N cells, and the features are extracted and classified with multiple lightweight SVM classifiers. Through experiments, the authors find that the size of 10 ∗ 12 cells yields the highest precision value. Experimental results on the dataset consisting of 4929 music symbols taken from 18 modern music sheets in the Synthetic Score Database show that our proposed method is able to classify printed musical scores with accuracy up to 99.56%.
Automatic Diagnosis of Obstructive Sleep Apnea/Hypopnea Events Using Respiratory Signals.
Aydoğan, Osman; Öter, Ali; Güney, Kerim; Kıymık, M Kemal; Tuncel, Deniz
2016-12-01
Obstructive sleep apnea is a sleep disorder which may lead to various results. While some studies used real-time systems, there are also numerous studies which focus on diagnosing Obstructive Sleep Apnea via signals obtained by polysomnography from apnea patients who spend the night in sleep laboratory. The mean, frequency and power of signals obtained from patients are frequently used. Obstructive Sleep Apnea of 74 patients were scored in this study. A visual-scoring based algorithm and a morphological filter via Artificial Neural Networks were used in order to diagnose Obstructive Sleep Apnea. After total accuracy of scoring was calculated via both methods, it was compared with visual scoring performed by the doctor. The algorithm used in the diagnosis of obstructive sleep apnea reached an average accuracy of 88.33 %, while Artificial Neural Networks and morphological filter method reached a success of 87.28 %. Scoring success was analyzed after it was grouped based on apnea/hypopnea. It is considered that both methods enable doctors to reduce time and costs in the diagnosis of Obstructive Sleep Apnea as well as ease of use.
Detection of vehicle parts based on Faster R-CNN and relative position information
NASA Astrophysics Data System (ADS)
Zhang, Mingwen; Sang, Nong; Chen, Youbin; Gao, Changxin; Wang, Yongzhong
2018-03-01
Detection and recognition of vehicles are two essential tasks in intelligent transportation system (ITS). Currently, a prevalent method is to detect vehicle body, logo or license plate at first, and then recognize them. So the detection task is the most basic, but also the most important work. Besides the logo and license plate, some other parts, such as vehicle face, lamp, windshield and rearview mirror, are also key parts which can reflect the characteristics of vehicle and be used to improve the accuracy of recognition task. In this paper, the detection of vehicle parts is studied, and the work is novel. We choose Faster R-CNN as the basic algorithm, and take the local area of an image where vehicle body locates as input, then can get multiple bounding boxes with their own scores. If the box with maximum score is chosen as final result directly, it is often not the best one, especially for small objects. This paper presents a method which corrects original score with relative position information between two parts. Then we choose the box with maximum comprehensive score as the final result. Compared with original output strategy, the proposed method performs better.
Li, R; Li, C T; Zhao, S M; Li, H X; Li, L; Wu, R G; Zhang, C C; Sun, H Y
2017-04-01
To establish a query table of IBS critical value and identification power for the detection systems with different numbers of STR loci under different false judgment standards. Samples of 267 pairs of full siblings and 360 pairs of unrelated individuals were collected and 19 autosomal STR loci were genotyped by Golden e ye™ 20A system. The full siblings were determined using IBS scoring method according to the 'Regulation for biological full sibling testing'. The critical values and identification power for the detection systems with different numbers of STR loci under different false judgment standards were calculated by theoretical methods. According to the formal IBS scoring criteria, the identification power of full siblings and unrelated individuals was 0.764 0 and the rate of false judgment was 0. The results of theoretical calculation were consistent with that of sample observation. The query table of IBS critical value for identification of full sibling detection systems with different numbers of STR loci was successfully established. The IBS scoring method defined by the regulation has high detection efficiency and low false judgment rate, which provides a relatively conservative result. The query table of IBS critical value for identification of full sibling detection systems with different numbers of STR loci provides an important reference data for the result judgment of full sibling testing and owns a considerable practical value. Copyright© by the Editorial Department of Journal of Forensic Medicine
Methods and statistics for combining motif match scores.
Bailey, T L; Gribskov, M
1998-01-01
Position-specific scoring matrices are useful for representing and searching for protein sequence motifs. A sequence family can often be described by a group of one or more motifs, and an effective search must combine the scores for matching a sequence to each of the motifs in the group. We describe three methods for combining match scores and estimating the statistical significance of the combined scores and evaluate the search quality (classification accuracy) and the accuracy of the estimate of statistical significance of each. The three methods are: 1) sum of scores, 2) sum of reduced variates, 3) product of score p-values. We show that method 3) is superior to the other two methods in both regards, and that combining motif scores indeed gives better search accuracy. The MAST sequence homology search algorithm utilizing the product of p-values scoring method is available for interactive use and downloading at URL http:/(/)www.sdsc.edu/MEME.
Song, Hyun Seok
2011-01-01
Background This study compared the results of patients treated for ulnar impaction syndrome using an ulnar shortening osteotomy (USO) alone with those treated with combined arthroscopic debridement and USO. Methods The results of 27 wrists were reviewed retrospectively. They were divided into three groups: group A (USO alone, 10 cases), group B (combined arthroscopic debridement and USO, 9 cases), and group C (arthroscopic triangular fibrocartilage complex [TFCC] debridement alone, 8 cases). The wrist function was evaluated using the modified Mayo wrist score, disabilities of the arm, shoulder and hand (DASH) score and Chun and Palmer grading system. Results The modified Mayo wrist score in groups A, B, and C was 74.5 ± 8.9, 73.9 ± 11.6, and 61.3 ± 10.2, respectively (p < 0.05). The DASH score in groups A, B, and C was 15.6 ± 11.8, 19.3 ± 11.9, and 33.2 ± 8.5, respectively (p < 0.05). The average Chun and Palmer grading score in groups A and B was 85.7 ± 8.9 and 84.7 ± 6.7, respectively. The difference in the Mayo wrist score, DASH score and Chun and Palmer grading score between group A and B was not significant (p > 0.05). Conclusions Both USO alone and combined arthroscopic TFCC debridement with USO improved the wrist function and reduced the level of pain in the patients treated for ulnar impaction syndrome. USO alone may be the preferred method of treatment in patients if the torn flap of TFCC is not unstable. PMID:21909465
Two-Way Gene Interaction From Microarray Data Based on Correlation Methods.
Alavi Majd, Hamid; Talebi, Atefeh; Gilany, Kambiz; Khayyer, Nasibeh
2016-06-01
Gene networks have generated a massive explosion in the development of high-throughput techniques for monitoring various aspects of gene activity. Networks offer a natural way to model interactions between genes, and extracting gene network information from high-throughput genomic data is an important and difficult task. The purpose of this study is to construct a two-way gene network based on parametric and nonparametric correlation coefficients. The first step in constructing a Gene Co-expression Network is to score all pairs of gene vectors. The second step is to select a score threshold and connect all gene pairs whose scores exceed this value. In the foundation-application study, we constructed two-way gene networks using nonparametric methods, such as Spearman's rank correlation coefficient and Blomqvist's measure, and compared them with Pearson's correlation coefficient. We surveyed six genes of venous thrombosis disease, made a matrix entry representing the score for the corresponding gene pair, and obtained two-way interactions using Pearson's correlation, Spearman's rank correlation, and Blomqvist's coefficient. Finally, these methods were compared with Cytoscape, based on BIND, and Gene Ontology, based on molecular function visual methods; R software version 3.2 and Bioconductor were used to perform these methods. Based on the Pearson and Spearman correlations, the results were the same and were confirmed by Cytoscape and GO visual methods; however, Blomqvist's coefficient was not confirmed by visual methods. Some results of the correlation coefficients are not the same with visualization. The reason may be due to the small number of data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhu, W; Wu, Q; Yuan, L
Purpose: To improve the robustness of a knowledge based automatic lung IMRT planning method and to further validate the reliability of this algorithm by utilizing for the planning of clinical cases with non-coplanar beams. Methods: A lung IMRT planning method which automatically determines both plan optimization objectives and beam configurations with non-coplanar beams has been reported previously. A beam efficiency index map is constructed to guide beam angle selection in this algorithm. This index takes into account both the dose contributions from individual beams and the combined effect of multiple beams which is represented by a beam separation score. Wemore » studied the effect of this beam separation score on plan quality and determined the optimal weight for this score.14 clinical plans were re-planned with the knowledge-based algorithm. Significant dosimetric metrics for the PTV and OARs in the automatic plans are compared with those in the clinical plans by the two-sample t-test. In addition, a composite dosimetric quality index was defined to obtain the relationship between the plan quality and the beam separation score. Results: On average, we observed more than 15% reduction on conformity index and homogeneity index for PTV and V{sub 40}, V{sub 60} for heart while an 8% and 3% increase on V{sub 5}, V{sub 20} for lungs, respectively. The variation curve of the composite index as a function of angle spread score shows that 0.6 is the best value for the weight of the beam separation score. Conclusion: Optimal value for beam angle spread score in automatic lung IMRT planning is obtained. With this value, model can result in statistically the “best” achievable plans. This method can potentially improve the quality and planning efficiency for IMRT plans with no-coplanar angles.« less
Scoring the importance of tropical forest landscapes with local people: patterns and insights.
Sheil, Douglas; Liswanti, Nining
2006-07-01
Good natural resource management is scarce in many remote tropical regions. Improved management requires better local consultation, but accessing and understanding the preferences and concerns of stakeholders can be difficult. Scoring, where items are numerically rated in relation to each other, is simple and seems applicable even in situations where capacity and funds are limited, but managers rarely use such methods. Here we investigate scoring with seven indigenous communities threatened by forest loss in Kalimantan, Indonesia. We aimed to clarify the forest's multifaceted importance, using replication, cross-check exercises, and interviews. Results are sometimes surprising, but generally explained by additional investigation that sometimes provides new insights. The consistency of scoring results increases in line with community literacy and wealth. Various benefits and pitfalls are identified and examined. Aside from revealing and clarifying local preferences, scoring has unexplored potential as a quantitative technique. Scoring is an underappreciated management tool with wide potential.
Adjusting for Confounding in Early Postlaunch Settings: Going Beyond Logistic Regression Models.
Schmidt, Amand F; Klungel, Olaf H; Groenwold, Rolf H H
2016-01-01
Postlaunch data on medical treatments can be analyzed to explore adverse events or relative effectiveness in real-life settings. These analyses are often complicated by the number of potential confounders and the possibility of model misspecification. We conducted a simulation study to compare the performance of logistic regression, propensity score, disease risk score, and stabilized inverse probability weighting methods to adjust for confounding. Model misspecification was induced in the independent derivation dataset. We evaluated performance using relative bias confidence interval coverage of the true effect, among other metrics. At low events per coefficient (1.0 and 0.5), the logistic regression estimates had a large relative bias (greater than -100%). Bias of the disease risk score estimates was at most 13.48% and 18.83%. For the propensity score model, this was 8.74% and >100%, respectively. At events per coefficient of 1.0 and 0.5, inverse probability weighting frequently failed or reduced to a crude regression, resulting in biases of -8.49% and 24.55%. Coverage of logistic regression estimates became less than the nominal level at events per coefficient ≤5. For the disease risk score, inverse probability weighting, and propensity score, coverage became less than nominal at events per coefficient ≤2.5, ≤1.0, and ≤1.0, respectively. Bias of misspecified disease risk score models was 16.55%. In settings with low events/exposed subjects per coefficient, disease risk score methods can be useful alternatives to logistic regression models, especially when propensity score models cannot be used. Despite better performance of disease risk score methods than logistic regression and propensity score models in small events per coefficient settings, bias, and coverage still deviated from nominal.
Quantification of myocardial fibrosis by digital image analysis and interactive stereology.
Daunoravicius, Dainius; Besusparis, Justinas; Zurauskas, Edvardas; Laurinaviciene, Aida; Bironaite, Daiva; Pankuweit, Sabine; Plancoulaine, Benoit; Herlin, Paulette; Bogomolovas, Julius; Grabauskiene, Virginija; Laurinavicius, Arvydas
2014-06-09
Cardiac fibrosis disrupts the normal myocardial structure and has a direct impact on heart function and survival. Despite already available digital methods, the pathologist's visual score is still widely considered as ground truth and used as a primary method in histomorphometric evaluations. The aim of this study was to compare the accuracy of digital image analysis tools and the pathologist's visual scoring for evaluating fibrosis in human myocardial biopsies, based on reference data obtained by point counting performed on the same images. Endomyocardial biopsy material from 38 patients diagnosed with inflammatory dilated cardiomyopathy was used. The extent of total cardiac fibrosis was assessed by image analysis on Masson's trichrome-stained tissue specimens using automated Colocalization and Genie software, by Stereology grid count and manually by Pathologist's visual score. A total of 116 slides were analyzed. The mean results obtained by the Colocalization software (13.72 ± 12.24%) were closest to the reference value of stereology (RVS), while the Genie software and Pathologist score gave a slight underestimation. RVS values correlated strongly with values obtained using the Colocalization and Genie (r>0.9, p<0.001) software as well as the pathologist visual score. Differences in fibrosis quantification by Colocalization and RVS were statistically insignificant. However, significant bias was found in the results obtained by using Genie versus RVS and pathologist score versus RVS with mean difference values of: -1.61% and 2.24%. Bland-Altman plots showed a bidirectional bias dependent on the magnitude of the measurement: Colocalization software overestimated the area fraction of fibrosis in the lower end, and underestimated in the higher end of the RVS values. Meanwhile, Genie software as well as the pathologist score showed more uniform results throughout the values, with a slight underestimation in the mid-range for both. Both applied digital image analysis methods revealed almost perfect correlation with the criterion standard obtained by stereology grid count and, in terms of accuracy, outperformed the pathologist's visual score. Genie algorithm proved to be the method of choice with the only drawback of a slight underestimation bias, which is considered acceptable for both clinical and research evaluations. The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/9857909611227193.
Retrieval of memories with the help of music in Alzheimer's disease.
Chevreau, Priscilia; Nizard, Ingrid; Allain, Philippe
2017-09-01
This study focuses on music as a mediator facilitating access to autobiographical memory in Alzheimer's disease (AD). Studies on this topic are rare, but available data have shown a beneficial effect of music on autobiographical performance in AD patients. Based on the "index word" method, we developed the "index music" method for the evaluation of autobiographical memory. The subjects had to tell a memory of their choice from the words or music presented to them. The task was proposed to 54 patients with diagnosis of AD according to DSM IV and NINCDS-ADRDA criteria. All of them had a significant cognitive decline on the MMSE (mean score: 14.5). Patients were matched by age, sex and level of education with 48 control subjects without cognitive impairment (mean score on the MMSE: 28). Results showed that autobiographical memory quantity scores of AD patients were significantly lower than those of healthy control in both methods. However, autobiographical memory quality scores of AD patients increased with "index music" whereas autobiographical memory quality scores of healthy control decreased. Also, the autobiographical performance of patients with AD in condition index music was not correlated with cognitive performance in contrast to the autobiographical performances in index word. These results confirm that music improves access to personal memories in patients with AD. Personal memories could be preserved in patients with AD and music could constitute an interesting way to stimulate recollection.
Environmental impact assessment of Gonabad municipal waste landfill site using Leopold Matrix
Sajjadi, Seyed Ali; Aliakbari, Zohreh; Matlabi, Mohammad; Biglari, Hamed; Rasouli, Seyedeh Samira
2017-01-01
Introduction An environmental impact assessment (EIA) before embarking on any project is a useful tool to reduce the potential effects of each project, including landfill, if possible. The main objective of this study was to assess the environmental impact of the current municipal solid waste disposal site of Gonabad by using the Iranian Leopold matrix method. Methods This cross-sectional study was conducted to assess the environmental impacts of a landfill site in Gonabad in 2015 by an Iranian matrix (modified Leopold matrix). This study was conducted based on field visits of the landfill, and collected information from various sources and analyzing and comparing between five available options, including the continuation of the current disposal practices, construction of new sanitary landfills, recycling plans, composting, and incineration plants was examined. The best option was proposed to replace the existing landfill. Results The current approach has a score of 2.35, the construction of new sanitary landfill has a score of 1.59, a score of 1.57 for the compost plant, and recycling and incineration plant, respectively, have scores of 1.68 and 2.3. Conclusion Results showed that continuation of the current method of disposal, due to severe environmental damage and health problems, is rejected. A compost plant with the lowest negative score is the best option for the waste disposal site of Gonabad City and has priority over the other four options. PMID:28465797
An evaluation of bias in propensity score-adjusted non-linear regression models.
Wan, Fei; Mitra, Nandita
2018-03-01
Propensity score methods are commonly used to adjust for observed confounding when estimating the conditional treatment effect in observational studies. One popular method, covariate adjustment of the propensity score in a regression model, has been empirically shown to be biased in non-linear models. However, no compelling underlying theoretical reason has been presented. We propose a new framework to investigate bias and consistency of propensity score-adjusted treatment effects in non-linear models that uses a simple geometric approach to forge a link between the consistency of the propensity score estimator and the collapsibility of non-linear models. Under this framework, we demonstrate that adjustment of the propensity score in an outcome model results in the decomposition of observed covariates into the propensity score and a remainder term. Omission of this remainder term from a non-collapsible regression model leads to biased estimates of the conditional odds ratio and conditional hazard ratio, but not for the conditional rate ratio. We further show, via simulation studies, that the bias in these propensity score-adjusted estimators increases with larger treatment effect size, larger covariate effects, and increasing dissimilarity between the coefficients of the covariates in the treatment model versus the outcome model.
Using clustering and a modified classification algorithm for automatic text summarization
NASA Astrophysics Data System (ADS)
Aries, Abdelkrime; Oufaida, Houda; Nouali, Omar
2013-01-01
In this paper we describe a modified classification method destined for extractive summarization purpose. The classification in this method doesn't need a learning corpus; it uses the input text to do that. First, we cluster the document sentences to exploit the diversity of topics, then we use a learning algorithm (here we used Naive Bayes) on each cluster considering it as a class. After obtaining the classification model, we calculate the score of a sentence in each class, using a scoring model derived from classification algorithm. These scores are used, then, to reorder the sentences and extract the first ones as the output summary. We conducted some experiments using a corpus of scientific papers, and we have compared our results to another summarization system called UNIS.1 Also, we experiment the impact of clustering threshold tuning, on the resulted summary, as well as the impact of adding more features to the classifier. We found that this method is interesting, and gives good performance, and the addition of new features (which is simple using this method) can improve summary's accuracy.
Rezaei, Nazanin; Tavalaee, Zahra; Sayehmiri, Kourosh; Sharifi, Nasibeh; Daliri, Salman
2018-04-01
Some physical, emotional and social changes arise in mothers during the postpartum periods which can affect the quality of life (QOL) of the mother and family. Given the importance of the quality of life in the postpartum period and its influencing factors such as method of delivery, the present study aimed at investigating the relationship between the quality of life and methods of delivery in the world, using a systematic review and meta-analysis method. The present study is a systematic review and meta-analysis on the relationship between aspects of quality of life and method of delivery in the world conducted in Persian and English language articles published by the end of 2015. For this purpose, the databases of Medlib, SID, Scopus, ISI Web of Science, PubMed, Google scholar, Irandoc, Magiran and Iranmedex were searched using key words and their compounds. The results of studies were combined using the random effects model in the meta-analysis. Heterogeneity of studies was assessed using I2 index and Cochran test and data were analyzed using STATA Version 11.1 and SPSS Version 16. Based on the results of the meta-analysis of studies, the aspect of physical functioning had the highest quality of life mean score in women with vaginal delivery: 74.37 (95% CI: 67.7-81) and mental health had the highest QOL mean score in women with cesarean delivery: 65.8 (95% CI: 62.7-69). Also, based on the time elapsed since delivery, mental health had the highest mean score in less than 1 month, 2 months and 4 months' postpartum. Physical pain had the highest mean score 6 months after giving birth, and mental functioning in 8 months after giving birth. The results of the present meta-analysis showed that the mean scores for most dimensions of quality of life in women with vaginal delivery were higher than in women with cesarean delivery.
Depression and Related Problems in University Students
ERIC Educational Resources Information Center
Field, Tiffany; Diego, Miguel; Pelaez, Martha; Deeds, Osvelia; Delgado, Jeannette
2012-01-01
Method: Depression and related problems were studied in a sample of 283 university students. Results: The students with high depression scores also had high scores on anxiety, intrusive thoughts, controlling intrusive thoughts and sleep disturbances scales. A stepwise regression suggested that those problems contributed to a significant proportion…
Evaluation of temperament scoring methods for beef cattle
USDA-ARS?s Scientific Manuscript database
The objective of this study was to evaluate methods of temperament scoring. Crossbred (n=228) calves were evaluated for temperament by an individual evaluator at weaning by two methods of scoring: 1) pen score (1 to 5 scale, with higher scores indicating increasing degree of nervousness, aggressiven...
A supervised framework for resolving coreference in clinical records.
Rink, Bryan; Roberts, Kirk; Harabagiu, Sanda M
2012-01-01
A method for the automatic resolution of coreference between medical concepts in clinical records. A multiple pass sieve approach utilizing support vector machines (SVMs) at each pass was used to resolve coreference. Information such as lexical similarity, recency of a concept mention, synonymy based on Wikipedia redirects, and local lexical context were used to inform the method. Results were evaluated using an unweighted average of MUC, CEAF, and B(3) coreference evaluation metrics. The datasets used in these research experiments were made available through the 2011 i2b2/VA Shared Task on Coreference. The method achieved an average F score of 0.821 on the ODIE dataset, with a precision of 0.802 and a recall of 0.845. These results compare favorably to the best-performing system with a reported F score of 0.827 on the dataset and the median system F score of 0.800 among the eight teams that participated in the 2011 i2b2/VA Shared Task on Coreference. On the i2b2 dataset, the method achieved an average F score of 0.906, with a precision of 0.895 and a recall of 0.918 compared to the best F score of 0.915 and the median of 0.859 among the 16 participating teams. Post hoc analysis revealed significant performance degradation on pathology reports. The pathology reports were characterized by complex synonymy and very few patient mentions. The use of several simple lexical matching methods had the most impact on achieving competitive performance on the task of coreference resolution. Moreover, the ability to detect patients in electronic medical records helped to improve coreference resolution more than other linguistic analysis.
Sepsis mortality prediction with the Quotient Basis Kernel.
Ribas Ripoll, Vicent J; Vellido, Alfredo; Romero, Enrique; Ruiz-Rodríguez, Juan Carlos
2014-05-01
This paper presents an algorithm to assess the risk of death in patients with sepsis. Sepsis is a common clinical syndrome in the intensive care unit (ICU) that can lead to severe sepsis, a severe state of septic shock or multi-organ failure. The proposed algorithm may be implemented as part of a clinical decision support system that can be used in combination with the scores deployed in the ICU to improve the accuracy, sensitivity and specificity of mortality prediction for patients with sepsis. In this paper, we used the Simplified Acute Physiology Score (SAPS) for ICU patients and the Sequential Organ Failure Assessment (SOFA) to build our kernels and algorithms. In the proposed method, we embed the available data in a suitable feature space and use algorithms based on linear algebra, geometry and statistics for inference. We present a simplified version of the Fisher kernel (practical Fisher kernel for multinomial distributions), as well as a novel kernel that we named the Quotient Basis Kernel (QBK). These kernels are used as the basis for mortality prediction using soft-margin support vector machines. The two new kernels presented are compared against other generative kernels based on the Jensen-Shannon metric (centred, exponential and inverse) and other widely used kernels (linear, polynomial and Gaussian). Clinical relevance is also evaluated by comparing these results with logistic regression and the standard clinical prediction method based on the initial SAPS score. As described in this paper, we tested the new methods via cross-validation with a cohort of 400 test patients. The results obtained using our methods compare favourably with those obtained using alternative kernels (80.18% accuracy for the QBK) and the standard clinical prediction method, which are based on the basal SAPS score or logistic regression (71.32% and 71.55%, respectively). The QBK presented a sensitivity and specificity of 79.34% and 83.24%, which outperformed the other kernels analysed, logistic regression and the standard clinical prediction method based on the basal SAPS score. Several scoring systems for patients with sepsis have been introduced and developed over the last 30 years. They allow for the assessment of the severity of disease and provide an estimate of in-hospital mortality. Physiology-based scoring systems are applied to critically ill patients and have a number of advantages over diagnosis-based systems. Severity score systems are often used to stratify critically ill patients for possible inclusion in clinical trials. In this paper, we present an effective algorithm that combines both scoring methodologies for the assessment of death in patients with sepsis that can be used to improve the sensitivity and specificity of the currently available methods. Copyright © 2014 Elsevier B.V. All rights reserved.
Investigation of work-related disorders in truck drivers using RULA method.
Massaccesi, M; Pagnotta, A; Soccetti, A; Masali, M; Masiero, C; Greco, F
2003-07-01
A high incidence of spinal disorders is observed in professional drivers; in particular, back and neck pain result in high rates of morbidity and low retirement age. A sample of 77 drivers, of rubbish-collection vehicles who sit in a standard posture and of road-washing vehicles, who drive with the neck and trunk flexed, bent and twisted, was studied using RULA, a method for the evaluation of the exposure to risk factors associated with work-related upper-limb disorders. Results showed a significant association between trunk and neck scores and all self-reported pains, aches or discomforts in the trunk or neck regions in all subjects. In particular, the neck score was significant in both postures, reflecting high loading of the neck. Significantly different posture scores were also recorded for drivers using an adjustable vs. a non-adjustable seat. In this first RULA study of the working posture of professional truck drivers, the method proved to be a suitable tool for the rapid evaluation of the loading of neck and trunk.
A Summary Score for the Framingham Heart Study Neuropsychological Battery
Downer, Brian; Fardo, David W.; Schmitt, Frederick A.
2015-01-01
Objective To calculate three summary scores of the Framingham Heart Study neuropsychological battery and determine which score best differentiates between subjects classified as having normal cognition, test-based impaired learning and memory, test-based multidomain impairment, and dementia. Method The final sample included 2,503 participants. Three summary scores were assessed: (a) composite score that provided equal weight to each subtest, (b) composite score that provided equal weight to each cognitive domain assessed by the neuropsychological battery, and (c) abbreviated score comprised of subtests for learning and memory. Receiver operating characteristic analysis was used to determine which summary score best differentiated between the four cognitive states. Results The summary score that provided equal weight to each subtest best differentiated between the four cognitive states. Discussion A summary score that provides equal weight to each subtest is an efficient way to utilize all of the cognitive data collected by a neuropsychological battery. PMID:25804903
A Novel Approach for Lie Detection Based on F-Score and Extreme Learning Machine
Gao, Junfeng; Wang, Zhao; Yang, Yong; Zhang, Wenjia; Tao, Chunyi; Guan, Jinan; Rao, Nini
2013-01-01
A new machine learning method referred to as F-score_ELM was proposed to classify the lying and truth-telling using the electroencephalogram (EEG) signals from 28 guilty and innocent subjects. Thirty-one features were extracted from the probe responses from these subjects. Then, a recently-developed classifier called extreme learning machine (ELM) was combined with F-score, a simple but effective feature selection method, to jointly optimize the number of the hidden nodes of ELM and the feature subset by a grid-searching training procedure. The method was compared to two classification models combining principal component analysis with back-propagation network and support vector machine classifiers. We thoroughly assessed the performance of these classification models including the training and testing time, sensitivity and specificity from the training and testing sets, as well as network size. The experimental results showed that the number of the hidden nodes can be effectively optimized by the proposed method. Also, F-score_ELM obtained the best classification accuracy and required the shortest training and testing time. PMID:23755136
Menard, J-P; Mazouni, C; Fenollar, F; Raoult, D; Boubli, L; Bretelle, F
2010-12-01
The purpose of this investigation was to determine the diagnostic accuracy of quantitative real-time polymerase chain reaction (PCR) assay in diagnosing bacterial vaginosis versus the standard methods, the Amsel criteria and the Nugent score. The Amsel criteria, the Nugent score, and results from the molecular tool were obtained independently from vaginal samples of 163 pregnant women who reported abnormal vaginal symptoms before 20 weeks gestation. To determine the performance of the molecular tool, we calculated the kappa value, sensitivity, specificity, and positive and negative predictive values. Either or both of the Amsel criteria (≥3 criteria) and the Nugent score (score ≥7) indicated that 25 women (15%) had bacterial vaginosis, and the remaining 138 women did not. DNA levels of Gardnerella vaginalis or Atopobium vaginae exceeded 10(9) copies/mL or 10(8) copies/mL, respectively, in 34 (21%) of the 163 samples. Complete agreement between both reference methods and high concentrations of G. vaginalis and A. vaginae was found in 94.5% of women (154/163 samples, kappa value = 0.81, 95% confidence interval 0.70-0.81). The nine samples with discordant results were categorized as intermediate flora by the Nugent score. The molecular tool predicted bacterial vaginosis with a sensitivity of 100%, a specificity of 93%, a positive predictive value of 73%, and a negative predictive value of 100%. The quantitative real-time PCR assay shows excellent agreement with the results of both reference methods for the diagnosis of bacterial vaginosis.
Shekhawat, Vishal; Banshiwal, Ramesh Chandra; Verma, Rajender Kumar
2017-01-01
Introduction The distal humeral fractures are common fractures of upper limb and are difficult to treat. These fractures, if left untreated or inadequately treated, leads to poor outcomes. Management of distal humeral fractures are pertained to many controversies and one among them is position of plates. Aim To compare the clinical and radiological outcomes in patients with intra-articular distal humerus fractures, treated using parallel and perpendicular double plating methods. Materials and Methods A total of 38 patients with distal humerus fractures, 20 in perpendicular plating group (group A) and 18 in parallel plating group (group B), were included in this prospective randomised study. At each follow up patients were evaluated clinically and radiologically for union and the outcomes were measured in terms of Mayo Elbow Performance Score (MEPS) consisting of pain intensity, range of motion, stability and function. MEP score greater than 90 is considered as excellent; Score 75 to 89 is good; Score 60 to 74 is fair and Score less than 60 is poor. Results In our study, 15 patients (75%) in group A, and 13 patients (72.22%) in group B achieved excellent results. Two patients (10%) in group A and 4 patients (22.22%) in group B attained good results. Complications developed in 2 patients in each groups. No significant differences were found between the clinical outcomes of the two plating methods. Conclusion Neither of the plating techniques are superior to the other, as inferred from the insignificant differences in bony union, elbow function and complications between the two plating techniques. PMID:28384948
Koo, Evonne; McNamara, Sara; Lansing, Bonnie; Olmsted, Russell N.; Rye, Ruth Anne; Fitzgerald, Thomas; Mody, Lona
2016-01-01
Objectives To assess effectiveness of an interactive educational program in increasing knowledge of key infection prevention and control (IPC) principles with emphasis on indwelling device care, hand hygiene and multi-drug resistant organisms (MDROs) among nursing home (NH) healthcare personnel (HCP). Methods We conducted a multi-modal randomized-controlled study involving HCP at 12 NHs. Ten comprehensive and interactive modules covered common IPC topics. We compared: a) intervention and control scores to assess differences in pre-test scores as a result of field interventions; b) pre- and post-test scores to assess knowledge gain and c) magnitude of knowledge gain based on job categories. Results 4,962 tests were returned over the course of the intervention with 389–633 HCP/module. Participants were mostly female certified nursing assistants (CNAs). Score improvement was highest for modules emphasizing hand hygiene, urinary catheter care and MDROs (15.6%, 15.95%, and 22.0%, respectively). After adjusting for cluster study design, knowledge scores were significantly higher after each educational module, suggesting the education delivery method was effective. When compared to CNAs, nursing and rehabilitation personnel scored significantly higher in their knowledge tests. Conclusion Our intervention significantly improved IPC knowledge in HCP, especially for those involved in direct patient care. This increase in knowledge along with preemptive barrier precautions and active surveillance has enhanced resident safety by reducing MDROs and infections in high-risk NH residents. PMID:27553671
Score-moment combined linear discrimination analysis (SMC-LDA) as an improved discrimination method.
Han, Jintae; Chung, Hoeil; Han, Sung-Hwan; Yoon, Moon-Young
2007-01-01
A new discrimination method called the score-moment combined linear discrimination analysis (SMC-LDA) has been developed and its performance has been evaluated using three practical spectroscopic datasets. The key concept of SMC-LDA was to use not only the score from principal component analysis (PCA), but also the moment of the spectrum, as inputs for LDA to improve discrimination. Along with conventional score, moment is used in spectroscopic fields as an effective alternative for spectral feature representation. Three different approaches were considered. Initially, the score generated from PCA was projected onto a two-dimensional feature space by maximizing Fisher's criterion function (conventional PCA-LDA). Next, the same procedure was performed using only moment. Finally, both score and moment were utilized simultaneously for LDA. To evaluate discrimination performances, three different spectroscopic datasets were employed: (1) infrared (IR) spectra of normal and malignant stomach tissue, (2) near-infrared (NIR) spectra of diesel and light gas oil (LGO) and (3) Raman spectra of Chinese and Korean ginseng. For each case, the best discrimination results were achieved when both score and moment were used for LDA (SMC-LDA). Since the spectral representation character of moment was different from that of score, inclusion of both score and moment for LDA provided more diversified and descriptive information.
64 slice MDCT generally underestimates coronary calcium scores as compared to EBT: A phantom study
DOE Office of Scientific and Technical Information (OSTI.GOV)
Greuter, M. J. W.; Dijkstra, H.; Groen, J. M.
The objective of our study was the determination of the influence of the sequential and spiral acquisition modes on the concordance and deviation of the calcium score on 64-slice multi-detector computed tomography (MDCT) scanners in comparison to electron beam tomography (EBT) as the gold standard. Our methods and materials were an anthropomorphic cardio CT phantom with different calcium inserts scanned in sequential and spiral acquisition modes on three identical 64-slice MDCT scanners of manufacturer A and on three identical 64-slice MDCT scanners of manufacturer B and on an EBT system. Every scan was repeated 30 times with and 15 timesmore » without a small random variation in the phantom position for both sequential and spiral modes. Significant differences were observed between EBT and 64-slice MDCT data for all inserts, both acquisition modes, and both manufacturers of MDCT systems. High regression coefficients (0.90-0.98) were found between the EBT and 64-slice MDCT data for both scoring methods and both systems with high correlation coefficients (R{sup 2}>0.94). System A showed more significant differences between spiral and sequential mode than system B. Almost no differences were observed in scanners of the same manufacturer for the Agatston score and no differences for the Volume score. The deviations of the Agatston and Volume scores showed regression dependencies approximately equal to the square root of the absolute score. The Agatston and Volume scores obtained with 64-slice MDCT imaging are highly correlated with EBT-obtained scores but are significantly underestimated (-10% to -2%) for both sequential and spiral acquisition modes. System B is more independent of acquisition mode to calcium score than system A. The Volume score shows no intramanufacturer dependency and its use is advocated versus the Agatston score. Using the same cut points for MDCT-based calcium scores as for EBT-based calcium scores can result in classifying individuals into a too low risk category. System information and scanprotocol is therefore needed for every calcium score procedure to ensure a correct clinical interpretation of the obtained calcium score results.« less
Everly, Marcee C
2013-02-01
To report the transformation from lecture to more active learning methods in a maternity nursing course and to evaluate whether student perception of improved learning through active-learning methods is supported by improved test scores. The process of transforming a course into an active-learning model of teaching is described. A voluntary mid-semester survey for student acceptance of the new teaching method was conducted. Course examination results, from both a standardized exam and a cumulative final exam, among students who received lecture in the classroom and students who had active learning activities in the classroom were compared. Active learning activities were very acceptable to students. The majority of students reported learning more from having active-learning activities in the classroom rather than lecture-only and this belief was supported by improved test scores. Students who had active learning activities in the classroom scored significantly higher on a standardized assessment test than students who received lecture only. The findings support the use of student reflection to evaluate the effectiveness of active-learning methods and help validate the use of student reflection of improved learning in other research projects. Copyright © 2011 Elsevier Ltd. All rights reserved.
Contrasting OLS and Quantile Regression Approaches to Student "Growth" Percentiles
ERIC Educational Resources Information Center
Castellano, Katherine Elizabeth; Ho, Andrew Dean
2013-01-01
Regression methods can locate student test scores in a conditional distribution, given past scores. This article contrasts and clarifies two approaches to describing these locations in terms of readily interpretable percentile ranks or "conditional status percentile ranks." The first is Betebenner's quantile regression approach that results in…
Dynamic TIMI Risk Score for STEMI
Amin, Sameer T.; Morrow, David A.; Braunwald, Eugene; Sloan, Sarah; Contant, Charles; Murphy, Sabina; Antman, Elliott M.
2013-01-01
Background Although there are multiple methods of risk stratification for ST‐elevation myocardial infarction (STEMI), this study presents a prospectively validated method for reclassification of patients based on in‐hospital events. A dynamic risk score provides an initial risk stratification and reassessment at discharge. Methods and Results The dynamic TIMI risk score for STEMI was derived in ExTRACT‐TIMI 25 and validated in TRITON‐TIMI 38. Baseline variables were from the original TIMI risk score for STEMI. New variables were major clinical events occurring during the index hospitalization. Each variable was tested individually in a univariate Cox proportional hazards regression. Variables with P<0.05 were incorporated into a full multivariable Cox model to assess the risk of death at 1 year. Each variable was assigned an integer value based on the odds ratio, and the final score was the sum of these values. The dynamic score included the development of in‐hospital MI, arrhythmia, major bleed, stroke, congestive heart failure, recurrent ischemia, and renal failure. The C‐statistic produced by the dynamic score in the derivation database was 0.76, with a net reclassification improvement (NRI) of 0.33 (P<0.0001) from the inclusion of dynamic events to the original TIMI risk score. In the validation database, the C‐statistic was 0.81, with a NRI of 0.35 (P=0.01). Conclusions This score is a prospectively derived, validated means of estimating 1‐year mortality of STEMI at hospital discharge and can serve as a clinically useful tool. By incorporating events during the index hospitalization, it can better define risk and help to guide treatment decisions. PMID:23525425
Cehreli, S Burcak; Polat-Ozsoy, Omur; Sar, Cagla; Cubukcu, H Evren; Cehreli, Zafer C
2012-04-01
The amount of the residual adhesive after bracket debonding is frequently assessed in a qualitative manner, utilizing the adhesive remnant index (ARI). This study aimed to investigate whether quantitative assessment of the adhesive remnant yields more precise results compared to qualitative methods utilizing the 4- and 5-point ARI scales. Twenty debonded brackets were selected. Evaluation and scoring of the adhesive remnant on bracket bases were made consecutively using: 1. qualitative assessment (visual scoring) and 2. quantitative measurement (image analysis) on digital photographs. Image analysis was made on scanning electron micrographs (SEM) and high-precision elemental maps of the adhesive remnant as determined by energy dispersed X-ray spectrometry. Evaluations were made in accordance with the original 4-point and the modified 5-point ARI scales. Intra-class correlation coefficients (ICCs) were calculated, and the data were evaluated using Friedman test followed by Wilcoxon signed ranks test with Bonferroni correction. ICC statistics indicated high levels of agreement for qualitative visual scoring among examiners. The 4-point ARI scale was compliant with the SEM assessments but indicated significantly less adhesive remnant compared to the results of quantitative elemental mapping. When the 5-point scale was used, both quantitative techniques yielded similar results with those obtained qualitatively. These results indicate that qualitative visual scoring using the ARI is capable of generating similar results with those assessed by quantitative image analysis techniques. In particular, visual scoring with the 5-point ARI scale can yield similar results with both the SEM analysis and elemental mapping.
Wang, Candice; Huang, Chin-Chou; Lin, Shing-Jong; Chen, Jaw-Wen
2016-01-01
Objectives The goal of our study was to shed light on educational methods to strengthen medical students' cardiopulmonary resuscitation (CPR) leadership and team skills in order to optimise CPR understanding and success using didactic videos and high-fidelity simulations. Design An observational study. Setting A tertiary medical centre in Northern Taiwan. Participants A total of 104 5–7th year medical students, including 72 men and 32 women. Interventions We provided the medical students with a 2-hour training session on advanced CPR. During each class, we divided the students into 1–2 groups; each group consisted of 4–6 team members. Medical student teams were trained by using either method A or B. Method A started with an instructional CPR video followed by a first CPR simulation. Method B started with a first CPR simulation followed by an instructional CPR video. All students then participated in a second CPR simulation. Outcome measures Student teams were assessed with checklist rating scores in leadership, teamwork and team member skills, global rating scores by an attending physician and video-recording evaluation by 2 independent individuals. Results The 104 medical students were divided into 22 teams. We trained 11 teams using method A and 11 using method B. Total second CPR simulation scores were significantly higher than first CPR simulation scores in leadership (p<0.001), teamwork (p<0.001) and team member skills (p<0.001). For methods A and B students' first CPR simulation scores were similar, but method A students' second CPR simulation scores were significantly higher than those of method B in leadership skills (p=0.034), specifically in the support subcategory (p=0.049). Conclusions Although both teaching strategies improved leadership, teamwork and team member performance, video exposure followed by CPR simulation further increased students' leadership skills compared with CPR simulation followed by video exposure. PMID:27678539
The effect of teaching method on long-term knowledge retention.
Beers, Geri W; Bowden, Susan
2005-11-01
Choosing a teaching strategy that results in knowledge retention on the part of learners can be challenging for educators. Studies on problem-based learning (PBL) have supported its effectiveness, compared to other, more traditional strategies. The results of a previous study comparing the effect of lecture versus PBL on objective test scores indicated there was no significant difference in scores. To measure long-term knowledge retention, the same groups were evaluated 1 year after instruction. The posttest administered in the original study was repeated, and the scores from a comprehensive adult health examination and the endocrine subsection were analyzed. At an alpha level of 0.05, a statistically significant difference was found in the scores on two of the measures. The scores of the PBL group were significantly higher on the endocrine section of the examination and the repeat posttest.
Kasper, Judith D.; Brandt, Jason; Pezzin, Liliana E.
2012-01-01
Objective. To examine the measurement equivalence of items on disability across three international surveys of aging. Method. Data for persons aged 65 and older were drawn from the Health and Retirement Survey (HRS, n = 10,905), English Longitudinal Study of Aging (ELSA, n = 5,437), and Survey of Health, Ageing and Retirement in Europe (SHARE, n = 13,408). Differential item functioning (DIF) was assessed using item response theory (IRT) methods for activities of daily living (ADL) and instrumental activities of daily living (IADL) items. Results. HRS and SHARE exhibited measurement equivalence, but 6 of 11 items in ELSA demonstrated meaningful DIF. At the scale level, this item-level DIF affected scores reflecting greater disability. IRT methods also spread out score distributions and shifted scores higher (toward greater disability). Results for mean disability differences by demographic characteristics, using original and DIF-adjusted scores, were the same overall but differed for some subgroup comparisons involving ELSA. Discussion. Testing and adjusting for DIF is one means of minimizing measurement error in cross-national survey comparisons. IRT methods were used to evaluate potential measurement bias in disability comparisons across three international surveys of aging. The analysis also suggested DIF was mitigated for scales including both ADL and IADL and that summary indexes (counts of limitations) likely underestimate mean disability in these international populations. PMID:22156662
Objectification of perceptual image quality for mobile video
NASA Astrophysics Data System (ADS)
Lee, Seon-Oh; Sim, Dong-Gyu
2011-06-01
This paper presents an objective video quality evaluation method for quantifying the subjective quality of digital mobile video. The proposed method aims to objectify the subjective quality by extracting edgeness and blockiness parameters. To evaluate the performance of the proposed algorithms, we carried out subjective video quality tests with the double-stimulus continuous quality scale method and obtained differential mean opinion score values for 120 mobile video clips. We then compared the performance of the proposed methods with that of existing methods in terms of the differential mean opinion score with 120 mobile video clips. Experimental results showed that the proposed methods were approximately 10% better than the edge peak signal-to-noise ratio of the J.247 method in terms of the Pearson correlation.
He, Hua; McDermott, Michael P.
2012-01-01
Sensitivity and specificity are common measures of the accuracy of a diagnostic test. The usual estimators of these quantities are unbiased if data on the diagnostic test result and the true disease status are obtained from all subjects in an appropriately selected sample. In some studies, verification of the true disease status is performed only for a subset of subjects, possibly depending on the result of the diagnostic test and other characteristics of the subjects. Estimators of sensitivity and specificity based on this subset of subjects are typically biased; this is known as verification bias. Methods have been proposed to correct verification bias under the assumption that the missing data on disease status are missing at random (MAR), that is, the probability of missingness depends on the true (missing) disease status only through the test result and observed covariate information. When some of the covariates are continuous, or the number of covariates is relatively large, the existing methods require parametric models for the probability of disease or the probability of verification (given the test result and covariates), and hence are subject to model misspecification. We propose a new method for correcting verification bias based on the propensity score, defined as the predicted probability of verification given the test result and observed covariates. This is estimated separately for those with positive and negative test results. The new method classifies the verified sample into several subsamples that have homogeneous propensity scores and allows correction for verification bias. Simulation studies demonstrate that the new estimators are more robust to model misspecification than existing methods, but still perform well when the models for the probability of disease and probability of verification are correctly specified. PMID:21856650
Strom, Suzanne L; Anderson, Craig L; Yang, Luanna; Canales, Cecilia; Amin, Alpesh; Lotfipour, Shahram; McCoy, C Eric; Osborn, Megan Boysen; Langdorf, Mark I
2015-11-01
Traditional Advanced Cardiac Life Support (ACLS) courses are evaluated using written multiple-choice tests. High-fidelity simulation is a widely used adjunct to didactic content, and has been used in many specialties as a training resource as well as an evaluative tool. There are no data to our knowledge that compare simulation examination scores with written test scores for ACLS courses. To compare and correlate a novel high-fidelity simulation-based evaluation with traditional written testing for senior medical students in an ACLS course. We performed a prospective cohort study to determine the correlation between simulation-based evaluation and traditional written testing in a medical school simulation center. Students were tested on a standard acute coronary syndrome/ventricular fibrillation cardiac arrest scenario. Our primary outcome measure was correlation of exam results for 19 volunteer fourth-year medical students after a 32-hour ACLS-based Resuscitation Boot Camp course. Our secondary outcome was comparison of simulation-based vs. written outcome scores. The composite average score on the written evaluation was substantially higher (93.6%) than the simulation performance score (81.3%, absolute difference 12.3%, 95% CI [10.6-14.0%], p<0.00005). We found a statistically significant moderate correlation between simulation scenario test performance and traditional written testing (Pearson r=0.48, p=0.04), validating the new evaluation method. Simulation-based ACLS evaluation methods correlate with traditional written testing and demonstrate resuscitation knowledge and skills. Simulation may be a more discriminating and challenging testing method, as students scored higher on written evaluation methods compared to simulation.
Application of Ponseti method in patients older than 6 months with congenital talipes equinovarus.
Wang, Yan-zhou; Wang, Xiao-wen; Zhang, Peng; Wang, Xing-shan
2009-08-18
To evaluate the effectiveness of Ponseti method in the treatment of congenital talipes equinovarus (CTE) in children older than 6 months. Ponseti method was used to treat 157 cases (227 feet) of CTE in children older than 6 months. All cases were classified by age and by the degree of deformity severity. The age group classification was: (1) I Group (6 months to 12 months), 113 feet in 81 cases; (2) II Group (1 to 3 years old), 78 feet in 52 cases; (3) III Group (> 3 years old), 36 feet in 24 cases. The degree of deformity of CTE was evaluated with Pirani scoring system. The cases were classified into three groups according to the deformity degree: (1) Mild Group (scoring 1-2.5), 85 feet in 56 cases; (2) Moderate Group (scoring 3-4.5), 104 feet in 71 cases; (3) Severe Group (scoring 5-6), 38 feet in 30 cases. A Pirani score of 0-0.5 is regarded as an excellent result. For each group, we evaluated the number of casts used, the percentage of excellent result according to the Pirani score, and the percentage of percutaneous achillotenotomy. The result was compared among different groups. The overall percentage of excellent result among all cases was 96.92%. Among the age groups, the percentage of excellence was not statistically different between I Group and II Group (P > 0.05). The percentage of excellence was lower in the III group than the other groups (P > 0.01). Among the groups classified by deformity degree, the percentage of excellence was the lowest in severe group (P < 0.05), and the difference between the mild group and moderate group was not statistically different (P > 0.05). The number of casts used among different groups were different (P < 0.01). Among different groups, the percentages of percutaneous achillotenotomy were significantly different (P < 0.01). 209 feet in 148 cases were followed up for average time duration of 3 years and 11 months. Relapse was observed in 40 feet in 29 cases. The percentages of relapse were not statistically different among different groups (P > 0.05). Using Ponseti method to treat CTE for children older than 6 months can achieve excellent results in this study.
Austin, Peter C
2018-01-01
Propensity score methods are increasingly being used to estimate the effects of treatments and exposures when using observational data. The propensity score was initially developed for use with binary exposures (e.g., active treatment vs. control). The generalized propensity score is an extension of the propensity score for use with quantitative exposures (e.g., dose or quantity of medication, income, years of education). A crucial component of any propensity score analysis is that of balance assessment. This entails assessing the degree to which conditioning on the propensity score (via matching, weighting, or stratification) has balanced measured baseline covariates between exposure groups. Methods for balance assessment have been well described and are frequently implemented when using the propensity score with binary exposures. However, there is a paucity of information on how to assess baseline covariate balance when using the generalized propensity score. We describe how methods based on the standardized difference can be adapted for use with quantitative exposures when using the generalized propensity score. We also describe a method based on assessing the correlation between the quantitative exposure and each covariate in the sample when weighted using generalized propensity score -based weights. We conducted a series of Monte Carlo simulations to evaluate the performance of these methods. We also compared two different methods of estimating the generalized propensity score: ordinary least squared regression and the covariate balancing propensity score method. We illustrate the application of these methods using data on patients hospitalized with a heart attack with the quantitative exposure being creatinine level.
Liu, Chengyu; Zhao, Lina; Tang, Hong; Li, Qiao; Wei, Shoushui; Li, Jianqing
2016-08-01
False alarm (FA) rates as high as 86% have been reported in intensive care unit monitors. High FA rates decrease quality of care by slowing staff response times while increasing patient burdens and stresses. In this study, we proposed a rule-based and multi-channel information fusion method for accurately classifying the true or false alarms for five life-threatening arrhythmias: asystole (ASY), extreme bradycardia (EBR), extreme tachycardia (ETC), ventricular tachycardia (VTA) and ventricular flutter/fibrillation (VFB). The proposed method consisted of five steps: (1) signal pre-processing, (2) feature detection and validation, (3) true/false alarm determination for each channel, (4) 'real-time' true/false alarm determination and (5) 'retrospective' true/false alarm determination (if needed). Up to four signal channels, that is, two electrocardiogram signals, one arterial blood pressure and/or one photoplethysmogram signal were included in the analysis. Two events were set for the method validation: event 1 for 'real-time' and event 2 for 'retrospective' alarm classification. The results showed that 100% true positive ratio (i.e. sensitivity) on the training set were obtained for ASY, EBR, ETC and VFB types, and 94% for VTA type, accompanied by the corresponding true negative ratio (i.e. specificity) results of 93%, 81%, 78%, 85% and 50% respectively, resulting in the score values of 96.50, 90.70, 88.89, 92.31 and 64.90, as well as with a final score of 80.57 for event 1 and 79.12 for event 2. For the test set, the proposed method obtained the score of 88.73 for ASY, 77.78 for EBR, 89.92 for ETC, 67.74 for VFB and 61.04 for VTA types, with the final score of 71.68 for event 1 and 75.91 for event 2.
Automatic Evidence Retrieval for Systematic Reviews
Choong, Miew Keen; Galgani, Filippo; Dunn, Adam G
2014-01-01
Background Snowballing involves recursively pursuing relevant references cited in the retrieved literature and adding them to the search results. Snowballing is an alternative approach to discover additional evidence that was not retrieved through conventional search. Snowballing’s effectiveness makes it best practice in systematic reviews despite being time-consuming and tedious. Objective Our goal was to evaluate an automatic method for citation snowballing’s capacity to identify and retrieve the full text and/or abstracts of cited articles. Methods Using 20 review articles that contained 949 citations to journal or conference articles, we manually searched Microsoft Academic Search (MAS) and identified 78.0% (740/949) of the cited articles that were present in the database. We compared the performance of the automatic citation snowballing method against the results of this manual search, measuring precision, recall, and F1 score. Results The automatic method was able to correctly identify 633 (as proportion of included citations: recall=66.7%, F1 score=79.3%; as proportion of citations in MAS: recall=85.5%, F1 score=91.2%) of citations with high precision (97.7%), and retrieved the full text or abstract for 490 (recall=82.9%, precision=92.1%, F1 score=87.3%) of the 633 correctly retrieved citations. Conclusions The proposed method for automatic citation snowballing is accurate and is capable of obtaining the full texts or abstracts for a substantial proportion of the scholarly citations in review articles. By automating the process of citation snowballing, it may be possible to reduce the time and effort of common evidence surveillance tasks such as keeping trial registries up to date and conducting systematic reviews. PMID:25274020
Improving Factor Score Estimation Through the Use of Observed Background Characteristics
Curran, Patrick J.; Cole, Veronica; Bauer, Daniel J.; Hussong, Andrea M.; Gottfredson, Nisha
2016-01-01
A challenge facing nearly all studies in the psychological sciences is how to best combine multiple items into a valid and reliable score to be used in subsequent modelling. The most ubiquitous method is to compute a mean of items, but more contemporary approaches use various forms of latent score estimation. Regardless of approach, outside of large-scale testing applications, scoring models rarely include background characteristics to improve score quality. The current paper used a Monte Carlo simulation design to study score quality for different psychometric models that did and did not include covariates across levels of sample size, number of items, and degree of measurement invariance. The inclusion of covariates improved score quality for nearly all design factors, and in no case did the covariates degrade score quality relative to not considering the influences at all. Results suggest that the inclusion of observed covariates can improve factor score estimation. PMID:28757790
Doosti-Irani, Amin; Mansournia, Mohammad Ali; Rahimi-Foroushani, Abbas; Haddad, Peiman
2017-01-01
Background Palliative treatments and stents are necessary for relieving dysphagia in patients with esophageal cancer. The aim of this study was to simultaneously compare available treatments in terms of complications. Methods Web of Science, Medline, Scopus, Cochrane Library and Embase were searched. Statistical heterogeneity was assessed using the Chi2 test and was quantified by I2. The results of this study were summarized in terms of Risk Ratio (RR). The random effects model was used to report the results. The rank probability for each treatment was calculated using the p-score. Results Out of 17855 references, 24 RCTs reported complications including treatment related death (TRD), bleeding, stent migration, aspiration, severe pain and fistula formation. In the ranking of treatments, thermal ablative therapy (p-score = 0.82), covered Evolution® stent (p-score = 0.70), brachytherapy (p-score = 0.72) and antireflux stent (p-score = 0.74) were better treatments in the network of TRD. Thermal ablative therapy (p-score = 0.86), the conventional stent (p-score = 0.62), covered Evolution® stent (p-score = 0.96) and brachytherapy (p-score = 0.82) were better treatments in the network of bleeding complications. Covered Evolution® (p-score = 0.78), uncovered (p-score = 0.88) and irradiation stents (p-score = 0.65) were better treatments in network of stent migration complications. In the network of severe pain, Conventional self-expandable nitinol alloy covered stent (p-score = 0.73), polyflex (p-score = 0.79), latex prosthesis (p-score = 0.96) and brachytherapy (p-score = 0.65) were better treatments. Conclusion According to our results, thermal ablative therapy, covered Evolution® stents, brachytherapy, and antireflux stents are associated with a lower risk of TRD. Moreover, thermal ablative therapy, conventional, covered Evolution® and brachytherapy had lower risks of bleeding. Overall, fewer complications were associated with covered Evolution® stent and brachytherapy. PMID:28968416
Gross Motor Development in Children Aged 3-5 Years, United States 2012.
Kit, Brian K; Akinbami, Lara J; Isfahani, Neda Sarafrazi; Ulrich, Dale A
2017-07-01
Objective Gross motor development in early childhood is important in fostering greater interaction with the environment. The purpose of this study is to describe gross motor skills among US children aged 3-5 years using the Test of Gross Motor Development (TGMD-2). Methods We used 2012 NHANES National Youth Fitness Survey (NNYFS) data, which included TGMD-2 scores obtained according to an established protocol. Outcome measures included locomotor and object control raw and age-standardized scores. Means and standard errors were calculated for demographic and weight status with SUDAAN using sample weights to calculate nationally representative estimates, and survey design variables to account for the complex sampling methods. Results The sample included 339 children aged 3-5 years. As expected, locomotor and object control raw scores increased with age. Overall mean standardized scores for locomotor and object control were similar to the mean value previously determined using a normative sample. Girls had a higher mean locomotor, but not mean object control, standardized score than boys (p < 0.05). However, the mean locomotor standardized scores for both boys and girls fell into the range categorized as "average." There were no other differences by age, race/Hispanic origin, weight status, or income in either of the subtest standardized scores (p > 0.05). Conclusions In a nationally representative sample of US children aged 3-5 years, TGMD-2 mean locomotor and object control standardized scores were similar to the established mean. These results suggest that standardized gross motor development among young children generally did not differ by demographic or weight status.
Creating a Computer Adaptive Test Version of the Late-Life Function & Disability Instrument
Jette, Alan M.; Haley, Stephen M.; Ni, Pengsheng; Olarsch, Sippy; Moed, Richard
2009-01-01
Background This study applied Item Response Theory (IRT) and Computer Adaptive Test (CAT) methodologies to develop a prototype function and disability assessment instrument for use in aging research. Herein, we report on the development of the CAT version of the Late-Life Function & Disability instrument (Late-Life FDI) and evaluate its psychometric properties. Methods We employed confirmatory factor analysis, IRT methods, validation, and computer simulation analyses of data collected from 671 older adults residing in residential care facilities. We compared accuracy, precision, and sensitivity to change of scores from CAT versions of two Late-Life FDI scales with scores from the fixed-form instrument. Score estimates from the prototype CAT versus the original instrument were compared in a sample of 40 older adults. Results Distinct function and disability domains were identified within the Late-Life FDI item bank and used to construct two prototype CAT scales. Using retrospective data, scores from computer simulations of the prototype CAT scales were highly correlated with scores from the original instrument. The results of computer simulation, accuracy, precision, and sensitivity to change of the CATs closely approximated those of the fixed-form scales, especially for the 10- or 15-item CAT versions. In the prospective study each CAT was administered in less than 3 minutes and CAT scores were highly correlated with scores generated from the original instrument. Conclusions CAT scores of the Late-Life FDI were highly comparable to those obtained from the full-length instrument with a small loss in accuracy, precision, and sensitivity to change. PMID:19038841
Rempe, Michael J; Clegern, William C; Wisor, Jonathan P
2015-01-01
Introduction Rodent sleep research uses electroencephalography (EEG) and electromyography (EMG) to determine the sleep state of an animal at any given time. EEG and EMG signals, typically sampled at >100 Hz, are segmented arbitrarily into epochs of equal duration (usually 2–10 seconds), and each epoch is scored as wake, slow-wave sleep (SWS), or rapid-eye-movement sleep (REMS), on the basis of visual inspection. Automated state scoring can minimize the burden associated with state and thereby facilitate the use of shorter epoch durations. Methods We developed a semiautomated state-scoring procedure that uses a combination of principal component analysis and naïve Bayes classification, with the EEG and EMG as inputs. We validated this algorithm against human-scored sleep-state scoring of data from C57BL/6J and BALB/CJ mice. We then applied a general homeostatic model to characterize the state-dependent dynamics of sleep slow-wave activity and cerebral glycolytic flux, measured as lactate concentration. Results More than 89% of epochs scored as wake or SWS by the human were scored as the same state by the machine, whether scoring in 2-second or 10-second epochs. The majority of epochs scored as REMS by the human were also scored as REMS by the machine. However, of epochs scored as REMS by the human, more than 10% were scored as SWS by the machine and 18 (10-second epochs) to 28% (2-second epochs) were scored as wake. These biases were not strain-specific, as strain differences in sleep-state timing relative to the light/dark cycle, EEG power spectral profiles, and the homeostatic dynamics of both slow waves and lactate were detected equally effectively with the automated method or the manual scoring method. Error associated with mathematical modeling of temporal dynamics of both EEG slow-wave activity and cerebral lactate either did not differ significantly when state scoring was done with automated versus visual scoring, or was reduced with automated state scoring relative to manual classification. Conclusions Machine scoring is as effective as human scoring in detecting experimental effects in rodent sleep studies. Automated scoring is an efficient alternative to visual inspection in studies of strain differences in sleep and the temporal dynamics of sleep-related physiological parameters. PMID:26366107
Calculating the risk of a pancreatic fistula after a pancreaticoduodenectomy: a systematic review
Vallance, Abigail E; Young, Alastair L; Macutkiewicz, Christian; Roberts, Keith J; Smith, Andrew M
2015-01-01
Background A post-operative pancreatic fistula (POPF) is a major cause of morbidity and mortality after a pancreaticoduodenectomy (PD). This systematic review aimed to identify all scoring systems to predict POPF after a PD, consider their clinical applicability and assess the study quality. Method An electronic search was performed of Medline (1946–2014) and EMBASE (1996–2014) databases. Results were screened according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, and quality assessed according to the QUIPS (quality in prognostic studies) tool. Results Six eligible scoring systems were identified. Five studies used the International Study Group on Pancreatic Fistula (ISGPF) definition. The proposed scores feature between two and five variables and of the 16 total variables, the majority (12) featured in only one score. Three scores could be fully completed pre-operatively whereas 1 score included intra-operative and two studies post-operative variables. Four scores were internally validated and of these, two scores have been subject to subsequent multicentre review. The median QUIPS score was 38 out of 50 (range 16–50). Conclusion These scores show potential in calculating the individualized patient risk of POPF. There is, however, much variation in current scoring systems and further validation in large multicentre cohorts is now needed. PMID:26456948
Medial Tibial Stress Syndrome: Evidence-Based Prevention
Craig, Debbie I
2008-01-01
Reference: Thacker SB, Gilchrist J, Stroup DF, Kimsey CD. The prevention of shin splints in sports: a systematic review of literature. Med Sci Sports Exerc. 2002;34(1):32–40. Clinical Question: Among physically active individuals, which medial tibial stress syndrome (MTSS) prevention methods are most effective to decrease injury rates? Data Sources: Studies were identified by searching MEDLINE (1966–2000), Current Contents (1996–2000), Biomedical Collection (1993–1999), and Dissertation Abstracts. Reference lists of identified studies were searched manually until no further studies were identified. Experts in the field were contacted, including first authors of randomized controlled trials addressing prevention of MTSS. The Cochrane Collaboration (early stage of Cochrane Database of Systematic Reviews) was contacted. Study Selection: Inclusion criteria included randomized controlled trials or clinical trials comparing different MTSS prevention methods with control groups. Excluded were studies that did not provide primary research data or that addressed treatment and rehabilitation rather than prevention of incident MTSS. Data Extraction: A total of 199 citations were identified. Of these, 4 studies compared prevention methods for MTSS. Three reviewers independently scored the 4 studies. Reviewers were blinded to the authors' names and affiliations but not the results. Each study was evaluated independently for methodologic quality using a 100-point checklist. Final scores were averages of the 3 reviewers' scores. Main Results: Prevention methods studied were shock-absorbent insoles, foam heel pads, Achilles tendon stretching, footwear, and graduated running programs. No statistically significant results were noted for any of the prevention methods. Median quality scores ranged from 29 to 47, revealing flaws in design, control for bias, and statistical methods. Conclusions: No current evidence supports any single prevention method for MTSS. The most promising outcomes support the use of shock-absorbing insoles. Well-designed and controlled trials are critically needed to decrease the incidence of this common injury. PMID:18523568
Measuring change in critical thinking skills of dental students educated in a PBL curriculum.
Pardamean, Bens
2012-04-01
This study measured the change in critical thinking skills of dental students educated in a problem-based learning (PBL) pedagogical method. The quantitative analysis was focused on measuring students' critical thinking skills achievement from their first through third years of dental education at the University of Southern California. This non-experimental evaluation was based on a volunteer sample of ninety-eight dental students who completed a demographics/academic questionnaire and a psychometric assessment known as the Health Sciences Reasoning Test (HSRT). The HSRT produced the overall critical thinking skills score. Additionally, the HSRT generated five subscale scores: analysis, inference, evaluation, deductive reasoning, and inductive reasoning. The results of this study concluded that the students showed no continuous and significant incremental improvement in their overall critical thinking skills score achievement during their PBL-based dental education. Except for the inductive reasoning score, this result was very consistent with the four subscale scores. Moreover, after performing the statistical adjustment on total score and subscale scores, no significant statistical differences were found among the three student groups. However, the results of this study found some aspects of critical thinking achievements that differed by categories of gender, race, English as first language, and education level.
Zhong, Guangjun; Liang, Zhu; Kan, Jiang; Muheremu, Aikeremujiang
2018-01-01
Objective This study was performed to determine the efficacy of selective peripheral nerve resection for treatment of persistent neuropathic pain after total knee arthroplasty (TKA). Methods Patients who underwent TKA in our department from January 2013 to July 2016 and experienced persistent pain around the knee joint after TKA were retrospectively included in the current study. Sixty patients were divided into experimental and control groups according the treatment they received. The treatment effect was evaluated by the Hospital for Special Surgery (HSS) knee score and visual analog scale (VAS) pain score preoperatively and at 1, 2, 3, 6, and 12 months postoperatively. Results The HSS knee scores were higher in both groups after than before the treatment, and HSS knee scores were significantly higher in the experimental group than in the control group. The VAS pain scores were lower in both groups after than before the treatment, and VAS pain scores were significantly lower in the experimental group than in the control group. Conclusions Selective peripheral nerve resection is an effective treatment method for persistent neuropathic pain after TKA.
Soleymani, Mohammad Reza; Hemmati, Soheila; Ashrafi-Rizi, Hassan; Shahrzadi, Leila
2017-01-01
Maintaining and improving the health situation of children requires them to become more aware about personal hygiene through proper education. Based on several studies, teachings provided through informal methods are fully understandable for children. Therefore, the goal of this study is to compare the effects of creative drama and storytelling education methods on increasing the awareness of children regarding personal hygiene. This is an applied study conducted using semiempirical method in two groups. The study population consisted of 85 children participating in 4 th center for Institute for the Intellectual Development of Children and Young Adults in Isfahan, 40 of which were randomly selected and placed in storytelling and creative drama groups with 20 members each. The data gathering tool was a questionnaire created by the researchers whose content validity was confirmed by health education experts. The gathered information were analyzed using both descriptive (average and standard deviation) and analytical (independent t -test and paired t -test) statistical methods. The findings showed that there was a meaningful difference between the awareness score of both groups before and after intervention. The average awareness score of storytelling group was increased from 50.69 to 86.83 while the average score of creative drama group was increased from 57.37 to 85.09. Furthermore, according to paired t -test results, there was no significant difference between average scores of storytelling and creative drama groups. The results of the current study showed that although both storytelling and creative drama methods are effective in increasing the awareness of children regarding personal hygiene, there is no significant difference between the two methods.
Medial tibial stress syndrome: evidence-based prevention.
Craig, Debbie I
2008-01-01
Thacker SB, Gilchrist J, Stroup DF, Kimsey CD. The prevention of shin splints in sports: a systematic review of literature. Med Sci Sports Exerc. 2002;34(1):32-40. Among physically active individuals, which medial tibial stress syndrome (MTSS) prevention methods are most effective to decrease injury rates? Studies were identified by searching MEDLINE (1966-2000), Current Contents (1996-2000), Biomedical Collection (1993-1999), and Dissertation Abstracts. Reference lists of identified studies were searched manually until no further studies were identified. Experts in the field were contacted, including first authors of randomized controlled trials addressing prevention of MTSS. The Cochrane Collaboration (early stage of Cochrane Database of Systematic Reviews) was contacted. Inclusion criteria included randomized controlled trials or clinical trials comparing different MTSS prevention methods with control groups. Excluded were studies that did not provide primary research data or that addressed treatment and rehabilitation rather than prevention of incident MTSS. A total of 199 citations were identified. Of these, 4 studies compared prevention methods for MTSS. Three reviewers independently scored the 4 studies. Reviewers were blinded to the authors' names and affiliations but not the results. Each study was evaluated independently for methodologic quality using a 100-point checklist. Final scores were averages of the 3 reviewers' scores. Prevention methods studied were shock-absorbent insoles, foam heel pads, Achilles tendon stretching, footwear, and graduated running programs. No statistically significant results were noted for any of the prevention methods. Median quality scores ranged from 29 to 47, revealing flaws in design, control for bias, and statistical methods. No current evidence supports any single prevention method for MTSS. The most promising outcomes support the use of shock-absorbing insoles. Well-designed and controlled trials are critically needed to decrease the incidence of this common injury.
Wyllie, E; Naugle, R; Awad, I; Chelune, G; Lüders, H; Dinner, D; Skibinski, C; Ahl, J
1991-01-01
To assess predictive value of the intracarotid amobarbital procedure (IAP) for decreased postoperative modality-specific memory, we studied 37 temporal lobectomy patients with intractable partial epilepsy who were selected for operation independent of preoperative IAP findings. When ipsilateral IAP failure was defined by an absolute method as a retention score less than 67%, the results were not associated with decreased modality-specific memory after operation. When ipsilateral IAP failure was defined by a comparative method as a retention score at least 20% lower after ipsilateral than contralateral injection, the results showed greater differences between groups, but differences still did not achieve statistical significance. Four left-resection patients who failed the ipsilateral IAP had a median postoperative change in the Wechsler Memory Scale-Revised (WMS-R) Verbal Memory Index score of -14%, whereas 16 left-resection patients who passed the ipsilateral IAP had a mean postoperative change in the WMS-R Verbal Memory Index score of -7.5% (p = 0.12). These results suggested that the IAP interpreted comparatively may be a helpful adjunctive test in assessment of relative risk for modality-specific memory dysfunction after temporal lobectomy, but larger series of operated patients are needed to confirm this possibility. In this series, complete amnesia was not noted after ipsilateral injection, even in patients with postoperative modality-specific memory decline.
Can formative quizzes predict or improve summative exam performance?*
Zhang, Niu; Henderson, Charles N.R.
2015-01-01
Objective Despite wide use, the value of formative exams remains unclear. We evaluated the possible benefits of formative assessments in a physical examination course at our chiropractic college. Methods Three hypotheses were examined: (1) Receiving formative quizzes (FQs) will increase summative exam (SX) scores, (2) writing FQ questions will further increase SE scores, and (3) FQs can predict SX scores. Hypotheses were tested across three separate iterations of the class. Results The SX scores for the control group (Class 3) were significantly less than those of Classes 1 and 2, but writing quiz questions and taking FQs (Class 1) did not produce significantly higher SX scores than only taking FQs (Class 2). The FQ scores were significant predictors of SX scores, accounting for 52% of the SX score. Sex, age, academic degrees, and ethnicity were not significant copredictors. Conclusion Our results support the assertion that FQs can improve written SX performance, but students producing quiz questions didn't further increase SX scores. We concluded that nonthreatening FQs may be used to enhance student learning and suggest that they also may serve to identify students who, without additional remediation, will perform poorly on subsequent summative written exams. PMID:25517737
Sermsathanasawadi, Nuttawut; Chaivanit, Trakarn; Suparatchatpun, Pinyo; Chinsakchai, Khamin; Wongwanit, Chumpol; Ruangsetakit, Chanean; Mutirangura, Pramook
2017-03-01
Objective To develop a new pretest probability score for deep vein thrombosis (DVT) in unselected population of outpatients and inpatients. Methods The new score was developed using independent factors from 500 patients clinically suspected of leg DVT. The new score was validated in a second group of 315 patients. Results The score consists of four components: unilateral leg pain, confinement to bed, calf enlargement >3 cm compared with the other side, and previous venous thromboembolism. A score ≥2 indicated a high probability while a score <2 indicated low probability. The sensitivity and specificity of the new score were 71.60% and 79.49%, respectively. The area under the receiver operating characteristic curve for the new score was 0.79. The combination of a new score <2 and D-dimer level <500 µg/L had a negative predictive value of 96.43%. Conclusions Our new score was valid in an unselected population of outpatients and inpatients.
Khodaveisi, Masoud; Qaderian, Khosro; Oshvandi, Khodayar; Soltanian, Ali Reza; Vardanjani, Mehdi molavi
2017-01-01
Background and aims learning plays an important role in developing nursing skills and right care-taking. The Present study aims to evaluate two learning methods based on team –based learning and lecture-based learning in learning care-taking of patients with diabetes in nursing students. Method In this quasi-experimental study, 64 students in term 4 in nursing college of Bukan and Miandoab were included in the study based on knowledge and performance questionnaire including 15 questions based on knowledge and 5 questions based on performance on care-taking in patients with diabetes were used as data collection tool whose reliability was confirmed by cronbach alpha (r=0.83) by the researcher. To compare the mean score of knowledge and performance in each group in pre-test step and post-test step, pair –t test and to compare mean of scores in two groups of control and intervention, the independent t- test was used. Results There was not significant statistical difference between two groups in pre terms of knowledge and performance score (p=0.784). There was significant difference between the mean of knowledge scores and diabetes performance in the post-test in the team-based learning group and lecture-based learning group (p=0.001). There was significant difference between the mean score of knowledge of diabetes care in pre-test and post-test in base learning groups (p=0.001). Conclusion In both methods team-based and lecture-based learning approaches resulted in improvement in learning in students, but the rate of learning in the team-based learning approach is greater compared to that of lecture-based learning and it is recommended that this method be used as a higher education method in the education of students.
Semi-automatic computerized approach to radiological quantification in rheumatoid arthritis
NASA Astrophysics Data System (ADS)
Steiner, Wolfgang; Schoeffmann, Sylvia; Prommegger, Andrea; Boegl, Karl; Klinger, Thomas; Peloschek, Philipp; Kainberger, Franz
2004-04-01
Rheumatoid Arthritis (RA) is a common systemic disease predominantly involving the joints. Precise diagnosis and follow-up therapy requires objective quantification. For this purpose, radiological analyses using standardized scoring systems are considered to be the most appropriate method. The aim of our study is to develop a semi-automatic image analysis software, especially applicable for scoring of joints in rheumatic disorders. The X-Ray RheumaCoach software delivers various scoring systems (Larsen-Score and Ratingen-Rau-Score) which can be applied by the scorer. In addition to the qualitative assessment of joints performed by the radiologist, a semi-automatic image analysis for joint detection and measurements of bone diameters and swollen tissue supports the image assessment process. More than 3000 radiographs from hands and feet of more than 200 RA patients were collected, analyzed, and statistically evaluated. Radiographs were quantified using conventional paper-based Larsen score and the X-Ray RheumaCoach software. The use of the software shortened the scoring time by about 25 percent and reduced the rate of erroneous scorings in all our studies. Compared to paper-based scoring methods, the X-Ray RheumaCoach software offers several advantages: (i) Structured data analysis and input that minimizes variance by standardization, (ii) faster and more precise calculation of sum scores and indices, (iii) permanent data storing and fast access to the software"s database, (iv) the possibility of cross-calculation to other scores, (v) semi-automatic assessment of images, and (vii) reliable documentation of results in the form of graphical printouts.
Zeng, Tao; Mott, Christopher; Mollicone, Daniel; Sanford, Larry D.
2012-01-01
The current standard for monitoring sleep in rats requires labor intensive surgical procedures and the implantation of chronic electrodes which have the potential to impact behavior and sleep. With the goal of developing a non-invasive method to determine sleep and wakefulness, we constructed a non-contact monitoring system to measure movement and respiratory activity using signals acquired with pulse Doppler radar and from digitized video analysis. A set of 23 frequency and time-domain features were derived from these signals and were calculated in 10 s epochs. Based on these features, a classification method for automated scoring of wakefulness, non-rapid eye movement sleep (NREM) and REM in rats was developed using a support vector machine (SVM). We then assessed the utility of the automated scoring system in discriminating wakefulness and sleep by comparing the results to standard scoring of wakefulness and sleep based on concurrently recorded EEG and EMG. Agreement between SVM automated scoring based on selected features and visual scores based on EEG and EMG were approximately 91% for wakefulness, 84% for NREM and 70% for REM. The results indicate that automated scoring based on non-invasively acquired movement and respiratory activity will be useful for studies requiring discrimination of wakefulness and sleep. However, additional information or signals will be needed to improve discrimination of NREM and REM episodes within sleep. PMID:22178621
Morreale, Federico; Angelino, Donato; Pellegrini, Nicoletta
2018-04-25
Gluten-free (GF) products are consumed both by individuals with celiac disease and by an increasing number of people with no specific medical needs. Although the technological quality of GF products has been recently improved, their nutritional quality is still scarcely addressed. Moreover, the few published studies report conflicting results, mostly because the information from product nutrition facts is the only considered factor. The aim of the present study was to develop a score-based method for the nutritional evaluation of 134 packaged Italian GF bakery products and to compare it with that of 162 matched gluten-containing (GC) food items. The score included the information from the nutrition facts and the presence/absence of some nutritionally relevant components in the ingredients list. Results indicated an overall low nutritional quality of the considered GF bakery products. Additionally, with the sole exception of GF bread substitutes, there was no difference in nutritional quality between GF and equivalent GC bakery products. Future research and development of GF bakery products may take advantage of this scoring method, as it may represent an easy approach to evaluate their nutritional quality. The present findings do not justify the consumption of packaged GF bakery products by people without any specific medical needs.
Liinamo, A E; Karjalainen, L; Ojala, M; Vilva, V
1997-03-01
Data from field trials of Finnish Hounds between 1988 and 1992 in Finland were used to estimate genetic parameters and environmental effects for measures of hunting performance using REML procedures and an animal model. The original data set included 28,791 field trial records from 5,666 dogs. Males and females had equal hunting performance, whereas experience acquired by age improved trial results compared with results for young dogs (P < .001). Results were mostly better on snow than on bare ground (P < .001), and testing areas, years, months, and their interactions affected results (P < .001). Estimates of heritabilities and repeatabilities were low for most of the 28 measures, mainly due to large residual variances. The highest heritabilities were for frequency of tonguing (h2 = .15), pursuit score (h2 = .13), tongue score (h2 = .13), ghost trailing score (h2 = .12), and merit and final score (both h2 = .11). Estimates of phenotypic and genetic correlations were positive and moderate or high for search scores, pursuit scores, and final scores but lower for other studied measures. The results suggest that, due to low heritabilities, evaluation of breeding values for Finnish Hounds with respect to their hunting ability should be based on animal model BLUP methods instead of mere performance testing. The evaluation system of field trials should also be revised for more reliability.
New methods for analyzing semantic graph based assessments in science education
NASA Astrophysics Data System (ADS)
Vikaros, Lance Steven
This research investigated how the scoring of semantic graphs (known by many as concept maps) could be improved and automated in order to address issues of inter-rater reliability and scalability. As part of the NSF funded SENSE-IT project to introduce secondary school science students to sensor networks (NSF Grant No. 0833440), semantic graphs illustrating how temperature change affects water ecology were collected from 221 students across 16 schools. The graphing task did not constrain students' use of terms, as is often done with semantic graph based assessment due to coding and scoring concerns. The graphing software used provided real-time feedback to help students learn how to construct graphs, stay on topic and effectively communicate ideas. The collected graphs were scored by human raters using assessment methods expected to boost reliability, which included adaptations of traditional holistic and propositional scoring methods, use of expert raters, topical rubrics, and criterion graphs. High levels of inter-rater reliability were achieved, demonstrating that vocabulary constraints may not be necessary after all. To investigate a new approach to automating the scoring of graphs, thirty-two different graph features characterizing graphs' structure, semantics, configuration and process of construction were then used to predict human raters' scoring of graphs in order to identify feature patterns correlated to raters' evaluations of graphs' topical accuracy and complexity. Results led to the development of a regression model able to predict raters' scoring with 77% accuracy, with 46% accuracy expected when used to score new sets of graphs, as estimated via cross-validation tests. Although such performance is comparable to other graph and essay based scoring systems, cross-context testing of the model and methods used to develop it would be needed before it could be recommended for widespread use. Still, the findings suggest techniques for improving the reliability and scalability of semantic graph based assessments without requiring constraint of how ideas are expressed.
ERIC Educational Resources Information Center
Diaz, Juan Jose; Handa, Sudhanshu
2006-01-01
Not all policy questions can be addressed by social experiments. Nonexperimental evaluation methods provide an alternative to experimental designs but their results depend on untestable assumptions. This paper presents evidence on the reliability of propensity score matching (PSM), which estimates treatment effects under the assumption of…
Applications of Small Area Estimation to Generalization with Subclassification by Propensity Scores
ERIC Educational Resources Information Center
Chan, Wendy
2018-01-01
Policymakers have grown increasingly interested in how experimental results may generalize to a larger population. However, recently developed propensity score-based methods are limited by small sample sizes, where the experimental study is generalized to a population that is at least 20 times larger. This is particularly problematic for methods…
NASA Astrophysics Data System (ADS)
Josephsen, Gary D.; Josephsen, Kelly A.; Beilman, Greg J.; Taylor, Jodie H.; Muiler, Kristine E.
2005-12-01
This is a report of the adaptation of microwave processing in the preparation of liver biopsies for transmission electron microscopy (TEM) to examine ultrastructural damage of mitochondria in the setting of metabolic stress. Hemorrhagic shock was induced in pigs via 35% total blood volume bleed and a 90-min period of shock followed by resuscitation. Hepatic biopsies were collected before shock and after resuscitation. Following collection, biopsies were processed for TEM by a rapid method involving microwave irradiation (Giberson, 2001). Samples pre- and postshock of each of two animals were viewed and scored using the mitochondrial ultrastructure scoring system (Crouser et al., 2002), a system used to quantify the severity of ultrastructural damage during shock. Results showed evidence of increased ultrastructural damage in the postshock samples, which scored 4.00 and 3.42, versus their preshock controls, which scored 1.18 and 1.27. The results of this analysis were similar to those obtained in another model of shock (Crouser et al., 2002). However, the amount of time used to process the samples was significantly shortened with methods involving microwave irradiation.
Perser, Karen; Godfrey, David; Bisson, Leslie
2011-01-01
Context: Double-row rotator cuff repair methods have improved biomechanical performance when compared with single-row repairs. Objective: To review clinical outcomes of single-row versus double-row rotator cuff repair with the hypothesis that double-row rotator cuff repair will result in better clinical and radiographic outcomes. Data Sources: Published literature from January 1980 to April 2010. Key terms included rotator cuff, prospective studies, outcomes, and suture techniques. Study Selection: The literature was systematically searched, and 5 level I and II studies were found comparing clinical outcomes of single-row and double-row rotator cuff repair. Coleman methodology scores were calculated for each article. Data Extraction: Meta-analysis was performed, with treatment effect between single row and double row for clinical outcomes and with odds ratios for radiographic results. The sample size necessary to detect a given difference in clinical outcome between the 2 methods was calculated. Results: Three level I studies had Coleman scores of 80, 74, and 81, and two level II studies had scores of 78 and 73. There were 156 patients with single-row repairs and 147 patients with double-row repairs, both with an average follow-up of 23 months (range, 12-40 months). Double-row repairs resulted in a greater treatment effect for each validated outcome measure in 4 studies, but the differences were not clinically or statistically significant (range, 0.4-2.2 points; 95% confidence interval, –0.19, 4.68 points). Double-row repairs had better radiographic results, but the differences were also not statistically significant (P = 0.13). Two studies had adequate power to detect a 10-point difference between repair methods using the Constant score, and 1 study had power to detect a 5-point difference using the UCLA (University of California, Los Angeles) score. Conclusions: Double-row rotator cuff repair does not show a statistically significant improvement in clinical outcome or radiographic healing with short-term follow-up. PMID:23016017
The variability of software scoring of the CDMAM phantom associated with a limited number of images
NASA Astrophysics Data System (ADS)
Yang, Chang-Ying J.; Van Metter, Richard
2007-03-01
Software scoring approaches provide an attractive alternative to human evaluation of CDMAM images from digital mammography systems, particularly for annual quality control testing as recommended by the European Protocol for the Quality Control of the Physical and Technical Aspects of Mammography Screening (EPQCM). Methods for correlating CDCOM-based results with human observer performance have been proposed. A common feature of all methods is the use of a small number (at most eight) of CDMAM images to evaluate the system. This study focuses on the potential variability in the estimated system performance that is associated with these methods. Sets of 36 CDMAM images were acquired under carefully controlled conditions from three different digital mammography systems. The threshold visibility thickness (TVT) for each disk diameter was determined using previously reported post-analysis methods from the CDCOM scorings for a randomly selected group of eight images for one measurement trial. This random selection process was repeated 3000 times to estimate the variability in the resulting TVT values for each disk diameter. The results from using different post-analysis methods, different random selection strategies and different digital systems were compared. Additional variability of the 0.1 mm disk diameter was explored by comparing the results from two different image data sets acquired under the same conditions from the same system. The magnitude and the type of error estimated for experimental data was explained through modeling. The modeled results also suggest a limitation in the current phantom design for the 0.1 mm diameter disks. Through modeling, it was also found that, because of the binomial statistic nature of the CDMAM test, the true variability of the test could be underestimated by the commonly used method of random re-sampling.
Failure mode and effects analysis: a comparison of two common risk prioritisation methods.
McElroy, Lisa M; Khorzad, Rebeca; Nannicelli, Anna P; Brown, Alexandra R; Ladner, Daniela P; Holl, Jane L
2016-05-01
Failure mode and effects analysis (FMEA) is a method of risk assessment increasingly used in healthcare over the past decade. The traditional method, however, can require substantial time and training resources. The goal of this study is to compare a simplified scoring method with the traditional scoring method to determine the degree of congruence in identifying high-risk failures. An FMEA of the operating room (OR) to intensive care unit (ICU) handoff was conducted. Failures were scored and ranked using both the traditional risk priority number (RPN) and criticality-based method, and a simplified method, which designates failures as 'high', 'medium' or 'low' risk. The degree of congruence was determined by first identifying those failures determined to be critical by the traditional method (RPN≥300), and then calculating the per cent congruence with those failures designated critical by the simplified methods (high risk). In total, 79 process failures among 37 individual steps in the OR to ICU handoff process were identified. The traditional method yielded Criticality Indices (CIs) ranging from 18 to 72 and RPNs ranging from 80 to 504. The simplified method ranked 11 failures as 'low risk', 30 as medium risk and 22 as high risk. The traditional method yielded 24 failures with an RPN ≥300, of which 22 were identified as high risk by the simplified method (92% agreement). The top 20% of CI (≥60) included 12 failures, of which six were designated as high risk by the simplified method (50% agreement). These results suggest that the simplified method of scoring and ranking failures identified by an FMEA can be a useful tool for healthcare organisations with limited access to FMEA expertise. However, the simplified method does not result in the same degree of discrimination in the ranking of failures offered by the traditional method. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Hu, Xuan; Fan, Mingwan; Rong, Wensheng; Lo, Edward C M; Bronkhorst, Ewald; Frencken, Jo E
2014-08-01
The aim of this study was to test the hypothesis that the colour photograph method has a higher level of validity for assessing sealant retention than the visual clinical examination and replica methods. Sealed molars were assessed by two evaluators. The scores for the three methods were compared against consensus scores derived through assessing retention from scanning electron microscopy images (reference standard). The presence/absence (survival) of retained sealants on occlusal surfaces was determined according to the traditional and modified categorizations of retention. Sensitivity, specificity, and Youden-index scores were calculated. Sealant retention assessment scores for visual clinical examinations and for colour photographs were compared with those of the reference standard on 95 surfaces, and sealant retention assessment scores for replicas were compared with those of the reference standard on 33 surfaces. The highest mean Youden-index score for the presence/absence of sealant material was observed for the colour photograph method, followed by that for the replica method; the visual clinical examination method scored lowest. The mean Youden-index score for the survival of retained sealants was highest for the colour photograph method for both the traditional (0.882) and the modified (0.768) categories of sealant retention, whilst the visual clinical examination method had the lowest Youden-index score for these categories (0.745 and 0.063, respectively). The colour photograph method had a higher validity than the replica and the visual examination methods for assessing sealant retention. © 2014 Eur J Oral Sci.
Correlation of psychomotor skills and didactic performance among dental students in Saudi Arabia
Afify, Ahmed R; Zawawi, Khalid H; Othman, Hisham I; Al-Dharrab, Ayman A
2013-01-01
Objectives The objective of this study is to investigate the correlation between the psychomotor skills and the academic performance of dental students. Methods Didactic and preclinical scores were collected for students who graduated from the Faculty of Dentistry, King Abdulaziz University, Jeddah, Saudi Arabia, in 2011. Three courses (Dental Anatomy, Removable Prosthodontic Denture, and Orthodontics) were selected. Correlations comparing didactic and practical scores were done for the total samples, then for the males and females separately. Results There was no significant correlation between the practical and didactic scores for the three courses for the total sample. There was a significant correlation between all three subjects in the didactic scores. For females, the results showed that there was only a significant correlation between the practical and didactic scores for Dental Anatomy. For males, no correlation was observed between the practical and didactic scores for all subjects. Conclusion In the present sample, didactic performance did not correlate well with the students’ psychomotor performance. PMID:24159266
Corpus-based Statistical Screening for Phrase Identification
Kim, Won; Wilbur, W. John
2000-01-01
Purpose: The authors study the extraction of useful phrases from a natural language database by statistical methods. The aim is to leverage human effort by providing preprocessed phrase lists with a high percentage of useful material. Method: The approach is to develop six different scoring methods that are based on different aspects of phrase occurrence. The emphasis here is not on lexical information or syntactic structure but rather on the statistical properties of word pairs and triples that can be obtained from a large database. Measurements: The Unified Medical Language System (UMLS) incorporates a large list of humanly acceptable phrases in the medical field as a part of its structure. The authors use this list of phrases as a gold standard for validating their methods. A good method is one that ranks the UMLS phrases high among all phrases studied. Measurements are 11-point average precision values and precision-recall curves based on the rankings. Result: The authors find of six different scoring methods that each proves effective in identifying UMLS quality phrases in a large subset of MEDLINE. These methods are applicable both to word pairs and word triples. All six methods are optimally combined to produce composite scoring methods that are more effective than any single method. The quality of the composite methods appears sufficient to support the automatic placement of hyperlinks in text at the site of highly ranked phrases. Conclusion: Statistical scoring methods provide a promising approach to the extraction of useful phrases from a natural language database for the purpose of indexing or providing hyperlinks in text. PMID:10984469
Improving predicted protein loop structure ranking using a Pareto-optimality consensus method
2010-01-01
Background Accurate protein loop structure models are important to understand functions of many proteins. Identifying the native or near-native models by distinguishing them from the misfolded ones is a critical step in protein loop structure prediction. Results We have developed a Pareto Optimal Consensus (POC) method, which is a consensus model ranking approach to integrate multiple knowledge- or physics-based scoring functions. The procedure of identifying the models of best quality in a model set includes: 1) identifying the models at the Pareto optimal front with respect to a set of scoring functions, and 2) ranking them based on the fuzzy dominance relationship to the rest of the models. We apply the POC method to a large number of decoy sets for loops of 4- to 12-residue in length using a functional space composed of several carefully-selected scoring functions: Rosetta, DOPE, DDFIRE, OPLS-AA, and a triplet backbone dihedral potential developed in our lab. Our computational results show that the sets of Pareto-optimal decoys, which are typically composed of ~20% or less of the overall decoys in a set, have a good coverage of the best or near-best decoys in more than 99% of the loop targets. Compared to the individual scoring function yielding best selection accuracy in the decoy sets, the POC method yields 23%, 37%, and 64% less false positives in distinguishing the native conformation, indentifying a near-native model (RMSD < 0.5A from the native) as top-ranked, and selecting at least one near-native model in the top-5-ranked models, respectively. Similar effectiveness of the POC method is also found in the decoy sets from membrane protein loops. Furthermore, the POC method outperforms the other popularly-used consensus strategies in model ranking, such as rank-by-number, rank-by-rank, rank-by-vote, and regression-based methods. Conclusions By integrating multiple knowledge- and physics-based scoring functions based on Pareto optimality and fuzzy dominance, the POC method is effective in distinguishing the best loop models from the other ones within a loop model set. PMID:20642859
Confronting Compassion Fatigue: Assessment and Intervention in Inpatient Oncology .
Zajac, Lisa M; Moran, Katherine J; Groh, Carla J
2017-08-01
A notable variation among patient satisfaction scores with nursing care was identified. Contributing factors were examined and revealed significant negative correlations between the unit death rate and surviving patients' satisfaction scores. Compassion fatigue (CF) was hypothesized to be a major contributing factor. . The objective was to address CF in RNs and oncology care associates (assistive personnel) by developing an intervention to provide bereavement support to staff after patient deaths. . A mixed-methods sequential design was used. Instruments included the Professional Quality of Life scale and Press Ganey survey results. Univariate descriptive statistics, frequencies, an independent t test, and an analysis of covariance were used for data analysis. . The preintervention results revealed average compassion satisfaction and secondary traumatic stress scores and low burnout scores. No significant difference was noted between pre- and postintervention CF scores. Patients' perception of nurses' skills improved significantly in the second quarter of 2015.
History by history statistical estimators in the BEAM code system.
Walters, B R B; Kawrakow, I; Rogers, D W O
2002-12-01
A history by history method for estimating uncertainties has been implemented in the BEAMnrc and DOSXYznrc codes replacing the method of statistical batches. This method groups scored quantities (e.g., dose) by primary history. When phase-space sources are used, this method groups incident particles according to the primary histories that generated them. This necessitated adding markers (negative energy) to phase-space files to indicate the first particle generated by a new primary history. The new method greatly reduces the uncertainty in the uncertainty estimate. The new method eliminates one dimension (which kept the results for each batch) from all scoring arrays, resulting in memory requirement being decreased by a factor of 2. Correlations between particles in phase-space sources are taken into account. The only correlations with any significant impact on uncertainty are those introduced by particle recycling. Failure to account for these correlations can result in a significant underestimate of the uncertainty. The previous method of accounting for correlations due to recycling by placing all recycled particles in the same batch did work. Neither the new method nor the batch method take into account correlations between incident particles when a phase-space source is restarted so one must avoid restarts.
NASA Astrophysics Data System (ADS)
Onken, Reiner
2017-11-01
A relocatable ocean prediction system (ROPS) was employed to an observational data set which was collected in June 2014 in the waters to the west of Sardinia (western Mediterranean) in the framework of the REP14-MED experiment. The observational data, comprising more than 6000 temperature and salinity profiles from a fleet of underwater gliders and shipborne probes, were assimilated in the Regional Ocean Modeling System (ROMS), which is the heart of ROPS, and verified against independent observations from ScanFish tows by means of the forecast skill score as defined by Murphy(1993). A simplified objective analysis (OA) method was utilised for assimilation, taking account of only those profiles which were located within a predetermined time window W. As a result of a sensitivity study, the highest skill score was obtained for a correlation length scale C = 12.5 km, W = 24 h, and r = 1, where r is the ratio between the error of the observations and the background error, both for temperature and salinity. Additional ROPS runs showed that (i) the skill score of assimilation runs was mostly higher than the score of a control run without assimilation, (i) the skill score increased with increasing forecast range, and (iii) the skill score for temperature was higher than the score for salinity in the majority of cases. Further on, it is demonstrated that the vast number of observations can be managed by the applied OA method without data reduction, enabling timely operational forecasts even on a commercially available personal computer or a laptop.
2012-01-01
Background Chronic wounds affect millions of people and cost billions of dollars in the United States each year. These wounds harbor polymicrobial biofilm communities, which can be difficult to elucidate using culturing methods. Clinical molecular microbiological methods are increasingly being employed to investigate the microbiota of chronic infections, including wounds, as part of standard patient care. However, molecular testing is more sensitive than culturing, which results in markedly different results being reported to clinicians. This study compares the results of aerobic culturing and molecular testing (culture-free 16S ribosomal DNA sequencing), and it examines the relative abundance score that is generated by the molecular test and the usefulness of the relative abundance score in predicting the likelihood that the same organism would be detected by culture. Methods Parallel samples from 51 chronic wounds were studied using aerobic culturing and 16S DNA sequencing for the identification of bacteria. Results One hundred forty-five (145) unique genera were identified using molecular methods, and 68 of these genera were aerotolerant. Fourteen (14) unique genera were identified using aerobic culture methods. One-third (31/92) of the cultures were determined to be < 1% of the relative abundance of the wound microbiota using molecular testing. At the genus level, molecular testing identified 85% (78/92) of the bacteria that were identified by culture. Conversely, culturing detected 15.7% (78/497) of the aerotolerant bacteria and detected 54.9% of the collective aerotolerant relative abundance of the samples. Aerotolerant bacterial genera (and individual species including Staphylococcus aureus, Pseudomonas aeruginosa, and Enterococcus faecalis) with higher relative abundance scores were more likely to be detected by culture as demonstrated with regression modeling. Conclusion Discordance between molecular and culture testing is often observed. However, culture-free 16S ribosomal DNA sequencing and its relative abundance score can provide clinicians with insight into which bacteria are most abundant in a sample and which are most likely to be detected by culture. PMID:23176603
Feldacker, Caryl; Chicumbe, Sergio; Dgedge, Martinho; Augusto, Gerito; Cesar, Freide; Robertson, Molly; Mbofana, Francisco; O'Malley, Gabrielle
2014-01-01
Introduction Mozambique suffers from a critical shortage of healthcare workers. Mid-level healthcare workers, (Tecnicos de Medicina Geral (TMG)), in Mozambique require less money and time to train than physicians. From 2009–2010, the Mozambique Ministry of Health (MoH) and the International Training and Education Center for Health (I-TECH), University of Washington, Seattle, revised the TMG curriculum. To evaluate the effect of the curriculum revision, we used mixed methods to determine: 1) if TMGs meet the MoH's basic standards of clinical competency; and 2) do scores on measurements of clinical knowledge, physical exam, and clinical case scenarios differ by curriculum? Methods T-tests of differences in means examined differences in continuous score variables between curriculum groups. Univariate and multivariate linear regression models assess curriculum-related and demographic factors associated with assessment scores on each of the three evaluation methods at the p<0.05 level. Qualitative interviews and focus groups inform interpretation. Results We found no significant differences in sex, marital status and age between the 112 and 189 TMGs in initial and revised curriculum, respectively. Mean scores at graduation of initial curriculum TMGs were 56.7%, 63.5%, and 49.1% on the clinical cases, knowledge test, and physical exam, respectively. Scores did not differ significantly from TMGs in the revised curriculum. Results from linear regression models find that training institute was the most significant predictor of TMG scores on both the clinical cases and physical exam. Conclusion TMGs trained in either curriculum may be inadequately prepared to provide quality care. Curriculum changes are a necessary, but insufficient, part of improving TMG knowledge and skills overall. A more comprehensive, multi-level approach to improving TMG training that includes post-graduation mentoring, strengthening the pre-service internship training, and greater resources for training institute faculty may result in improvements in TMG capacity and patient care over time. PMID:25068590
Nijhuis-van der Sanden, Maria W G; Driehuis, Femke; Heerkens, Yvonne F; van der Vleuten, Cees P M; van der Wees, Philip J
2017-01-01
Objectives To evaluate the feasibility of a quality improvement programme aimed to enhance the client-centeredness, effectiveness and transparency of physiotherapy services by addressing three feasibility domains: (1) acceptability of the programme design, (2) appropriateness of the implementation strategy and (3) impact on quality improvement. Design Mixed methods study. Participants and setting 64 physiotherapists working in primary care, organised in a network of communities of practice in the Netherlands. Methods The programme contained: (1) two cycles of online self-assessment and peer assessment (PA) of clinical performance using client records and video-recordings of client communication followed by face-to-face group discussions, and (2) clinical audit assessing organisational performance. Assessment was based on predefined performance indicators which could be scored on a 5-point Likert scale. Discussions addressed performance standards and scoring differences. All feasibility domains were evaluated qualitatively with two focus groups and 10 in-depth interviews. In addition, we evaluated the impact on quality improvement quantitatively by comparing self-assessment and PA scores in cycles 1 and 2. Results We identified critical success features relevant to programme development and implementation, such as clarifying expectations at baseline, training in PA skills, prolonged engagement with video-assessment and competent group coaches. Self-reported impact on quality improvement included awareness of clinical and organisational performance, improved evidence-based practice and client-centeredness and increased motivation to self-direct quality improvement. Differences between self-scores and peer scores on performance indicators were not significant. Between cycles 1 and 2, scores for record keeping showed significant improvement, however not for client communication. Conclusions This study demonstrated that bottom-up initiatives to improve healthcare quality can be effective. The results justify ongoing evaluation to inform nationwide implementation when the critical success features are addressed. Further research is necessary to explore the sustainability of the results and the impact on client outcomes in a full-scale study. PMID:28188156
Mclean, Scott; Salmon, Paul M; Gorman, Adam D; Stevens, Nicholas J; Solomon, Colin
2018-02-01
In the current study, social network analysis (SNA) and notational analysis (NA) methods were applied to examine the goal scoring passing networks (GSPN) for all goals scored at the 2016 European Football Championships. The aim of the study was to determine the GSPN characteristics for the overall tournament, between the group and knock out stages, and for the successful and unsuccessful teams. The study also used degree centrality (DC) metrics as a novel method to determine the relative contributions of the pitch locations involved in the GSPN. To determine changes in GSPN characteristics as a function of changing score line, the analysis considered the match status of the game when goals were scored. There were significant differences for SNA metrics as a function of match status, and for the DC metrics in the comparison of the different pitch locations. There were no differences in the SNA metrics for the GSPN between teams in the group and knock out stages, or between the successful and unsuccessful teams. The results indicate that the GSPN had low values for network density, cohesion, connections, and duration. The networks were direct in terms of pitch zones utilised, where 85% of the GSPN included passes that were played within zones or progressed through the zones towards the goal. SNA and NA metrics were significantly different as a function of changing match status. The current study adds to the previous research on goal scoring in football, and demonstrates a novel method to determine the prominent pitch zones involved in the GSPN. These results have implications for match analysis and the coaching process. Copyright © 2017 Elsevier B.V. All rights reserved.
Zeng, Rui; Xiang, Lian-rui; Zeng, Jing; Zuo, Chuan
2017-01-01
Background We aimed to introduce team-based learning (TBL) as one of the teaching methods for diagnostics and to compare its teaching effectiveness with that of the traditional teaching methods. Methods We conducted a randomized controlled trial on diagnostics teaching involving 111 third-year medical undergraduates, using TBL as the experimental intervention, compared with lecture-based learning as the control, for teaching the two topics of symptomatology. Individual Readiness Assurance Test (IRAT)-baseline and Group Readiness Assurance Test (GRAT) were performed in members of each TBL subgroup. The scores in Individual Terminal Test 1 (ITT1) immediately after class and Individual Terminal Test 2 (ITT2) 1 week later were compared between the two groups. The questionnaire and interview were also implemented to survey the attitude of students and teachers toward TBL. Results There was no significant difference between the two groups in ITT1 (19.85±4.20 vs 19.70±4.61), while the score of the TBL group was significantly higher than that of the control group in ITT2 (19.15±3.93 vs 17.46±4.65). In the TBL group, the scores of the two terminal tests after the teaching intervention were significantly higher than the baseline test score of individuals. IRAT-baseline, ITT1, and ITT2 scores of students at different academic levels in the TBL teaching exhibited significant differences, but the ITT1-IRAT-baseline and ITT2-IRAT-baseline indicated no significant differences among the three subgroups. Conclusion Our TBL in symptomatology approach was highly accepted by students in the improvement of interest and self-directed learning and resulted in an increase in knowledge acquirements, which significantly improved short-term test scores compared with lecture-based learning. TBL is regarded as an effective teaching method worthy of promoting. PMID:28331383
ERIC Educational Resources Information Center
Dong, Nianbo; Lipsey, Mark
2014-01-01
When randomized control trials (RCT) are not feasible, researchers seek other methods to make causal inference, e.g., propensity score methods. One of the underlined assumptions for the propensity score methods to obtain unbiased treatment effect estimates is the ignorability assumption, that is, conditional on the propensity score, treatment…
A comparison of nutrient density scores for 100% fruit juices.
Rampersaud, G C
2007-05-01
The 2005 Dietary Guidelines for Americans recommend that consumers choose a variety of nutrient-dense foods. Nutrient density is usually defined as the quantity of nutrients per calorie. Food and nutrition professionals should be aware of the concept of nutrient density, how it might be quantified, and its potential application in food labeling and dietary guidance. This article presents the concept of a nutrient density score and compares nutrient density scores for various 100% fruit juices. One hundred percent fruit juices are popular beverages in the United States, and although they can provide concentrated sources of a variety of nutrients, they can differ considerably in their nutrient profiles. Six methodologies were used to quantify nutrient density and 7 100% fruit juices were included in the analysis: apple, grape, pink grapefruit, white grapefruit, orange, pineapple, and prune. Food composition data were obtained from the USDA National Nutrient Database for Standard Reference, Release 18. Application of the methods resulted in nutrient density scores with a range of values and magnitudes. The relative scores indicated that citrus juices, particularly pink grapefruit and orange juice, were more nutrient dense compared to the other nonfortified 100% juices included in the analysis. Although the methods differed, the relative ranking of the juices based on nutrient density score was similar for each method. Issues to be addressed regarding the development and application of a nutrient density score include those related to food fortification, nutrient bioavailability, and consumer education and behavior.
Assessing Hourly Precipitation Forecast Skill with the Fractions Skill Score
NASA Astrophysics Data System (ADS)
Zhao, Bin; Zhang, Bo
2018-02-01
Statistical methods for category (yes/no) forecasts, such as the Threat Score, are typically used in the verification of precipitation forecasts. However, these standard methods are affected by the so-called "double-penalty" problem caused by slight displacements in either space or time with respect to the observations. Spatial techniques have recently been developed to help solve this problem. The fractions skill score (FSS), a neighborhood spatial verification method, directly compares the fractional coverage of events in windows surrounding the observations and forecasts. We applied the FSS to hourly precipitation verification by taking hourly forecast products from the GRAPES (Global/Regional Assimilation Prediction System) regional model and quantitative precipitation estimation products from the National Meteorological Information Center of China during July and August 2016, and investigated the difference between these results and those obtained with the traditional category score. We found that the model spin-up period affected the assessment of stability. Systematic errors had an insignificant role in the fraction Brier score and could be ignored. The dispersion of observations followed a diurnal cycle and the standard deviation of the forecast had a similar pattern to the reference maximum of the fraction Brier score. The coefficient of the forecasts and the observations is similar to the FSS; that is, the FSS may be a useful index that can be used to indicate correlation. Compared with the traditional skill score, the FSS has obvious advantages in distinguishing differences in precipitation time series, especially in the assessment of heavy rainfall.
ERIC Educational Resources Information Center
Wang, Ze; Rohrer, David; Chuang, Chi-ching; Fujiki, Mayo; Herman, Keith; Reinke, Wendy
2015-01-01
This study compared 5 scoring methods in terms of their statistical assumptions. They were then used to score the Teacher Observation of Classroom Adaptation Checklist, a measure consisting of 3 subscales and 21 Likert-type items. The 5 methods used were (a) sum/average scores of items, (b) latent factor scores with continuous indicators, (c)…
The Value of Mixed Methods Research: A Mixed Methods Study
ERIC Educational Resources Information Center
McKim, Courtney A.
2017-01-01
The purpose of this explanatory mixed methods study was to examine the perceived value of mixed methods research for graduate students. The quantitative phase was an experiment examining the effect of a passage's methodology on students' perceived value. Results indicated students scored the mixed methods passage as more valuable than those who…
Austin, Peter C
2018-01-01
Propensity score methods are frequently used to estimate the effects of interventions using observational data. The propensity score was originally developed for use with binary exposures. The generalized propensity score (GPS) is an extension of the propensity score for use with quantitative or continuous exposures (e.g. pack-years of cigarettes smoked, dose of medication, or years of education). We describe how the GPS can be used to estimate the effect of continuous exposures on survival or time-to-event outcomes. To do so we modified the concept of the dose-response function for use with time-to-event outcomes. We used Monte Carlo simulations to examine the performance of different methods of using the GPS to estimate the effect of quantitative exposures on survival or time-to-event outcomes. We examined covariate adjustment using the GPS and weighting using weights based on the inverse of the GPS. The use of methods based on the GPS was compared with the use of conventional G-computation and weighted G-computation. Conventional G-computation resulted in estimates of the dose-response function that displayed the lowest bias and the lowest variability. Amongst the two GPS-based methods, covariate adjustment using the GPS tended to have the better performance. We illustrate the application of these methods by estimating the effect of average neighbourhood income on the probability of survival following hospitalization for an acute myocardial infarction.
Fu, Szu-Wei; Li, Pei-Chun; Lai, Ying-Hui; Yang, Cheng-Chien; Hsieh, Li-Chun; Tsao, Yu
2017-11-01
Objective: This paper focuses on machine learning based voice conversion (VC) techniques for improving the speech intelligibility of surgical patients who have had parts of their articulators removed. Because of the removal of parts of the articulator, a patient's speech may be distorted and difficult to understand. To overcome this problem, VC methods can be applied to convert the distorted speech such that it is clear and more intelligible. To design an effective VC method, two key points must be considered: 1) the amount of training data may be limited (because speaking for a long time is usually difficult for postoperative patients); 2) rapid conversion is desirable (for better communication). Methods: We propose a novel joint dictionary learning based non-negative matrix factorization (JD-NMF) algorithm. Compared to conventional VC techniques, JD-NMF can perform VC efficiently and effectively with only a small amount of training data. Results: The experimental results demonstrate that the proposed JD-NMF method not only achieves notably higher short-time objective intelligibility (STOI) scores (a standardized objective intelligibility evaluation metric) than those obtained using the original unconverted speech but is also significantly more efficient and effective than a conventional exemplar-based NMF VC method. Conclusion: The proposed JD-NMF method may outperform the state-of-the-art exemplar-based NMF VC method in terms of STOI scores under the desired scenario. Significance: We confirmed the advantages of the proposed joint training criterion for the NMF-based VC. Moreover, we verified that the proposed JD-NMF can effectively improve the speech intelligibility scores of oral surgery patients. Objective: This paper focuses on machine learning based voice conversion (VC) techniques for improving the speech intelligibility of surgical patients who have had parts of their articulators removed. Because of the removal of parts of the articulator, a patient's speech may be distorted and difficult to understand. To overcome this problem, VC methods can be applied to convert the distorted speech such that it is clear and more intelligible. To design an effective VC method, two key points must be considered: 1) the amount of training data may be limited (because speaking for a long time is usually difficult for postoperative patients); 2) rapid conversion is desirable (for better communication). Methods: We propose a novel joint dictionary learning based non-negative matrix factorization (JD-NMF) algorithm. Compared to conventional VC techniques, JD-NMF can perform VC efficiently and effectively with only a small amount of training data. Results: The experimental results demonstrate that the proposed JD-NMF method not only achieves notably higher short-time objective intelligibility (STOI) scores (a standardized objective intelligibility evaluation metric) than those obtained using the original unconverted speech but is also significantly more efficient and effective than a conventional exemplar-based NMF VC method. Conclusion: The proposed JD-NMF method may outperform the state-of-the-art exemplar-based NMF VC method in terms of STOI scores under the desired scenario. Significance: We confirmed the advantages of the proposed joint training criterion for the NMF-based VC. Moreover, we verified that the proposed JD-NMF can effectively improve the speech intelligibility scores of oral surgery patients.
Determination of optimal self-drive tourism route using the orienteering problem method
NASA Astrophysics Data System (ADS)
Hashim, Zakiah; Ismail, Wan Rosmanira; Ahmad, Norfaieqah
2013-04-01
This paper was conducted to determine the optimal travel routes for self-drive tourism based on the allocation of time and expense by maximizing the amount of attraction scores assigned to each city involved. Self-drive tourism represents a type of tourism where tourists hire or travel by their own vehicle. It only involves a tourist destination which can be linked with a network of roads. Normally, the traveling salesman problem (TSP) and multiple traveling salesman problems (MTSP) method were used in the minimization problem such as determination the shortest time or distance traveled. This paper involved an alternative approach for maximization method which is maximize the attraction scores and tested on tourism data for ten cities in Kedah. A set of priority scores are used to set the attraction score at each city. The classical approach of the orienteering problem was used to determine the optimal travel route. This approach is extended to the team orienteering problem and the two methods were compared. These two models have been solved by using LINGO12.0 software. The results indicate that the model involving the team orienteering problem provides a more appropriate solution compared to the orienteering problem model.
Aggregate Interview Method of ranking orthopedic applicants predicts future performance.
Geissler, Jacqueline; VanHeest, Ann; Tatman, Penny; Gioe, Terence
2013-07-01
This article evaluates and describes a process of ranking orthopedic applicants using what the authors term the Aggregate Interview Method. The authors hypothesized that higher-ranking applicants using this method at their institution would perform better than those ranked lower using multiple measures of resident performance. A retrospective review of 115 orthopedic residents was performed at the authors' institution. Residents were grouped into 3 categories by matching rank numbers: 1-5, 6-14, and 15 or higher. Each rank group was compared with resident performance as measured by faculty evaluations, the Orthopaedic In-Training Examination (OITE), and American Board of Orthopaedic Surgery (ABOS) test results. Residents ranked 1-5 scored significantly better on patient care, behavior, and overall competence by faculty evaluation (P<.05). Residents ranked 1-5 scored higher on the OITE compared with those ranked 6-14 during postgraduate years 2 and 3 (P⩽.5). Graduates who had been ranked 1-5 had a 100% pass rate on the ABOS part 1 examination on the first attempt. The most favorably ranked residents performed at or above the level of other residents in the program; they did not score inferiorly on any measure. These results support the authors' method of ranking residents. The rigorous Aggregate Interview Method for ranking applicants consistently identified orthopedic resident candidates who scored highly on the Accreditation Council for Graduate Medical Education resident core competencies as measured by faculty evaluations, performed above the national average on the OITE, and passed the ABOS part 1 examination at rates exceeding the national average. Copyright 2013, SLACK Incorporated.
A Mixed QM/MM Scoring Function to Predict Protein-Ligand Binding Affinity
Hayik, Seth A.; Dunbrack, Roland; Merz, Kenneth M.
2010-01-01
Computational methods for predicting protein-ligand binding free energy continue to be popular as a potential cost-cutting method in the drug discovery process. However, accurate predictions are often difficult to make as estimates must be made for certain electronic and entropic terms in conventional force field based scoring functions. Mixed quantum mechanics/molecular mechanics (QM/MM) methods allow electronic effects for a small region of the protein to be calculated, treating the remaining atoms as a fixed charge background for the active site. Such a semi-empirical QM/MM scoring function has been implemented in AMBER using DivCon and tested on a set of 23 metalloprotein-ligand complexes, where QM/MM methods provide a particular advantage in the modeling of the metal ion. The binding affinity of this set of proteins can be calculated with an R2 of 0.64 and a standard deviation of 1.88 kcal/mol without fitting and 0.71 and a standard deviation of 1.69 kcal/mol with fitted weighting of the individual scoring terms. In this study we explore using various methods to calculate terms in the binding free energy equation, including entropy estimates and minimization standards. From these studies we found that using the rotational bond estimate to ligand entropy results in a reasonable R2 of 0.63 without fitting. We also found that using the ESCF energy of the proteins without minimization resulted in an R2 of 0.57, when using the rotatable bond entropy estimate. PMID:21221417
Škrbić, Biljana; Héberger, Károly; Durišić-Mladenović, Nataša
2013-10-01
Sum of ranking differences (SRD) was applied for comparing multianalyte results obtained by several analytical methods used in one or in different laboratories, i.e., for ranking the overall performances of the methods (or laboratories) in simultaneous determination of the same set of analytes. The data sets for testing of the SRD applicability contained the results reported during one of the proficiency tests (PTs) organized by EU Reference Laboratory for Polycyclic Aromatic Hydrocarbons (EU-RL-PAH). In this way, the SRD was also tested as a discriminant method alternative to existing average performance scores used to compare mutlianalyte PT results. SRD should be used along with the z scores--the most commonly used PT performance statistics. SRD was further developed to handle the same rankings (ties) among laboratories. Two benchmark concentration series were selected as reference: (a) the assigned PAH concentrations (determined precisely beforehand by the EU-RL-PAH) and (b) the averages of all individual PAH concentrations determined by each laboratory. Ranking relative to the assigned values and also to the average (or median) values pointed to the laboratories with the most extreme results, as well as revealed groups of laboratories with similar overall performances. SRD reveals differences between methods or laboratories even if classical test(s) cannot. The ranking was validated using comparison of ranks by random numbers (a randomization test) and using seven folds cross-validation, which highlighted the similarities among the (methods used in) laboratories. Principal component analysis and hierarchical cluster analysis justified the findings based on SRD ranking/grouping. If the PAH-concentrations are row-scaled, (i.e., z scores are analyzed as input for ranking) SRD can still be used for checking the normality of errors. Moreover, cross-validation of SRD on z scores groups the laboratories similarly. The SRD technique is general in nature, i.e., it can be applied to any experimental problem in which multianalyte results obtained either by several analytical procedures, analysts, instruments, or laboratories need to be compared.
SADEGHI, ROYA; SEDAGHAT, MOHAMMAD MEHDI; SHA AHMADI, FARAMARZ
2014-01-01
Introduction: Blended learning, a new approach in educational planning, is defined as an applying more than one method, strategy, technique or media in education. Todays, due to the development of infrastructure of Internet networks and the access of most of the students, the Internet can be utilized along with traditional and conventional methods of training. The aim of this study was to compare the students’ learning and satisfaction in combination of lecture and e-learning with conventional lecture methods. Methods: This quasi-experimental study is conducted among the sophomore students of Public Health School, Tehran University of Medical Science in 2012-2013. Four classes of the school are randomly selected and are divided into two groups. Education in two classes (45 students) was in the form of lecture method and in the other two classes (48 students) was blended method with e-Learning and lecture methods. The students’ knowledge about tuberculosis in two groups was collected and measured by using pre and post-test. This step has been done by sending self-reported electronic questionnaires to the students' email addresses through Google Document software. At the end of educational programs, students' satisfaction and comments about two methods were also collected by questionnaires. Statistical tests such as descriptive methods, paired t-test, independent t-test and ANOVA were done through the SPSS 14 software, and p≤0.05 was considered as significant difference. Results: The mean scores of the lecture and blended groups were 13.18±1.37 and 13.35±1.36, respectively; the difference between the pre-test scores of the two groups was not statistically significant (p=0.535). Knowledge scores increased in both groups after training, and the mean and standard deviation of knowledge scores of the lectures and combined groups were 16.51±0.69 and 16.18±1.06, respectively. The difference between the post-test scores of the two groups was not statistically significant (p=0.112). Students’ satisfaction in blended learning method was higher than lecture method. Conclusion: The results revealed that the blended method is effective in increasing the students' learning rate. E-learning can be used to teach some courses and might be considered as economic aspects. Since in universities of medical sciences in the country, the majority of students have access to the Internet and email address, using e-learning could be used as a supplement to traditional teaching methods or sometimes as educational alternative method because this method of teaching increases the students’ knowledge, satisfaction and attention. PMID:25512938
Rios, Anthony; Kavuluru, Ramakanth
2017-11-01
The CEGS N-GRID 2016 Shared Task in Clinical Natural Language Processing (NLP) provided a set of 1000 neuropsychiatric notes to participants as part of a competition to predict psychiatric symptom severity scores. This paper summarizes our methods, results, and experiences based on our participation in the second track of the shared task. Classical methods of text classification usually fall into one of three problem types: binary, multi-class, and multi-label classification. In this effort, we study ordinal regression problems with text data where misclassifications are penalized differently based on how far apart the ground truth and model predictions are on the ordinal scale. Specifically, we present our entries (methods and results) in the N-GRID shared task in predicting research domain criteria (RDoC) positive valence ordinal symptom severity scores (absent, mild, moderate, and severe) from psychiatric notes. We propose a novel convolutional neural network (CNN) model designed to handle ordinal regression tasks on psychiatric notes. Broadly speaking, our model combines an ordinal loss function, a CNN, and conventional feature engineering (wide features) into a single model which is learned end-to-end. Given interpretability is an important concern with nonlinear models, we apply a recent approach called locally interpretable model-agnostic explanation (LIME) to identify important words that lead to instance specific predictions. Our best model entered into the shared task placed third among 24 teams and scored a macro mean absolute error (MMAE) based normalized score (100·(1-MMAE)) of 83.86. Since the competition, we improved our score (using basic ensembling) to 85.55, comparable with the winning shared task entry. Applying LIME to model predictions, we demonstrate the feasibility of instance specific prediction interpretation by identifying words that led to a particular decision. In this paper, we present a method that successfully uses wide features and an ordinal loss function applied to convolutional neural networks for ordinal text classification specifically in predicting psychiatric symptom severity scores. Our approach leads to excellent performance on the N-GRID shared task and is also amenable to interpretability using existing model-agnostic approaches. Copyright © 2017 Elsevier Inc. All rights reserved.
Cavalcanti, Paulo Ernando Ferraz; Sá, Michel Pompeu Barros de Oliveira; dos Santos, Cecília Andrade; Esmeraldo, Isaac Melo; Chaves, Mariana Leal; Lins, Ricardo Felipe de Albuquerque; Lima, Ricardo de Carvalho
2015-01-01
Objective To determine whether stratification of complexity models in congenital heart surgery (RACHS-1, Aristotle basic score and STS-EACTS mortality score) fit to our center and determine the best method of discriminating hospital mortality. Methods Surgical procedures in congenital heart diseases in patients under 18 years of age were allocated to the categories proposed by the stratification of complexity methods currently available. The outcome hospital mortality was calculated for each category from the three models. Statistical analysis was performed to verify whether the categories presented different mortalities. The discriminatory ability of the models was determined by calculating the area under the ROC curve and a comparison between the curves of the three models was performed. Results 360 patients were allocated according to the three methods. There was a statistically significant difference between the mortality categories: RACHS-1 (1) - 1.3%, (2) - 11.4%, (3)-27.3%, (4) - 50 %, (P<0.001); Aristotle basic score (1) - 1.1%, (2) - 12.2%, (3) - 34%, (4) - 64.7%, (P<0.001); and STS-EACTS mortality score (1) - 5.5 %, (2) - 13.6%, (3) - 18.7%, (4) - 35.8%, (P<0.001). The three models had similar accuracy by calculating the area under the ROC curve: RACHS-1- 0.738; STS-EACTS-0.739; Aristotle- 0.766. Conclusion The three models of stratification of complexity currently available in the literature are useful with different mortalities between the proposed categories with similar discriminatory capacity for hospital mortality. PMID:26107445
Mata, Caio Augusto Sterse; Ota, Luiz Hirotoshi; Suzuki, Iunis; Telles, Adriana; Miotto, Andre; Leão, Luiz Eduardo Vilaça
2012-01-01
This study compares the traditional live lecture to a web-based approach in the teaching of bronchoscopy and evaluates the positive and negative aspects of both methods. We developed a web-based bronchoscopy curriculum, which integrates texts, images and animations. It was applied to first-year interns, who were later administered a multiple-choice test. Another group of eight first-year interns received the traditional teaching method and the same test. The two groups were compared using the Student's t-test. The mean scores (± SD) of students who used the website were 14.63 ± 1.41 (range 13-17). The test scores of the other group had the same range, with a mean score of 14.75 ± 1. The Student's t-test showed no difference between the test results. The common positive point noted was the presence of multimedia content. The web group cited as positive the ability to review the pages, and the other one the role of the teacher. Web-based bronchoscopy education showed results similar to the traditional live lecture in effectiveness.
Reinforce: An Ensemble Approach for Inferring PPI Network from AP-MS Data.
Tian, Bo; Duan, Qiong; Zhao, Can; Teng, Ben; He, Zengyou
2017-05-17
Affinity Purification-Mass Spectrometry (AP-MS) is one of the most important technologies for constructing protein-protein interaction (PPI) networks. In this paper, we propose an ensemble method, Reinforce, for inferring PPI network from AP-MS data set. The new algorithm named Reinforce is based on rank aggregation and false discovery rate control. Under the null hypothesis that the interaction scores from different scoring methods are randomly generated, Reinforce follows three steps to integrate multiple ranking results from different algorithms or different data sets. The experimental results show that Reinforce can get more stable and accurate inference results than existing algorithms. The source codes of Reinforce and data sets used in the experiments are available at: https://sourceforge.net/projects/reinforce/.
Measuring the Interestingness of Articles in a Limited User Environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pon, R; Cardenas, A; Buttler, David
Search engines, such as Google, assign scores to news articles based on their relevance to a query. However, not all relevant articles for the query may be interesting to a user. For example, if the article is old or yields little new information, the article would be uninteresting. Relevance scores do not take into account what makes an article interesting, which would vary from user to user. Although methods such as collaborative filtering have been shown to be effective in recommendation systems, in a limited user environment, there are not enough users that would make collaborative filtering effective. A generalmore » framework, called iScore, is presented for defining and measuring the ‘‘interestingness of articles, incorporating user-feedback. iScore addresses the various aspects of what makes an article interesting, such as topic relevance, uniqueness, freshness, source reputation, and writing style. It employs various methods, such as multiple topic tracking, online parameter selection, language models, clustering, sentiment analysis, and phrase extraction to measure these features. Due to varying reasons that users hold about why an article is interesting, an online feature selection method in naι¨ve Bayes is also used to improve recommendation results. iScore can outperform traditional IR techniques by as much as 50.7%. iScore and its components are evaluated in the news recommendation task using three datasets from Yahoo! News, actual users, and Digg.« less
Xing, Chao; Elston, Robert C
2006-07-01
The multipoint lod score and mod score methods have been advocated for their superior power in detecting linkage. However, little has been done to determine the distribution of multipoint lod scores or to examine the properties of mod scores. In this paper we study the distribution of multipoint lod scores both analytically and by simulation. We also study by simulation the distribution of maximum multipoint lod scores when maximized over different penetrance models. The multipoint lod score is approximately normally distributed with mean and variance that depend on marker informativity, marker density, specified genetic model, number of pedigrees, pedigree structure, and pattern of affection status. When the multipoint lod scores are maximized over a set of assumed penetrances models, an excess of false positive indications of linkage appear under dominant analysis models with low penetrances and under recessive analysis models with high penetrances. Therefore, caution should be taken in interpreting results when employing multipoint lod score and mod score approaches, in particular when inferring the level of linkage significance and the mode of inheritance of a trait.
Fat scoring: Sources of variability
Krementz, D.G.; Pendleton, G.W.
1990-01-01
Fat scoring is a widely used nondestructive method of assessing total body fat in birds. This method has not been rigorously investigated. We investigated inter- and intraobserver variability in scoring as well as the predictive ability of fat scoring using five species of passerines. Between-observer variation in scoring was variable and great at times. Observers did not consistently score species higher or lower relative to other observers nor did they always score birds with more total body fat higher. We found that within-observer variation was acceptable but was dependent on the species being scored. The precision of fat scoring was species-specific and for most species, fat scores accounted for less than 50% of the variation in true total body fat. Overall, we would describe fat scoring as a fairly precise method of indexing total body fat but with limited reliability among observers.
ROCS: a Reproducibility Index and Confidence Score for Interaction Proteomics Studies
2012-01-01
Background Affinity-Purification Mass-Spectrometry (AP-MS) provides a powerful means of identifying protein complexes and interactions. Several important challenges exist in interpreting the results of AP-MS experiments. First, the reproducibility of AP-MS experimental replicates can be low, due both to technical variability and the dynamic nature of protein interactions in the cell. Second, the identification of true protein-protein interactions in AP-MS experiments is subject to inaccuracy due to high false negative and false positive rates. Several experimental approaches can be used to mitigate these drawbacks, including the use of replicated and control experiments and relative quantification to sensitively distinguish true interacting proteins from false ones. Methods To address the issues of reproducibility and accuracy of protein-protein interactions, we introduce a two-step method, called ROCS, which makes use of Indicator Prey Proteins to select reproducible AP-MS experiments, and of Confidence Scores to select specific protein-protein interactions. The Indicator Prey Proteins account for measures of protein identifiability as well as protein reproducibility, effectively allowing removal of outlier experiments that contribute noise and affect downstream inferences. The filtered set of experiments is then used in the Protein-Protein Interaction (PPI) scoring step. Prey protein scoring is done by computing a Confidence Score, which accounts for the probability of occurrence of prey proteins in the bait experiments relative to the control experiment, where the significance cutoff parameter is estimated by simultaneously controlling false positives and false negatives against metrics of false discovery rate and biological coherence respectively. In summary, the ROCS method relies on automatic objective criterions for parameter estimation and error-controlled procedures. Results We illustrate the performance of our method by applying it to five previously published AP-MS experiments, each containing well characterized protein interactions, allowing for systematic benchmarking of ROCS. We show that our method may be used on its own to make accurate identification of specific, biologically relevant protein-protein interactions, or in combination with other AP-MS scoring methods to significantly improve inferences. Conclusions Our method addresses important issues encountered in AP-MS datasets, making ROCS a very promising tool for this purpose, either on its own or in conjunction with other methods. We anticipate that our methodology may be used more generally in proteomics studies and databases, where experimental reproducibility issues arise. The method is implemented in the R language, and is available as an R package called “ROCS”, freely available from the CRAN repository http://cran.r-project.org/. PMID:22682516
Credit scoring analysis using weighted k nearest neighbor
NASA Astrophysics Data System (ADS)
Mukid, M. A.; Widiharih, T.; Rusgiyono, A.; Prahutama, A.
2018-05-01
Credit scoring is a quatitative method to evaluate the credit risk of loan applications. Both statistical methods and artificial intelligence are often used by credit analysts to help them decide whether the applicants are worthy of credit. These methods aim to predict future behavior in terms of credit risk based on past experience of customers with similar characteristics. This paper reviews the weighted k nearest neighbor (WKNN) method for credit assessment by considering the use of some kernels. We use credit data from a private bank in Indonesia. The result shows that the Gaussian kernel and rectangular kernel have a better performance based on the value of percentage corrected classified whose value is 82.4% respectively.
Deep convolutional neural networks as strong gravitational lens detectors
NASA Astrophysics Data System (ADS)
Schaefer, C.; Geiger, M.; Kuntzer, T.; Kneib, J.-P.
2018-03-01
Context. Future large-scale surveys with high-resolution imaging will provide us with approximately 105 new strong galaxy-scale lenses. These strong-lensing systems will be contained in large data amounts, however, which are beyond the capacity of human experts to visually classify in an unbiased way. Aim. We present a new strong gravitational lens finder based on convolutional neural networks (CNNs). The method was applied to the strong-lensing challenge organized by the Bologna Lens Factory. It achieved first and third place, respectively, on the space-based data set and the ground-based data set. The goal was to find a fully automated lens finder for ground-based and space-based surveys that minimizes human inspection. Methods: We compared the results of our CNN architecture and three new variations ("invariant" "views" and "residual") on the simulated data of the challenge. Each method was trained separately five times on 17 000 simulated images, cross-validated using 3000 images, and then applied to a test set with 100 000 images. We used two different metrics for evaluation, the area under the receiver operating characteristic curve (AUC) score, and the recall with no false positive (Recall0FP). Results: For ground-based data, our best method achieved an AUC score of 0.977 and a Recall0FP of 0.50. For space-based data, our best method achieved an AUC score of 0.940 and a Recall0FP of 0.32. Adding dihedral invariance to the CNN architecture diminished the overall score on space-based data, but achieved a higher no-contamination recall. We found that using committees of five CNNs produced the best recall at zero contamination and consistently scored better AUC than a single CNN. Conclusions: We found that for every variation of our CNN lensfinder, we achieved AUC scores close to 1 within 6%. A deeper network did not outperform simpler CNN models either. This indicates that more complex networks are not needed to model the simulated lenses. To verify this, more realistic lens simulations with more lens-like structures (spiral galaxies or ring galaxies) are needed to compare the performance of deeper and shallower networks.
Document Level Assessment of Document Retrieval Systems in a Pairwise System Evaluation
ERIC Educational Resources Information Center
Rajagopal, Prabha; Ravana, Sri Devi
2017-01-01
Introduction: The use of averaged topic-level scores can result in the loss of valuable data and can cause misinterpretation of the effectiveness of system performance. This study aims to use the scores of each document to evaluate document retrieval systems in a pairwise system evaluation. Method: The chosen evaluation metrics are document-level…
From Gain Score t to ANCOVA F (and Vice Versa)
ERIC Educational Resources Information Center
Knapp, Thomas R.; Schafer, William D.
2009-01-01
Although they test somewhat different hypotheses, analysis of gain scores (or its repeated-measures analog) and analysis of covariance are both common methods that researchers use for pre-post data. The results of the two approaches yield non-comparable outcomes, but since the same generic data are used, it is possible to transform the test…
An Evaluation of Depressed Mood in Two Classes of Medical Students
ERIC Educational Resources Information Center
Levine, Ruth E.; Litwins, Stephanie D.; Frye, Ann W.
2006-01-01
Objective: To assess depression rates in contemporary medical students. Method: The Beck Depression Inventory (BDI) was administered anonymously to two medical school classes at matriculation, the end of first year, and the end of second year. Results: Median scores for both classes were low at all points. The proportion of students scoring in the…
ERIC Educational Resources Information Center
van Ginkel, Joost R.; van der Ark, L. Andries; Sijtsma, Klaas
2007-01-01
The performance of five simple multiple imputation methods for dealing with missing data were compared. In addition, random imputation and multivariate normal imputation were used as lower and upper benchmark, respectively. Test data were simulated and item scores were deleted such that they were either missing completely at random, missing at…
Assessment of first-year post-graduate residents: usefulness of multiple tools.
Yang, Ying-Ying; Lee, Fa-Yauh; Hsu, Hui-Chi; Huang, Chin-Chou; Chen, Jaw-Wen; Cheng, Hao-Min; Lee, Wen-Shin; Chuang, Chiao-Lin; Chang, Ching-Chih; Huang, Chia-Chang
2011-12-01
Objective Structural Clinical Examination (OSCE) usually needs a large number of stations with long test time, which usually exceeds the resources available in a medical center. We aimed to determine the reliability of a combination of Direct Observation of Procedural Skills (DOPS), Internal Medicine in-Training Examination (IM-ITE(®)) and OSCE, and to verify the correlation between the small-scale OSCE+DOPS+IM-ITE(®)-composited scores and 360-degree evaluation scores of first year post-graduate (PGY(1)) residents. Between 2007 January to 2010 January, two hundred and nine internal medicine PGY1 residents completed DOPS, IM-ITE(®) and small-scale OSCE at our hospital. Faculty members completed 12-item 360-degree evaluation for each of the PGY(1) residents regularly. The small-scale OSCE scores correlated well with the 360-degree evaluation scores (r = 0.37, p < 0.021). Interestingly, the addition of DOPS scores to small-scale OSCE scores [small-scale OSCE+DOPS-composited scores] increased it's correlation with 360-degree evaluation scores of PGY(1) residents (r = 0.72, p < 0.036). Further, combination of IM-ITE(®) score with small-scale OSCE+DOPS scores [small-scale OSCE+DOPS+IM-ITE(®)-composited scores] markedly enhanced their correlation with 360-degree evaluation scores (r = 0.85, p < 0.016). The strong correlations between 360-degree evaluation and small-scale OSCE+DOPS+IM-ITE(®)-composited scores suggested that both methods were measuring the same quality. Our results showed that the small-scale OSCE, when associated with both the DOPS and IM-ITE(®), could be an important assessment method for PGY(1) residents. Copyright © 2011. Published by Elsevier B.V.
2014-01-01
Background The UK Clinical Aptitude Test (UKCAT) was designed to address issues identified with traditional methods of selection. This study aims to examine the predictive validity of the UKCAT and compare this to traditional selection methods in the senior years of medical school. This was a follow-up study of two cohorts of students from two medical schools who had previously taken part in a study examining the predictive validity of the UKCAT in first year. Methods The sample consisted of 4th and 5th Year students who commenced their studies at the University of Aberdeen or University of Dundee medical schools in 2007. Data collected were: demographics (gender and age group), UKCAT scores; Universities and Colleges Admissions Service (UCAS) form scores; admission interview scores; Year 4 and 5 degree examination scores. Pearson’s correlations were used to examine the relationships between admissions variables, examination scores, gender and age group, and to select variables for multiple linear regression analysis to predict examination scores. Results Ninety-nine and 89 students at Aberdeen medical school from Years 4 and 5 respectively, and 51 Year 4 students in Dundee, were included in the analysis. Neither UCAS form nor interview scores were statistically significant predictors of examination performance. Conversely, the UKCAT yielded statistically significant validity coefficients between .24 and .36 in four of five assessments investigated. Multiple regression analysis showed the UKCAT made a statistically significant unique contribution to variance in examination performance in the senior years. Conclusions Results suggest the UKCAT appears to predict performance better in the later years of medical school compared to earlier years and provides modest supportive evidence for the UKCAT’s role in student selection within these institutions. Further research is needed to assess the predictive validity of the UKCAT against professional and behavioural outcomes as the cohort commences working life. PMID:24762134
Local Linear Observed-Score Equating
ERIC Educational Resources Information Center
Wiberg, Marie; van der Linden, Wim J.
2011-01-01
Two methods of local linear observed-score equating for use with anchor-test and single-group designs are introduced. In an empirical study, the two methods were compared with the current traditional linear methods for observed-score equating. As a criterion, the bias in the equated scores relative to true equating based on Lord's (1980)…
AlHeresh, Rawan; LaValley, Michael P; Coster, Wendy; Keysor, Julie J
2017-06-01
To evaluate construct validity and scoring methods of the world health organization-health and work performance questionnaire (HPQ) for people with arthritis. Construct validity was examined through hypothesis testing using the recommended guidelines of the consensus-based standards for the selection of health measurement instruments (COSMIN). The HPQ using the absolute scoring method showed moderate construct validity as four of the seven hypotheses were met. The HPQ using the relative scoring method had weak construct validity as only one of the seven hypotheses were met. The absolute scoring method for the HPQ is superior in construct validity to the relative scoring method in assessing work performance among people with arthritis and related rheumatic conditions; however, more research is needed to further explore other psychometric properties of the HPQ.
Scoring systems for the Clock Drawing Test: A historical review
Spenciere, Bárbara; Alves, Heloisa; Charchat-Fichman, Helenice
2017-01-01
The Clock Drawing Test (CDT) is a simple neuropsychological screening instrument that is well accepted by patients and has solid psychometric properties. Several different CDT scoring methods have been developed, but no consensus has been reached regarding which scoring method is the most accurate. This article reviews the literature on these scoring systems and the changes they have undergone over the years. Historically, different types of scoring systems emerged. Initially, the focus was on screening for dementia, and the methods were both quantitative and semi-quantitative. Later, the need for an early diagnosis called for a scoring system that can detect subtle errors, especially those related to executive function. Therefore, qualitative analyses began to be used for both differential and early diagnoses of dementia. A widely used qualitative method was proposed by Rouleau et al. (1992). Tracing the historical path of these scoring methods is important for developing additional scoring systems and furthering dementia prevention research. PMID:29213488
Do MCAT scores predict USMLE scores? An analysis on 5 years of medical student data
Gauer, Jacqueline L.; Wolff, Josephine M.; Jackson, J. Brooks
2016-01-01
Introduction The purpose of this study was to determine the associations and predictive values of Medical College Admission Test (MCAT) component and composite scores prior to 2015 with U.S. Medical Licensure Exam (USMLE) Step 1 and Step 2 Clinical Knowledge (CK) scores, with a focus on whether students scoring low on the MCAT were particularly likely to continue to score low on the USMLE exams. Method Multiple linear regression, correlation, and chi-square analyses were performed to determine the relationship between MCAT component and composite scores and USMLE Step 1 and Step 2 CK scores from five graduating classes (2011–2015) at the University of Minnesota Medical School (N=1,065). Results The multiple linear regression analyses were both significant (p<0.001). The three MCAT component scores together explained 17.7% of the variance in Step 1 scores (p<0.001) and 12.0% of the variance in Step 2 CK scores (p<0.001). In the chi-square analyses, significant, albeit weak associations were observed between almost all MCAT component scores and USMLE scores (Cramer's V ranged from 0.05 to 0.24). Discussion Each of the MCAT component scores was significantly associated with USMLE Step 1 and Step 2 CK scores, although the effect size was small. Being in the top or bottom scoring range of the MCAT exam was predictive of being in the top or bottom scoring range of the USMLE exams, although the strengths of the associations were weak to moderate. These results indicate that MCAT scores are predictive of student performance on the USMLE exams, but, given the small effect sizes, should be considered as part of the holistic view of the student. PMID:27702431
Cao, Zhi-Juan; Chen, Yue; Wang, Shu-Mei
2014-01-10
Although multifaceted community-based programmes have been widely developed, there remains a paucity of evaluation of the effectiveness of multifaceted injury prevention programmes implemented in different settings in the community context. This study was to provide information for the evaluation of community-based health education programmes of injury prevention among high school students. The pre-intervention survey was conducted in November 2009. Health belief model (HBM) based health education for injury prevention started in January 2010 and stopped in the end of 2011 among high school students in the community context in Shanghai, China. A post-intervention survey was conducted six weeks after the completion of intervention. Injury-related health belief indicators were captured by a short questionnaire before and after the intervention. Health belief scores were calculated and compared using the simple sum score (SSS) method and the confirmatory factor analysis weighted score (CFAWS) method, respectively. The average reliability coefficient for the questionnaire was 0.89. The factor structure of HBM was given and the data fit HBM in the confirmatory factor analysis (CFA) very well. The result of CFA showed that Perceived Benefits of Taking Action (BEN) and Perceived Seriousness (SER) had the greatest impact on the health belief, Perceived Susceptibility (SUS) and Cues to Action (CTA) were the second and third most important components of HBM respectively. Barriers to Taking Action (BAR) had no notable impact on HBM. The standardized path coefficient was only 0.35, with only a small impact on CTA. The health belief score was significantly higher after intervention (p < 0.001), which was similar in the CFAWS method and in the SSS method. However, the 95% confidential interval in the CFAWS method was narrower than that in the SSS method. The results of CFA provide further empirical support for the HBM in injury intervention. The CFAWS method can be used to calculate the health belief scores and evaluate the injury related intervention. The community-based school health education might improve injury-related health belief among high school students; however, this preliminary observation needs to be confirmed in further research.
Toomey, Elaine; Matthews, James; Hurley, Deirdre A
2017-01-01
Objectives and design Despite an increasing awareness of the importance of fidelity of delivery within complex behaviour change interventions, it is often poorly assessed. This mixed methods study aimed to establish the fidelity of delivery of a complex self-management intervention and explore the reasons for these findings using a convergent/triangulation design. Setting Feasibility trial of the Self-management of Osteoarthritis and Low back pain through Activity and Skills (SOLAS) intervention (ISRCTN49875385), delivered in primary care physiotherapy. Methods and outcomes 60 SOLAS sessions were delivered across seven sites by nine physiotherapists. Fidelity of delivery of prespecified intervention components was evaluated using (1) audio-recordings (n=60), direct observations (n=24) and self-report checklists (n=60) and (2) individual interviews with physiotherapists (n=9). Quantitatively, fidelity scores were calculated using percentage means and SD of components delivered. Associations between fidelity scores and physiotherapist variables were analysed using Spearman’s correlations. Interviews were analysed using thematic analysis to explore potential reasons for fidelity scores. Integration of quantitative and qualitative data occurred at an interpretation level using triangulation. Results Quantitatively, fidelity scores were high for all assessment methods; with self-report (92.7%) consistently higher than direct observations (82.7%) or audio-recordings (81.7%). There was significant variation between physiotherapists’ individual scores (69.8% - 100%). Both qualitative and quantitative data (from physiotherapist variables) found that physiotherapists’ knowledge (Spearman’s association at p=0.003) and previous experience (p=0.008) were factors that influenced their fidelity. The qualitative data also postulated participant-level (eg, individual needs) and programme-level factors (eg, resources) as additional elements that influenced fidelity. Conclusion The intervention was delivered with high fidelity. This study contributes to the limited evidence regarding fidelity assessment methods within complex behaviour change interventions. The findings suggest a combination of quantitative methods is suitable for the assessment of fidelity of delivery. A mixed methods approach provided a more insightful understanding of fidelity and its influencing factors. Trial registration number ISRCTN49875385; Pre-results. PMID:28780544
NASA Astrophysics Data System (ADS)
Srivastava, S.
2015-12-01
Gravity Recovery and Climate Experiment (GRACE) data are widely used for the hydrological studies for large scale basins (≥100,000 sq km). GRACE data (Stokes Coefficients or Equivalent Water Height) used for hydrological studies are not direct observations but result from high level processing of raw data from the GRACE mission. Different partner agencies like CSR, GFZ and JPL implement their own methodology and their processing methods are independent from each other. The primary source of errors in GRACE data are due to measurement and modeling errors and the processing strategy of these agencies. Because of different processing methods, the final data from all the partner agencies are inconsistent with each other at some epoch. GRACE data provide spatio-temporal variations in Earth's gravity which is mainly attributed to the seasonal fluctuations in water level on Earth surfaces and subsurface. During the quantification of error/uncertainties, several high positive and negative peaks were observed which do not correspond to any hydrological processes but may emanate from a combination of primary error sources, or some other geophysical processes (e.g. Earthquakes, landslide, etc.) resulting in redistribution of earth's mass. Such peaks can be considered as outliers for hydrological studies. In this work, an algorithm has been designed to extract outliers from the GRACE data for Indo-Gangetic plain, which considers the seasonal variations and the trend in data. Different outlier detection methods have been used such as Z-score, modified Z-score and adjusted boxplot. For verification, assimilated hydrological (GLDAS) and hydro-meteorological data are used as the reference. The results have shown that the consistency amongst all data sets improved significantly after the removal of outliers.
Validation of the tablet-administered Brief Assessment of Cognition (BAC App).
Atkins, Alexandra S; Tseng, Tina; Vaughan, Adam; Twamley, Elizabeth W; Harvey, Philip; Patterson, Thomas; Narasimhan, Meera; Keefe, Richard S E
2017-03-01
Computerized tests benefit from automated scoring procedures and standardized administration instructions. These methods can reduce the potential for rater error. However, especially in patients with severe mental illnesses, the equivalency of traditional and tablet-based tests cannot be assumed. The Brief Assessment of Cognition in Schizophrenia (BACS) is a pen-and-paper cognitive assessment tool that has been used in hundreds of research studies and clinical trials, and has normative data available for generating age- and gender-corrected standardized scores. A tablet-based version of the BACS called the BAC App has been developed. This study compared performance on the BACS and the BAC App in patients with schizophrenia and healthy controls. Test equivalency was assessed, and the applicability of paper-based normative data was evaluated. Results demonstrated the distributions of standardized composite scores for the tablet-based BAC App and the pen-and-paper BACS were indistinguishable, and the between-methods mean differences were not statistically significant. The discrimination between patients and controls was similarly robust. The between-methods correlations for individual measures in patients were r>0.70 for most subtests. When data from the Token Motor Test was omitted, the between-methods correlation of composite scores was r=0.88 (df=48; p<0.001) in healthy controls and r=0.89 (df=46; p<0.001) in patients, consistent with the test-retest reliability of each measure. Taken together, results indicate that the tablet-based BAC App generates results consistent with the traditional pen-and-paper BACS, and support the notion that the BAC App is appropriate for use in clinical trials and clinical practice. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Comparing strategies to assess multiple behavior change in behavioral intervention studies.
Drake, Bettina F; Quintiliani, Lisa M; Sapp, Amy L; Li, Yi; Harley, Amy E; Emmons, Karen M; Sorensen, Glorian
2013-03-01
Alternatives to individual behavior change methods have been proposed, however, little has been done to investigate how these methods compare. To explore four methods that quantify change in multiple risk behaviors targeting four common behaviors. We utilized data from two cluster-randomized, multiple behavior change trials conducted in two settings: small businesses and health centers. Methods used were: (1) summative; (2) z-score; (3) optimal linear combination; and (4) impact score. In the Small Business study, methods 2 and 3 revealed similar outcomes. However, physical activity did not contribute to method 3. In the Health Centers study, similar results were found with each of the methods. Multivitamin intake contributed significantly more to each of the summary measures than other behaviors. Selection of methods to assess multiple behavior change in intervention trials must consider study design, and the targeted population when determining the appropriate method/s to use.
Segmentation of breast cancer cells positive 1+ and 3+ immunohistochemistry
NASA Astrophysics Data System (ADS)
Labellapansa, Ause; Muhimmah, Izzati; Indrayanti
2016-03-01
Breast cancer is a disease occurs as a result of uncontrolled cells growth. One examination method of breast cancer cells is using Immunohistochemistry (IHC) to determine status of Human Epidermal Growth Factor Receptor2 (HER2) protein. This study helps anatomic pathologist to determine HER2 scores using image processing techniques to obtain HER2 overexpression positive area percentages of 1+ and 3+ scores. This is done because the score of 0 is HER2 negative cells and 2+ scores have equivocal results, which means it could not be determined whether it is necessary to give targeted therapy or not. HER2 overexpression positive area percentage is done by dividing the area with a HER2 positive tumor area. To obtain better tumor area, repair is done by eliminating lymphocytes area which is not tumor area using morphological opening. Results of 10 images IHC scores of 1+ and 3+ and 10 IHC images testing without losing lymphocytes area in tumor area, has proven that the system has been able to provide an overall correct classification in accordance with the experts analysis. However by doing operation to remove non-tumor areas, classification can be done correctly 100% for scores of 3+ and 65% for scores of 1+.
A comparison of three developmental stage scoring systems.
Dawson, Theo Linda
2002-01-01
In social psychological research the stage metaphor has fallen into disfavor due to concerns about bias, reliability, and validity. To address some of these issues, I employ a multidimensional partial credit analysis comparing moral judgment interviews scored with the Standard Issue Scoring System (SISS) (Colby and Kohlberg, 1987b), evaluative reasoning interviews scored with the Good Life Scoring System (GLSS) (Armon, 1984b), and Good Education interviews scored with the Hierarchical Complexity Scoring System (HCSS) (Commons, Danaher, Miller, and Dawson, 2000). A total of 209 participants between the ages of 5 and 86 were interviewed. The multidimensional model reveals that even though the scoring systems rely upon different criteria and the data were collected using different methods and scored by different teams of raters, the SISS, GLSS, and HCSS all appear to measure the same latent variable. The HCSS exhibits more internal consistency than the SISS and GLSS, and solves some methodological problems introduced by the content dependency of the SISS and GLSS. These results and their implications are elaborated.
A Review of Propensity-Score Methods and Their Use in Cardiovascular Research.
Deb, Saswata; Austin, Peter C; Tu, Jack V; Ko, Dennis T; Mazer, C David; Kiss, Alex; Fremes, Stephen E
2016-02-01
Observational studies using propensity-score methods have been increasing in the cardiovascular literature because randomized controlled trials are not always feasible or ethical. However, propensity-score methods can be confusing, and the general audience may not fully understand the importance of this technique. The objectives of this review are to describe (1) the fundamentals of propensity score methods, (2) the techniques to assess for propensity-score model adequacy, (3) the 4 major methods for using the propensity score (matching, stratification, covariate adjustment, and inverse probability of treatment weighting [IPTW]) using examples from previously published cardiovascular studies, and (4) the strengths and weaknesses of these 4 techniques. Our review suggests that matching or IPTW using the propensity score have shown to be most effective in reducing bias of the treatment effect. Copyright © 2016 Canadian Cardiovascular Society. Published by Elsevier Inc. All rights reserved.
Perser, Karen; Godfrey, David; Bisson, Leslie
2011-05-01
Double-row rotator cuff repair methods have improved biomechanical performance when compared with single-row repairs. To review clinical outcomes of single-row versus double-row rotator cuff repair with the hypothesis that double-row rotator cuff repair will result in better clinical and radiographic outcomes. Published literature from January 1980 to April 2010. Key terms included rotator cuff, prospective studies, outcomes, and suture techniques. The literature was systematically searched, and 5 level I and II studies were found comparing clinical outcomes of single-row and double-row rotator cuff repair. Coleman methodology scores were calculated for each article. Meta-analysis was performed, with treatment effect between single row and double row for clinical outcomes and with odds ratios for radiographic results. The sample size necessary to detect a given difference in clinical outcome between the 2 methods was calculated. Three level I studies had Coleman scores of 80, 74, and 81, and two level II studies had scores of 78 and 73. There were 156 patients with single-row repairs and 147 patients with double-row repairs, both with an average follow-up of 23 months (range, 12-40 months). Double-row repairs resulted in a greater treatment effect for each validated outcome measure in 4 studies, but the differences were not clinically or statistically significant (range, 0.4-2.2 points; 95% confidence interval, -0.19, 4.68 points). Double-row repairs had better radiographic results, but the differences were also not statistically significant (P = 0.13). Two studies had adequate power to detect a 10-point difference between repair methods using the Constant score, and 1 study had power to detect a 5-point difference using the UCLA (University of California, Los Angeles) score. Double-row rotator cuff repair does not show a statistically significant improvement in clinical outcome or radiographic healing with short-term follow-up.
Lee, Sung-Jae; Brooks, Ronald; Bolan, Robert K.; Flynn, Risa
2013-01-01
Men who have sex with men (MSM) in the United States represent a vulnerable population with lower rates of HIV testing. There are various specific attributes of HIV testing that may impact willingness to test (WTT) for HIV. Identifying specific attributes influencing patients’ decisions around WTT for HIV is critical to ensure improved HIV testing uptake. This study examined WTT for HIV by using conjoint analysis, an innovative method for systematically estimating consumer preferences across discrete attributes. WTT for HIV was assessed across eight hypothetical HIV testing scenarios varying across seven dichotomous attributes: location (home vs. clinic), price (free vs. $50), sample collection (finger prick vs. blood), timeliness of results (immediate vs. 1–2 weeks), privacy (anonymous vs. confidential), results given (by phone vs. in-person), and type of counseling (brochure vs. in-person). Seventy-five MSM were recruited from a community based organization providing HIV testing services in Los Angeles to participate in conjoint analysis. WTT for HIV score was based on a 100-point scale. Scores ranged from 32.2 to 80.3 for eight hypothetical HIV testing scenarios. Price of HIV testing (free vs. $50) had the highest impact on WTT (impact score=31.4, SD=29.2, p<.0001), followed by timeliness of results (immediate vs. 1–2 weeks) (impact score=13.9, SD=19.9, p=<.0001) and testing location (home vs. clinic) (impact score=10.3, SD=22.8, p=.0002). Impacts of other HIV testing attributes were not significant. Conjoint analysis method enabled direct assessment of HIV testing preferences and identified specific attributes that significantly impact WTT for HIV among MSM. This method provided empirical evidence to support the potential uptake of the newly FDA-approved over-the-counter HIV home-test kit with immediate results, with cautionary note on the cost of the kit. PMID:23651439
Lizunov, A Y; Gonchar, A L; Zaitseva, N I; Zosimov, V V
2015-10-26
We analyzed the frequency with which intraligand contacts occurred in a set of 1300 protein-ligand complexes [ Plewczynski et al. J. Comput. Chem. 2011 , 32 , 742 - 755 .]. Our analysis showed that flexible ligands often form intraligand hydrophobic contacts, while intraligand hydrogen bonds are rare. The test set was also thoroughly investigated and classified. We suggest a universal method for enhancement of a scoring function based on a potential of mean force (PMF-based score) by adding a term accounting for intraligand interactions. The method was implemented via in-house developed program, utilizing an Algo_score scoring function [ Ramensky et al. Proteins: Struct., Funct., Genet. 2007 , 69 , 349 - 357 .] based on the Tarasov-Muryshev PMF [ Muryshev et al. J. Comput.-Aided Mol. Des. 2003 , 17 , 597 - 605 .]. The enhancement of the scoring function was shown to significantly improve the docking and scoring quality for flexible ligands in the test set of 1300 protein-ligand complexes [ Plewczynski et al. J. Comput. Chem. 2011 , 32 , 742 - 755 .]. We then investigated the correlation of the docking results with two parameters of intraligand interactions estimation. These parameters are the weight of intraligand interactions and the minimum number of bonds between the ligand atoms required to take their interaction into account.
Developing a Conceptually Equivalent Type 2 Diabetes Risk Score for Indian Gujaratis in the UK
Patel, Naina; Stone, Margaret; Barber, Shaun; Gray, Laura; Davies, Melanie; Khunti, Kamlesh
2016-01-01
Aims. To apply and assess the suitability of a model consisting of commonly used cross-cultural translation methods to achieve a conceptually equivalent Gujarati language version of the Leicester self-assessment type 2 diabetes risk score. Methods. Implementation of the model involved multiple stages, including pretesting of the translated risk score by conducting semistructured interviews with a purposive sample of volunteers. Interviews were conducted on an iterative basis to enable findings to inform translation revisions and to elicit volunteers' ability to self-complete and understand the risk score. Results. The pretest stage was an essential component involving recruitment of a diverse sample of 18 Gujarati volunteers, many of whom gave detailed suggestions for improving the instructions for the calculation of the risk score and BMI table. Volunteers found the standard and level of Gujarati accessible and helpful in understanding the concept of risk, although many of the volunteers struggled to calculate their BMI. Conclusions. This is the first time that a multicomponent translation model has been applied to the translation of a type 2 diabetes risk score into another language. This project provides an invaluable opportunity to share learning about the transferability of this model for translation of self-completed risk scores in other health conditions. PMID:27703985
2011-01-01
Introduction To develop a scoring method for quantifying nutrition risk in the intensive care unit (ICU). Methods A prospective, observational study of patients expected to stay > 24 hours. We collected data for key variables considered for inclusion in the score which included: age, baseline APACHE II, baseline SOFA score, number of comorbidities, days from hospital admission to ICU admission, Body Mass Index (BMI) < 20, estimated % oral intake in the week prior, weight loss in the last 3 months and serum interleukin-6 (IL-6), procalcitonin (PCT), and C-reactive protein (CRP) levels. Approximate quintiles of each variable were assigned points based on the strength of their association with 28 day mortality. Results A total of 597 patients were enrolled in this study. Based on the statistical significance in the multivariable model, the final score used all candidate variables except BMI, CRP, PCT, estimated percentage oral intake and weight loss. As the score increased, so did mortality rate and duration of mechanical ventilation. Logistic regression demonstrated that nutritional adequacy modifies the association between the score and 28 day mortality (p = 0.01). Conclusions This scoring algorithm may be helpful in identifying critically ill patients most likely to benefit from aggressive nutrition therapy. PMID:22085763
Effect of storytelling on hopefulness in girl students
Shafieyan, Shima; Soleymani, Mohammad Reza; Samouei, Raheleh; Afshar, Mina
2017-01-01
BACKGROUND AND AIM: One of the methods that help students in learning critical thinking and decision-making skills is storytelling. Story helps the students to place themselves in the same situation as the main protagonist and try different ways and finally select and implement the best possible method. The goal of this study is to investigate the effect of storytelling on hopefulness of students, age 8–11 in Isfahan's 2nd educational district. METHODS: This is an applied, quasi-experimental study. The study population comprised of 34 randomly selected students attending one of the schools in Isfahan's 2nd educational district. The data gathering tool was the standard Kazdin hopefulness scale (α = 0.72) and data were gathered before and after 8 storytelling sessions for the intervention group. The gathered data were analyzed using descriptive and analytical (paired and independent t-test) with the help of SPSS Version 18 software. RESULTS: The study's findings showed a significant difference in the average hopefulness score of students in study group in pre- and posttest (P = 0.04). Furthermore, independent t-test results showed a significant difference in hopefulness score of intervention and control (P = 0.001). The average hopefulness score of the control group after storytelling sessions was higher than that of the intervention and control. CONCLUSION: The results show the effectiveness of storytelling as a method for improving hopefulness in students. PMID:29296602
Leucht, Stefan; Fennema, Hein; Engel, Rolf R; Kaspers-Janssen, Marion; Lepping, Peter; Szegedi, Armin
2017-03-01
Little is known about the clinical relevance of the Montgomery Asberg Depression Rating Scale (MADRS) total scores. It is unclear how total scores translate into clinical severity, or how commonly used measures for response (reduction from baseline of ≥50% in the total score) translate into clinical relevance. Moreover, MADRS based definitions of remission vary. We therefore compared: a/ the MADRS total score with the Clinical Global Impression - Severity Score (CGI-S) b/ the percentage and absolute change in the MADRS total scores with Clinical Global Impression - Improvement (CGI-I); c/ the absolute and percentage change in the MADRS total scores with CGI-S absolute change. The method used was equipercentile linking of MADRS and CGI ratings from 22 drug trials in patients with Major Depressive Disorder (MDD) (n=3288). Our results confirm the validity of the commonly used measures for response in MDD trials: a CGI-I score of 2 ('much improved') corresponded to a percentage MADRS reduction from baseline of 48-57%, and a CGI-I score of 1 ('very much improved') to a reduction of 80-84%. If a state of almost complete absence of symptoms were required for a definition of remission, a MADRS total score would be <8, because such scores corresponded to a CGI-S score of 2 ('borderline mentally ill'). Although our analysis is based on a large number of patients, the original trials were not specifically designed to examine our research question. The results might contribute to a better understanding and improved interpretation of clinical trial results in MDD. Copyright © 2017 Elsevier B.V. All rights reserved.
The gender difference on the Mental Rotations test is not due to performance factors.
Masters, M S
1998-05-01
Men score higher than women on the Mental Rotations test (MRT), and the magnitude of this gender difference is the largest of that on any spatial test. Goldstein, Haldane, and Mitchell (1990) reported finding that the gender difference on the MRT disappears when "performance factors" are controlled--specifically, when subjects are allowed sufficient time to attempt all items on the test or when a scoring procedure that controls for the number of items attempted is used. The present experiment also explored whether eliminating these performance factors results in a disappearance of the gender difference on the test. Male and female college students were allowed a short time period or unlimited time on the MRT. The tests were scored according to three different procedures. The results showed no evidence that the gender difference on the MRT was affected by the scoring method or the time limit. Regardless of the scoring procedure, men scored higher than women, and the magnitude of the gender difference persisted undiminished when subjects completed all items on the test. Thus there was no evidence that performance factors produced the gender difference on the MRT. These results are consistent with the results of other investigators who have attempted to replicate Goldstein et al.'s findings.
Jafari, Rahim; Sadeghi, Mehdi; Mirzaie, Mehdi
2016-05-01
The approaches taken to represent and describe structural features of the macromolecules are of major importance when developing computational methods for studying and predicting their structures and interactions. This study attempts to explore the significance of Delaunay tessellation for the definition of atomic interactions by evaluating its impact on the performance of scoring protein-protein docking prediction. Two sets of knowledge-based scoring potentials are extracted from a training dataset of native protein-protein complexes. The potential of the first set is derived using atomic interactions extracted from Delaunay tessellated structures. The potential of the second set is calculated conventionally, that is, using atom pairs whose interactions were determined by their separation distances. The scoring potentials were tested against two different docking decoy sets and their performances were compared. The results show that, if properly optimized, the Delaunay-based scoring potentials can achieve higher success rate than the usual scoring potentials. These results and the results of a previous study on the use of Delaunay-based potentials in protein fold recognition, all point to the fact that Delaunay tessellation of protein structure can provide a more realistic definition of atomic interaction, and therefore, if appropriately utilized, may be able to improve the accuracy of pair potentials. Copyright © 2016 Elsevier Inc. All rights reserved.
Bornstein, Robert F
2007-06-01
The degree to which projection plays a role in Rorschach (Rorschach, 1921/1942) responding remains controversial, in part because extant data have yielded inconclusive results. In this investigation, I examined the impact of social projection on Rorschach Oral Dependency (ROD) scores using methods adapted from social cognition research. In Study 1, I prescreened 85 college students (40 women and 45 men) with the ROD scale and a widely used self-report measure of dependency, the Interpersonal Dependency Inventory (IDI; Hirschfeld et al., 1977). Results show that informing participants who scored low on the IDI that they were in fact highly dependent led to significant increases in ROD scores; I did not obtain parallel ROD increases for participants who scored high on the IDI or for participants who received low-dependent feedback. In Study 2, I examined a separate sample of 80 prescreened college students (40 women and 40 men) and showed that providing low self-report participants an opportunity to attribute dependency to a fictional target person prior to Rorschach responding attenuated the impact of high-dependent feedback on ROD scores. These results suggest that projection played a role in at least one domain of Rorschach responding. I discuss theoretical, clinical, and empirical implications of these results.
Velissaris, Dimitrios; Karanikolas, Menelaos; Flaris, Nikolaos; Fligou, Fotini; Marangos, Markos; Filos, Kriton S
2012-01-01
Introduction. Severe leptospirosis, also known as Weil's disease, can cause multiorgan failure with high mortality. Scoring systems for disease severity have not been validated for leptospirosis, and there is no documented method to predict mortality. Methods. This is a case series on 10 patients admitted to ICU for multiorgan failure from severe leptospirosis. Data were collected retrospectively, with approval from the Institution Ethics Committee. Results. Ten patients with severe leptospirosis were admitted in the Patras University Hospital ICU in a four-year period. Although, based on SOFA scores, predicted mortality was over 80%, seven of 10 patients survived and were discharged from the hospital in good condition. There was no association between SAPS II or SOFA scores and mortality, but survivors had significantly lower APACHE II scores compared to nonsurvivors. Conclusion. Commonly used severity scores do not seem to be useful in predicting mortality in severe leptospirosis. Early ICU admission and resuscitation based on a goal-directed therapy protocol are recommended and may reduce mortality. However, this study is limited by retrospective data collection and small sample size. Data from large prospective studies are needed to validate our findings.
Measuring patient's expectation and the perception of quality in LASIK services
Lin, Deng-Juin; Sheu, Ing-Cheau; Pai, Jar-Yuan; Bair, Alex; Hung, Che-Yu; Yeh, Yuan-Hung; Chou, Ming-Jen
2009-01-01
Background LASIK is the use of excimer lasers to treat therapeutic and refractive visual disorders, ranging from superficial scars to nearsightedness (myopia), and from astigmatism to farsightedness (hyperopia). The purposes of this study are to checking the applicability and psychometric properties of the SERVQUAL on Lasik surgery population. Second, use SEM methods to investigate the loyalty, perceptions and expectations relationship on LASIK surgery. Methods The method with which this study was conducted was questionnaire development. A total of 463 consecutive patients, attending LASIK surgery affiliated with Chung Shan Medical University Eye Center, enrolled in this study. All participants were asked to complete revised SERVQUAL questionnaires. Student t test, correlation test, and ANOVA and factor analyses were used to identify the characters and factors of service quality. Paired t test were used to test the gap between expectation and perception scores and structural equation modeling was used to examine relationships among satisfaction components. Results The effective response rate was 97.3%. Validity was verified by several methods and internal reliability Cronbach's alpha was > 0.958. The results from patient's scores were very high with an overall score of 6.41(0.66), expectations at 6.68(0.47), and perceptions at 6.51(0.57). The gap between expectations and perceptions was significant, however, (t = 6.08). Furthermore, there were significant differences in the expectation scores among the different jobs. Also, the results showed that the higher the education of the patient, the lower their perception score (r = -0.10). The factor loading results of factor analysis showed 5 factors of the 22 items of the SERVQUAL model. The 5 factors of perception explained 72.94% of the total variance there; and on expectations it explained 77.12% of the total variance of satisfaction scores. The goodness-of-fit summary, of structure equation modeling, showed trends in concept on expectations, perceptions, and loyalty. Conclusion The results of this research appear to show that the SERVQUAL instrument is a useful measurement tool in assessing and monitoring service quality in LASIK service, and enabling staff to identify where improvements are needed, from the patients' perspective. There were service quality gaps in the reliability, assurance, and empathy. This study suggested that physicians should increase their discussions with patients; which has, of course, already been proven to be an effective way to increase patient's satisfaction with medical care, regardless of the procedure received. PMID:19591682
Multimodal biometric method that combines veins, prints, and shape of a finger
NASA Astrophysics Data System (ADS)
Kang, Byung Jun; Park, Kang Ryoung; Yoo, Jang-Hee; Kim, Jeong Nyeo
2011-01-01
Multimodal biometrics provides high recognition accuracy and population coverage by using various biometric features. A single finger contains finger veins, fingerprints, and finger geometry features; by using multimodal biometrics, information on these multiple features can be simultaneously obtained in a short time and their fusion can outperform the use of a single feature. This paper proposes a new finger recognition method based on the score-level fusion of finger veins, fingerprints, and finger geometry features. This research is novel in the following four ways. First, the performances of the finger-vein and fingerprint recognition are improved by using a method based on a local derivative pattern. Second, the accuracy of the finger geometry recognition is greatly increased by combining a Fourier descriptor with principal component analysis. Third, a fuzzy score normalization method is introduced; its performance is better than the conventional Z-score normalization method. Fourth, finger-vein, fingerprint, and finger geometry recognitions are combined by using three support vector machines and a weighted SUM rule. Experimental results showed that the equal error rate of the proposed method was 0.254%, which was lower than those of the other methods.
Towards accurate modeling of noncovalent interactions for protein rigidity analysis
2013-01-01
Background Protein rigidity analysis is an efficient computational method for extracting flexibility information from static, X-ray crystallography protein data. Atoms and bonds are modeled as a mechanical structure and analyzed with a fast graph-based algorithm, producing a decomposition of the flexible molecule into interconnected rigid clusters. The result depends critically on noncovalent atomic interactions, primarily on how hydrogen bonds and hydrophobic interactions are computed and modeled. Ongoing research points to the stringent need for benchmarking rigidity analysis software systems, towards the goal of increasing their accuracy and validating their results, either against each other and against biologically relevant (functional) parameters. We propose two new methods for modeling hydrogen bonds and hydrophobic interactions that more accurately reflect a mechanical model, without being computationally more intensive. We evaluate them using a novel scoring method, based on the B-cubed score from the information retrieval literature, which measures how well two cluster decompositions match. Results To evaluate the modeling accuracy of KINARI, our pebble-game rigidity analysis system, we use a benchmark data set of 20 proteins, each with multiple distinct conformations deposited in the Protein Data Bank. Cluster decompositions for them were previously determined with the RigidFinder method from Gerstein's lab and validated against experimental data. When KINARI's default tuning parameters are used, an improvement of the B-cubed score over a crude baseline is observed in 30% of this data. With our new modeling options, improvements were observed in over 70% of the proteins in this data set. We investigate the sensitivity of the cluster decomposition score with case studies on pyruvate phosphate dikinase and calmodulin. Conclusion To substantially improve the accuracy of protein rigidity analysis systems, thorough benchmarking must be performed on all current systems and future extensions. We have measured the gain in performance by comparing different modeling methods for noncovalent interactions. We showed that new criteria for modeling hydrogen bonds and hydrophobic interactions can significantly improve the results. The two new methods proposed here have been implemented and made publicly available in the current version of KINARI (v1.3), together with the benchmarking tools, which can be downloaded from our software's website, http://kinari.cs.umass.edu. PMID:24564209
A test of Hartnett's revisions to the pubic symphysis and fourth rib methods on a modern sample.
Merritt, Catherine E
2014-05-01
Estimating age at death is one of the most important aspects of creating a biological profile. Most adult age estimation methods were developed on North American skeletal collections from the early to mid-20th century, and their applicability to modern populations has been questioned. In 2010, Hartnett used a modern skeletal collection from the Maricopia County Forensic Science Centre to revise the Suchey-Brooks pubic symphysis method and the İşcan et al. fourth rib methods. The current study tests Hartnett's revised methods as well as the original Suchey-Brooks and İşcan et al. methods on a modern sample from the William Bass Skeletal Collection (N = 313, mean age = 58.5, range 19-92). Results show that the Suchey-Brooks and İşcan et al. methods assign individuals to the correct phase 70.8% and 57.5% of the time compared with Hartnett's revised methods at 58.1% and 29.7%, respectively, with correctness scores based on one standard deviation of the mean rather than the entire age range. Accuracy and bias scores are significantly improved for Hartnett's revised pubic symphysis method and marginally better for Hartnett's revised fourth rib method, suggesting that the revised mean ages at death of Hartnett's phases better reflect this modern population. Overall, both Hartnett's revised methods are reliable age estimation methods. For the pubic symphysis, there are significant improvements in accuracy and bias scores, especially for older individuals; however, for the fourth rib, the results are comparable to the original İşcan et al. methods, with some improvement for older individuals. © 2014 American Academy of Forensic Sciences.
An Investigation of Undefined Cut Scores with the Hofstee Standard-Setting Method
ERIC Educational Resources Information Center
Wyse, Adam E.; Babcock, Ben
2017-01-01
This article provides an overview of the Hofstee standard-setting method and illustrates several situations where the Hofstee method will produce undefined cut scores. The situations where the cut scores will be undefined involve cases where the line segment derived from the Hofstee ratings does not intersect the score distribution curve based on…
ERIC Educational Resources Information Center
Ossai, Peter Agbadobi Uloku
2016-01-01
This study examined the relationship between students' scores on Research Methods and statistics, and undergraduate project at the final year. The purpose was to find out whether students matched knowledge of research with project-writing skill. The study adopted an expost facto correlational design. Scores on Research Methods and Statistics for…
Using Generalizability Theory to Examine Different Concept Map Scoring Methods
ERIC Educational Resources Information Center
Cetin, Bayram; Guler, Nese; Sarica, Rabia
2016-01-01
Problem Statement: In addition to being teaching tools, concept maps can be used as effective assessment tools. The use of concept maps for assessment has raised the issue of scoring them. Concept maps generated and used in different ways can be scored via various methods. Holistic and relational scoring methods are two of them. Purpose of the…
NASA Astrophysics Data System (ADS)
Crawford, I.; Ruske, S.; Topping, D. O.; Gallagher, M. W.
2015-07-01
In this paper we present improved methods for discriminating and quantifying Primary Biological Aerosol Particles (PBAP) by applying hierarchical agglomerative cluster analysis to multi-parameter ultra violet-light induced fluorescence (UV-LIF) spectrometer data. The methods employed in this study can be applied to data sets in excess of 1×106 points on a desktop computer, allowing for each fluorescent particle in a dataset to be explicitly clustered. This reduces the potential for misattribution found in subsampling and comparative attribution methods used in previous approaches, improving our capacity to discriminate and quantify PBAP meta-classes. We evaluate the performance of several hierarchical agglomerative cluster analysis linkages and data normalisation methods using laboratory samples of known particle types and an ambient dataset. Fluorescent and non-fluorescent polystyrene latex spheres were sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4) where the optical size, asymmetry factor and fluorescent measurements were used as inputs to the analysis package. It was found that the Ward linkage with z-score or range normalisation performed best, correctly attributing 98 and 98.1 % of the data points respectively. The best performing methods were applied to the BEACHON-RoMBAS ambient dataset where it was found that the z-score and range normalisation methods yield similar results with each method producing clusters representative of fungal spores and bacterial aerosol, consistent with previous results. The z-score result was compared to clusters generated with previous approaches (WIBS AnalysiS Program, WASP) where we observe that the subsampling and comparative attribution method employed by WASP results in the overestimation of the fungal spore concentration by a factor of 1.5 and the underestimation of bacterial aerosol concentration by a factor of 5. We suggest that this likely due to errors arising from misatrribution due to poor centroid definition and failure to assign particles to a cluster as a result of the subsampling and comparative attribution method employed by WASP. The methods used here allow for the entire fluorescent population of particles to be analysed yielding an explict cluster attribution for each particle, improving cluster centroid definition and our capacity to discriminate and quantify PBAP meta-classes compared to previous approaches.
Bime, Christian; Wei, Christine Y.; Holbrook, Janet T.; Sockrider, Marianna M.; Revicki, Dennis A.; Wise, Robert A.
2012-01-01
Background The evaluation of asthma symptoms is a core outcome measure in asthma clinical research. The Asthma Symptom Utility Index (ASUI) was developed to assess frequency and severity of asthma symptoms. The psychometric properties of the ASUI are not well characterized and a minimal important difference (MID) is not established. Objectives We assessed the reliability, validity, and responsiveness to change of the ASUI in a population of adult asthma patients. We also sought to determine the MID for the ASUI. Methods Adult asthma patients (n = 1648) from two previously completed multicenter randomized trials were included. Demographic information, spirometry, ASUI scores, and other asthma questionnaire scores were obtained at baseline and during follow-up visits. Participants also kept a daily asthma diary. Results Internal consistency reliability of the ASUI was 0.74 (Cronbach’s alpha). Test-retest reliability was 0.76 (intra-class correlation). Construct validity was demonstrated by significant correlations between ASUI scores and Asthma Control Questionnaire (ACQ) scores (Spearman correlation r = −0.79, 95% CI [−0.85, −0.75], P<0.001) and Mini Asthma Quality of Life Questionnaire (Mini AQLQ) scores (r = 0.59, 95% CI [0.51, 0.61], P<0.001). Responsiveness to change was demonstrated, with significant differences between mean changes in ASUI score across groups of participants differing by 10% in the percent predicted FEV1 (P<0.001), and by 0.5 points in ACQ score (P < 0.001). Anchor-based methods and statistical methods support an MID for the ASUI of 0.09 points. Conclusions The ASUI is reliable, valid, and responsive to changes in asthma control over time. The MID of the ASUI (range of scores 0–1) is 0.09. PMID:23026499
Psoriasis image representation using patch-based dictionary learning for erythema severity scoring.
George, Yasmeen; Aldeen, Mohammad; Garnavi, Rahil
2018-06-01
Psoriasis is a chronic skin disease which can be life-threatening. Accurate severity scoring helps dermatologists to decide on the treatment. In this paper, we present a semi-supervised computer-aided system for automatic erythema severity scoring in psoriasis images. Firstly, the unsupervised stage includes a novel image representation method. We construct a dictionary, which is then used in the sparse representation for local feature extraction. To acquire the final image representation vector, an aggregation method is exploited over the local features. Secondly, the supervised phase is where various multi-class machine learning (ML) classifiers are trained for erythema severity scoring. Finally, we compare the proposed system with two popular unsupervised feature extractor methods, namely: bag of visual words model (BoVWs) and AlexNet pretrained model. Root mean square error (RMSE) and F1 score are used as performance measures for the learned dictionaries and the trained ML models, respectively. A psoriasis image set consisting of 676 images, is used in this study. Experimental results demonstrate that the use of the proposed procedure can provide a setup where erythema scoring is accurate and consistent. Also, it is revealed that dictionaries with large number of atoms and small patch sizes yield the best representative erythema severity features. Further, random forest (RF) outperforms other classifiers with F1 score 0.71, followed by support vector machine (SVM) and boosting with 0.66 and 0.64 scores, respectively. Furthermore, the conducted comparative studies confirm the effectiveness of the proposed approach with improvement of 9% and 12% over BoVWs and AlexNet based features, respectively. Crown Copyright © 2018. Published by Elsevier Ltd. All rights reserved.
Walking on a user similarity network towards personalized recommendations.
Gan, Mingxin
2014-01-01
Personalized recommender systems have been receiving more and more attention in addressing the serious problem of information overload accompanying the rapid evolution of the world-wide-web. Although traditional collaborative filtering approaches based on similarities between users have achieved remarkable success, it has been shown that the existence of popular objects may adversely influence the correct scoring of candidate objects, which lead to unreasonable recommendation results. Meanwhile, recent advances have demonstrated that approaches based on diffusion and random walk processes exhibit superior performance over collaborative filtering methods in both the recommendation accuracy and diversity. Building on these results, we adopt three strategies (power-law adjustment, nearest neighbor, and threshold filtration) to adjust a user similarity network from user similarity scores calculated on historical data, and then propose a random walk with restart model on the constructed network to achieve personalized recommendations. We perform cross-validation experiments on two real data sets (MovieLens and Netflix) and compare the performance of our method against the existing state-of-the-art methods. Results show that our method outperforms existing methods in not only recommendation accuracy and diversity, but also retrieval performance.
Methods of scaling threshold color difference using printed samples
NASA Astrophysics Data System (ADS)
Huang, Min; Cui, Guihua; Liu, Haoxue; Luo, M. Ronnier
2012-01-01
A series of printed samples on substrate of semi-gloss paper and with the magnitude of threshold color difference were prepared for scaling the visual color difference and to evaluate the performance of different method. The probabilities of perceptibly was used to normalized to Z-score and different color differences were scaled to the Z-score. The visual color difference was got, and checked with the STRESS factor. The results indicated that only the scales have been changed but the relative scales between pairs in the data are preserved.
Leth, Peter Mygind; Ibsen, Marlene
2010-06-01
The purpose of this investigation is to evaluate the value of postmortem computerized tomography (CT) for Abbreviated Injury Scale (AIS) scoring and Injury Severity Scoring (ISS) of traffic fatalities. This is a prospective investigation of a consecutive series of 52 traffic fatalities from Southern Denmark that were CT scanned and autopsied. The AIS and ISS scores based on CT and autopsy (AU) were registered in a computer database and compared. Kappa values for reproducibility of AIS-severity scores and ISS scores were calculated. On an average, there was a 94% agreement between AU and CT in detecting the presence or absence of lesions in the various anatomic regions, and the severity scores were the same in 90% of all cases (range, 75-100%). When different severity scoring was obtained, CT detected more lesions with a high severity score in the facial skeleton, pelvis, and extremities, whereas AU detected more lesions with high scores in the soft tissues (especially in the aorta), cranium, and ribs. The kappa value for reproducibility of AIS scores confirmed that the agreement between the two methods was good. The lowest kappa values (>0.6) were found for the facial skeleton, cerebellum, meninges, neck organs, lungs, kidneys, and gastrointestinal tract. In these areas, the kappa value provided moderate agreement between CT and AU. For all other areas, there was a substantial agreement between the two methods. The ISS scores obtained by CT and by AU were calculated and were found to be with no or moderate variation in 85%. Rupture of the aorta was often overlooked by CT, resulting in too low ISS scoring. The most precise postmortem AIS and ISS scorings of traffic fatalities was obtained by a combination of AU and CT. If it is not possible to perform an AU, then CT may be used as an acceptable alternative for AIS scoring. We have identified one important obstacle for postmortem ISS scoring, namely that aorta ruptures are not easily detected by post mortem CT.
The impacts of speed cameras on road accidents: an application of propensity score matching methods.
Li, Haojie; Graham, Daniel J; Majumdar, Arnab
2013-11-01
This paper aims to evaluate the impacts of speed limit enforcement cameras on reducing road accidents in the UK by accounting for both confounding factors and the selection of proper reference groups. The propensity score matching (PSM) method is employed to do this. A naïve before and after approach and the empirical Bayes (EB) method are compared with the PSM method. A total of 771 sites and 4787 sites for the treatment and the potential reference groups respectively are observed for a period of 9 years in England. Both the PSM and the EB methods show similar results that there are significant reductions in the number of accidents of all severities at speed camera sites. It is suggested that the propensity score can be used as the criteria for selecting the reference group in before-after control studies. Speed cameras were found to be most effective in reducing accidents up to 200 meters from camera sites and no evidence of accident migration was found. Copyright © 2013 Elsevier Ltd. All rights reserved.
Vingerhoets, Johan; Nijs, Steven; Tambuyzer, Lotke; Hoogstoel, Annemie; Anderson, David; Picchio, Gaston
2012-01-01
The aims of this study were to compare various genotypic scoring systems commonly used to predict virological outcome to etravirine, and examine their concordance with etravirine phenotypic susceptibility. Six etravirine genotypic scoring systems were assessed: Tibotec 2010 (based on 20 mutations; TBT 20), Monogram, Stanford HIVdb, ANRS, Rega (based on 37, 30, 27 and 49 mutations, respectively) and virco(®)TYPE HIV-1 (predicted fold change based on genotype). Samples from treatment-experienced patients who participated in the DUET trials and with both genotypic and phenotypic data (n=403) were assessed using each scoring system. Results were retrospectively correlated with virological response in DUET. κ coefficients were calculated to estimate the degree of correlation between the different scoring systems. Correlation between the five scoring systems and the TBT 20 system was approximately 90%. Virological response by etravirine susceptibility was comparable regardless of which scoring system was utilized, with 70-74% of DUET patients determined as susceptible to etravirine by the different scoring systems achieving plasma viral load <50 HIV-1 RNA copies/ml. In samples classed as phenotypically susceptible to etravirine (fold change in 50% effective concentration ≤3), correlations with genotypic score were consistently high across scoring systems (≥70%). In general, the etravirine genotypic scoring systems produced similar results, and genotype-phenotype concordance was high. As such, phenotypic interpretations, and in their absence all genotypic scoring systems investigated, may be used to reliably predict the activity of etravirine.
Use of allele scores as instrumental variables for Mendelian randomization
Burgess, Stephen; Thompson, Simon G
2013-01-01
Background An allele score is a single variable summarizing multiple genetic variants associated with a risk factor. It is calculated as the total number of risk factor-increasing alleles for an individual (unweighted score), or the sum of weights for each allele corresponding to estimated genetic effect sizes (weighted score). An allele score can be used in a Mendelian randomization analysis to estimate the causal effect of the risk factor on an outcome. Methods Data were simulated to investigate the use of allele scores in Mendelian randomization where conventional instrumental variable techniques using multiple genetic variants demonstrate ‘weak instrument’ bias. The robustness of estimates using the allele score to misspecification (for example non-linearity, effect modification) and to violations of the instrumental variable assumptions was assessed. Results Causal estimates using a correctly specified allele score were unbiased with appropriate coverage levels. The estimates were generally robust to misspecification of the allele score, but not to instrumental variable violations, even if the majority of variants in the allele score were valid instruments. Using a weighted rather than an unweighted allele score increased power, but the increase was small when genetic variants had similar effect sizes. Naive use of the data under analysis to choose which variants to include in an allele score, or for deriving weights, resulted in substantial biases. Conclusions Allele scores enable valid causal estimates with large numbers of genetic variants. The stringency of criteria for genetic variants in Mendelian randomization should be maintained for all variants in an allele score. PMID:24062299
Aiding alternatives assessment with an uncertainty-tolerant hazard scoring method.
Faludi, Jeremy; Hoang, Tina; Gorman, Patrick; Mulvihill, Martin
2016-11-01
This research developed a single-score system to simplify and clarify decision-making in chemical alternatives assessment, accounting for uncertainty. Today, assessing alternatives to hazardous constituent chemicals is a difficult task-rather than comparing alternatives by a single definitive score, many independent toxicological variables must be considered at once, and data gaps are rampant. Thus, most hazard assessments are only comprehensible to toxicologists, but business leaders and politicians need simple scores to make decisions. In addition, they must balance hazard against other considerations, such as product functionality, and they must be aware of the high degrees of uncertainty in chemical hazard data. This research proposes a transparent, reproducible method to translate eighteen hazard endpoints into a simple numeric score with quantified uncertainty, alongside a similar product functionality score, to aid decisions between alternative products. The scoring method uses Clean Production Action's GreenScreen as a guide, but with a different method of score aggregation. It provides finer differentiation between scores than GreenScreen's four-point scale, and it displays uncertainty quantitatively in the final score. Displaying uncertainty also illustrates which alternatives are early in product development versus well-defined commercial products. This paper tested the proposed assessment method through a case study in the building industry, assessing alternatives to spray polyurethane foam insulation containing methylene diphenyl diisocyanate (MDI). The new hazard scoring method successfully identified trade-offs between different alternatives, showing finer resolution than GreenScreen Benchmarking. Sensitivity analysis showed that different weighting schemes in hazard scores had almost no effect on alternatives ranking, compared to uncertainty from data gaps. Copyright © 2016 Elsevier Ltd. All rights reserved.
Quality evaluation of Persian nutrition and diet therapy websites
Gholizadeh, Zahra; Papi, Ahmad; Ashrafi-rizi, Hasan; Shahrzadi, Leila; Hasanzadeh, Akbar
2017-01-01
INTRODUCTION: Nowadays websites are among the most important information sources used by most people. With the spread of websites, especially those related to health issues, the number of their visitors also increases, more than half of which are about nutritional information. Therefore, quality analysis of nutrition and diet therapy websites is of outmost importance. This study aims to evaluate the quality of Persian nutrition and diet therapy websites. METHODS: The current work is a survey study and uses an applied study method. The statistical population consists of 51 Persian websites about nutrition and diet therapy and census method was used in order to study them. Data gathering was done using a checklist and with direct visit to each website. Descriptive and analytical statistics were used to analyse the gathered data with the help of SPSS 21 software. RESULTS: Findings showed that content (66.7%), organization (82.4%), user friendly interfaces (52.9%) and total quality (70.6%) of most websites had a mediocre score while the design score for most of the websites (70.6%) was acceptable also organizational websites had better design, organization and quality compared to private websites. The three websites with the highest general quality score were the websites of “Novel Diet Therapy,” “Behsite” and “Dr. BehdadiPour” (jointly) and “Dr. Kermani” respectively. Also in the dimension of content the factors of goal, relevance and credibility, in the dimension of design the factors of color, text and sound, pictures and videos, in the dimension of organization the factors of stability and indexing and in the dimension of user friendliness the factors of confidentiality, credibility and personalization had the highest scores. CONCLUSION: The results showed that the design score was higher than other scores. Also the general quality score of the websites was mediocre and was not desirable. Also websites didn’t have suitable scores in every factor. Since most people search the internet for nutritional and diet therapy information, the creators of these websites should endeavor to fix the shortcomings of their websites and increase the quality of their websites in several different areas. PMID:28616415
Rebar, Amanda L.; Ram, Nilam; Conroy, David E.
2014-01-01
Objective The Single-Category Implicit Association Test (SC-IAT) has been used as a method for assessing automatic evaluations of physical activity, but measurement artifact or consciously-held attitudes could be confounding the outcome scores of these measures. The objective of these two studies was to address these measurement concerns by testing the validity of a novel SC-IAT scoring technique. Design Study 1 was a cross-sectional study, and study 2 was a prospective study. Method In study 1, undergraduate students (N = 104) completed SC-IATs for physical activity, flowers, and sedentary behavior. In study 2, undergraduate students (N = 91) completed a SC-IAT for physical activity, self-reported affective and instrumental attitudes toward physical activity, physical activity intentions, and wore an accelerometer for two weeks. The EZ-diffusion model was used to decompose the SC-IAT into three process component scores including the information processing efficiency score. Results In study 1, a series of structural equation model comparisons revealed that the information processing score did not share variability across distinct SC-IATs, suggesting it does not represent systematic measurement artifact. In study 2, the information processing efficiency score was shown to be unrelated to self-reported affective and instrumental attitudes toward physical activity, and positively related to physical activity behavior, above and beyond the traditional D-score of the SC-IAT. Conclusions The information processing efficiency score is a valid measure of automatic evaluations of physical activity. PMID:25484621
Automatic coronary calcium scoring using noncontrast and contrast CT images
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Guanyu, E-mail: yang.list@seu.edu.cn; Chen, Yang; Shu, Huazhong
Purpose: Calcium scoring is widely used to assess the risk of coronary heart disease (CHD). Accurate coronary artery calcification detection in noncontrast CT image is a prerequisite step for coronary calcium scoring. Currently, calcified lesions in the coronary arteries are manually identified by radiologists in clinical practice. Thus, in this paper, a fully automatic calcium scoring method was developed to alleviate the work load of the radiologists or cardiologists. Methods: The challenge of automatic coronary calcification detection is to discriminate the calcification in the coronary arteries from the calcification in the other tissues. Since the anatomy of coronary arteries ismore » difficult to be observed in the noncontrast CT images, the contrast CT image of the same patient is used to extract the regions of the aorta, heart, and coronary arteries. Then, a patient-specific region-of-interest (ROI) is generated in the noncontrast CT image according to the segmentation results in the contrast CT image. This patient-specific ROI focuses on the regions in the neighborhood of coronary arteries for calcification detection, which can eliminate the calcifications in the surrounding tissues. A support vector machine classifier is applied finally to refine the results by removing possible image noise. Furthermore, the calcified lesions in the noncontrast images belonging to the different main coronary arteries are identified automatically using the labeling results of the extracted coronary arteries. Results: Forty datasets from four different CT machine vendors were used to evaluate their algorithm, which were provided by the MICCAI 2014 Coronary Calcium Scoring (orCaScore) Challenge. The sensitivity and positive predictive value for the volume of detected calcifications are 0.989 and 0.948. Only one patient out of 40 patients had been assigned to the wrong risk category defined according to Agatston scores (0, 1–100, 101–300, >300) by comparing with the ground truth. Conclusions: The calcified lesions in the noncontrast CT images can be detected automatically by using the segmentation results of the aorta, heart, and coronary arteries obtained in the contrast CT images with a very high accuracy.« less
Rational and Experiential Decision-Making Preferences of Third-Year Student Pharmacists
McLaughlin, Jacqueline E.; Cox, Wendy C.; Williams, Charlene R.
2014-01-01
Objective. To examine the rational (systematic and rule-based) and experiential (fast and intuitive) decision-making preferences of student pharmacists, and to compare these preferences to the preferences of other health professionals and student populations. Methods. The Rational-Experiential Inventory (REI-40), a validated psychometric tool, was administered electronically to 114 third-year (P3) student pharmacists. Student demographics and preadmission data were collected. The REI-40 results were compared with student demographics and admissions data to identify possible correlations between these factors. Results. Mean REI-40 rational scores were higher than experiential scores. Rational scores for younger students were significantly higher than students aged 30 years and older (p<0.05). No significant differences were found based on gender, race, or the presence of a prior degree. All correlations between REI-40 scores and incoming grade point average (GPA) and Pharmacy College Admission Test (PCAT) scores were weak. Conclusion. Student pharmacists favored rational decision making over experiential decision making, which was similar to results of studies done of other health professions. PMID:25147392
Prediction of distal residue participation in enzyme catalysis
Brodkin, Heather R; DeLateur, Nicholas A; Somarowthu, Srinivas; Mills, Caitlyn L; Novak, Walter R; Beuning, Penny J; Ringe, Dagmar; Ondrechen, Mary Jo
2015-01-01
A scoring method for the prediction of catalytically important residues in enzyme structures is presented and used to examine the participation of distal residues in enzyme catalysis. Scores are based on the Partial Order Optimum Likelihood (POOL) machine learning method, using computed electrostatic properties, surface geometric features, and information obtained from the phylogenetic tree as input features. Predictions of distal residue participation in catalysis are compared with experimental kinetics data from the literature on variants of the featured enzymes; some additional kinetics measurements are reported for variants of Pseudomonas putida nitrile hydratase (ppNH) and for Escherichia coli alkaline phosphatase (AP). The multilayer active sites of P. putida nitrile hydratase and of human phosphoglucose isomerase are predicted by the POOL log ZP scores, as is the single-layer active site of P. putida ketosteroid isomerase. The log ZP score cutoff utilized here results in over-prediction of distal residue involvement in E. coli alkaline phosphatase. While fewer experimental data points are available for P. putida mandelate racemase and for human carbonic anhydrase II, the POOL log ZP scores properly predict the previously reported participation of distal residues. PMID:25627867
Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson
2010-01-01
Summary Objective Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this Review was to assess machine learning alternatives to logistic regression which may accomplish the same goals but with fewer assumptions or greater accuracy. Study Design and Setting We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. Results We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (CART), and meta-classifiers (in particular, boosting). Conclusion While the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and to a lesser extent decision trees (particularly CART) appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. PMID:20630332
Nutritional risk assessment in critically ill cancer patients: systematic review
Fruchtenicht, Ana Valéria Gonçalves; Poziomyck, Aline Kirjner; Kabke, Geórgia Brum; Loss, Sérgio Henrique; Antoniazzi, Jorge Luiz; Steemburgo, Thais; Moreira, Luis Fernando
2015-01-01
Objective To systematically review the main methods for nutritional risk assessment used in critically ill cancer patients and present the methods that better assess risks and predict relevant clinical outcomes in this group of patients, as well as to discuss the pros and cons of these methods according to the current literature. Methods The study consisted of a systematic review based on analysis of manuscripts retrieved from the PubMed, LILACS and SciELO databases by searching for the key words “nutritional risk assessment”, “critically ill” and “cancer”. Results Only 6 (17.7%) of 34 initially retrieved papers met the inclusion criteria and were selected for the review. The main outcomes of these studies were that resting energy expenditure was associated with undernourishment and overfeeding. The high Patient-Generated Subjective Global Assessment score was significantly associated with low food intake, weight loss and malnutrition. In terms of biochemical markers, higher levels of creatinine, albumin and urea were significantly associated with lower mortality. The worst survival was found for patients with worse Eastern Cooperative Oncologic Group - performance status, high Glasgow Prognostic Score, low albumin, high Patient-Generated Subjective Global Assessment score and high alkaline phosphatase levels. Geriatric Nutritional Risk Index values < 87 were significantly associated with mortality. A high Prognostic Inflammatory and Nutritional Index score was associated with abnormal nutritional status in critically ill cancer patients. Among the reviewed studies that examined weight and body mass index alone, no significant clinical outcome was found. Conclusion None of the methods reviewed helped to define risk among these patients. Therefore, assessment by a combination of weight loss and serum measurements, preferably in combination with other methods using scores such as Eastern Cooperative Oncologic Group - performance status, Glasgow Prognostic Score and Patient-Generated Subjective Global Assessment, is suggested given that their use is simple, feasible and useful in such cases. PMID:26270855
Merz, Erin L; Kwakkenbos, Linda; Carrier, Marie-Eve; Gholizadeh, Shadi; Mills, Sarah D; Fox, Rina S; Jewett, Lisa R; Williamson, Heidi; Harcourt, Diana; Assassi, Shervin; Furst, Daniel E; Gottesman, Karen; Mayes, Maureen D; Moss, Tim P; Thombs, Brett D; Malcarne, Vanessa L
2018-01-01
Objective Valid measures of appearance concern are needed in systemic sclerosis (SSc), a rare, disfiguring autoimmune disease. The Derriford Appearance Scale-24 (DAS-24) assesses appearance-related distress related to visible differences. There is uncertainty regarding its factor structure, possibly due to its scoring method. Design Cross-sectional survey. Setting Participants with SSc were recruited from 27 centres in Canada, the USA and the UK. Participants who self-identified as having visible differences were recruited from community and clinical settings in the UK. Participants Two samples were analysed (n=950 participants with SSc; n=1265 participants with visible differences). Primary and secondary outcome measures The DAS-24 factor structure was evaluated using two scoring methods. Convergent validity was evaluated with measures of social interaction anxiety, depression, fear of negative evaluation, social discomfort and dissatisfaction with appearance. Results When items marked by respondents as ‘not applicable’ were scored as 0, per standard DAS-24 scoring, a one-factor model fit poorly; when treated as missing data, the one-factor model fit well. Convergent validity analyses revealed strong correlations that were similar across scoring methods. Conclusions Treating ‘not applicable’ responses as missing improved the measurement model, but did not substantively influence practical inferences that can be drawn from DAS-24 scores. Indications of item redundancy and poorly performing items suggest that the DAS-24 could be improved and potentially shortened. PMID:29511009
Shelf, Louay
2016-08-18
This research used DRGs and CMI to adjust medical waste production through the calculation of DRGs and CMI scores. These scores were used to assess the performances of teaching hospitals in Damascus. The linear correlations between these scores and the annual amount of waste and DRGs values were studied. The differences between the daily waste generations before and after the adjustment process were determined. Accordingly, the highest values of DRGs and CMI Scores were for the pediatric and Al Assad hospitals respectively. Among the teaching hospitals in Damascus, Al Assad has achieved the highest performance. Based on the results, the accuracy and homogeneity of medical waste generation rates were improved, which in turn leads to continuous improvement in the management of medical wastes.
[Arthur Vick Prize 2017 of the German Society of Orthopaedic Rheumatology].
Bause, L; Niemeier, A; Krenn, V
2018-03-01
The German Society of Orthopaedic Rheumatology (DGORh) honored Prof. Dr. med. Veit Krenn (MVZ-ZHZMD-Trier) with the Arthur Vick Prize 2017. With this award, scientific results with high impact on the diagnosis, therapy and pathogenetic understanding of rheumatic diseases are honored. In cooperation with pathologists and colleagues from various clinical disciplines Prof. Dr. med. Veit Krenn developed several histopathologic scoring systems which contribute to the diagnosis and pathogenetic understanding of degenerative and rheumatic diseases. These scores include the synovitis score, the meniscal degeneration score, the classification of periprosthetic tissues (SLIM classification), the arthrofibrosis score, the particle score and the CD15 focus score. Of highest relevance for orthopedic rheumatology is the synovitis score which is a semiquantitative score for evaluating immunological and inflammatory changes of synovitis in a graded manner. Based on this score, it is possible to divide results into low-grade synovitis and high-grade synovitis: a synovitis score of 1-4 is called low-grade synovitis and occurs for example in association with osteoarthritis (OA), post-trauma, with meniscal lesions and hemochromatosis. A synovitis score of 5-9 is called high-grade synovitis, e.g. rheumatoid arthritis, psoriatic arthritis, Lyme arthritis, postinfection and reactive arthritis as well as peripheral arthritis with Bechterew's disease (sensitivity 61.7%, specificity 96.1%). The first publication (2002) and an associated subsequent publication (2006) of the synovitis score has led to national and international acceptance of this score as the standard for histopathological assessment of synovitis. The synovitis score provides a diagnostic, standardized and reproducible histopathological evaluation method for joint diseases, particularly when this score is applied in the context with the joint pathology algorithm.
A Comparative Study of Standard-Setting Methods.
ERIC Educational Resources Information Center
Livingston, Samuel A.; Zieky, Michael J.
1989-01-01
The borderline group standard-setting method (BGSM), Nedelsky method (NM), and Angoff method (AM) were compared, using reading scores for 1,948 and mathematics scores for 2,191 sixth through ninth graders. The NM and AM were inconsistent with the BGSM. Passing scores were higher where students were more able. (SLD)
Negative Marking and the Student Physician–-A Descriptive Study of Nigerian Medical Schools
Ndu, Ikenna Kingsley; Ekwochi, Uchenna; Di Osuorah, Chidiebere; Asinobi, Isaac Nwabueze; Nwaneri, Michael Osita; Uwaezuoke, Samuel Nkachukwu; Amadi, Ogechukwu Franscesca; Okeke, Ifeyinwa Bernadette; Chinawa, Josephat Maduabuchi; Orjioke, Casmir James Ginikanwa
2016-01-01
Background There is considerable debate about the two most commonly used scoring methods, namely, the formula scoring (popularly referred to as negative marking method in our environment) and number right scoring methods. Although the negative marking scoring system attempts to discourage students from guessing in order to increase test reliability and validity, there is the view that it is an excessive and unfair penalty that also increases anxiety. Feedback from students is part of the education process; thus, this study assessed the perception of medical students about negative marking method for multiple choice question (MCQ) examination formats and also the effect of gender and risk-taking behavior on scores obtained with this assessment method. Methods This was a prospective multicenter survey carried out among fifth year medical students in Enugu State University and the University of Nigeria. A structured questionnaire was administered to 175 medical students from the two schools, while a class test was administered to medical students from Enugu State University. Qualitative statistical methods including frequencies, percentages, and chi square were used to analyze categorical variables. Quantitative statistics using analysis of variance was used to analyze continuous variables. Results Inquiry into assessment format revealed that most of the respondents preferred MCQs (65.9%). One hundred and thirty students (74.3%) had an unfavorable perception of negative marking. Thirty-nine students (22.3%) agreed that negative marking reduces the tendency to guess and increases the validity of MCQs examination format in testing knowledge content of a subject compared to 108 (61.3%) who disagreed with this assertion (χ2 = 23.0, df = 1, P = 0.000). The median score of the students who were not graded with negative marking was significantly higher than the score of the students graded with negative marking (P = 0.001). There was no statistically significant difference in the risk-taking behavior between male and female students in their MCQ answering patterns with negative marking method (P = 0.618). Conclusions In the assessment of students, it is more desirable to adopt fair penalties for discouraging guessing rather than excessive penalties for incorrect answers, which could intimidate students in negative marking schemes. There is no consensus on the penalty for an incorrect answer. Thus, there is a need for continued research into an effective and objective assessment tool that will ensure that the students’ final score in a test truly represents their level of knowledge. PMID:29349304
Analysis of Covariance: Is It the Appropriate Model to Study Change?
ERIC Educational Resources Information Center
Marston, Paul T., Borich, Gary D.
The four main approaches to measuring treatment effects in schools; raw gain, residual gain, covariance, and true scores; were compared. A simulation study showed true score analysis produced a large number of Type-I errors. When corrected for this error, this method showed the least power of the four. This outcome was clearly the result of the…
The relationship between physical fitness and academic achievement among adolescent in South Korea.
Han, Gun-Soo
2018-04-01
[Purpose] The purpose of this study was to identify the relationship between physical fitness level and academic achievement in middle school students. [Subjects and Methods] A total of 236 students aged 13-15 from three middle schools in D city, South Korea, were selected using a random sampling method. Academic achievement was measured by students' 2014 fall-semester final exam scores and the level of physical fitness was determined according to the PAPS (Physical Activity Promotion System) score administrated by the Korean Ministry of Education. A Pearson correlation test with SPSS 20.0 was employed. [Results] The Pearson correlation test revealed a significant correlation between physical fitness and academic achievement. Specifically, students with higher levels of physical fitness tend to have higher academic performance. In addition, final exam scores of core subjects (e.g., English, mathematics, and science) were significantly related to the PAPS score. [Conclusion] Results of this study can be used to develop more effective physical education curricula. In addition, the data can also be applied to recreation and sport programs for other populations (e.g., children and adult) as well as existing national physical fitness data in various countries.
The Effect of Schooling and Ability on Achievement Test Scores. NBER Working Paper Series.
ERIC Educational Resources Information Center
Hansen, Karsten; Heckman, James J.; Mullen, Kathleen J.
This study developed two methods for estimating the effect of schooling on achievement test scores that control for the endogeneity of schooling by postulating that both schooling and test scores are generated by a common unobserved latent ability. The methods were applied to data on schooling and test scores. Estimates from the two methods are in…
ERIC Educational Resources Information Center
Woodruff, David; Traynor, Anne; Cui, Zhongmin; Fang, Yu
2013-01-01
Professional standards for educational testing recommend that both the overall standard error of measurement and the conditional standard error of measurement (CSEM) be computed on the score scale used to report scores to examinees. Several methods have been developed to compute scale score CSEMs. This paper compares three methods, based on…
Koohestani, Hamid Reza; Baghcheghi, Nayereh
2016-01-01
Background: Team-based learning is a structured type of cooperative learning that is becoming increasingly more popular in nursing education. This study compares levels of nursing students' perception of the psychosocial climate of the classroom between conventional lecture group and team-based learning group. Methods: In a quasi-experimental study with pretest-posttest design 38 nursing students of second year participated. One half of the 16 sessions of cardiovascular disease nursing course sessions was taught by lectures and the second half with team-based learning. The modified college and university classroom environment inventory (CUCEI) was used to measure the perception of classroom environment. This was completed after the final lecture and TBL sessions. Results: Results revealed a significant difference in the mean scores of psycho-social climate for the TBL method (Mean (SD): 179.8(8.27)) versus the mean score for the lecture method (Mean (SD): 154.213.44)). Also, the results showed significant differences between the two groups in the innovation (p<0.001), student cohesiveness (p=0.01), cooperation (p<0.001) and equity (p= 0.03) sub-scales scores (p<0.05). Conclusion: This study provides evidence that team-based learning does have a positive effect on nursing students' perceptions of their psycho-social climate of the classroom.
Mobile Visual Search Based on Histogram Matching and Zone Weight Learning
NASA Astrophysics Data System (ADS)
Zhu, Chuang; Tao, Li; Yang, Fan; Lu, Tao; Jia, Huizhu; Xie, Xiaodong
2018-01-01
In this paper, we propose a novel image retrieval algorithm for mobile visual search. At first, a short visual codebook is generated based on the descriptor database to represent the statistical information of the dataset. Then, an accurate local descriptor similarity score is computed by merging the tf-idf weighted histogram matching and the weighting strategy in compact descriptors for visual search (CDVS). At last, both the global descriptor matching score and the local descriptor similarity score are summed up to rerank the retrieval results according to the learned zone weights. The results show that the proposed approach outperforms the state-of-the-art image retrieval method in CDVS.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yan, Shiju; Qian, Wei; Guan, Yubao
2016-06-15
Purpose: This study aims to investigate the potential to improve lung cancer recurrence risk prediction performance for stage I NSCLS patients by integrating oversampling, feature selection, and score fusion techniques and develop an optimal prediction model. Methods: A dataset involving 94 early stage lung cancer patients was retrospectively assembled, which includes CT images, nine clinical and biological (CB) markers, and outcome of 3-yr disease-free survival (DFS) after surgery. Among the 94 patients, 74 remained DFS and 20 had cancer recurrence. Applying a computer-aided detection scheme, tumors were segmented from the CT images and 35 quantitative image (QI) features were initiallymore » computed. Two normalized Gaussian radial basis function network (RBFN) based classifiers were built based on QI features and CB markers separately. To improve prediction performance, the authors applied a synthetic minority oversampling technique (SMOTE) and a BestFirst based feature selection method to optimize the classifiers and also tested fusion methods to combine QI and CB based prediction results. Results: Using a leave-one-case-out cross-validation (K-fold cross-validation) method, the computed areas under a receiver operating characteristic curve (AUCs) were 0.716 ± 0.071 and 0.642 ± 0.061, when using the QI and CB based classifiers, respectively. By fusion of the scores generated by the two classifiers, AUC significantly increased to 0.859 ± 0.052 (p < 0.05) with an overall prediction accuracy of 89.4%. Conclusions: This study demonstrated the feasibility of improving prediction performance by integrating SMOTE, feature selection, and score fusion techniques. Combining QI features and CB markers and performing SMOTE prior to feature selection in classifier training enabled RBFN based classifier to yield improved prediction accuracy.« less
Rah, Jeong-Eun; Manger, Ryan P; Yock, Adam D; Kim, Gwe-Ya
2016-12-01
To examine the abilities of a traditional failure mode and effects analysis (FMEA) and modified healthcare FMEA (m-HFMEA) scoring methods by comparing the degree of congruence in identifying high risk failures. The authors applied two prospective methods of the quality management to surface image guided, linac-based radiosurgery (SIG-RS). For the traditional FMEA, decisions on how to improve an operation were based on the risk priority number (RPN). The RPN is a product of three indices: occurrence, severity, and detectability. The m-HFMEA approach utilized two indices, severity and frequency. A risk inventory matrix was divided into four categories: very low, low, high, and very high. For high risk events, an additional evaluation was performed. Based upon the criticality of the process, it was decided if additional safety measures were needed and what they comprise. The two methods were independently compared to determine if the results and rated risks matched. The authors' results showed an agreement of 85% between FMEA and m-HFMEA approaches for top 20 risks of SIG-RS-specific failure modes. The main differences between the two approaches were the distribution of the values and the observation that failure modes (52, 54, 154) with high m-HFMEA scores do not necessarily have high FMEA-RPN scores. In the m-HFMEA analysis, when the risk score is determined, the basis of the established HFMEA Decision Tree™ or the failure mode should be more thoroughly investigated. m-HFMEA is inductive because it requires the identification of the consequences from causes, and semi-quantitative since it allows the prioritization of high risks and mitigation measures. It is therefore a useful tool for the prospective risk analysis method to radiotherapy.
NASA Astrophysics Data System (ADS)
Baragona, Michelle
The purpose of this study was to investigate the interactions between multiple intelligence strengths and alternative teaching methods on student academic achievement, conceptual understanding and attitudes. The design was a quasi-experimental study, in which students enrolled in Principles of Anatomy and Physiology, a developmental biology course, received lecture only, problem-based learning with lecture, or peer teaching with lecture. These students completed the Multiple Intelligence Inventory to determine their intelligence strengths, the Students' Motivation Toward Science Learning questionnaire to determine student attitudes towards learning in science, multiple choice tests to determine academic achievement, and open-ended questions to determine conceptual understanding. Effects of intelligence types and teaching methods on academic achievement and conceptual understanding were determined statistically by repeated measures ANOVAs. No significance occurred in academic achievement scores due to lab group or due to teaching method used; however, significant interactions between group and teaching method did occur in students with strengths in logical-mathematical, interpersonal, kinesthetic, and intrapersonal intelligences. Post-hoc analysis using Tukey HSD tests revealed students with strengths in logical-mathematical intelligence and enrolled in Group Three scored significantly higher when taught by problem-based learning (PBL) as compared to peer teaching (PT). No significance occurred in conceptual understanding scores due to lab group or due to teaching method used; however, significant interactions between group and teaching method did occur in students with strengths in musical, kinesthetic, intrapersonal, and spatial intelligences. Post-hoc analysis using Tukey HSD tests revealed students with strengths in logical-mathematical intelligence and enrolled in Group Three scored significantly higher when taught by lecture as compared to PBL. Students with strengths in intrapersonal intelligence and enrolled in Group One scored significantly lower when taught by lecture as compared to PBL. Results of a repeated measures ANOVA for student attitudes showed significant increases in positive student attitudes toward science learning for all three types of teaching method between pretest and posttest; but there were no significant differences in posttest attitude scores by type of teaching method.
Benson, Nicholas F; Kranzler, John H; Floyd, Randy G
2016-10-01
Prior research examining cognitive ability and academic achievement relations have been based on different theoretical models, have employed both latent variables as well as observed variables, and have used a variety of analytic methods. Not surprisingly, results have been inconsistent across studies. The aims of this study were to (a) examine how relations between psychometric g, Cattell-Horn-Carroll (CHC) broad abilities, and academic achievement differ across higher-order and bifactor models; (b) examine how well various types of observed scores corresponded with latent variables; and (c) compare two types of observed scores (i.e., refined and non-refined factor scores) as predictors of academic achievement. Results suggest that cognitive-achievement relations vary across theoretical models and that both types of factor scores tend to correspond well with the models on which they are based. However, orthogonal refined factor scores (derived from a bifactor model) have the advantage of controlling for multicollinearity arising from the measurement of psychometric g across all measures of cognitive abilities. Results indicate that the refined factor scores provide more precise representations of their targeted constructs than non-refined factor scores and maintain close correspondence with the cognitive-achievement relations observed for latent variables. Thus, we argue that orthogonal refined factor scores provide more accurate representations of the relations between CHC broad abilities and achievement outcomes than non-refined scores do. Further, the use of refined factor scores addresses calls for the application of scores based on latent variable models. Copyright © 2016 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Bonny, S P F; Hocquette, J-F; Pethick, D W; Legrand, I; Wierzbicki, J; Allen, P; Farmer, L J; Polkinghorne, R J; Gardner, G E
2017-08-01
Quantifying consumer responses to beef across a broad range of demographics, nationalities and cooking methods is vitally important for any system evaluating beef eating quality. On the basis of previous work, it was expected that consumer scores would be highly accurate in determining quality grades for beef, thereby providing evidence that such a technique could be used to form the basis of and eating quality grading system for beef. Following the Australian MSA (Meat Standards Australia) testing protocols, over 19 000 consumers from Northern Ireland, Poland, Ireland, France and Australia tasted cooked beef samples, then allocated them to a quality grade; unsatisfactory, good-every-day, better-than-every-day and premium. The consumers also scored beef samples for tenderness, juiciness, flavour-liking and overall-liking. The beef was sourced from all countries involved in the study and cooked by four different cooking methods and to three different degrees of doneness, with each experimental group in the study consisting of a single cooking doneness within a cooking method for each country. For each experimental group, and for the data set as a whole, a linear discriminant function was calculated, using the four sensory scores which were used to predict the quality grade. This process was repeated using two conglomerate scores which are derived from weighting and combining the consumer sensory scores for tenderness, juiciness, flavour-liking and overall-liking, the original meat quality 4 score (oMQ4) (0.4, 0.1, 0.2, 0.3) and current meat quality 4 score (cMQ4) (0.3, 0.1, 0.3, 0.3). From the results of these analyses, the optimal weightings of the sensory scores to generate an 'ideal meat quality 4 score (MQ4)' for each country were calculated, and the MQ4 values that reflected the boundaries between the four quality grades were determined. The oMQ4 weightings were far more accurate in categorising European meat samples than the cMQ4 weightings, highlighting that tenderness is more important than flavour to the consumer when determining quality. The accuracy of the discriminant analysis to predict the consumer scored quality grades was similar across all consumer groups, 68%, and similar to previously reported values. These results demonstrate that this technique, as used in the MSA system, could be used to predict consumer assessment of beef eating quality and therefore to underpin a commercial eating quality guarantee for all European consumers.
Kayataş, Semra; Özkaya, Enis; Api, Murat; Çıkman, Seyhan; Gürbüz, Ayşen; Eser, Ahmet
2017-01-01
Objective: The aim of the present study was to compare female sexual function between women who underwent conventional abdominal or laparoscopic hysterectomy. Materials and Methods: Seventy-seven women who were scheduled to undergo hysterectomy without oophorectomy for benign gynecologic conditions were included in the study. The women were assigned to laparoscopic or open abdominal hysterectomy according to the surgeons preference. Women with endometriosis and symptomatic prolapsus were excluded. Female sexual function scores were obtained before and six months after the operation from each participant by using validated questionnaires. Results: Pre- and postoperative scores of three different quationnaires were found as comparable in the group that underwent laparoscopic hysterectomy (p>0.05). Scores were also found as comparable in the group that underwent laparotomic hysterectomy (p>0.05). Pre- and postoperative values were compared between the two groups and revealed similar results with regard to all three scores (p>0.05). Conclusion: Our data showed comparable pre- and the postoperative scores for the two different hysterectomy techniques. The two groups were also found to have similar pre- and postoperative score values. PMID:28913149
Image-Guided Rendering with an Evolutionary Algorithm Based on Cloud Model
2018-01-01
The process of creating nonphotorealistic rendering images and animations can be enjoyable if a useful method is involved. We use an evolutionary algorithm to generate painterly styles of images. Given an input image as the reference target, a cloud model-based evolutionary algorithm that will rerender the target image with nonphotorealistic effects is evolved. The resulting animations have an interesting characteristic in which the target slowly emerges from a set of strokes. A number of experiments are performed, as well as visual comparisons, quantitative comparisons, and user studies. The average scores in normalized feature similarity of standard pixel-wise peak signal-to-noise ratio, mean structural similarity, feature similarity, and gradient similarity based metric are 0.486, 0.628, 0.579, and 0.640, respectively. The average scores in normalized aesthetic measures of Benford's law, fractal dimension, global contrast factor, and Shannon's entropy are 0.630, 0.397, 0.418, and 0.708, respectively. Compared with those of similar method, the average score of the proposed method, except peak signal-to-noise ratio, is higher by approximately 10%. The results suggest that the proposed method can generate appealing images and animations with different styles by choosing different strokes, and it would inspire graphic designers who may be interested in computer-based evolutionary art. PMID:29805440
Patel, Niyant V.; Wagner, Douglas S.
2015-01-01
Background: Venous thromboembolism (VTE) risk models including the Davison risk score and the 2005 Caprini risk assessment model have been validated in plastic surgery patients. However, their utility and predictive value in breast reconstruction has not been well described. We sought to determine the utility of current VTE risk models in this population and the VTE rate observed in various methods of breast reconstruction. Methods: A retrospective review of breast reconstructions by a single surgeon was performed. One hundred consecutive transverse rectus abdominis myocutaneous (TRAM) patients, 100 consecutive implant patients, and 100 consecutive latissimus dorsi patients were identified over a 10-year period. Patient demographics and presence of symptomatic VTE were collected. 2005 Caprini risk scores and Davison risk scores were calculated for each patient. Results: The TRAM reconstruction group was found to have a higher VTE rate (6%) than the implant (0%) and latissimus (0%) reconstruction groups (P < 0.01). Mean Davison risk scores and 2005 Caprini scores were similar across all reconstruction groups (P > 0.1). The vast majority of patients were stratified as high risk (87.3%) by the VTE risk models. However, only TRAM reconstruction patients demonstrated significant VTE risk. Conclusions: TRAM reconstruction appears to have a significantly higher risk of VTE than both implant and latissimus reconstruction. Current risk models do not effectively stratify breast reconstruction patients at risk for VTE. The method of breast reconstruction appears to have a significant role in patients’ VTE risk. PMID:26090287
Cosín Aguilar, J; Hernándiz Martínez, A; Rodríguez Padial, L; Zamorano Gómez, J L; Arístegui Urrestarazu, R; Armada Peláez, B; Aguilar Llopis, A; Masramon Morell, X
2006-04-01
Calculation of cardiovascular risk in populations allows for developing and assessing of intervention programs and adapting health resources. While the Framingham System has been used in the past, a group of European researchers have proposed a different method called the Score project. The purpose of this paper is to compare the value of both methods for assessing cardiovascular risk. In 6,775 evaluable hypertensive patients distributed over the 17 Spanish autonomous communities (ACs), the 10-year risk of experiencing a coronary event (CR) was calculated using the Framingham equation, while risk of coronary death (RCD) and vascular death (RVD) was calculated using the Score project system, both at baseline and after one year of blood pressure control with amlodipine at the required dose. A comparison was made of the capacity to detect risk differences by both methods between populations with known different risks, and in the same population as a result of blood pressure control. Both the Score and the Framingham systems detected the significant decrease in both CR and RCD or RVD at one year of application of the CORONARIA study protocol. Risk decrease measured by any of the two methods was significant (p < 0.05) overall, by genders, and by ACs. However, the Score System, unlike the Framingham system, could not detect the reported differences in the mortality risk for coronary and vascular disease between the ACs of the North and the South-East parts of Spain.
Fukunishi, Yoshifumi; Mikami, Yoshiaki; Nakamura, Haruki
2005-09-01
We developed a new method to evaluate the distances and similarities between receptor pockets or chemical compounds based on a multi-receptor versus multi-ligand docking affinity matrix. The receptors were classified by a cluster analysis based on calculations of the distance between receptor pockets. A set of low homologous receptors that bind a similar compound could be classified into one cluster. Based on this line of reasoning, we proposed a new in silico screening method. According to this method, compounds in a database were docked to multiple targets. The new docking score was a slightly modified version of the multiple active site correction (MASC) score. Receptors that were at a set distance from the target receptor were not included in the analysis, and the modified MASC scores were calculated for the selected receptors. The choice of the receptors is important to achieve a good screening result, and our clustering of receptors is useful to this purpose. This method was applied to the analysis of a set of 132 receptors and 132 compounds, and the results demonstrated that this method achieves a high hit ratio, as compared to that of a uniform sampling, using a receptor-ligand docking program, Sievgene, which was newly developed with a good docking performance yielding 50.8% of the reconstructed complexes at a distance of less than 2 A RMSD.
Ambulatory orthopaedic surgery patients' emotions when using different patient education methods.
Heikkinen, Katja; Salanterä, Sanna; Leppänen, Tiina; Vahlberg, Tero; Leino-Kilpi, Helena
2012-07-01
A randomised controlled trial was used to evaluate elective ambulatory orthopaedic surgery patients' emotions during internet-based patient education or face-to-face education with a nurse. The internet-based patient education was designed for this study and patients used websites individually based on their needs. Patients in the control group participated individually in face-to-face patient education with a nurse in the ambulatory surgery unit. The theoretical basis for both types of education was the same. Ambulatory orthopaedic surgery patients scored their emotions rather low at intervals throughout the whole surgical process, though their scores also changed during the surgical process. Emotion scores did not decrease after patient education. No differences in patients' emotions were found to result from either of the two different patient education methods.
NASA Astrophysics Data System (ADS)
Chaa, Mourad; Boukezzoula, Naceur-Eddine; Attia, Abdelouahab
2017-01-01
Two types of scores extracted from two-dimensional (2-D) and three-dimensional (3-D) palmprint for personal recognition systems are merged, introducing a local image descriptor for 2-D palmprint-based recognition systems, named bank of binarized statistical image features (B-BSIF). The main idea of B-BSIF is that the extracted histograms from the binarized statistical image features (BSIF) code images (the results of applying the different BSIF descriptor size with the length 12) are concatenated into one to produce a large feature vector. 3-D palmprint contains the depth information of the palm surface. The self-quotient image (SQI) algorithm is applied for reconstructing illumination-invariant 3-D palmprint images. To extract discriminative Gabor features from SQI images, Gabor wavelets are defined and used. Indeed, the dimensionality reduction methods have shown their ability in biometrics systems. Given this, a principal component analysis (PCA)+linear discriminant analysis (LDA) technique is employed. For the matching process, the cosine Mahalanobis distance is applied. Extensive experiments were conducted on a 2-D and 3-D palmprint database with 10,400 range images from 260 individuals. Then, a comparison was made between the proposed algorithm and other existing methods in the literature. Results clearly show that the proposed framework provides a higher correct recognition rate. Furthermore, the best results were obtained by merging the score of B-BSIF descriptor with the score of the SQI+Gabor wavelets+PCA+LDA method, yielding an equal error rate of 0.00% and a recognition rate of rank-1=100.00%.
A Comprehensive Critique and Review of Published Measures of Acne Severity
Furber, Gareth; Leach, Matthew; Segal, Leonie
2016-01-01
Objective: Acne vulgaris is a dynamic, complex condition that is notoriously difficult to evaluate. The authors set out to critically evaluate currently available measures of acne severity, particularly in terms of suitability for use in clinical trials. Design: A systematic review was conducted to identify methods used to measure acne severity, using MEDLINE, CINAHL, Scopus, and Wiley Online. Each method was critically reviewed and given a score out of 13 based on eight quality criteria under two broad groupings of psychometric testing and suitability for research and evaluation. Results: Twenty-four methods for assessing acne severity were identified. Four scales received a quality score of zero, and 11 scored ≤3. The highest rated scales achieved a total score of 6. Six scales reported strong inter-rater reliability (ICC>0.75), and four reported strong intra-rater reliability (ICC>0.75). The poor overall performance of most scales, largely characterized by the absence of reliability testing or evidence for independent assessment and validation indicates that generally, their application in clinical trials is not supported. Conclusion: This review and appraisal of instruments for measuring acne severity supports previously identified concerns regarding the quality of published measures. It highlights the need for a valid and reliable acne severity scale, especially for use in research and evaluation. The ideal scale would demonstrate adequate validation and reliability and be easily implemented for third-party analysis. The development of such a scale is critical to interpreting results of trials and facilitating the pooling of results for systematic reviews and meta-analyses. PMID:27672410
NASA Astrophysics Data System (ADS)
Powell, P. E.
Educators have recently come to consider inquiry based instruction as a more effective method of instruction than didactic instruction. Experience based learning theory suggests that student performance is linked to teaching method. However, research is limited on inquiry teaching and its effectiveness on preparing students to perform well on standardized tests. The purpose of the study to investigate whether one of these two teaching methodologies was more effective in increasing student performance on standardized science tests. The quasi experimental quantitative study was comprised of two stages. Stage 1 used a survey to identify teaching methods of a convenience sample of 57 teacher participants and determined level of inquiry used in instruction to place participants into instructional groups (the independent variable). Stage 2 used analysis of covariance (ANCOVA) to compare posttest scores on a standardized exam by teaching method. Additional analyses were conducted to examine the differences in science achievement by ethnicity, gender, and socioeconomic status by teaching methodology. Results demonstrated a statistically significant gain in test scores when taught using inquiry based instruction. Subpopulation analyses indicated all groups showed improved mean standardized test scores except African American students. The findings benefit teachers and students by presenting data supporting a method of content delivery that increases teacher efficacy and produces students with a greater cognition of science content that meets the school's mission and goals.
Todd Rogers, W; Docherty, David; Petersen, Stewart
2014-01-01
The bookmark method for setting cut-scores was used to re-set the cut-score for the Canadian Forces Firefighter Physical Fitness Maintenance Evaluation (FF PFME). The time required to complete 10 tasks that together simulate a first-response firefighting emergency was accepted as a measure of work capacity. A panel of 25 Canadian Forces firefighter supervisors set cut-scores in three rounds. Each round involved independent evaluation of nine video work samples, where the times systematically increased from 400 seconds to 560 seconds. Results for Round 1 were discussed before moving to Round 2 and results for Round 2 were discussed before moving to Round 3. Accounting for the variability among panel members at the end of Round 3, a cut-score of 481 seconds (mean Round 3 plus 2 SEM) was recommended. Firefighters who complete the FF PFME in 481 seconds or less have the physical capacity to complete first-response firefighting work.
Automatic and Objective Assessment of Alternating Tapping Performance in Parkinson's Disease
Memedi, Mevludin; Khan, Taha; Grenholm, Peter; Nyholm, Dag; Westin, Jerker
2013-01-01
This paper presents the development and evaluation of a method for enabling quantitative and automatic scoring of alternating tapping performance of patients with Parkinson's disease (PD). Ten healthy elderly subjects and 95 patients in different clinical stages of PD have utilized a touch-pad handheld computer to perform alternate tapping tests in their home environments. First, a neurologist used a web-based system to visually assess impairments in four tapping dimensions (‘speed’, ‘accuracy’, ‘fatigue’ and ‘arrhythmia’) and a global tapping severity (GTS). Second, tapping signals were processed with time series analysis and statistical methods to derive 24 quantitative parameters. Third, principal component analysis was used to reduce the dimensions of these parameters and to obtain scores for the four dimensions. Finally, a logistic regression classifier was trained using a 10-fold stratified cross-validation to map the reduced parameters to the corresponding visually assessed GTS scores. Results showed that the computed scores correlated well to visually assessed scores and were significantly different across Unified Parkinson's Disease Rating Scale scores of upper limb motor performance. In addition, they had good internal consistency, had good ability to discriminate between healthy elderly and patients in different disease stages, had good sensitivity to treatment interventions and could reflect the natural disease progression over time. In conclusion, the automatic method can be useful to objectively assess the tapping performance of PD patients and can be included in telemedicine tools for remote monitoring of tapping. PMID:24351667
Automatic and objective assessment of alternating tapping performance in Parkinson's disease.
Memedi, Mevludin; Khan, Taha; Grenholm, Peter; Nyholm, Dag; Westin, Jerker
2013-12-09
This paper presents the development and evaluation of a method for enabling quantitative and automatic scoring of alternating tapping performance of patients with Parkinson's disease (PD). Ten healthy elderly subjects and 95 patients in different clinical stages of PD have utilized a touch-pad handheld computer to perform alternate tapping tests in their home environments. First, a neurologist used a web-based system to visually assess impairments in four tapping dimensions ('speed', 'accuracy', 'fatigue' and 'arrhythmia') and a global tapping severity (GTS). Second, tapping signals were processed with time series analysis and statistical methods to derive 24 quantitative parameters. Third, principal component analysis was used to reduce the dimensions of these parameters and to obtain scores for the four dimensions. Finally, a logistic regression classifier was trained using a 10-fold stratified cross-validation to map the reduced parameters to the corresponding visually assessed GTS scores. Results showed that the computed scores correlated well to visually assessed scores and were significantly different across Unified Parkinson's Disease Rating Scale scores of upper limb motor performance. In addition, they had good internal consistency, had good ability to discriminate between healthy elderly and patients in different disease stages, had good sensitivity to treatment interventions and could reflect the natural disease progression over time. In conclusion, the automatic method can be useful to objectively assess the tapping performance of PD patients and can be included in telemedicine tools for remote monitoring of tapping.
Chi-square-based scoring function for categorization of MEDLINE citations.
Kastrin, A; Peterlin, B; Hristovski, D
2010-01-01
Text categorization has been used in biomedical informatics for identifying documents containing relevant topics of interest. We developed a simple method that uses a chi-square-based scoring function to determine the likelihood of MEDLINE citations containing genetic relevant topic. Our procedure requires construction of a genetic and a nongenetic domain document corpus. We used MeSH descriptors assigned to MEDLINE citations for this categorization task. We compared frequencies of MeSH descriptors between two corpora applying chi-square test. A MeSH descriptor was considered to be a positive indicator if its relative observed frequency in the genetic domain corpus was greater than its relative observed frequency in the nongenetic domain corpus. The output of the proposed method is a list of scores for all the citations, with the highest score given to those citations containing MeSH descriptors typical for the genetic domain. Validation was done on a set of 734 manually annotated MEDLINE citations. It achieved predictive accuracy of 0.87 with 0.69 recall and 0.64 precision. We evaluated the method by comparing it to three machine-learning algorithms (support vector machines, decision trees, naïve Bayes). Although the differences were not statistically significantly different, results showed that our chi-square scoring performs as good as compared machine-learning algorithms. We suggest that the chi-square scoring is an effective solution to help categorize MEDLINE citations. The algorithm is implemented in the BITOLA literature-based discovery support system as a preprocessor for gene symbol disambiguation process.
ADEPT, a dynamic next generation sequencing data error-detection program with trimming
DOE Office of Scientific and Technical Information (OSTI.GOV)
Feng, Shihai; Lo, Chien-Chi; Li, Po-E
Illumina is the most widely used next generation sequencing technology and produces millions of short reads that contain errors. These sequencing errors constitute a major problem in applications such as de novo genome assembly, metagenomics analysis and single nucleotide polymorphism discovery. In this study, we present ADEPT, a dynamic error detection method, based on the quality scores of each nucleotide and its neighboring nucleotides, together with their positions within the read and compares this to the position-specific quality score distribution of all bases within the sequencing run. This method greatly improves upon other available methods in terms of the truemore » positive rate of error discovery without affecting the false positive rate, particularly within the middle of reads. We conclude that ADEPT is the only tool to date that dynamically assesses errors within reads by comparing position-specific and neighboring base quality scores with the distribution of quality scores for the dataset being analyzed. The result is a method that is less prone to position-dependent under-prediction, which is one of the most prominent issues in error prediction. The outcome is that ADEPT improves upon prior efforts in identifying true errors, primarily within the middle of reads, while reducing the false positive rate.« less
ADEPT, a dynamic next generation sequencing data error-detection program with trimming
Feng, Shihai; Lo, Chien-Chi; Li, Po-E; ...
2016-02-29
Illumina is the most widely used next generation sequencing technology and produces millions of short reads that contain errors. These sequencing errors constitute a major problem in applications such as de novo genome assembly, metagenomics analysis and single nucleotide polymorphism discovery. In this study, we present ADEPT, a dynamic error detection method, based on the quality scores of each nucleotide and its neighboring nucleotides, together with their positions within the read and compares this to the position-specific quality score distribution of all bases within the sequencing run. This method greatly improves upon other available methods in terms of the truemore » positive rate of error discovery without affecting the false positive rate, particularly within the middle of reads. We conclude that ADEPT is the only tool to date that dynamically assesses errors within reads by comparing position-specific and neighboring base quality scores with the distribution of quality scores for the dataset being analyzed. The result is a method that is less prone to position-dependent under-prediction, which is one of the most prominent issues in error prediction. The outcome is that ADEPT improves upon prior efforts in identifying true errors, primarily within the middle of reads, while reducing the false positive rate.« less
Key Technical Aspects Influencing the Accuracy of Tablet Subdivision.
Teixeira, Maíra T; Sá-Barreto, Lívia C L; Gratieri, Taís; Gelfuso, Guilherme M; Silva, Izabel C R; Cunha-Filho, Marcílio S S
2017-05-01
Tablet subdivision is a common practice used mainly for dose adjustment. The aim of this study was to investigate how the technical aspects of production as well as the method of tablets subdivision (employing a tablet splitter or a kitchen knife) influence the accuracy of this practice. Five drugs commonly used as subdivided tablets were selected. For each drug, the innovator drug product, a scored-generic and a non-scored generic were investigated totalizing fifteen drug products. Mechanical and physical tests, including image analysis, were performed. Additionally, comparisons were made between tablet subdivision method, score, shape, diluent composition and coating. Image analysis based on surface area was a useful tool as an alternative assay to evaluate the accuracy of tablet subdivision. The tablet splitter demonstrates an advantage relative to a knife as it showed better results in weight loss and friability tests. Oblong, coated and scored tablets had better results after subdivision than round, uncoated and non-scored tablets. The presence of elastic diluents such as starch and dibasic phosphate dehydrate conferred a more appropriate behaviour for the subdivision process than plastic materials such as microcrystalline cellulose and lactose. Finally, differences were observed between generics and their innovator products in all selected drugs with regard the quality control assays in divided tablet, which highlights the necessity of health regulations to consider subdivision performance at least in marketing authorization of generic products.
Interobserver Reliability of the Total Body Score System for Quantifying Human Decomposition.
Dabbs, Gretchen R; Connor, Melissa; Bytheway, Joan A
2016-03-01
Several authors have tested the accuracy of the Total Body Score (TBS) method for quantifying decomposition, but none have examined the reliability of the method as a scoring system by testing interobserver error rates. Sixteen participants used the TBS system to score 59 observation packets including photographs and written descriptions of 13 human cadavers in different stages of decomposition (postmortem interval: 2-186 days). Data analysis used a two-way random model intraclass correlation in SPSS (v. 17.0). The TBS method showed "almost perfect" agreement between observers, with average absolute correlation coefficients of 0.990 and average consistency correlation coefficients of 0.991. While the TBS method may have sources of error, scoring reliability is not one of them. Individual component scores were examined, and the influences of education and experience levels were investigated. Overall, the trunk component scores were the least concordant. Suggestions are made to improve the reliability of the TBS method. © 2016 American Academy of Forensic Sciences.
Automatic evidence retrieval for systematic reviews.
Choong, Miew Keen; Galgani, Filippo; Dunn, Adam G; Tsafnat, Guy
2014-10-01
Snowballing involves recursively pursuing relevant references cited in the retrieved literature and adding them to the search results. Snowballing is an alternative approach to discover additional evidence that was not retrieved through conventional search. Snowballing's effectiveness makes it best practice in systematic reviews despite being time-consuming and tedious. Our goal was to evaluate an automatic method for citation snowballing's capacity to identify and retrieve the full text and/or abstracts of cited articles. Using 20 review articles that contained 949 citations to journal or conference articles, we manually searched Microsoft Academic Search (MAS) and identified 78.0% (740/949) of the cited articles that were present in the database. We compared the performance of the automatic citation snowballing method against the results of this manual search, measuring precision, recall, and F1 score. The automatic method was able to correctly identify 633 (as proportion of included citations: recall=66.7%, F1 score=79.3%; as proportion of citations in MAS: recall=85.5%, F1 score=91.2%) of citations with high precision (97.7%), and retrieved the full text or abstract for 490 (recall=82.9%, precision=92.1%, F1 score=87.3%) of the 633 correctly retrieved citations. The proposed method for automatic citation snowballing is accurate and is capable of obtaining the full texts or abstracts for a substantial proportion of the scholarly citations in review articles. By automating the process of citation snowballing, it may be possible to reduce the time and effort of common evidence surveillance tasks such as keeping trial registries up to date and conducting systematic reviews.
Peine, Arne; Kabino, Klaus; Spreckelsen, Cord
2016-06-03
Modernised medical curricula in Germany (so called "reformed study programs") rely increasingly on alternative self-instructed learning forms such as e-learning and curriculum-guided self-study. However, there is a lack of evidence that these methods can outperform conventional teaching methods such as lectures and seminars. This study was conducted in order to compare extant traditional teaching methods with new instruction forms in terms of learning effect and student satisfaction. In a randomised trial, 244 students of medicine in their third academic year were assigned to one of four study branches representing self-instructed learning forms (e-learning and curriculum-based self-study) and instructed learning forms (lectures and seminars). All groups participated in their respective learning module with standardised materials and instructions. Learning effect was measured with pre-test and post-test multiple-choice questionnaires. Student satisfaction and learning style were examined via self-assessment. Of 244 initial participants, 223 completed the respective module and were included in the study. In the pre-test, the groups showed relatively homogenous scores. All students showed notable improvements compared with the pre-test results. Participants in the non-self-instructed learning groups reached scores of 14.71 (seminar) and 14.37 (lecture), while the groups of self-instructed learners reached higher scores with 17.23 (e-learning) and 15.81 (self-study). All groups improved significantly (p < .001) in the post-test regarding their self-assessment, led by the e-learning group, whose self-assessment improved by 2.36. The study shows that students in modern study curricula learn better through modern self-instructed methods than through conventional methods. These methods should be used more, as they also show good levels of student acceptance and higher scores in personal self-assessment of knowledge.
The Quality of Methods Reporting in Parasitology Experiments
Flórez-Vargas, Oscar; Bramhall, Michael; Noyes, Harry; Cruickshank, Sheena; Stevens, Robert; Brass, Andy
2014-01-01
There is a growing concern both inside and outside the scientific community over the lack of reproducibility of experiments. The depth and detail of reported methods are critical to the reproducibility of findings, but also for making it possible to compare and integrate data from different studies. In this study, we evaluated in detail the methods reporting in a comprehensive set of trypanosomiasis experiments that should enable valid reproduction, integration and comparison of research findings. We evaluated a subset of other parasitic (Leishmania, Toxoplasma, Plasmodium, Trichuris and Schistosoma) and non-parasitic (Mycobacterium) experimental infections in order to compare the quality of method reporting more generally. A systematic review using PubMed (2000–2012) of all publications describing gene expression in cells and animals infected with Trypanosoma spp was undertaken based on PRISMA guidelines; 23 papers were identified and included. We defined a checklist of essential parameters that should be reported and have scored the number of those parameters that are reported for each publication. Bibliometric parameters (impact factor, citations and h-index) were used to look for association between Journal and Author status and the quality of method reporting. Trichuriasis experiments achieved the highest scores and included the only paper to score 100% in all criteria. The mean of scores achieved by Trypanosoma articles through the checklist was 65.5% (range 32–90%). Bibliometric parameters were not correlated with the quality of method reporting (Spearman's rank correlation coefficient <−0.5; p>0.05). Our results indicate that the quality of methods reporting in experimental parasitology is a cause for concern and it has not improved over time, despite there being evidence that most of the assessed parameters do influence the results. We propose that our set of parameters be used as guidelines to improve the quality of the reporting of experimental infection models as a pre-requisite for integrating and comparing sets of data. PMID:25076044
The quality of methods reporting in parasitology experiments.
Flórez-Vargas, Oscar; Bramhall, Michael; Noyes, Harry; Cruickshank, Sheena; Stevens, Robert; Brass, Andy
2014-01-01
There is a growing concern both inside and outside the scientific community over the lack of reproducibility of experiments. The depth and detail of reported methods are critical to the reproducibility of findings, but also for making it possible to compare and integrate data from different studies. In this study, we evaluated in detail the methods reporting in a comprehensive set of trypanosomiasis experiments that should enable valid reproduction, integration and comparison of research findings. We evaluated a subset of other parasitic (Leishmania, Toxoplasma, Plasmodium, Trichuris and Schistosoma) and non-parasitic (Mycobacterium) experimental infections in order to compare the quality of method reporting more generally. A systematic review using PubMed (2000-2012) of all publications describing gene expression in cells and animals infected with Trypanosoma spp was undertaken based on PRISMA guidelines; 23 papers were identified and included. We defined a checklist of essential parameters that should be reported and have scored the number of those parameters that are reported for each publication. Bibliometric parameters (impact factor, citations and h-index) were used to look for association between Journal and Author status and the quality of method reporting. Trichuriasis experiments achieved the highest scores and included the only paper to score 100% in all criteria. The mean of scores achieved by Trypanosoma articles through the checklist was 65.5% (range 32-90%). Bibliometric parameters were not correlated with the quality of method reporting (Spearman's rank correlation coefficient <-0.5; p>0.05). Our results indicate that the quality of methods reporting in experimental parasitology is a cause for concern and it has not improved over time, despite there being evidence that most of the assessed parameters do influence the results. We propose that our set of parameters be used as guidelines to improve the quality of the reporting of experimental infection models as a pre-requisite for integrating and comparing sets of data.
Lu, L N; He, X G; Zhu, J F; Xu, X; Zhang, R; Hu, X; Zou, H D
2016-11-11
Objective: To establish an assessment system, including indexes and scoring methods, that can be used for performance evaluation of the provincial blindness prevention technical guidance group properly and effectively . Methods: The indexes and scoring methods were set based on the core content of The " National Plan of Prevention and Treatment of Blindness (2012-2015)" , the specific requirement and target of the World Health Organization (WHO) "For the General Eye Health: Global plan of Action (2014-2019)" , and the current situation of the China's provinces and autonomous regions. These indexes should be of effectiveness, feasibility, comparability, guidance and advancing. Formed by a literature review of candidate indicators, the framework of the system is built by qualitative assessment. With the Delphi method, the system was further revised and improved. Empirical pilot study was then used to prove the feasibility, followed by the final qualitative analysis that establish the " Chinese provincial Blindness prevention technical guidance group performance evaluation system" . Results: Through the literature review and qualitative assessment, a six dimensional system framework was built, including 6 first-level indicators, 16 second-level indicators, and 29 third-level indicators through Delphi method evaluation. With the variation coefficient method, the coeffiences of the first-level index weight were calculated as: Organization and management 0.15, Development and implementation of blindness prevention plans 0.15, Implementation of blindness prevention projects 0.14, Training 0.17, Health education 0.18, and Cooperation and exchanges 0.21. The specific scoring method for this system is confirmed as: data and files check, field interview, and record interview, sampling investigation. Empirical pilot study was conducted in the Jilin, Guizhou and Gansu provinces, and the self-assessment results from local experts were consistent with the scores from the systems. Conclusion: This system established is appropriate at current time, and it can effectively evaluate the performance of the Chinese provincial Blindness prevention technical guidance group. (Chin J Ophthalmol, 2016, 52:814-824) .
Terslev, Lene; Naredo, Esperanza; Aegerter, Philippe; Wakefield, Richard J; Backhaus, Marina; Balint, Peter; Bruyn, George A W; Iagnocco, Annamaria; Jousse-Joulin, Sandrine; Schmidt, Wolfgang A; Szkudlarek, Marcin; Conaghan, Philip G; Filippucci, Emilio
2017-01-01
Objectives To test the reliability of new ultrasound (US) definitions and quantification of synovial hypertrophy (SH) and power Doppler (PD) signal, separately and in combination, in a range of joints in patients with rheumatoid arthritis (RA) using the European League Against Rheumatisms–Outcomes Measures in Rheumatology (EULAR-OMERACT) combined score for PD and SH. Methods A stepwise approach was used: (1) scoring static images of metacarpophalangeal (MCP) joints in a web-based exercise and subsequently when scanning patients; (2) scoring static images of wrist, proximal interphalangeal joints, knee and metatarsophalangeal joints in a web-based exercise and subsequently when scanning patients using different acquisitions (standardised vs usual practice). For reliability, kappa coefficients (κ) were used. Results Scoring MCP joints in static images showed substantial intraobserver variability but good to excellent interobserver reliability. In patients, intraobserver reliability was the same for the two acquisition methods. Interobserver reliability for SH (κ=0.87) and PD (κ=0.79) and the EULAR-OMERACT combined score (κ=0.86) were better when using a ‘standardised’ scan. For the other joints, the intraobserver reliability was excellent in static images for all scores (κ=0.8–0.97) and the interobserver reliability marginally lower. When using standardised scanning in patients, the intraobserver was good (κ=0.64 for SH and the EULAR-OMERACT combined score, 0.66 for PD) and the interobserver reliability was also good especially for PD (κ range=0.41–0.92). Conclusion The EULAR-OMERACT score demonstrated moderate-good reliability in MCP joints using a standardised scan and is equally applicable in non-MCP joints. This scoring system should underpin improved reliability and consequently the responsiveness of US in RA clinical trials. PMID:28948984
ERIC Educational Resources Information Center
Meijer, Rob R.
2004-01-01
Two new methods have been proposed to determine unexpected sum scores on sub-tests (testlets) both for paper-and-pencil tests and computer adaptive tests. A method based on a conservative bound using the hypergeometric distribution, denoted p, was compared with a method where the probability for each score combination was calculated using a…
An improved method to detect correct protein folds using partial clustering
2013-01-01
Background Structure-based clustering is commonly used to identify correct protein folds among candidate folds (also called decoys) generated by protein structure prediction programs. However, traditional clustering methods exhibit a poor runtime performance on large decoy sets. We hypothesized that a more efficient “partial“ clustering approach in combination with an improved scoring scheme could significantly improve both the speed and performance of existing candidate selection methods. Results We propose a new scheme that performs rapid but incomplete clustering on protein decoys. Our method detects structurally similar decoys (measured using either Cα RMSD or GDT-TS score) and extracts representatives from them without assigning every decoy to a cluster. We integrated our new clustering strategy with several different scoring functions to assess both the performance and speed in identifying correct or near-correct folds. Experimental results on 35 Rosetta decoy sets and 40 I-TASSER decoy sets show that our method can improve the correct fold detection rate as assessed by two different quality criteria. This improvement is significantly better than two recently published clustering methods, Durandal and Calibur-lite. Speed and efficiency testing shows that our method can handle much larger decoy sets and is up to 22 times faster than Durandal and Calibur-lite. Conclusions The new method, named HS-Forest, avoids the computationally expensive task of clustering every decoy, yet still allows superior correct-fold selection. Its improved speed, efficiency and decoy-selection performance should enable structure prediction researchers to work with larger decoy sets and significantly improve their ab initio structure prediction performance. PMID:23323835
The video-based test of communication skills: description, development, and preliminary findings.
Mazor, Kathleen M; Haley, Heather-Lyn; Sullivan, Kate; Quirk, Mark E
2007-01-01
The importance of assessing physician-patient communication skills is widely recognized, but assessment methods are limited. Objective structured clinical examinations are time-consuming and resource intensive. For practicing physicians, patient surveys may be useful, but these also require substantial resources. Clearly, it would be advantageous to develop alternative or supplemental methods for assessing communication skills of medical students, residents, and physicians. The Video-based Test of Communication Skills (VTCS) is an innovative, computer-administered test, consisting of 20 very short video vignettes. In each vignette, a patient makes a statement or asks a question. The examinee responds verbally, as if it was a real encounter and he or she were the physician. Responses are recorded for later scoring. Test administration takes approximately 1 h. Generalizability studies were conducted, and scores for two groups of physicians predicted to differ in their communication skills were compared. Preliminary results are encouraging; the estimated g coefficient for the communication score for 20-vignette test (scored by five raters) is 0.79; g for the personal/affective score under the same conditions is 0.62. Differences between physicians were in the predicted direction, with physicians considered "at risk" for communication difficulties scoring lower than those not so identified. The VTCS is a short, portable test of communication skills. Results reported here suggest that scores reflect differences in skill levels and are generalizable. However, these findings are based on very small sample sizes and must be considered preliminary. Additional work is required before it will be possible to argue confidently that this test in particular, and this approach to testing communication skills in general, is valuable and likely to make a substantial contribution to assessment in medical education.
Thompson, Frances E; Midthune, Douglas; Kahle, Lisa; Dodd, Kevin W
2017-06-01
Background: Methods for improving the utility of short dietary assessment instruments are needed. Objective: We sought to describe the development of the NHANES Dietary Screener Questionnaire (DSQ) and its scoring algorithms and performance. Methods: The 19-item DSQ assesses intakes of fruits and vegetables, whole grains, added sugars, dairy, fiber, and calcium. Two nonconsecutive 24-h dietary recalls and the DSQ were administered in NHANES 2009-2010 to respondents aged 2-69 y ( n = 7588). The DSQ frequency responses, coupled with sex- and age-specific portion size information, were regressed on intake from 24-h recalls by using the National Cancer Institute usual intake method to obtain scoring algorithms to estimate mean and prevalences of reaching 2 a priori threshold levels. The resulting scoring algorithms were applied to the DSQ and compared with intakes estimated with the 24-h recall data only. The stability of the derived scoring algorithms was evaluated in repeated sampling. Finally, scoring algorithms were applied to screener data, and these estimates were compared with those from multiple 24-h recalls in 3 external studies. Results: The DSQ and its scoring algorithms produced estimates of mean intake and prevalence that agreed closely with those from multiple 24-h recalls. The scoring algorithms were stable in repeated sampling. Differences in the means were <2%; differences in prevalence were <16%. In other studies, agreement between screener and 24-h recall estimates in fruit and vegetable intake varied. For example, among men in 2 studies, estimates from the screener were significantly lower than the 24-h recall estimates (3.2 compared with 3.8 and 3.2 compared with 4.1). In the third study, agreement between the screener and 24-h recall estimates were close among both men (3.2 compared with 3.1) and women (2.6 compared with 2.5). Conclusions: This approach to developing scoring algorithms is an advance in the use of screeners. However, because these algorithms may not be generalizable to all studies, a pilot study in the proposed study population is advisable. Although more precise instruments such as 24-h dietary recalls are recommended in most research, the NHANES DSQ provides a less burdensome alternative when time and resources are constrained and interest is in a limited set of dietary factors. © 2017 American Society for Nutrition.
Advanced Guidance and Control Methods for Reusable Launch Vehicles: Test Results
NASA Technical Reports Server (NTRS)
Hanson, John M.; Jones, Robert E.; Krupp, Don R.; Fogle, Frank R. (Technical Monitor)
2002-01-01
There are a number of approaches to advanced guidance and control (AG&C) that have the potential for achieving the goals of significantly increasing reusable launch vehicle (RLV) safety/reliability and reducing the cost. In this paper, we examine some of these methods and compare the results. We briefly introduce the various methods under test, list the test cases used to demonstrate that the desired results are achieved, show an automated test scoring method that greatly reduces the evaluation effort required, and display results of the tests. Results are shown for the algorithms that have entered testing so far.
Temperature Profiles of Different Cooling Methods in Porcine Pancreas Procurement
Weegman, Brad P.; Suszynski, Thomas M.; Scott, William E.; Ferrer, Joana; Avgoustiniatos, Efstathios S.; Anazawa, Takayuki; O’Brien, Timothy D.; Rizzari, Michael D.; Karatzas, Theodore; Jie, Tun; Sutherland, David ER.; Hering, Bernhard J.; Papas, Klearchos K.
2014-01-01
Background Porcine islet xenotransplantation is a promising alternative to human islet allotransplantation. Porcine pancreas cooling needs to be optimized to reduce the warm ischemia time (WIT) following donation after cardiac death, which is associated with poorer islet isolation outcomes. Methods This study examines the effect of 4 different cooling Methods on core porcine pancreas temperature (n=24) and histopathology (n=16). All Methods involved surface cooling with crushed ice and chilled irrigation. Method A, which is the standard for porcine pancreas procurement, used only surface cooling. Method B involved an intravascular flush with cold solution through the pancreas arterial system. Method C involved an intraductal infusion with cold solution through the major pancreatic duct, and Method D combined all 3 cooling Methods. Results Surface cooling alone (Method A) gradually decreased core pancreas temperature to < 10 °C after 30 minutes. Using an intravascular flush (Method B) improved cooling during the entire duration of procurement, but incorporating an intraductal infusion (Method C) rapidly reduced core temperature 15–20 °C within the first 2 minutes of cooling. Combining all methods (Method D) was the most effective at rapidly reducing temperature and providing sustained cooling throughout the duration of procurement, although the recorded WIT was not different between Methods (p=0.36). Histological scores were different between the cooling Methods (p=0.02) and the worst with Method A. There were differences in histological scores between Methods A and C (p=0.02) and Methods A and D (p=0.02), but not between Methods C and D (p=0.95), which may highlight the importance of early cooling using an intraductal infusion. Conclusions In conclusion, surface cooling alone cannot rapidly cool large (porcine or human) pancreata. Additional cooling with an intravascular flush and intraductal infusion results in improved core porcine pancreas temperature profiles during procurement and histopathology scores. These data may also have implications on human pancreas procurement since use of an intraductal infusion is not common practice. PMID:25040217
Tan, York Kiat; Allen, John C; Lye, Weng Kit; Conaghan, Philip G; Chew, Li-Ching; Thumboo, Julian
2017-05-01
The aim of the study is to compare the responsiveness of two joint inflammation scoring systems (dichotomous scoring (DS) versus semi-quantitative scoring (SQS)) using novel individualized ultrasound joint selection methods and existing ultrasound joint selection methods. Responsiveness measured by the standardized response means (SRMs) using the DS and the SQS system (for both the novel and existing ultrasound joint selection methods) was derived using the baseline and the 3-month total inflammatory scores from 20 rheumatoid arthritis patients. The relative SRM gain ratios (SRM-Gains) for both scoring system (DS and SQS) comparing the novel to the existing methods were computed. Both scoring systems (DS and SQS) demonstrated substantial SRM-Gains (ranged from 3.31 to 5.67 for the DS system and ranged from 1.82 to 3.26 for the SQS system). The SRMs using the novel methods ranged from 0.94 to 1.36 for the DS system and ranged from 0.89 to 1.11 for the SQS system. The SRMs using the existing methods ranged from 0.24 to 0.32 for the DS system and ranged from 0.34 to 0.49 for the SQS system. The DS system appears to achieve high responsiveness comparable to SQS for the novel individualized ultrasound joint selection methods.
2009-01-01
Background Chronic kidney disease (CKD) is a serious public health problem in Taiwan and the world. The most effective, affordable treatments involve early prevention/detection/intervention, requiring screening. Successfully implementing CKD programs requires good patient participation, affected by patient perceptions of screening service quality. Service quality improvements can help make such programs more successful. Thus, good tools for assessing service quality perceptions are important. Aim: to investigate using a modified SERVQUAL questionnaire in assessing patient expectations, perceptions, and loyalty towards kidney disease screening service quality. Method 1595 kidney disease screening program patients in Taichung City were requested to complete and return a modified kidney disease screening SERVQUAL questionnaire. 1187 returned them. Incomplete ones (102) were culled and 1085 were chosen as effective for use. Paired t-tests, correlation tests, ANOVA, LSD test, and factor analysis identified the characteristics and factors of service quality. The paired t-test tested expectation score and perception score gaps. A structural equation modeling system examined satisfaction-based components' relationships. Results The effective response rate was 91.4%. Several methods verified validity. Cronbach's alpha on internal reliability was above 0.902. On patient satisfaction, expectation scores are high: 6.50 (0.82), but perception scores are significantly lower 6.14 (1.02). Older patients' perception scores are lower than younger patients'. Expectation and perception scores for patients with different types of jobs are significantly different. Patients higher on education have lower scores for expectation (r = -0.09) and perception (r = -0.26). Factor analysis identified three factors in the 22 item SERVQUAL form, which account for 80.8% of the total variance for the expectation scores and 86.9% of the total variance for the satisfaction scores. Expectation and perception score gaps in all 22 items are significant. The goodness-of-fit summary of the SEM results indicates that expectations and perceptions are positively correlated, perceptions and loyalty are positively correlated, but expectations and loyalty are not positively correlated. Conclusions The results of this research suggest that the SERVQUAL instrument is a useful measurement tool in assessing and monitoring service quality in kidney disease screening services, enabling the staff to identify where service improvements are needed from the patients' perspectives. PMID:20021684
FHSA-SED: Two-Locus Model Detection for Genome-Wide Association Study with Harmony Search Algorithm
Tuo, Shouheng; Zhang, Junying; Yuan, Xiguo; Zhang, Yuanyuan; Liu, Zhaowen
2016-01-01
Motivation Two-locus model is a typical significant disease model to be identified in genome-wide association study (GWAS). Due to intensive computational burden and diversity of disease models, existing methods have drawbacks on low detection power, high computation cost, and preference for some types of disease models. Method In this study, two scoring functions (Bayesian network based K2-score and Gini-score) are used for characterizing two SNP locus as a candidate model, the two criteria are adopted simultaneously for improving identification power and tackling the preference problem to disease models. Harmony search algorithm (HSA) is improved for quickly finding the most likely candidate models among all two-locus models, in which a local search algorithm with two-dimensional tabu table is presented to avoid repeatedly evaluating some disease models that have strong marginal effect. Finally G-test statistic is used to further test the candidate models. Results We investigate our method named FHSA-SED on 82 simulated datasets and a real AMD dataset, and compare it with two typical methods (MACOED and CSE) which have been developed recently based on swarm intelligent search algorithm. The results of simulation experiments indicate that our method outperforms the two compared algorithms in terms of detection power, computation time, evaluation times, sensitivity (TPR), specificity (SPC), positive predictive value (PPV) and accuracy (ACC). Our method has identified two SNPs (rs3775652 and rs10511467) that may be also associated with disease in AMD dataset. PMID:27014873
Intelligent query by humming system based on score level fusion of multiple classifiers
NASA Astrophysics Data System (ADS)
Pyo Nam, Gi; Thu Trang Luong, Thi; Ha Nam, Hyun; Ryoung Park, Kang; Park, Sung-Joo
2011-12-01
Recently, the necessity for content-based music retrieval that can return results even if a user does not know information such as the title or singer has increased. Query-by-humming (QBH) systems have been introduced to address this need, as they allow the user to simply hum snatches of the tune to find the right song. Even though there have been many studies on QBH, few have combined multiple classifiers based on various fusion methods. Here we propose a new QBH system based on the score level fusion of multiple classifiers. This research is novel in the following three respects: three local classifiers [quantized binary (QB) code-based linear scaling (LS), pitch-based dynamic time warping (DTW), and LS] are employed; local maximum and minimum point-based LS and pitch distribution feature-based LS are used as global classifiers; and the combination of local and global classifiers based on the score level fusion by the PRODUCT rule is used to achieve enhanced matching accuracy. Experimental results with the 2006 MIREX QBSH and 2009 MIR-QBSH corpus databases show that the performance of the proposed method is better than that of single classifier and other fusion methods.
Kumar, Ujwal; Tomar, Vinay; Yadav, Sher Singh; Priyadarshi, Shivam; Vyas, Nachiket; Agarwal, Neeraj; Dayal, Ram
2018-01-01
Purpose: The aim of the current study was to compare Guy's score and STONE score in predicting the success and complication rate of percutaneous nephrolithotomy (PCNL). Materials and Methods: A total of 445 patients were included in the study between July 2015 and December 2016. The patients were given STONE score and Guy's Stone Score (GSS) grades based on CT scan done preoperatively and intra- and post-operative complications were graded using the modified Clavien grading system. The PCNL were done by a standard technique in prone positions. Results: The success rate in our study was 86.29% and both the GSS and STONE score were significantly associated with a success rate of the procedure. Both the scoring systems correlated with operative time and postoperative hospital stay. Of the total cases, 102 patients (22.92%) experienced complications. A correlation between STONE score stratified into low, moderate, and high nephrolithometry score risk groups (low scores 4–5, moderate scores 6–8, high scores 9–13), and complication was also found (P = 0.04) but not between the GSS and complication rate (P = 0.054). Conclusion: Both GSS and STONE scores are equally effective in predicting success rate of the procedure. PMID:29416280
Khan, Shah Alam; Kumar, Ashok
2010-09-01
We wanted to evaluate the efficacy of Ponseti's technique in neglected clubfoot in children more than 7 years of age. The results of Ponseti's method were evaluated in 21 children (25 feet) with neglected club feet. Patients were evaluated using the Dimeglio scoring system. All patients underwent percutaneous tenotomy of the Achilles tendon. The mean age at the time of treatment was 8.9 years. The mean follow-up period was 4.7 years. The average Dimeglio score at the start of the treatment was 14.2 compared with an average score of 0.95 at the end of the treatment at 1-year follow-up. Eighteen feet (85.7%) had full correction. Recurrence was seen in six feet (24%). At 4-year follow-up, the average Dimeglio score for 19 feet was 0.18. We recommend that Ponseti's method should be the preferred initial treatment modality for neglected clubfeet.
Can We Train Machine Learning Methods to Outperform the High-dimensional Propensity Score Algorithm?
Karim, Mohammad Ehsanul; Pang, Menglan; Platt, Robert W
2018-03-01
The use of retrospective health care claims datasets is frequently criticized for the lack of complete information on potential confounders. Utilizing patient's health status-related information from claims datasets as surrogates or proxies for mismeasured and unobserved confounders, the high-dimensional propensity score algorithm enables us to reduce bias. Using a previously published cohort study of postmyocardial infarction statin use (1998-2012), we compare the performance of the algorithm with a number of popular machine learning approaches for confounder selection in high-dimensional covariate spaces: random forest, least absolute shrinkage and selection operator, and elastic net. Our results suggest that, when the data analysis is done with epidemiologic principles in mind, machine learning methods perform as well as the high-dimensional propensity score algorithm. Using a plasmode framework that mimicked the empirical data, we also showed that a hybrid of machine learning and high-dimensional propensity score algorithms generally perform slightly better than both in terms of mean squared error, when a bias-based analysis is used.
[Evaluation of the factorial and metric equivalence of the Sexual Assertiveness Scale (SAS) by sex].
Sierra, Juan Carlos; Santos-Iglesias, Pablo; Vallejo-Medina, Pablo
2012-05-01
Sexual assertiveness refers to the ability to initiate sexual activity, refuse unwanted sexual activity, and use contraceptive methods to avoid sexually transmitted diseases, developing healthy sexual behaviors. The Sexual Assertiveness Scale (SAS) assesses these three dimensions. The purpose of this study is to evaluate, using structural equation modeling and differential item functioning, the equivalence of the scale between men and women. Standard scores are also provided. A total of 4,034 participants from 21 Spanish provinces took part in the study. Quota sampling method was used. Results indicate a strict equivalent dimensionality of the Sexual Assertiveness Scale across sexes. One item was flagged by differential item functioning, although it does not affect the scale. Therefore, there is no significant bias in the scale when comparing across sexes. Standard scores show similar Initiation assertiveness scores for men and women, and higher scores on Refusal and Sexually Transmitted Disease Prevention for women. This scale can be used on men and women with sufficient psychometric guarantees.
Millard, Heather A Towle; Millard, Ralph P; Constable, Peter D; Freeman, Lyn J
2014-02-01
To determine the relationships among traditional and laparoscopic surgical skills, spatial analysis skills, and video gaming proficiency of third-year veterinary students. Prospective, randomized, controlled study. A convenience sample of 29 third-year veterinary students. The students had completed basic surgical skills training with inanimate objects but had no experience with soft tissue, orthopedic, or laparoscopic surgery; the spatial analysis test; or the video games that were used in the study. Scores for traditional surgical, laparoscopic, spatial analysis, and video gaming skills were determined, and associations among these were analyzed by means of Spearman's rank order correlation coefficient (rs). A significant positive association (rs = 0.40) was detected between summary scores for video game performance and laparoscopic skills, but not between video game performance and traditional surgical skills scores. Spatial analysis scores were positively (rs = 0.30) associated with video game performance scores; however, that result was not significant. Spatial analysis scores were not significantly associated with laparoscopic surgical skills scores. Traditional surgical skills scores were not significantly associated with laparoscopic skills or spatial analysis scores. Results of this study indicated video game performance of third-year veterinary students was predictive of laparoscopic but not traditional surgical skills, suggesting that laparoscopic performance may be improved with video gaming experience. Additional studies would be required to identify methods for improvement of traditional surgical skills.
Wang, L Y; Peng, H; Huang, W N; Gao, B
2016-04-20
Objective: This study was designed to observe the dizziness handicap inventory (DHI) scores in patients with BPPV (benign paroxysmal positional vertigo) before and after maneuver repositioning and aimed to discuss the values of DHI scores in the diagnosing and treatment of BPPV. Method: Charts of 72 patients with BPPV diagnosed by positioning test were reviewed. Four DHI scores were used including the total score (DHIT), the functional score (DHIF), the emotional score (DHIE), and the physical score (DHIP). We compared the pre-repositioning DHI scores and post-repositioning scores of patients, and also compared the DHI scores of patients with and without residual dizziness. Result: All of the 72 patients were underwent maneuver repositioning and recorded the DHI scores. The mean post-repositioning scores were dramatically decreased compared with pre-repositioning scores, and the difference was significant ( P <0.01). The differences of the DHIP scores between the residual dizziness group and the non-residual dizziness group was not significant, while the DHIF scores, the DHIE scores and the DHIT scores between the two groups were statistically different. Conclusion: After maneuver repositioning the dizziness handicap of BPPV patients could be significantly improved. The next treatment program for residual dizziness patients after successful repositioning could be aimed at the functional and emotional dizziness. Copyright© by the Editorial Department of Journal of Clinical Otorhinolaryngology Head and Neck Surgery.
Is the NIHSS Certification Process Too Lenient?
Hills, Nancy K.; Josephson, S. Andrew; Lyden, Patrick D.; Johnston, S. Claiborne
2009-01-01
Background and Purpose The National Institutes of Health Stroke Scale (NIHSS) is a widely used measure of neurological function in clinical trials and patient assessment; inter-rater scoring variability could impact communications and trial power. The manner in which the rater certification test is scored yields multiple correct answers that have changed over time. We examined the range of possible total NIHSS scores from answers given in certification tests by over 7,000 individual raters who were certified. Methods We analyzed the results of all raters who completed one of two standard multiple-patient videotaped certification examinations between 1998 and 2004. The range for the correct score, calculated using NIHSS ‘correct answers’, was determined for each patient. The distribution of scores derived from those who passed the certification test then was examined. Results A total of 6,268 raters scored 5 patients on Test 1; 1,240 scored 6 patients on Test 2. Using a National Stroke Association (NSA) answer key, we found that correct total scores ranged from 2 correct scores to as many as 12 different correct total scores. Among raters who achieved a passing score and were therefore qualified to administer the NIHSS, score distributions were even wider, with 1 certification patient receiving 18 different correct total scores. Conclusions Allowing multiple acceptable answers for questions on the NIHSS certification test introduces scoring variability. It seems reasonable to assume that the wider the range of acceptable answers in the certification test, the greater the variability in the performance of the test in trials and clinical practice by certified examiners. Greater consistency may be achieved by deriving a set of ‘best’ answers through expert consensus on all questions where this is possible, then teaching raters how to derive these answers using a required interactive training module. PMID:19295205
[Variation trend and significance of adult tonsil size and tongue position].
Bin, X; Zhou, Y
2016-08-05
Objective: The aim of this study is to explore the changing trend and significance of adult tonsil size and tongue position by observing adults in different age groups. Method: Oropharyngeal cavities of 1 060 adults who undergoing health examination and had no history of tonsil surgery were observed. Friedman tongue position (FTP) and tonsil size (TS) were scored according to Friedman's criteria and results were statistic analyzed to evaluate their changing law and significance. Result: Mean FTP scores increased with age significantly( P <0.01); FTP score in male was lower than that in female( P <0.01). TS score significantly decreased with age( P <0.05).The average score of TS had no statistical significance in different gender. Although there was no statistical significance, total score of FTP show an increasing trend with age( P >0.05);Total scores of FTP were different between sexes(male 4.12±0.67,female 4.23±0.68, P <0.05).BMI was not found to be statistically different when FTP scores, TS scores and total scores changed ( P >0.05); but it showed an increasing trend with age( P <0.01). Conclusion: Width of pharyngeal cavity in normal adults is always kept in certain stability, while it proves to be narrower in obese people. TS score and FTP score, which appear the opposite trend with age, can be thought as a major factor to keep a stable width of oral pharyngeal cavity. Copyright© by the Editorial Department of Journal of Clinical Otorhinolaryngology Head and Neck Surgery.
ERIC Educational Resources Information Center
Weltman, David; Whiteside, Mary
2010-01-01
This research shows that active learning is not universally effective and, in fact, may inhibit learning for certain types of students. The results of this study show that as increased levels of active learning are utilized, student test scores decrease for those with a high grade point average. In contrast, test scores increase as active learning…
Association between eating behavior scores and obesity in Chilean children
2011-01-01
Background Inadequate eating behavior and physical inactivity contribute to the current epidemic of childhood obesity. The aim of this study was to assess the association between eating behavior scores and childhood obesity in Chilean children. Design and methods We recruited 126 obese, 44 overweight and 124 normal-weight Chilean children (6-12 years-old; both genders) according to the International Obesity Task Force (IOTF) criteria. Eating behavior scores were calculated using the Child Eating Behavior Questionnaire (CEBQ). Factorial analysis in the culturally-adapted questionnaire for Chilean population was used to confirm the original eight-factor structure of CEBQ. The Cronbach's alpha statistic (>0.7 in most subscales) was used to assess internal consistency. Non-parametric methods were used to assess case-control associations. Results Eating behavior scores were strongly associated with childhood obesity in Chilean children. Childhood obesity was directly associated with high scores in the subscales "enjoyment of food" (P < 0.0001), "emotional overeating" (P < 0.001) and "food responsiveness" (P < 0.0001). Food-avoidant subscales "satiety responsiveness" and "slowness in eating" were inversely associated with childhood obesity (P < 0.001). There was a graded relation between the magnitude of these eating behavior scores across groups of normal-weight, overweight and obesity groups. Conclusion Our study shows a strong and graded association between specific eating behavior scores and childhood obesity in Chile. PMID:21985269
Pharmacophore-Based Similarity Scoring for DOCK
2015-01-01
Pharmacophore modeling incorporates geometric and chemical features of known inhibitors and/or targeted binding sites to rationally identify and design new drug leads. In this study, we have encoded a three-dimensional pharmacophore matching similarity (FMS) scoring function into the structure-based design program DOCK. Validation and characterization of the method are presented through pose reproduction, crossdocking, and enrichment studies. When used alone, FMS scoring dramatically improves pose reproduction success to 93.5% (∼20% increase) and reduces sampling failures to 3.7% (∼6% drop) compared to the standard energy score (SGE) across 1043 protein–ligand complexes. The combined FMS+SGE function further improves success to 98.3%. Crossdocking experiments using FMS and FMS+SGE scoring, for six diverse protein families, similarly showed improvements in success, provided proper pharmacophore references are employed. For enrichment, incorporating pharmacophores during sampling and scoring, in most cases, also yield improved outcomes when docking and rank-ordering libraries of known actives and decoys to 15 systems. Retrospective analyses of virtual screenings to three clinical drug targets (EGFR, IGF-1R, and HIVgp41) using X-ray structures of known inhibitors as pharmacophore references are also reported, including a customized FMS scoring protocol to bias on selected regions in the reference. Overall, the results and fundamental insights gained from this study should benefit the docking community in general, particularly researchers using the new FMS method to guide computational drug discovery with DOCK. PMID:25229837
NASA Astrophysics Data System (ADS)
Fikri Zanil, Muhamad; Nur Wahidah Nik Hashim, Nik; Azam, Huda
2017-11-01
Psychiatrist currently relies on questionnaires and interviews for psychological assessment. These conservative methods often miss true positives and might lead to death, especially in cases where a patient might be experiencing suicidal predisposition but was only diagnosed as major depressive disorder (MDD). With modern technology, an assessment tool might aid psychiatrist with a more accurate diagnosis and thus hope to reduce casualty. This project will explore on the relationship between speech features of spoken audio signal (reading) in Bahasa Malaysia with the Beck Depression Inventory scores. The speech features used in this project were Power Spectral Density (PSD), Mel-frequency Ceptral Coefficients (MFCC), Transition Parameter, formant and pitch. According to analysis, the optimum combination of speech features to predict BDI-II scores include PSD, MFCC and Transition Parameters. The linear regression approach with sequential forward/backward method was used to predict the BDI-II scores using reading speech. The result showed 0.4096 mean absolute error (MAE) for female reading speech. For male, the BDI-II scores successfully predicted 100% less than 1 scores difference with MAE of 0.098437. A prediction system called Depression Severity Evaluator (DSE) was developed. The DSE managed to predict one out of five subjects. Although the prediction rate was low, the system precisely predict the score within the maximum difference of 4.93 for each person. This demonstrates that the scores are not random numbers.
Using structure to explore the sequence alignment space of remote homologs.
Kuziemko, Andrew; Honig, Barry; Petrey, Donald
2011-10-01
Protein structure modeling by homology requires an accurate sequence alignment between the query protein and its structural template. However, sequence alignment methods based on dynamic programming (DP) are typically unable to generate accurate alignments for remote sequence homologs, thus limiting the applicability of modeling methods. A central problem is that the alignment that is "optimal" in terms of the DP score does not necessarily correspond to the alignment that produces the most accurate structural model. That is, the correct alignment based on structural superposition will generally have a lower score than the optimal alignment obtained from sequence. Variations of the DP algorithm have been developed that generate alternative alignments that are "suboptimal" in terms of the DP score, but these still encounter difficulties in detecting the correct structural alignment. We present here a new alternative sequence alignment method that relies heavily on the structure of the template. By initially aligning the query sequence to individual fragments in secondary structure elements and combining high-scoring fragments that pass basic tests for "modelability", we can generate accurate alignments within a small ensemble. Our results suggest that the set of sequences that can currently be modeled by homology can be greatly extended.
Booker, Simon; Alfahad, Nawaf; Scott, Martin; Gooding, Ben; Wallace, W Angus
2015-01-01
To investigate shoulder scoring systems used in Europe and North America and how outcomes might be classified after shoulder joint replacement. All research papers published in four major journals in 2012 and 2013 were reviewed for the shoulder scoring systems used in their published papers. A method of identifying how outcomes after shoulder arthroplasty might be used to categorize patients into fair, good, very good and excellent outcomes was explored using the outcome evaluations from patients treated in our own unit. A total of 174 research articles that were published in the four journals used some form of shoulder scoring system. The outcome from shoulder arthroplasty in our unit has been evaluated using the constant score (CS) and the oxford shoulder score and these scores have been used to evaluate individual patient outcomes. CSs of < 30 = unsatisfactory; 30-39 = fair; 40-59 = good; 60-69 = very good; and 70 and over = excellent. The most popular shoulder scoring systems in North America were Simple Shoulder Test and American shoulder and elbow surgeons standard shoulder assessment form score and in Europe CS, Oxford Shoulder Score and DASH score. PMID:25793164
SCMPSP: Prediction and characterization of photosynthetic proteins based on a scoring card method.
Vasylenko, Tamara; Liou, Yi-Fan; Chen, Hong-An; Charoenkwan, Phasit; Huang, Hui-Ling; Ho, Shinn-Ying
2015-01-01
Photosynthetic proteins (PSPs) greatly differ in their structure and function as they are involved in numerous subprocesses that take place inside an organelle called a chloroplast. Few studies predict PSPs from sequences due to their high variety of sequences and structues. This work aims to predict and characterize PSPs by establishing the datasets of PSP and non-PSP sequences and developing prediction methods. A novel bioinformatics method of predicting and characterizing PSPs based on scoring card method (SCMPSP) was used. First, a dataset consisting of 649 PSPs was established by using a Gene Ontology term GO:0015979 and 649 non-PSPs from the SwissProt database with sequence identity <= 25%.- Several prediction methods are presented based on support vector machine (SVM), decision tree J48, Bayes, BLAST, and SCM. The SVM method using dipeptide features-performed well and yielded - a test accuracy of 72.31%. The SCMPSP method uses the estimated propensity scores of 400 dipeptides - as PSPs and has a test accuracy of 71.54%, which is comparable to that of the SVM method. The derived propensity scores of 20 amino acids were further used to identify informative physicochemical properties for characterizing PSPs. The analytical results reveal the following four characteristics of PSPs: 1) PSPs favour hydrophobic side chain amino acids; 2) PSPs are composed of the amino acids prone to form helices in membrane environments; 3) PSPs have low interaction with water; and 4) PSPs prefer to be composed of the amino acids of electron-reactive side chains. The SCMPSP method not only estimates the propensity of a sequence to be PSPs, it also discovers characteristics that further improve understanding of PSPs. The SCMPSP source code and the datasets used in this study are available at http://iclab.life.nctu.edu.tw/SCMPSP/.
Comparison of two scores for allocating resources to doctors in deprived areas.
Hutchinson, A; Foy, C; Sandhu, B
1989-11-04
Current proposals in the general practitioner contract include additional payments to doctors working among deprived populations. The underprivileged area score will be used to identify local authority wards with the greatest levels of deprivation, thus acting as the basis for distributing considerable resources. Two methods of identifying deprived populations--the underprivileged area score and the material deprivation score--were compared to determine whether they result in similar allocation of resources to regions. Financial allocations to regions based on figures derived from the contract differed considerably if the material deprivation score was used instead of the underprivileged area score: Northern and Mersey regions gained over 50% of their allocation whereas East Anglia, Oxford, and South West Thames regions lost more than 30% of theirs. Such differences have considerable implications for doctors working among deprived populations as up to 60m pounds each year might be distributed by these payments.
Symptoms of maternal depression immediately after delivery predict unsuccessful breast feeding.
Gagliardi, Luigi; Petrozzi, Angela; Rusconi, Franca
2012-04-01
Postnatal depression may interfere with breast feeding. This study tested the ability of the Edinburgh Postnatal Depression Scale (EPDS) to predict later breast feeding problems, hypothesising that risk of unsuccessful breast feeding increased with increasing EPDS scores, even at low values. The authors administered the EPDS on days 2-3 after delivery to 592 mothers of a healthy baby. Feeding method was recorded at 12-14 weeks. Median EPDS score was 5 (IQR 2 -8); 15.7% of women scored >9. At 12-14 weeks, 50.7% of infants received full breast feeding, 21.0% mixed breast feeding and 28.4% bottle feeding. Mothers with higher EPDS scores were more likely to bottle feed at 3 months; the odds of bottle feeding increased with EPDS result, even at low scores (OR 1.06, 95% CI 1.01 to 1.11). Higher EPDS scores immediately after delivery were associated with later breast feeding failure.
Nursing Activities Score: nursing work load in a burns Intensive Care Unit1
Camuci, Marcia Bernadete; Martins, Júlia Trevisan; Cardeli, Alexandrina Aparecida Maciel; Robazzi, Maria Lúcia do Carmo Cruz
2014-01-01
Objective to evaluate the nursing work load in a Burns Intensive Care Unit according to the Nursing Activities Score. Method an exploratory, descriptive cross-sectional study with a quantitative approach. The Nursing Activities Score was used for data collection between October 2011 and May 2012, totalling 1,221 measurements, obtained from 50 patients' hospital records. Data for qualitative variables was described in tables; for the quantitative variables, calculations using statistical measurements were used. Results the mean score for the Nursing Activities Score was 70.4% and the median was 70.3%, corresponding to the percentage of the time spent on direct care to the patient in 24 hours. Conclusion the Nursing Activities Score provided information which involves the process of caring for patients hospitalized in a Burns Intensive Care Unit, and indicated that there is a high work load for the nursing team of the sector studied. PMID:26107842
Kenya, Amilliah W.; Hart, John F.; Vuyiya, Charles K.
2016-01-01
Objective: This study compared National Board of Chiropractic Examiners part I test scores between students who did and did not serve as tutors on the subject matter. Methods: Students who had a prior grade point average of 3.45 or above on a 4.0 scale just before taking part I of the board exams were eligible to participate. A 2-sample t-test was used to ascertain the difference in the mean scores on part I between the tutor group (n = 28) and nontutor (n = 29) group. Results: Scores were higher in all subjects for the tutor group compared to the nontutor group and the differences were statistically significant (p < .01) with large effect sizes. Conclusion: The tutors in this study performed better on part I of the board examination compared to nontutors, suggesting that tutoring results in an academic benefit for tutors themselves. PMID:26998665
Brennan, Paul M; Murray, Gordon D; Teasdale, Graham M
2018-06-01
OBJECTIVE Glasgow Coma Scale (GCS) scores and pupil responses are key indicators of the severity of traumatic brain damage. The aim of this study was to determine what information would be gained by combining these indicators into a single index and to explore the merits of different ways of achieving this. METHODS Information about early GCS scores, pupil responses, late outcomes on the Glasgow Outcome Scale, and mortality were obtained at the individual patient level by reviewing data from the CRASH (Corticosteroid Randomisation After Significant Head Injury; n = 9,045) study and the IMPACT (International Mission for Prognosis and Clinical Trials in TBI; n = 6855) database. These data were combined into a pooled data set for the main analysis. Methods of combining the Glasgow Coma Scale and pupil response data varied in complexity from using a simple arithmetic score (GCS score [range 3-15] minus the number of nonreacting pupils [0, 1, or 2]), which we call the GCS-Pupils score (GCS-P; range 1-15), to treating each factor as a separate categorical variable. The content of information about patient outcome in each of these models was evaluated using Nagelkerke's R 2 . RESULTS Separately, the GCS score and pupil response were each related to outcome. Adding information about the pupil response to the GCS score increased the information yield. The performance of the simple GCS-P was similar to the performance of more complex methods of evaluating traumatic brain damage. The relationship between decreases in the GCS-P and deteriorating outcome was seen across the complete range of possible scores. The additional 2 lowest points offered by the GCS-Pupils scale (GCS-P 1 and 2) extended the information about injury severity from a mortality rate of 51% and an unfavorable outcome rate of 70% at GCS score 3 to a mortality rate of 74% and an unfavorable outcome rate of 90% at GCS-P 1. The paradoxical finding that GCS score 4 was associated with a worse outcome than GCS score 3 was not seen when using the GCS-P. CONCLUSIONS A simple arithmetic combination of the GCS score and pupillary response, the GCS-P, extends the information provided about patient outcome to an extent comparable to that obtained using more complex methods. The greater range of injury severities that are identified and the smoothness of the stepwise pattern of outcomes across the range of scores may be useful in evaluating individual patients and identifying patient subgroups. The GCS-P may be a useful platform onto which information about other key prognostic features can be added in a simple format likely to be useful in clinical practice.
Jahangard, Leila; Rahmani, Anahita; Haghighi, Mohammad; Ahmadpanah, Mohammad; Sadeghi Bahmani, Dena; Soltanian, Ali R; Shirzadi, Shahriar; Bajoghli, Hafez; Gerber, Markus; Holsboer-Trachsler, Edith; Brand, Serge
2017-01-01
Background: In the present study, we explored the associations between hypomania, symptoms of depression, sleep complaints, physical activity and mental toughness. The latter construct has gained interest for its association with a broad variety of favorable behavior in both clinical and non-clinical samples. Subjects and Methods: The non-clinical sample consisted of 206 young adults ( M = 21.3 years; age range: 18-24 years; 57.3% males). They completed questionnaires covering hypomania, mental toughness, symptoms of depression, physical activity, and sleep quality. Results: Higher hypomania scores were associated with higher mental toughness, increased physical activity, lower symptoms of depression and lower sleep complaints. No gender differences were observed. Higher hypomania scores were predicted by higher scores of mental toughness subscales of control and challenge, and physical activity. Conclusion: The pattern of results suggests that among a non-clinical sample of young adults, self-rated hypomania scores were associated with higher scores on mental toughness and physical activity, along with lower depression and sleep complaints. The pattern of results further suggests that hypomania traits are associated with a broad range of favorable psychological, behavioral and sleep-related traits, at least among a non-clinical sample of young adults.
Do We Really Become Smarter When Our Fluid-Intelligence Test Scores Improve?
Hayes, Taylor R.; Petrov, Alexander A.; Sederberg, Per B.
2014-01-01
Recent reports of training-induced gains on fluid intelligence tests have fueled an explosion of interest in cognitive training—now a billion-dollar industry. The interpretation of these results is questionable because score gains can be dominated by factors that play marginal roles in the scores themselves, and because intelligence gain is not the only possible explanation for the observed control-adjusted far transfer across tasks. Here we present novel evidence that the test score gains used to measure the efficacy of cognitive training may reflect strategy refinement instead of intelligence gains. A novel scanpath analysis of eye movement data from 35 participants solving Raven’s Advanced Progressive Matrices on two separate sessions indicated that one-third of the variance of score gains could be attributed to test-taking strategy alone, as revealed by characteristic changes in eye-fixation patterns. When the strategic contaminant was partialled out, the residual score gains were no longer significant. These results are compatible with established theories of skill acquisition suggesting that procedural knowledge tacitly acquired during training can later be utilized at posttest. Our novel method and result both underline a reason to be wary of purported intelligence gains, but also provide a way forward for testing for them in the future. PMID:25395695
Do We Really Become Smarter When Our Fluid-Intelligence Test Scores Improve?
Hayes, Taylor R; Petrov, Alexander A; Sederberg, Per B
2015-01-01
Recent reports of training-induced gains on fluid intelligence tests have fueled an explosion of interest in cognitive training-now a billion-dollar industry. The interpretation of these results is questionable because score gains can be dominated by factors that play marginal roles in the scores themselves, and because intelligence gain is not the only possible explanation for the observed control-adjusted far transfer across tasks. Here we present novel evidence that the test score gains used to measure the efficacy of cognitive training may reflect strategy refinement instead of intelligence gains. A novel scanpath analysis of eye movement data from 35 participants solving Raven's Advanced Progressive Matrices on two separate sessions indicated that one-third of the variance of score gains could be attributed to test-taking strategy alone, as revealed by characteristic changes in eye-fixation patterns. When the strategic contaminant was partialled out, the residual score gains were no longer significant. These results are compatible with established theories of skill acquisition suggesting that procedural knowledge tacitly acquired during training can later be utilized at posttest. Our novel method and result both underline a reason to be wary of purported intelligence gains, but also provide a way forward for testing for them in the future.
The score statistic of the LD-lod analysis: detecting linkage adaptive to linkage disequilibrium.
Huang, J; Jiang, Y
2001-01-01
We study the properties of a modified lod score method for testing linkage that incorporates linkage disequilibrium (LD-lod). By examination of its score statistic, we show that the LD-lod score method adaptively combines two sources of information: (a) the IBD sharing score which is informative for linkage regardless of the existence of LD and (b) the contrast between allele-specific IBD sharing scores which is informative for linkage only in the presence of LD. We also consider the connection between the LD-lod score method and the transmission-disequilibrium test (TDT) for triad data and the mean test for affected sib pair (ASP) data. We show that, for triad data, the recessive LD-lod test is asymptotically equivalent to the TDT; and for ASP data, it is an adaptive combination of the TDT and the ASP mean test. We demonstrate that the LD-lod score method has relatively good statistical efficiency in comparison with the ASP mean test and the TDT for a broad range of LD and the genetic models considered in this report. Therefore, the LD-lod score method is an interesting approach for detecting linkage when the extent of LD is unknown, such as in a genome-wide screen with a dense set of genetic markers. Copyright 2001 S. Karger AG, Basel
Robu, Maria R; Edwards, Philip; Ramalhinho, João; Thompson, Stephen; Davidson, Brian; Hawkes, David; Stoyanov, Danail; Clarkson, Matthew J
2017-07-01
Minimally invasive surgery offers advantages over open surgery due to a shorter recovery time, less pain and trauma for the patient. However, inherent challenges such as lack of tactile feedback and difficulty in controlling bleeding lower the percentage of suitable cases. Augmented reality can show a better visualisation of sub-surface structures and tumour locations by fusing pre-operative CT data with real-time laparoscopic video. Such augmented reality visualisation requires a fast and robust video to CT registration that minimises interruption to the surgical procedure. We propose to use view planning for efficient rigid registration. Given the trocar position, a set of camera positions are sampled and scored based on the corresponding liver surface properties. We implement a simulation framework to validate the proof of concept using a segmented CT model from a human patient. Furthermore, we apply the proposed method on clinical data acquired during a human liver resection. The first experiment motivates the viewpoint scoring strategy and investigates reliable liver regions for accurate registrations in an intuitive visualisation. The second experiment shows wider basins of convergence for higher scoring viewpoints. The third experiment shows that a comparable registration performance can be achieved by at least two merged high scoring views and four low scoring views. Hence, the focus could change from the acquisition of a large liver surface to a small number of distinctive patches, thereby giving a more explicit protocol for surface reconstruction. We discuss the application of the proposed method on clinical data and show initial results. The proposed simulation framework shows promising results to motivate more research into a comprehensive view planning method for efficient registration in laparoscopic liver surgery.
Sexual dimorphism in human cranial trait scores: effects of population, age, and body size.
Garvin, Heather M; Sholts, Sabrina B; Mosca, Laurel A
2014-06-01
Sex estimation from the skull is commonly performed by physical and forensic anthropologists using a five-trait scoring system developed by Walker. Despite the popularity of this method, validation studies evaluating its accuracy across a variety of samples are lacking. Furthermore, it remains unclear what other intrinsic or extrinsic variables are related to the expression of these traits. In this study, cranial trait scores and postcranial measurements were collected from four diverse population groups (U.S. Whites, U.S. Blacks, medieval Nubians, and Arikara Native Americans) following Walker's protocols (total n = 499). Univariate and multivariate analyses were utilized to evaluate the accuracy of these traits in sex estimation, and to test for the effects of population, age, and body size on trait expressions. Results revealed significant effects of population on all trait scores. Sample-specific correct sex classification rates ranged from 74% to 94%, with an overall accuracy of 85% for the pooled sample. Classification performance varied among the traits (best for glabella and mastoid scores and worst for nuchal scores). Furthermore, correlations between traits were weak or nonsignificant, suggesting that different factors may influence individual traits. Some traits displayed correlations with age and/or postcranial size that were significant but weak, and within-population analyses did not reveal any consistent relationships between these traits across all groups. These results indicate that neither age nor body size plays a large role in trait expression, and thus does not need to be incorporated into sex estimation methods. Copyright © 2014 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Young, Jerry Wayne
The purpose of this study was to determine the effects of four instructional methods (direct instruction, computer-aided instruction, video observation, and microcomputer-based lab activities), gender, and time of testing (pretest, immediate posttest for determining the immediate effect of instruction, and a delayed posttest two weeks later to determine the retained effect of the instruction) on the achievement of sixth graders who were learning to interpret graphs of displacement and velocity. The dependent variable of achievement was reflected in the scores earned by students on a testing instrument of established validity and reliability. The 107 students participating in the study were divided by gender and were then randomly assigned to the four treatment groups, each taught by a different teacher. Each group had approximately equal numbers of males and females. The students were pretested and then involved in two class periods of the instructional method which was unique to their group. Immediately following treatment they were posttested and two weeks later they were posttested again. The data in the form of test scores were analyzed with a two-way split-plot analysis of variance to determine if there was significant interaction among technique, gender, and time of testing. When significant interaction was indicated, the Tukey HSD test was used to determine specific mean differences. The results of the analysis indicated no gender effect. Only students in the direct instruction group and the microcomputer-based laboratory group had significantly higher posttest-1 scores than pretest scores. They also had significantly higher posttest-2 scores than pretest scores. This suggests that the learning was retained. The other groups experienced no significant differences among pretest, posttest-1, and posttest-2 scores. Recommendations are that direct instruction and microcomputer-based laboratory activities should be considered as effective stand-alone methods for teaching sixth grade students to interpret graphs of displacement and velocity. However, video and computer instruction may serve as supplemental activities.
Herasevich, Vitaly
2017-01-01
Background The new sepsis definition has increased the need for frequent sequential organ failure assessment (SOFA) score recalculation and the clerical burden of information retrieval makes this score ideal for automated calculation. Objective The aim of this study was to (1) estimate the clerical workload of manual SOFA score calculation through a time-motion analysis and (2) describe a user-centered design process for an electronic medical record (EMR) integrated, automated SOFA score calculator with subsequent usability evaluation study. Methods First, we performed a time-motion analysis by recording time-to-task-completion for the manual calculation of 35 baseline and 35 current SOFA scores by 14 internal medicine residents over a 2-month period. Next, we used an agile development process to create a user interface for a previously developed automated SOFA score calculator. The final user interface usability was evaluated by clinician end users with the Computer Systems Usability Questionnaire. Results The overall mean (standard deviation, SD) time-to-complete manual SOFA score calculation time was 61.6 s (33). Among the 24% (12/50) usability survey respondents, our user-centered user interface design process resulted in >75% favorability of survey items in the domains of system usability, information quality, and interface quality. Conclusions Early stakeholder engagement in our agile design process resulted in a user interface for an automated SOFA score calculator that reduced clinician workload and met clinicians’ needs at the point of care. Emerging interoperable platforms may facilitate dissemination of similarly useful clinical score calculators and decision support algorithms as “apps.” A user-centered design process and usability evaluation should be considered during creation of these tools. PMID:28526675
Doshi, Neena Piyush
2017-01-01
Team-based learning (TBL) combines small and large group learning by incorporating multiple small groups in a large group setting. It is a teacher-directed method that encourages student-student interaction. This study compares student learning and teaching satisfaction between conventional lecture and TBL in the subject of pathology. The present study is aimed to assess the effectiveness of TBL method of teaching over the conventional lecture. The present study was conducted in the Department of Pathology, GMERS Medical College and General Hospital, Gotri, Vadodara, Gujarat. The study population comprised 126 students of second-year MBBS, in their third semester of the academic year 2015-2016. "Hemodynamic disorders" were taught by conventional method and "transfusion medicine" by TBL method. Effectiveness of both the methods was assessed. A posttest multiple choice question was conducted at the end of "hemodynamic disorders." Assessment of TBL was based on individual score, team score, and each member's contribution to the success of the team. The individual score and overall score were compared with the posttest score on "hemodynamic disorders." A feedback was taken from the students regarding their experience with TBL. Tukey's multiple comparisons test and ANOVA summary were used to find the significance of scores between didactic and TBL methods. Student feedback was taken using "Student Satisfaction Scale" based on Likert scoring method. The mean of student scores by didactic, Individual Readiness Assurance Test (score "A"), and overall (score "D") was 49.8% (standard deviation [SD]-14.8), 65.6% (SD-10.9), and 65.6% (SD-13.8), respectively. The study showed positive educational outcome in terms of knowledge acquisition, participation and engagement, and team performance with TBL.
Cognitive-behavioral screening reveals prevalent impairment in a large multicenter ALS cohort
Factor-Litvak, Pam; Goetz, Raymond; Lomen-Hoerth, Catherine; Nagy, Peter L.; Hupf, Jonathan; Singleton, Jessica; Woolley, Susan; Andrews, Howard; Heitzman, Daragh; Bedlack, Richard S.; Katz, Jonathan S.; Barohn, Richard J.; Sorenson, Eric J.; Oskarsson, Björn; Fernandes Filho, J. Americo M.; Kasarskis, Edward J.; Mozaffar, Tahseen; Rollins, Yvonne D.; Nations, Sharon P.; Swenson, Andrea J.; Koczon-Jaremko, Boguslawa A.; Mitsumoto, Hiroshi
2016-01-01
Objectives: To characterize the prevalence of cognitive and behavioral symptoms using a cognitive/behavioral screening battery in a large prospective multicenter study of amyotrophic lateral sclerosis (ALS). Methods: Two hundred seventy-four patients with ALS completed 2 validated cognitive screening tests and 2 validated behavioral interviews with accompanying caregivers. We examined the associations between cognitive and behavioral performance, demographic and clinical data, and C9orf72 mutation data. Results: Based on the ALS Cognitive Behavioral Screen cognitive score, 6.5% of the sample scored below the cutoff score for frontotemporal lobar dementia, 54.2% scored in a range consistent with ALS with mild cognitive impairment, and 39.2% scored in the normal range. The ALS Cognitive Behavioral Screen behavioral subscale identified 16.5% of the sample scoring below the dementia cutoff score, with an additional 14.1% scoring in the ALS behavioral impairment range, and 69.4% scoring in the normal range. Conclusions: This investigation revealed high levels of cognitive and behavioral impairment in patients with ALS within 18 months of symptom onset, comparable to prior investigations. This investigation illustrates the successful use and scientific value of adding a cognitive-behavioral screening tool in studies of motor neuron diseases, to provide neurologists with an efficient method to measure these common deficits and to understand how they relate to key clinical variables, when extensive neuropsychological examinations are unavailable. These tools, developed specifically for patients with motor impairment, may be particularly useful in patient populations with multiple sclerosis and Parkinson disease, who are known to have comorbid cognitive decline. PMID:26802094
Air pollution and asthma severity in adults
Rage, Estelle; Siroux, Valérie; Künzli, Nino; Pin, Isabelle; Kauffmann, Francine
2009-01-01
Objectives There is evidence that exposure to air pollution affects asthma, but the effect of air pollution on asthma severity has not been addressed. The aim was to assess the relation between asthma severity during the past 12 months and home outdoor concentrations of air pollution. Methods Asthma severity over the last 12 months was assessed in two complementary ways among 328 adult asthmatics from the French Epidemiological study on the Genetics and Environment of Asthma (EGEA) examined between 1991 and 1995. The 4-class severity score integrated clinical events and type of treatment. The 5-level asthma score is based only on the occurrence of symptoms. Nitrogen dioxide (NO2), sulphur dioxide (SO2) and ozone (O3) concentrations were assigned to each residence using two different methods. The first was based on the closest monitor data from 1991–1995. The second consisted in spatial models that used geostatistical interpolations and then assigned air pollutants to the geo-coded residences (1998). Results Higher asthma severity score was significantly related to the 8-hour average of ozone during April-September (O3-8hr) and the number of days (O3-days) with 8-hour ozone averages above 110 μg.m−3 (for a 36-day increase, equivalent to the inter quartile range, in O3-days, odds ratio (95% confidence interval) 2.22 (1.61–3.07) for one class difference in score). Adjustment for age, sex, smoking habits, occupational exposure, and educational level did not alter results. Asthma severity was unrelated to NO2. Both exposure assessment methods and severity scores resulted in very similar findings. SO2 correlated with severity but reached statistical significance only for the model based assignment of exposure. Conclusions The observed associations between asthma severity and air pollution, in particular O3, support the hypothesis that air pollution at levels far below current standards increases asthma severity. PMID:19017701
Model diagnostics in reduced-rank estimation
Chen, Kun
2016-01-01
Reduced-rank methods are very popular in high-dimensional multivariate analysis for conducting simultaneous dimension reduction and model estimation. However, the commonly-used reduced-rank methods are not robust, as the underlying reduced-rank structure can be easily distorted by only a few data outliers. Anomalies are bound to exist in big data problems, and in some applications they themselves could be of the primary interest. While naive residual analysis is often inadequate for outlier detection due to potential masking and swamping, robust reduced-rank estimation approaches could be computationally demanding. Under Stein's unbiased risk estimation framework, we propose a set of tools, including leverage score and generalized information score, to perform model diagnostics and outlier detection in large-scale reduced-rank estimation. The leverage scores give an exact decomposition of the so-called model degrees of freedom to the observation level, which lead to exact decomposition of many commonly-used information criteria; the resulting quantities are thus named information scores of the observations. The proposed information score approach provides a principled way of combining the residuals and leverage scores for anomaly detection. Simulation studies confirm that the proposed diagnostic tools work well. A pattern recognition example with hand-writing digital images and a time series analysis example with monthly U.S. macroeconomic data further demonstrate the efficacy of the proposed approaches. PMID:28003860
Model diagnostics in reduced-rank estimation.
Chen, Kun
2016-01-01
Reduced-rank methods are very popular in high-dimensional multivariate analysis for conducting simultaneous dimension reduction and model estimation. However, the commonly-used reduced-rank methods are not robust, as the underlying reduced-rank structure can be easily distorted by only a few data outliers. Anomalies are bound to exist in big data problems, and in some applications they themselves could be of the primary interest. While naive residual analysis is often inadequate for outlier detection due to potential masking and swamping, robust reduced-rank estimation approaches could be computationally demanding. Under Stein's unbiased risk estimation framework, we propose a set of tools, including leverage score and generalized information score, to perform model diagnostics and outlier detection in large-scale reduced-rank estimation. The leverage scores give an exact decomposition of the so-called model degrees of freedom to the observation level, which lead to exact decomposition of many commonly-used information criteria; the resulting quantities are thus named information scores of the observations. The proposed information score approach provides a principled way of combining the residuals and leverage scores for anomaly detection. Simulation studies confirm that the proposed diagnostic tools work well. A pattern recognition example with hand-writing digital images and a time series analysis example with monthly U.S. macroeconomic data further demonstrate the efficacy of the proposed approaches.
NASA Astrophysics Data System (ADS)
Jensen-Ruopp, Helga Spitko
A comparison of hands-on inquiry instruction with lecture instruction was presented to 134 Patterns and Process Biology students. Students participated in seven biology lessons that were selected from Biology Survey of Living Things (1992). A pre and post paper and pencil assessment was used as the data collecting instrument. The treatment group was taught using hands-on inquiry strategies while the non-treatment group was taught in the lecture method of instruction. The team teaching model was used as the mode of presentation to the treatment group and the non-treatment group. Achievement levels using specific criterion; novice (0% to 50%), developing proficiency (51% to 69%), accomplished (70% to 84) and exceptional or mastery level (85% to 100%) were used as a guideline to tabulate the results of the pre and post assessment. Rubric tabulation was done to interpret the testing results. The raw data was plotted using percentage change in test score totals versus reading level score by gender as well as percentage change in test score totals versus auditory vocabulary score by gender. Box Whisker plot comparative descriptive of individual pre and post test scores for the treatment and non-treatment group was performed. Analysis of covariance (ANCOVA) using MINITAB Statistical Software version 14.11 was run on data of the seven lessons, as well as on gender (male results individual and combined, and female results individual and combined) results. Normal Probability Plots for total scores as well as individual test scores were performed. The results suggest that hands-on inquiry based instruction when presented to special needs students including; at-risk; English as a second language limited, English proficiency and special education inclusive students' learning may enhance individual student achievement.
Knight, Sarah; Heinrich, Antje
2017-01-01
Inhibition—the ability to suppress goal-irrelevant information—is thought to be an important cognitive skill in many situations, including speech-in-noise (SiN) perception. One way to measure inhibition is by means of Stroop tasks, in which one stimulus dimension must be named while a second, more prepotent dimension is ignored. The to-be-ignored dimension may be relevant or irrelevant to the target dimension, and the inhibition measure—Stroop interference (SI)—is calculated as the reaction time difference between the relevant and irrelevant conditions. Both SiN perception and inhibition are suggested to worsen with age, yet attempts to connect age-related declines in these two abilities have produced mixed results. We suggest that the inconsistencies between studies may be due to methodological issues surrounding the use of Stroop tasks. First, the relationship between SI and SiN perception may differ depending on the modality of the Stroop task; second, the traditional SI measure may not account for generalized slowing or sensory declines, and thus may not provide a pure interference measure. We investigated both claims in a group of 50 older adults, who performed two Stroop tasks (visual and auditory) and two SiN perception tasks. For each Stroop task, we calculated interference scores using both the traditional difference measure and methods designed to address its various problems, and compared the ability of these different scoring methods to predict SiN performance, alone and in combination with hearing sensitivity. Results from the two Stroop tasks were uncorrelated and had different relationships to SiN perception. Changing the scoring method altered the nature of the predictive relationship between Stroop scores and SiN perception, which was additionally influenced by hearing sensitivity. These findings raise questions about the extent to which different Stroop tasks and/or scoring methods measure the same aspect of cognition. They also highlight the importance of considering additional variables such as hearing ability when analyzing cognitive variables. PMID:28367129
NASA Astrophysics Data System (ADS)
Zeyl, Timothy; Yin, Erwei; Keightley, Michelle; Chau, Tom
2016-04-01
Objective. Error-related potentials (ErrPs) have the potential to guide classifier adaptation in BCI spellers, for addressing non-stationary performance as well as for online optimization of system parameters, by providing imperfect or partial labels. However, the usefulness of ErrP-based labels for BCI adaptation has not been established in comparison to other partially supervised methods. Our objective is to make this comparison by retraining a two-step P300 speller on a subset of confident online trials using naïve labels taken from speller output, where confidence is determined either by (i) ErrP scores, (ii) posterior target scores derived from the P300 potential, or (iii) a hybrid of these scores. We further wish to evaluate the ability of partially supervised adaptation and retraining methods to adjust to a new stimulus-onset asynchrony (SOA), a necessary step towards online SOA optimization. Approach. Eleven consenting able-bodied adults attended three online spelling sessions on separate days with feedback in which SOAs were set at 160 ms (sessions 1 and 2) and 80 ms (session 3). A post hoc offline analysis and a simulated online analysis were performed on sessions two and three to compare multiple adaptation methods. Area under the curve (AUC) and symbols spelled per minute (SPM) were the primary outcome measures. Main results. Retraining using supervised labels confirmed improvements of 0.9 percentage points (session 2, p < 0.01) and 1.9 percentage points (session 3, p < 0.05) in AUC using same-day training data over using data from a previous day, which supports classifier adaptation in general. Significance. Using posterior target score alone as a confidence measure resulted in the highest SPM of the partially supervised methods, indicating that ErrPs are not necessary to boost the performance of partially supervised adaptive classification. Partial supervision significantly improved SPM at a novel SOA, showing promise for eventual online SOA optimization.
An International Ki67 Reproducibility Study
2013-01-01
Background In breast cancer, immunohistochemical assessment of proliferation using the marker Ki67 has potential use in both research and clinical management. However, lack of consistency across laboratories has limited Ki67’s value. A working group was assembled to devise a strategy to harmonize Ki67 analysis and increase scoring concordance. Toward that goal, we conducted a Ki67 reproducibility study. Methods Eight laboratories received 100 breast cancer cases arranged into 1-mm core tissue microarrays—one set stained by the participating laboratory and one set stained by the central laboratory, both using antibody MIB-1. Each laboratory scored Ki67 as percentage of positively stained invasive tumor cells using its own method. Six laboratories repeated scoring of 50 locally stained cases on 3 different days. Sources of variation were analyzed using random effects models with log2-transformed measurements. Reproducibility was quantified by intraclass correlation coefficient (ICC), and the approximate two-sided 95% confidence intervals (CIs) for the true intraclass correlation coefficients in these experiments were provided. Results Intralaboratory reproducibility was high (ICC = 0.94; 95% CI = 0.93 to 0.97). Interlaboratory reproducibility was only moderate (central staining: ICC = 0.71, 95% CI = 0.47 to 0.78; local staining: ICC = 0.59, 95% CI = 0.37 to 0.68). Geometric mean of Ki67 values for each laboratory across the 100 cases ranged 7.1% to 23.9% with central staining and 6.1% to 30.1% with local staining. Factors contributing to interlaboratory discordance included tumor region selection, counting method, and subjective assessment of staining positivity. Formal counting methods gave more consistent results than visual estimation. Conclusions Substantial variability in Ki67 scoring was observed among some of the world’s most experienced laboratories. Ki67 values and cutoffs for clinical decision-making cannot be transferred between laboratories without standardizing scoring methodology because analytical validity is limited. PMID:24203987
Gelderman, H T; Boer, L; Naujocks, T; IJzermans, A C M; Duijst, W L J M
2018-05-01
The decomposition process of human remains can be used to estimate the post-mortem interval (PMI), but decomposition varies due to many factors. Temperature is believed to be the most important and can be connected to decomposition by using the accumulated degree days (ADD). The aim of this research was to develop a decomposition scoring method and to develop a formula to estimate the PMI by using the developed decomposition scoring method and ADD.A decomposition scoring method and a Book of Reference (visual resource) were made. Ninety-one cases were used to develop a method to estimate the PMI. The photographs were scored using the decomposition scoring method. The temperature data was provided by the Royal Netherlands Meteorological Institute. The PMI was estimated using the total decomposition score (TDS) and using the TDS and ADD. The latter required an additional step, namely to calculate the ADD from the finding date back until the predicted day of death.The developed decomposition scoring method had a high interrater reliability. The TDS significantly estimates the PMI (R 2 = 0.67 and 0.80 for indoor and outdoor bodies, respectively). When using the ADD, the R 2 decreased to 0.66 and 0.56.The developed decomposition scoring method is a practical method to measure decomposition for human remains found on land. The PMI can be estimated using this method, but caution is advised in cases with a long PMI. The ADD does not account for all the heat present in a decomposing remain and is therefore a possible bias.
Estimating Total-Test Scores from Partial Scores in a Matrix Sampling Design.
ERIC Educational Resources Information Center
Sachar, Jane; Suppes, Patrick
1980-01-01
The present study compared six methods, two of which utilize the content structure of items, to estimate total-test scores using 450 students and 60 items of the 110-item Stanford Mental Arithmetic Test. Three methods yielded fairly good estimates of the total-test score. (Author/RL)
Mining for class-specific motifs in protein sequence classification
2013-01-01
Background In protein sequence classification, identification of the sequence motifs or n-grams that can precisely discriminate between classes is a more interesting scientific question than the classification itself. A number of classification methods aim at accurate classification but fail to explain which sequence features indeed contribute to the accuracy. We hypothesize that sequences in lower denominations (n-grams) can be used to explore the sequence landscape and to identify class-specific motifs that discriminate between classes during classification. Discriminative n-grams are short peptide sequences that are highly frequent in one class but are either minimally present or absent in other classes. In this study, we present a new substitution-based scoring function for identifying discriminative n-grams that are highly specific to a class. Results We present a scoring function based on discriminative n-grams that can effectively discriminate between classes. The scoring function, initially, harvests the entire set of 4- to 8-grams from the protein sequences of different classes in the dataset. Similar n-grams of the same size are combined to form new n-grams, where the similarity is defined by positive amino acid substitution scores in the BLOSUM62 matrix. Substitution has resulted in a large increase in the number of discriminatory n-grams harvested. Due to the unbalanced nature of the dataset, the frequencies of the n-grams are normalized using a dampening factor, which gives more weightage to the n-grams that appear in fewer classes and vice-versa. After the n-grams are normalized, the scoring function identifies discriminative 4- to 8-grams for each class that are frequent enough to be above a selection threshold. By mapping these discriminative n-grams back to the protein sequences, we obtained contiguous n-grams that represent short class-specific motifs in protein sequences. Our method fared well compared to an existing motif finding method known as Wordspy. We have validated our enriched set of class-specific motifs against the functionally important motifs obtained from the NLSdb, Prosite and ELM databases. We demonstrate that this method is very generic; thus can be widely applied to detect class-specific motifs in many protein sequence classification tasks. Conclusion The proposed scoring function and methodology is able to identify class-specific motifs using discriminative n-grams derived from the protein sequences. The implementation of amino acid substitution scores for similarity detection, and the dampening factor to normalize the unbalanced datasets have significant effect on the performance of the scoring function. Our multipronged validation tests demonstrate that this method can detect class-specific motifs from a wide variety of protein sequence classes with a potential application to detecting proteome-specific motifs of different organisms. PMID:23496846
NASA Astrophysics Data System (ADS)
Mardi Safitri, Dian; Arfi Nabila, Zahra; Azmi, Nora
2018-03-01
Musculoskeletal Disorders (MSD) is one of the ergonomic risks due to manual activity, non-neutral posture and repetitive motion. The purpose of this study is to measure risk and implement ergonomic interventions to reduce the risk of MSD on the paper pallet assembly work station. Measurements to work posture are done by Ovako Working Posture Analysis (OWAS) methods and Rapid Entire Body Assessment (REBA) method, while the measurement of work repetitiveness was using Strain Index (SI) method. Assembly processes operators are identified has the highest risk level. OWAS score, Strain Index, and REBA values are 4, 20.25, and 11. Ergonomic improvements are needed to reduce that level of risk. Proposed improvements will be developed using the Quality Function Deployment (QFD) method applied with Axiomatic House of Quality (AHOQ) and Morphological Chart. As the result, risk level based on OWAS score & REBA score turn out from 4 & 11 to be 1 & 2. Biomechanics analysis of the operator also shows the decreasing values for L4-L5 moment, compression, joint shear, and joint moment strength.
Evaluation of a Teaching Assistant Program for Third-Year Pharmacy Students.
Bradley, Courtney L; Khanova, Julia; Scolaro, Kelly L
2016-11-25
Objectives. To determine if a teaching assistant (TA) program for third-year pharmacy students (PY3s) improves confidence in teaching abilities. Additionally, 3 assessment methods (faculty, student, and TA self-evaluations) were compared for similarities and correlations. Methods. An application and interview process was used to select 21 pharmacy students to serve as TAs for the Pharmaceutical Care Laboratory course for 2 semesters. Participants' self-perceived confidence in teaching abilities was assessed at the start, midpoint, and conclusion of the program. The relationships between the scores were analyzed using 3 assessment methods. Results. All 21 TAs agreed to participate in the study and completed the 2 teaching semesters. The TAs confidence in overall teaching abilities increased significantly (80.7 vs 91.4, p <0.001). There was a significant difference between the three assessment scores in the fall ( p =0.027) and spring ( p <0.001) semesters. However, no correlation was found among the assessment scores. Conclusions. The TA program was effective in improving confidence in teaching abilities. The lack of correlation among the assessment methods highlights the importance of various forms of feedback.
NASA Astrophysics Data System (ADS)
Widyaningsih, Purnami; Retno Sari Saputro, Dewi; Nugrahani Putri, Aulia
2017-06-01
GWOLR model combines geographically weighted regression (GWR) and (ordinal logistic reression) OLR models. Its parameter estimation employs maximum likelihood estimation. Such parameter estimation, however, yields difficult-to-solve system of nonlinear equations, and therefore numerical approximation approach is required. The iterative approximation approach, in general, uses Newton-Raphson (NR) method. The NR method has a disadvantage—its Hessian matrix is always the second derivatives of each iteration so it does not always produce converging results. With regard to this matter, NR model is modified by substituting its Hessian matrix into Fisher information matrix, which is termed Fisher scoring (FS). The present research seeks to determine GWOLR model parameter estimation using Fisher scoring method and apply the estimation on data of the level of vulnerability to Dengue Hemorrhagic Fever (DHF) in Semarang. The research concludes that health facilities give the greatest contribution to the probability of the number of DHF sufferers in both villages. Based on the number of the sufferers, IR category of DHF in both villages can be determined.
Meta-heuristic CRPS minimization for the calibration of short-range probabilistic forecasts
NASA Astrophysics Data System (ADS)
Mohammadi, Seyedeh Atefeh; Rahmani, Morteza; Azadi, Majid
2016-08-01
This paper deals with the probabilistic short-range temperature forecasts over synoptic meteorological stations across Iran using non-homogeneous Gaussian regression (NGR). NGR creates a Gaussian forecast probability density function (PDF) from the ensemble output. The mean of the normal predictive PDF is a bias-corrected weighted average of the ensemble members and its variance is a linear function of the raw ensemble variance. The coefficients for the mean and variance are estimated by minimizing the continuous ranked probability score (CRPS) during a training period. CRPS is a scoring rule for distributional forecasts. In the paper of Gneiting et al. (Mon Weather Rev 133:1098-1118, 2005), Broyden-Fletcher-Goldfarb-Shanno (BFGS) method is used to minimize the CRPS. Since BFGS is a conventional optimization method with its own limitations, we suggest using the particle swarm optimization (PSO), a robust meta-heuristic method, to minimize the CRPS. The ensemble prediction system used in this study consists of nine different configurations of the weather research and forecasting model for 48-h forecasts of temperature during autumn and winter 2011 and 2012. The probabilistic forecasts were evaluated using several common verification scores including Brier score, attribute diagram and rank histogram. Results show that both BFGS and PSO find the optimal solution and show the same evaluation scores, but PSO can do this with a feasible random first guess and much less computational complexity.
Prognostic Value of AIMS65 Score in Cirrhotic Patients with Upper Gastrointestinal Bleeding.
Gaduputi, Vinaya; Abdulsamad, Molham; Tariq, Hassan; Rafeeq, Ahmed; Abbas, Naeem; Kumbum, Kavitha; Chilimuri, Sridhar
2014-01-01
Introduction. Unlike Rockall scoring system, AIMS65 is based only on clinical and laboratory features. In this study we investigated the correlation between the AIMS65 score and Endoscopic Rockall score, in cirrhotic and noncirrhotic patients. Methods. This is a retrospective study of patients admitted with overt UGIB and undergoing esophagogastroduodenoscopy (EGD). AIMS65 and Rockall scores were calculated at the time of admission. We investigated the correlation between both scores along with stigmata of bleed seen on endoscopy. Results. A total of 1255 patients were studied. 152 patients were cirrhotic while 1103 patients were noncirrhotic. There was significant correlation between AIMS65 and Total Rockall scores in patients of both groups. There was significant correlation between AIMS65 score and Endoscopic Rockall score in noncirrhotics but not cirrhotics. AIMS65 scores in both cirrhotic and noncirrhotic groups were significantly higher in patients who died from UGIB than in patients who did not. Conclusion. We observed statistically significant correlation between AIMS65 score and length of hospitalization and mortality in noncirrhotic patients. We found that AIMS65 score paralleled the endoscopic grading of lesion causing UGIB in noncirrhotics. AIMS65 score correlated only with mortality but not the length of hospitalization or endoscopic stigmata of bleed in cirrhotics.
The Kernel Levine Equipercentile Observed-Score Equating Function. Research Report. ETS RR-13-38
ERIC Educational Resources Information Center
von Davier, Alina A.; Chen, Haiwen
2013-01-01
In the framework of the observed-score equating methods for the nonequivalent groups with anchor test design, there are 3 fundamentally different ways of using the information provided by the anchor scores to equate the scores of a new form to those of an old form. One method uses the anchor scores as a conditioning variable, such as the Tucker…
Comparison of Comorbidity Collection Methods
Kallogjeri, Dorina; Gaynor, Sheila M; Piccirillo, Marilyn L; Jean, Raymond A; Spitznagel, Edward L; Piccirillo, Jay F
2014-01-01
Background Multiple valid comorbidity indices exist to quantify the presence and role of comorbidities in cancer patient survival. Our goal was to compare chart-based Adult Comorbidity Evaluation-27 index (ACE-27), and claims-based Charlson Comorbidity Index (CCI) methods of identifying comorbid ailments, and their prognostic ability. Study Design Prospective cohort study of 6138 newly-diagnosed cancer patients at 12 different institutions. Participating registrars were trained to collect comorbidities from the abstracted chart using the ACE-27 method. ACE-27 assessment was compared with comorbidities captured through hospital discharge face-sheets using ICD-coding. The prognostic accomplishments of each comorbidity method was examined using follow-up data assessed at 24 months after data abstraction. Results Distribution of the ACE-27 scores was: “None” for 1453 (24%) of the patients; “Mild” for 2388 (39%); “Moderate” for 1344 (22%) and “Severe” for 950 (15%) of the patients. Deyo’s adaption of the Charlson Comorbidity Index (CCI) identified 4265 (69%) patients with a CCI score of 0, and the remaining 31% had CCI scores of 1 (n=1341, 22%), 2 (n=365, 6%), or 3 or more (n=167, 3%). Of the 4265 patients with a CCI score of 0, 394 (9%) were coded with severe comorbidities based on ACE-27 method. A higher comorbidity score was significantly associated with higher risk of death for both comorbidity indices. The multivariable Cox model including both comorbidity indices had the best performance (Nagelkerke’s R-square=0.37) and the best discrimination (c-index=0.827). Conclusion The number, type, and overall severity of comorbid ailments identified by chart- and claims-based approaches in newly-diagnosed cancer patients were notably different. Both indices were prognostically significant and able to provide unique prognostic information. PMID:24933715
Pilot study of an automated method to determine Melasma Area and Severity Index.
Tay, E Y; Gan, E Y; Tan, V W D; Lin, Z; Liang, Y; Lin, F; Wee, S; Thng, T G
2015-06-01
Objective outcome measures for melasma severity are essential for the evaluation of severity as well as results of treatment. The modified Melasma Area and Severity Index (mMASI) score is a validated tool for assessing melasma severity but is often subject to inter-observer variability. To develop and validate a novel image analysis software designed to automatically calculate the area and degree of hyperpigmentation in melasma from computer image analysis of whole-face digital photographs, thereby deriving an automated mMASI score (aMASI). The algorithm was developed in collaboration between dermatologists and image analysis experts. Firstly, using an adaptive threshold method, the algorithm identifies, segments and calculates the areas involved. It then calculates the darkness. Finally, the derived area and darkness are then used to calculate mMASI. The scores derived from the algorithm are validated prospectively. Twenty-nine patients with melasma using depigmenting agents were recruited for validation. Three dermatologists scored mMASI at baseline and post-treatment using standardized photographs. These scores were compared with aMASI scores derived from computer analysis. aMASI scores correlated well with clinical mMASI pre-treatment (r = 0·735, P < 0·001) and post-treatment (r = 0·608, P < 0·001). aMASI was reliable in detecting changes with treatment. These changes in aMASI scores correlated well with changes in clinician-assessed mMASI (r = 0·622, P < 0·001). This study proposes a novel approach in melasma scoring using digital image analysis. It holds promise as a tool that would enable clinicians worldwide to standardize melasma severity scoring and outcome measures in an easy and reproducible manner, enabling different treatment options to be compared accurately. © 2015 British Association of Dermatologists.
Towards accurate modeling of noncovalent interactions for protein rigidity analysis.
Fox, Naomi; Streinu, Ileana
2013-01-01
Protein rigidity analysis is an efficient computational method for extracting flexibility information from static, X-ray crystallography protein data. Atoms and bonds are modeled as a mechanical structure and analyzed with a fast graph-based algorithm, producing a decomposition of the flexible molecule into interconnected rigid clusters. The result depends critically on noncovalent atomic interactions, primarily on how hydrogen bonds and hydrophobic interactions are computed and modeled. Ongoing research points to the stringent need for benchmarking rigidity analysis software systems, towards the goal of increasing their accuracy and validating their results, either against each other and against biologically relevant (functional) parameters. We propose two new methods for modeling hydrogen bonds and hydrophobic interactions that more accurately reflect a mechanical model, without being computationally more intensive. We evaluate them using a novel scoring method, based on the B-cubed score from the information retrieval literature, which measures how well two cluster decompositions match. To evaluate the modeling accuracy of KINARI, our pebble-game rigidity analysis system, we use a benchmark data set of 20 proteins, each with multiple distinct conformations deposited in the Protein Data Bank. Cluster decompositions for them were previously determined with the RigidFinder method from Gerstein's lab and validated against experimental data. When KINARI's default tuning parameters are used, an improvement of the B-cubed score over a crude baseline is observed in 30% of this data. With our new modeling options, improvements were observed in over 70% of the proteins in this data set. We investigate the sensitivity of the cluster decomposition score with case studies on pyruvate phosphate dikinase and calmodulin. To substantially improve the accuracy of protein rigidity analysis systems, thorough benchmarking must be performed on all current systems and future extensions. We have measured the gain in performance by comparing different modeling methods for noncovalent interactions. We showed that new criteria for modeling hydrogen bonds and hydrophobic interactions can significantly improve the results. The two new methods proposed here have been implemented and made publicly available in the current version of KINARI (v1.3), together with the benchmarking tools, which can be downloaded from our software's website, http://kinari.cs.umass.edu.
The effect of kangaroo mother care on mental health of mothers with low birth weight infants
Badiee, Zohreh; Faramarzi, Salar; MiriZadeh, Tahereh
2014-01-01
Background: The mothers of premature infants are at risk of psychological stress because of separation from their infants. One of the methods influencing the maternal mental health in the postpartum period is kangaroo mother care (KMC). This study was conducted to evaluate the effect of KMC of low birth weight infants on their maternal mental health. Materials and Methods: The study was conducted in the Department of Pediatrics of Isfahan University of Medical Sciences, Isfahan, Iran. Premature infants were randomly allocated into two groups. The control group received standard caring in the incubator. In the experimental group, caring with three sessions of 60 min KMC daily for 1 week was practiced. Mental health scores of the mothers were evaluated by using the 28-item General Health Questionnaire. Statistical analysis was performed by the analysis of covariance using SPSS. Results: The scores of 50 infant-mother pairs were analyzed totally (25 in KMC group and 25 in standard care group). Results of covariance analysis showed the positive effects of KMC on the rate of maternal mental health scores. There were statistically significant differences between the mean scores of the experimental group and control subjects in the posttest period (P < 0.001). Conclusion: KMC for low birth weight infants is a safe way to improve maternal mental health. Therefore, it is suggested as a useful method that can be recommended for improving the mental health of mothers. PMID:25371871
ERIC Educational Resources Information Center
Rapp, John T.; Carroll, Regina A.; Stangeland, Lindsay; Swanson, Greg; Higgins, William J.
2011-01-01
The authors evaluated the extent to which interobserver agreement (IOA) scores, using the block-by-block method for events scored with continuous duration recording (CDR), were higher when the data from the same sessions were converted to discontinuous methods. Sessions with IOA scores of 89% or less with CDR were rescored using 10-s partial…
Pilger, Jens; Mazur, Adam; Monecke, Peter; Schreuder, Herman; Elshorst, Bettina; Bartoschek, Stefan; Langer, Thomas; Schiffer, Alexander; Krimm, Isabelle; Wegstroth, Melanie; Lee, Donghan; Hessler, Gerhard; Wendt, K-Ulrich; Becker, Stefan; Griesinger, Christian
2015-05-26
Structure-based drug design (SBDD) is a powerful and widely used approach to optimize affinity of drug candidates. With the recently introduced INPHARMA method, the binding mode of small molecules to their protein target can be characterized even if no spectroscopic information about the protein is known. Here, we show that the combination of the spin-diffusion-based NMR methods INPHARMA, trNOE, and STD results in an accurate scoring function for docking modes and therefore determination of protein-ligand complex structures. Applications are shown on the model system protein kinase A and the drug targets glycogen phosphorylase and soluble epoxide hydrolase (sEH). Multiplexing of several ligands improves the reliability of the scoring function further. The new score allows in the case of sEH detecting two binding modes of the ligand in its binding site, which was corroborated by X-ray analysis. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Further evidence for the increased power of LOD scores compared with nonparametric methods.
Durner, M; Vieland, V J; Greenberg, D A
1999-01-01
In genetic analysis of diseases in which the underlying model is unknown, "model free" methods-such as affected sib pair (ASP) tests-are often preferred over LOD-score methods, although LOD-score methods under the correct or even approximately correct model are more powerful than ASP tests. However, there might be circumstances in which nonparametric methods will outperform LOD-score methods. Recently, Dizier et al. reported that, in some complex two-locus (2L) models, LOD-score methods with segregation analysis-derived parameters had less power to detect linkage than ASP tests. We investigated whether these particular models, in fact, represent a situation that ASP tests are more powerful than LOD scores. We simulated data according to the parameters specified by Dizier et al. and analyzed the data by using a (a) single locus (SL) LOD-score analysis performed twice, under a simple dominant and a recessive mode of inheritance (MOI), (b) ASP methods, and (c) nonparametric linkage (NPL) analysis. We show that SL analysis performed twice and corrected for the type I-error increase due to multiple testing yields almost as much linkage information as does an analysis under the correct 2L model and is more powerful than either the ASP method or the NPL method. We demonstrate that, even for complex genetic models, the most important condition for linkage analysis is that the assumed MOI at the disease locus being tested is approximately correct, not that the inheritance of the disease per se is correctly specified. In the analysis by Dizier et al., segregation analysis led to estimates of dominance parameters that were grossly misspecified for the locus tested in those models in which ASP tests appeared to be more powerful than LOD-score analyses.
Hamann, Claus; Volkan, Kevin; Fishman, Mary B; Silvestri, Ronald C; Simon, Steven R; Fletcher, Suzanne W
2002-01-01
Background Little is known about using the Objective Structured Clinical Examination (OSCE) in physical diagnosis courses. The purpose of this study was to describe student performance on an OSCE in a physical diagnosis course. Methods Cross-sectional study at Harvard Medical School, 1997–1999, for 489 second-year students. Results Average total OSCE score was 57% (range 39–75%). Among clinical skills, students scored highest on patient interaction (72%), followed by examination technique (65%), abnormality identification (62%), history-taking (60%), patient presentation (60%), physical examination knowledge (47%), and differential diagnosis (40%) (p < .0001). Among 16 OSCE stations, scores ranged from 70% for arthritis to 29% for calf pain (p < .0001). Teaching sites accounted for larger adjusted differences in station scores, up to 28%, than in skill scores (9%) (p < .0001). Conclusions Students scored higher on interpersonal and technical skills than on interpretive or integrative skills. Station scores identified specific content that needs improved teaching. PMID:11888484
Pantazes, Robert J; Saraf, Manish C; Maranas, Costas D
2007-08-01
In this paper, we introduce and test two new sequence-based protein scoring systems (i.e. S1, S2) for assessing the likelihood that a given protein hybrid will be functional. By binning together amino acids with similar properties (i.e. volume, hydrophobicity and charge) the scoring systems S1 and S2 allow for the quantification of the severity of mismatched interactions in the hybrids. The S2 scoring system is found to be able to significantly functionally enrich a cytochrome P450 library over other scoring methods. Given this scoring base, we subsequently constructed two separate optimization formulations (i.e. OPTCOMB and OPTOLIGO) for optimally designing protein combinatorial libraries involving recombination or mutations, respectively. Notably, two separate versions of OPTCOMB are generated (i.e. model M1, M2) with the latter allowing for position-dependent parental fragment skipping. Computational benchmarking results demonstrate the efficacy of models OPTCOMB and OPTOLIGO to generate high scoring libraries of a prespecified size.
Hypothesis Testing Using Factor Score Regression
Devlieger, Ines; Mayer, Axel; Rosseel, Yves
2015-01-01
In this article, an overview is given of four methods to perform factor score regression (FSR), namely regression FSR, Bartlett FSR, the bias avoiding method of Skrondal and Laake, and the bias correcting method of Croon. The bias correcting method is extended to include a reliable standard error. The four methods are compared with each other and with structural equation modeling (SEM) by using analytic calculations and two Monte Carlo simulation studies to examine their finite sample characteristics. Several performance criteria are used, such as the bias using the unstandardized and standardized parameterization, efficiency, mean square error, standard error bias, type I error rate, and power. The results show that the bias correcting method, with the newly developed standard error, is the only suitable alternative for SEM. While it has a higher standard error bias than SEM, it has a comparable bias, efficiency, mean square error, power, and type I error rate. PMID:29795886
Four Bootstrap Confidence Intervals for the Binomial-Error Model.
ERIC Educational Resources Information Center
Lin, Miao-Hsiang; Hsiung, Chao A.
1992-01-01
Four bootstrap methods are identified for constructing confidence intervals for the binomial-error model. The extent to which similar results are obtained and the theoretical foundation of each method and its relevance and ranges of modeling the true score uncertainty are discussed. (SLD)
Measuring patient's expectation and the perception of quality in LASIK services.
Lin, Deng-Juin; Sheu, Ing-Cheau; Pai, Jar-Yuan; Bair, Alex; Hung, Che-Yu; Yeh, Yuan-Hung; Chou, Ming-Jen
2009-07-10
LASIK is the use of excimer lasers to treat therapeutic and refractive visual disorders, ranging from superficial scars to nearsightedness (myopia), and from astigmatism to farsightedness (hyperopia). The purposes of this study are to checking the applicability and psychometric properties of the SERVQUAL on Lasik surgery population. Second, use SEM methods to investigate the loyalty, perceptions and expectations relationship on LASIK surgery. The method with which this study was conducted was questionnaire development. A total of 463 consecutive patients, attending LASIK surgery affiliated with Chung Shan Medical University Eye Center, enrolled in this study. All participants were asked to complete revised SERVQUAL questionnaires. Student t test, correlation test, and ANOVA and factor analyses were used to identify the characters and factors of service quality. Paired t test were used to test the gap between expectation and perception scores and structural equation modeling was used to examine relationships among satisfaction components. The effective response rate was 97.3%. Validity was verified by several methods and internal reliability Cronbach's alpha was > 0.958. The results from patient's scores were very high with an overall score of 6.41(0.66), expectations at 6.68(0.47), and perceptions at 6.51(0.57). The gap between expectations and perceptions was significant, however, (t = 6.08). Furthermore, there were significant differences in the expectation scores among the different jobs. Also, the results showed that the higher the education of the patient, the lower their perception score (r = -0.10). The factor loading results of factor analysis showed 5 factors of the 22 items of the SERVQUAL model. The 5 factors of perception explained 72.94% of the total variance there; and on expectations it explained 77.12% of the total variance of satisfaction scores.The goodness-of-fit summary, of structure equation modeling, showed trends in concept on expectations, perceptions, and loyalty. The results of this research appear to show that the SERVQUAL instrument is a useful measurement tool in assessing and monitoring service quality in LASIK service, and enabling staff to identify where improvements are needed, from the patients' perspective. There were service quality gaps in the reliability, assurance, and empathy. This study suggested that physicians should increase their discussions with patients; which has, of course, already been proven to be an effective way to increase patient's satisfaction with medical care, regardless of the procedure received.
Waltrick, Renata; Possamai, Dimitri Sauter; de Aguiar, Fernanda Perito; Dadam, Micheli; de Souza Filho, Valmir João; Ramos, Lucas Rocker; Laurett, Renata da Silva; Fujiwara, Kênia; Caldeira Filho, Milton; Koenig, Álvaro; Westphal, Glauco Adrieno
2015-01-01
>To evaluate the agreement between a new epidemiological surveillance method of the Center for Disease Control and Prevention and the clinical pulmonary infection score for mechanical ventilator-associated pneumonia detection. This was a prospective cohort study that evaluated patients in the intensive care units of two hospitals who were intubated for more than 48 hours between August 2013 and June 2014. Patients were evaluated daily by physical therapist using the clinical pulmonary infection score. A nurse independently applied the new surveillance method proposed by the Center for Disease Control and Prevention. The diagnostic agreement between the methods was evaluated. A clinical pulmonary infection score of ≥ 7 indicated a clinical diagnosis of mechanical ventilator-associated pneumonia, and the association of a clinical pulmonary infection score ≥ 7 with an isolated semiquantitative culture consisting of ≥ 104 colony-forming units indicated a definitive diagnosis. Of the 801 patients admitted to the intensive care units, 198 required mechanical ventilation. Of these, 168 were intubated for more than 48 hours. A total of 18 (10.7%) cases of mechanical ventilation-associated infectious conditions were identified, 14 (8.3%) of which exhibited possible or probable mechanical ventilator-associated pneumonia, which represented 35% (14/38) of mechanical ventilator-associated pneumonia cases. The Center for Disease Control and Prevention method identified cases of mechanical ventilator-associated pneumonia with a sensitivity of 0.37, specificity of 1.0, positive predictive value of 1.0, and negative predictive value of 0.84. The differences resulted in discrepancies in the mechanical ventilator-associated pneumonia incidence density (CDC, 5.2/1000 days of mechanical ventilation; clinical pulmonary infection score ≥ 7, 13.1/1000 days of mechanical ventilation). The Center for Disease Control and Prevention method failed to detect mechanical ventilator-associated pneumonia cases and may not be satisfactory as a surveillance method.
Rasch analysis of the Edmonton Symptom Assessment System and research implications
Cheifetz, O.; Packham, T.L.; MacDermid, J.C.
2014-01-01
Background Reliable and valid assessment of the disease burden across all forms of cancer is critical to the evaluation of treatment effectiveness and patient progress. The Edmonton Symptom Assessment System (esas) is used for routine evaluation of people attending for cancer care. In the present study, we used Rasch analysis to explore the measurement properties of the esas and to determine the effect of using Rasch-proposed interval-level esas scoring compared with traditional scoring when evaluating the effects of an exercise program for cancer survivors. Methods Polytomous Rasch analysis (Andrich’s rating-scale model) was applied to data from 26,645 esas questionnaires completed at the Juravinski Cancer Centre. The fit of the esas to the polytomous Rasch model was investigated, including evaluations of differential item functioning for sex, age, and disease group. The research implication was investigated by comparing the results of an observational research study previously analysed using a traditional approach with the results obtained by Rasch-proposed interval-level esas scoring. Results The Rasch reliability index was 0.73, falling short of the desired 0.80–0.90 level. However, the esas was found to fit the Rasch model, including the criteria for uni-dimensional data. The analysis suggests that the current esas scoring system of 0–10 could be collapsed to a 6-point scale. Use of the Rasch-proposed interval-level scoring yielded results that were different from those calculated using summarized ordinal-level esas scores. Differential item functioning was not found for sex, age, or diagnosis groups. Conclusions The esas is a moderately reliable uni-dimensional measure of cancer disease burden and can provide interval-level scaling with Rasch-based scoring. Further, our study indicates that, compared with the traditional scoring metric, Rasch-based scoring could result in substantive changes to conclusions. PMID:24764703
A Study of Mental Health Literacy Among North Korean Refugees in South Korea
Noh, Jin-Won; Kwon, Young Dae; Yu, Sieun; Park, Hyunchun; Woo, Jong-Min
2015-01-01
Objectives: This study aimed to investigate North Korean refugees’ knowledge of mental illnesses and treatments and analyze the factors affecting this knowledge. Methods: Subjects were selected via a snowball sampling method, and the survey outcomes of 152 North Korean refugee participants were analyzed. The factors affecting knowledge of mental illnesses were analyzed via a regression analysis by constructing a multivariate model with mental illness knowledge score as the dependent variable. Results: The North Korean refugees’ mental illness scores ranged from 3 to 24 points, with an average score of 13.0. Regarding the factors that influence mental illness knowledge, the subjects with South Korean spouses and those who had spent more time in South Korea had higher knowledge scores. Furthermore, the subjects who considered the mental health of North Korean refugees to be a serious issue revealed lower knowledge scores than those who did not believe it was a serious issue. The subjects who visit psychiatric clinics showed higher knowledge scores than those who do not. The South Korean subjects who had at least a college education exhibited higher scores than did those without advanced education. The subjects who are satisfied with life in South Korea manifested a higher mental illness knowledge score than those who are not. Conclusions: This study is significant as being the first study to ever measure and evaluate the level of North Korean refugees’ knowledge of mental illnesses. In addition, the evaluations of North Korean refugees’ mental illness knowledge and influencing factors while residing in South Korea created basic data that formed the foundation of an effort to enhance mental health literacy and provide proper mental health services. The results of this study can be utilized to solve mental health problems that might frequently occur during the unification process of North and South Korea in the future. PMID:25652712
2011-01-01
Background Nonparametric item response theory (IRT) was used to examine (a) the performance of the 30 Positive and Negative Syndrome Scale (PANSS) items and their options ((levels of severity), (b) the effectiveness of various subscales to discriminate among differences in symptom severity, and (c) the development of an abbreviated PANSS (Mini-PANSS) based on IRT and a method to link scores to the original PANSS. Methods Baseline PANSS scores from 7,187 patients with Schizophrenia or Schizoaffective disorder who were enrolled between 1995 and 2005 in psychopharmacology trials were obtained. Option characteristic curves (OCCs) and Item Characteristic Curves (ICCs) were constructed to examine the probability of rating each of seven options within each of 30 PANSS items as a function of subscale severity, and summed-score linking was applied to items selected for the Mini-PANSS. Results The majority of items forming the Positive and Negative subscales (i.e. 19 items) performed very well and discriminate better along symptom severity compared to the General Psychopathology subscale. Six of the seven Positive Symptom items, six of the seven Negative Symptom items, and seven out of the 16 General Psychopathology items were retained for inclusion in the Mini-PANSS. Summed score linking and linear interpolation was able to produce a translation table for comparing total subscale scores of the Mini-PANSS to total subscale scores on the original PANSS. Results show scores on the subscales of the Mini-PANSS can be linked to scores on the original PANSS subscales, with very little bias. Conclusions The study demonstrated the utility of non-parametric IRT in examining the item properties of the PANSS and to allow selection of items for an abbreviated PANSS scale. The comparisons between the 30-item PANSS and the Mini-PANSS revealed that the shorter version is comparable to the 30-item PANSS, but when applying IRT, the Mini-PANSS is also a good indicator of illness severity. PMID:22087503
Fuglsang-Damgaard, David; Nielsen, Camilla Houlberg; Mandrup, Elisabeth; Fuursted, Kurt
2011-10-01
Matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF-MS) is promising as an alternative to more costly and cumbersome methods for direct identifications in blood cultures. We wanted to evaluate a simplified pre-treatment method for using MALDI-TOF-MS directly on positive blood cultures using BacT/Alert blood culture system, and to test an algorithm combining the result of the initial microscopy with the result suggested by MALDI-TOF-MS. Using the recommended cut-off score of 1.7 the best results were obtained among Gram-negative rods with correct identifications in 91% of Enterobacteriaceae, 83% in aerobic/non-fermentative Gram-negative rods, whereas results were more modest among Gram-positive cocci with correct identifications in 52% of Staphylococci, 54% in Enterococci and only 20% in Streptococci. Combining the results of Gram stain with the top reports by MALDI-TOF-MS, increased the sensitivity from 91% to 93% in the score range from 1.5 to 1.7 and from 48% to 85% in the score range from 1.3 to 1.5. Thus, using this strategy and accepting a cut-off at 1.3 instead of the suggested 1.7, overall sensitivity could be increased from 88.1% to 96.3%. MALDI-TOF-MS is an efficient method for direct routine identification of bacterial isolates in blood culture, especially when combined with the result of the Gram stain. © 2011 The Authors. APMIS © 2011 APMIS.
Matsumoto, Hiroshi; Saito, Fumiyo; Takeyoshi, Masahiro
2015-12-01
Recently, the development of several gene expression-based prediction methods has been attempted in the fields of toxicology. CARCINOscreen® is a gene expression-based screening method to predict carcinogenicity of chemicals which target the liver with high accuracy. In this study, we investigated the applicability of the gene expression-based screening method to SD and Wistar rats by using CARCINOscreen®, originally developed with F344 rats, with two carcinogens, 2,4-diaminotoluen and thioacetamide, and two non-carcinogens, 2,6-diaminotoluen and sodium benzoate. After the 28-day repeated dose test was conducted with each chemical in SD and Wistar rats, microarray analysis was performed using total RNA extracted from each liver. Obtained gene expression data were applied to CARCINOscreen®. Predictive scores obtained by the CARCINOscreen® for known carcinogens were > 2 in all strains of rats, while non-carcinogens gave prediction scores below 0.5. These results suggested that the gene expression based screening method, CARCINOscreen®, can be applied to SD and Wistar rats, widely used strains in toxicological studies, by setting of an appropriate boundary line of prediction score to classify the chemicals into carcinogens and non-carcinogens.
Shimizu, Hironori; Isoda, Hiroyoshi; Ohno, Tsuyoshi; Yamashita, Rikiya; Kawahara, Seiya; Furuta, Akihiro; Fujimoto, Koji; Kido, Aki; Kusahara, Hiroshi; Togashi, Kaori
2015-01-01
To compare and evaluate images of non-contrast enhanced magnetic resonance (MR) portography and hepatic venography acquired with two different fat suppression methods, the chemical shift selective (CHESS) method and short tau inversion recovery (STIR) method. Twenty-two healthy volunteers were examined using respiratory-triggered three-dimensional true steady-state free-precession with two time-spatial labeling inversion pulses. The CHESS or STIR methods were used for fat suppression. The relative signal-to-noise ratio and contrast-to-noise ratio (CNR) were quantified, and the quality of visualization was scored. Image acquisition was successfully conducted in all volunteers. The STIR method significantly improved the CNRs of MR portography and hepatic venography. The image quality scores of main portal vein and right portal vein were higher with the STIR method, but there were no significant differences. The image quality scores of right hepatic vein, middle hepatic vein, and left hepatic vein (LHV) were all higher, and the visualization of LHV was significantly better (p<0.05). The STIR method contributes to further suppression of the background signal and improves visualization of the portal and hepatic veins. The results support using non-contrast-enhanced MR portography and hepatic venography in clinical practice. Copyright © 2014 Elsevier Inc. All rights reserved.
Walking on a User Similarity Network towards Personalized Recommendations
Gan, Mingxin
2014-01-01
Personalized recommender systems have been receiving more and more attention in addressing the serious problem of information overload accompanying the rapid evolution of the world-wide-web. Although traditional collaborative filtering approaches based on similarities between users have achieved remarkable success, it has been shown that the existence of popular objects may adversely influence the correct scoring of candidate objects, which lead to unreasonable recommendation results. Meanwhile, recent advances have demonstrated that approaches based on diffusion and random walk processes exhibit superior performance over collaborative filtering methods in both the recommendation accuracy and diversity. Building on these results, we adopt three strategies (power-law adjustment, nearest neighbor, and threshold filtration) to adjust a user similarity network from user similarity scores calculated on historical data, and then propose a random walk with restart model on the constructed network to achieve personalized recommendations. We perform cross-validation experiments on two real data sets (MovieLens and Netflix) and compare the performance of our method against the existing state-of-the-art methods. Results show that our method outperforms existing methods in not only recommendation accuracy and diversity, but also retrieval performance. PMID:25489942
Novel and Practical Scoring Systems for the Diagnosis of Thyroid Nodules
Wei, Ying; Zhou, Xinrong; Liu, Siyue; Wang, Hong; Liu, Limin; Liu, Renze; Kang, Jinsong; Hong, Kai; Wang, Daowen; Yuan, Gang
2016-01-01
Objective The clinical management of patients with thyroid nodules that are biopsied by fine-needle aspiration cytology and yield indeterminate results remains unsettled. The BRAF V600E mutation has dubious diagnostic value due to its low sensitivity. Novel strategies are urgently needed to distinguish thyroid malignancies from thyroid nodules. Design This prospective study included 504 thyroid nodules diagnosed by ultrasonography from 468 patients, and fine-needle aspiration cytology was performed under ultrasound guidance. Cytology and molecular analysis, including BRAF V600E, RET/PTC1 and RET/PTC3, were conducted simultaneously. The cytology, ultrasonography results, and mutational status were gathered and analyzed together. Predictive scoring systems were designed using a combination of diagnostic parameters for ultrasonography, cytology and genetic analysis. The utility of the scoring systems was analyzed and compared to detection using the individual methods alone or combined. Result The sensitivity of scoring systema (ultrasonography, cytology, BRAF V600E, RET/PTC) was nearly identical to that of scoring systemb (ultrasonography, cytology, BRAF V600E); these were 91.0% and 90.2%, respectively. These sensitivities were significantly higher than those obtained using FNAC, genetic analysis and US alone or combined; their sensitivities were 63.9%, 70.7% and 87.2%, respectively. Scoring systemc (ultrasonography, cytology) was slightly inferior to the former two scoring systems but still had relatively high sensitivity and specificity (80.5% and 95.1%, respectively), which were significantly superior to those of single cytology, ultrasonography or genetic analysis. In nodules with uncertainty cytology, scoring systema, scoring systemb and scoring systemc could elevate the malignancy detection rates to 69.7%, 69.7% and 63.6%, respectively. Conclusion These three scoring systems were quick for clinicians to master and could provide quantified information to predict the probability of malignant nodules. Scoring systemb is recommended for improving the detection rate among nodules of uncertain cytology. PMID:27654865
Pletcher, Mark J; Tice, Jeffrey A; Pignone, Michael; McCulloch, Charles; Callister, Tracy Q; Browner, Warren S
2004-01-01
Background The coronary artery calcium (CAC) score is an independent predictor of coronary heart disease. We sought to combine information from the CAC score with information from conventional cardiac risk factors to produce post-test risk estimates, and to determine whether the score may add clinically useful information. Methods We measured the independent cross-sectional associations between conventional cardiac risk factors and the CAC score among asymptomatic persons referred for non-contrast electron beam computed tomography. Using the resulting multivariable models and published CAC score-specific relative risk estimates, we estimated post-test coronary heart disease risk in a number of different scenarios. Results Among 9341 asymptomatic study participants (age 35–88 years, 40% female), we found that conventional coronary heart disease risk factors including age, male sex, self-reported hypertension, diabetes and high cholesterol were independent predictors of the CAC score, and we used the resulting multivariable models for predicting post-test risk in a variety of scenarios. Our models predicted, for example, that a 60-year-old non-smoking non-diabetic women with hypertension and high cholesterol would have a 47% chance of having a CAC score of zero, reducing her 10-year risk estimate from 15% (per Framingham) to 6–9%; if her score were over 100, however (a 17% chance), her risk estimate would be markedly higher (25–51% in 10 years). In low risk scenarios, the CAC score is very likely to be zero or low, and unlikely to change management. Conclusion Combining information from the CAC score with information from conventional risk factors can change assessment of coronary heart disease risk to an extent that may be clinically important, especially when the pre-test 10-year risk estimate is intermediate. The attached spreadsheet makes these calculations easy. PMID:15327691
Ferreira, António Miguel; Marques, Hugo; Tralhão, António; Santos, Miguel Borges; Santos, Ana Rita; Cardoso, Gonçalo; Dores, Hélder; Carvalho, Maria Salomé; Madeira, Sérgio; Machado, Francisco Pereira; Cardim, Nuno; de Araújo Gonçalves, Pedro
2016-11-01
Current guidelines recommend the use of the Modified Diamond-Forrester (MDF) method to assess the pre-test likelihood of obstructive coronary artery disease (CAD). We aimed to compare the performance of the MDF method with two contemporary algorithms derived from multicenter trials that additionally incorporate cardiovascular risk factors: the calculator-based 'CAD Consortium 2' method, and the integer-based CONFIRM score. We assessed 1069 consecutive patients without known CAD undergoing coronary CT angiography (CCTA) for stable chest pain. Obstructive CAD was defined as the presence of coronary stenosis ≥50% on 64-slice dual-source CT. The three methods were assessed for calibration, discrimination, net reclassification, and changes in proposed downstream testing based upon calculated pre-test likelihoods. The observed prevalence of obstructive CAD was 13.8% (n=147). Overestimations of the likelihood of obstructive CAD were 140.1%, 9.8%, and 18.8%, respectively, for the MDF, CAD Consortium 2 and CONFIRM methods. The CAD Consortium 2 showed greater discriminative power than the MDF method, with a C-statistic of 0.73 vs. 0.70 (p<0.001), while the CONFIRM score did not (C-statistic 0.71, p=0.492). Reclassification of pre-test likelihood using the 'CAD Consortium 2' or CONFIRM scores resulted in a net reclassification improvement of 0.19 and 0.18, respectively, which would change the diagnostic strategy in approximately half of the patients. Newer risk factor-encompassing models allow for a more precise estimation of pre-test probabilities of obstructive CAD than the guideline-recommended MDF method. Adoption of these scores may improve disease prediction and change the diagnostic pathway in a significant proportion of patients. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Prediction of distal residue participation in enzyme catalysis.
Brodkin, Heather R; DeLateur, Nicholas A; Somarowthu, Srinivas; Mills, Caitlyn L; Novak, Walter R; Beuning, Penny J; Ringe, Dagmar; Ondrechen, Mary Jo
2015-05-01
A scoring method for the prediction of catalytically important residues in enzyme structures is presented and used to examine the participation of distal residues in enzyme catalysis. Scores are based on the Partial Order Optimum Likelihood (POOL) machine learning method, using computed electrostatic properties, surface geometric features, and information obtained from the phylogenetic tree as input features. Predictions of distal residue participation in catalysis are compared with experimental kinetics data from the literature on variants of the featured enzymes; some additional kinetics measurements are reported for variants of Pseudomonas putida nitrile hydratase (ppNH) and for Escherichia coli alkaline phosphatase (AP). The multilayer active sites of P. putida nitrile hydratase and of human phosphoglucose isomerase are predicted by the POOL log ZP scores, as is the single-layer active site of P. putida ketosteroid isomerase. The log ZP score cutoff utilized here results in over-prediction of distal residue involvement in E. coli alkaline phosphatase. While fewer experimental data points are available for P. putida mandelate racemase and for human carbonic anhydrase II, the POOL log ZP scores properly predict the previously reported participation of distal residues. 2015 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
Clark, D G; Kapur, P; Geldmacher, D S; Brockington, J C; Harrell, L; DeRamus, T P; Blanton, P D; Lokken, K; Nicholas, A P; Marson, D C
2014-06-01
We constructed random forest classifiers employing either the traditional method of scoring semantic fluency word lists or new methods. These classifiers were then compared in terms of their ability to diagnose Alzheimer disease (AD) or to prognosticate among individuals along the continuum from cognitively normal (CN) through mild cognitive impairment (MCI) to AD. Semantic fluency lists from 44 cognitively normal elderly individuals, 80 MCI patients, and 41 AD patients were transcribed into electronic text files and scored by four methods: traditional raw scores, clustering and switching scores, "generalized" versions of clustering and switching, and a method based on independent components analysis (ICA). Random forest classifiers based on raw scores were compared to "augmented" classifiers that incorporated newer scoring methods. Outcome variables included AD diagnosis at baseline, MCI conversion, increase in Clinical Dementia Rating-Sum of Boxes (CDR-SOB) score, or decrease in Financial Capacity Instrument (FCI) score. Receiver operating characteristic (ROC) curves were constructed for each classifier and the area under the curve (AUC) was calculated. We compared AUC between raw and augmented classifiers using Delong's test and assessed validity and reliability of the augmented classifier. Augmented classifiers outperformed classifiers based on raw scores for the outcome measures AD diagnosis (AUC .97 vs. .95), MCI conversion (AUC .91 vs. .77), CDR-SOB increase (AUC .90 vs. .79), and FCI decrease (AUC .89 vs. .72). Measures of validity and stability over time support the use of the method. Latent information in semantic fluency word lists is useful for predicting cognitive and functional decline among elderly individuals at increased risk for developing AD. Modern machine learning methods may incorporate latent information to enhance the diagnostic value of semantic fluency raw scores. These methods could yield information valuable for patient care and clinical trial design with a relatively small investment of time and money. Published by Elsevier Ltd.
ERIC Educational Resources Information Center
Stone, Clement A.; Tang, Yun
2013-01-01
Propensity score applications are often used to evaluate educational program impact. However, various options are available to estimate both propensity scores and construct comparison groups. This study used a student achievement dataset with commonly available covariates to compare different propensity scoring estimation methods (logistic…
Core, Cynthia; Hoff, Erika; Rumiche, Rosario; Señor, Melissa
2015-01-01
Purpose Vocabulary assessment holds promise as a way to identify young bilingual children at risk for language delay. This study compares 2 measures of vocabulary in a group of young Spanish–English bilingual children to a single-language measure used with monolingual children. Method Total vocabulary and conceptual vocabulary were used to measure mean vocabulary size and growth in 47 Spanish–English bilingually developing children from 22 to 30 months of age based on results from the MacArthur–Bates Communicative Development Inventory (CDI; Fenson et al., 1993) and the Inventario del Desarrollo de Habilidades Comunicativas (Jackson-Maldonado et al., 2003). Bilingual children’s scores of total vocabulary and conceptual vocabulary were compared with CDI scores for a control group of 56 monolingual children. Results The total vocabulary measure resulted in mean vocabulary scores and average rate of growth similar to monolingual growth, whereas conceptual vocabulary scores were significantly smaller and grew at a slower rate than total vocabulary scores. Total vocabulary identified the same proportion of bilingual children below the 25th percentile on monolingual norms as the CDI did for monolingual children. Conclusion These results support the use of total vocabulary as a means of assessing early language development in young bilingual Spanish–English speaking children. PMID:24023382
A decision support tool for selecting the optimal sewage sludge treatment.
Turunen, Ville; Sorvari, Jaana; Mikola, Anna
2018-02-01
Sewage sludge contains significant amounts of resources, such as nutrients and organic matter. At the same time, the organic contaminants (OC) found in sewage sludge are of growing concern. Consequently, in many European countries incineration is currently favored over recycling in agriculture. This study presents a Multi-Attribute Value Theory (MAVT)-based decision support tool (DST) for facilitating sludge treatment decisions. Essential decision criteria were recognized and prioritized, i.e., weighted, by experts from water utilities. Since the fate of organic contaminants was in focus, a simple scoring method was developed to take into account their environmental risks. The final DST assigns each sludge treatment method a preference score expressing its superiority compared to alternative methods. The DST was validated by testing it with data from two Finnish municipal wastewater treatment plants (WWTP). The validation results of the first case study preferred sludge pyrolysis (preference score: 0.629) to other alternatives: composting and incineration (score 0.580, and 0.484 respectively). The preference scores were influenced by WWTP dependent factors, i.e., the operating environment and the weighting of the criteria. A lack of data emerged as the main practical limitation. Therefore, not all of the relevant criteria could be included in the value tree. More data are needed on the effects of treatment methods on the availability of nutrients, the quality of organic matter and sludge-borne OCs. Despite these shortcomings, the DST proved useful and adaptable in decision-making. It can also help achieve a more transparent, understandable and comprehensive decision-making process. Copyright © 2017 Elsevier Ltd. All rights reserved.
Computerized summary scoring: crowdsourcing-based latent semantic analysis.
Li, Haiying; Cai, Zhiqiang; Graesser, Arthur C
2017-11-03
In this study we developed and evaluated a crowdsourcing-based latent semantic analysis (LSA) approach to computerized summary scoring (CSS). LSA is a frequently used mathematical component in CSS, where LSA similarity represents the extent to which the to-be-graded target summary is similar to a model summary or a set of exemplar summaries. Researchers have proposed different formulations of the model summary in previous studies, such as pregraded summaries, expert-generated summaries, or source texts. The former two methods, however, require substantial human time, effort, and costs in order to either grade or generate summaries. Using source texts does not require human effort, but it also does not predict human summary scores well. With human summary scores as the gold standard, in this study we evaluated the crowdsourcing LSA method by comparing it with seven other LSA methods that used sets of summaries from different sources (either experts or crowdsourced) of differing quality, along with source texts. Results showed that crowdsourcing LSA predicted human summary scores as well as expert-good and crowdsourcing-good summaries, and better than the other methods. A series of analyses with different numbers of crowdsourcing summaries demonstrated that the number (from 10 to 100) did not significantly affect performance. These findings imply that crowdsourcing LSA is a promising approach to CSS, because it saves human effort in generating the model summary while still yielding comparable performance. This approach to small-scale CSS provides a practical solution for instructors in courses, and also advances research on automated assessments in which student responses are expected to semantically converge on subject matter content.
Chung, Jinyong; Yoo, Kwangsun; Lee, Peter; Kim, Chan Mi; Roh, Jee Hoon; Park, Ji Eun; Kim, Sang Joon; Seo, Sang Won; Shin, Jeong-Hyeon; Seong, Joon-Kyung; Jeong, Yong
2017-10-01
The use of different 3D T1-weighted magnetic resonance (T1 MR) imaging protocols induces image incompatibility across multicenter studies, negating the many advantages of multicenter studies. A few methods have been developed to address this problem, but significant image incompatibility still remains. Thus, we developed a novel and convenient method to improve image compatibility. W-score standardization creates quality reference values by using a healthy group to obtain normalized disease values. We developed a protocol-specific w-score standardization to control the protocol effect, which is applied to each protocol separately. We used three data sets. In dataset 1, brain T1 MR images of normal controls (NC) and patients with Alzheimer's disease (AD) from two centers, acquired with different T1 MR protocols, were used (Protocol 1 and 2, n = 45/group). In dataset 2, data from six subjects, who underwent MRI with two different protocols (Protocol 1 and 2), were used with different repetition times, echo times, and slice thicknesses. In dataset 3, T1 MR images from a large number of healthy normal controls (Protocol 1: n = 148, Protocol 2: n = 343) were collected for w-score standardization. The protocol effect and disease effect on subjects' cortical thickness were analyzed before and after the application of protocol-specific w-score standardization. As expected, different protocols resulted in differing cortical thickness measurements in both NC and AD subjects. Different measurements were obtained for the same subject when imaged with different protocols. Multivariate pattern difference between measurements was observed between the protocols. Classification accuracy between two protocols was nearly 90%. After applying protocol-specific w-score standardization, the differences between the protocols substantially decreased. Most importantly, protocol-specific w-score standardization reduced both univariate and multivariate differences in the images while maintaining the AD disease effect. Compared to conventional regression methods, our method showed the best performance for in terms of controlling the protocol effect while preserving disease information. Protocol-specific w-score standardization effectively resolved the concerns of conventional regression methods. It showed the best performance for improving the compatibility of a T1 MR post-processed feature, cortical thickness. Copyright © 2017 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Hirano, Mitsuharu; Tonosaki, Shozo; Ueno, Takahiro; Tanaka, Masato; Hasegawa, Takemi
2014-02-01
We report an improved method to visualize lipid distribution in axial and lateral direction within arterial vessel walls by spectroscopic spectral-domain Optical Coherence Tomography (OCT) at 1.7μm wavelength for identification of lipidrich plaque that is suspected to cause coronary events. In our previous method, an extended InGaAs-based line camera detects an OCT interferometric spectrum from 1607 to 1766 nm, which is then divided into twenty subbands, and A-scan OCT profile is calculated for each subband, resulting in a tomographic spectrum. This tomographic spectrum is decomposed into lipid spectrum having an attenuation peak at 1730 nm and non-lipid spectrum independent of wavelength, and the weight of each spectrum, that is, lipid and non-lipid score is calculated. In this paper, we present an improved algorithm, in which we have combined the lipid score and the non-lipid score to derive a corrected lipid score. We have found that the corrected lipid score is better than the raw lipid score in that the former is more robust against false positive occurring due to abrupt change in reflectivity at vessel surface. In addition, we have optimized spatial smoothing filter and reduced false positive and false negative due to detection noise and speckle. We have verified this improved algorithm by the use of measuring data of normal porcine coronary artery and lard as a model of lipid-rich plaque and confirmed that both the sensitivity and the specificity of lard are 92%.
Romm, H; Ainsbury, E; Bajinskis, A; Barnard, S; Barquinero, J F; Barrios, L; Beinke, C; Puig-Casanovas, R; Deperas-Kaminska, M; Gregoire, E; Oestreicher, U; Lindholm, C; Moquet, J; Rothkamm, K; Sommer, S; Thierens, H; Vral, A; Vandersickel, V; Wojcik, A
2014-05-01
In the case of a large scale radiation accident high throughput methods of biological dosimetry for population triage are needed to identify individuals requiring clinical treatment. The dicentric assay performed in web-based scoring mode may be a very suitable technique. Within the MULTIBIODOSE EU FP7 project a network is being established of 8 laboratories with expertise in dose estimations based on the dicentric assay. Here, the manual dicentric assay was tested in a web-based scoring mode. More than 23,000 high resolution images of metaphase spreads (only first mitosis) were captured by four laboratories and established as image galleries on the internet (cloud). The galleries included images of a complete dose effect curve (0-5.0 Gy) and three types of irradiation scenarios simulating acute whole body, partial body and protracted exposure. The blood samples had been irradiated in vitro with gamma rays at the University of Ghent, Belgium. Two laboratories provided image galleries from Fluorescence plus Giemsa stained slides (3 h colcemid) and the image galleries from the other two laboratories contained images from Giemsa stained preparations (24 h colcemid). Each of the 8 participating laboratories analysed 3 dose points of the dose effect curve (scoring 100 cells for each point) and 3 unknown dose points (50 cells) for each of the 3 simulated irradiation scenarios. At first all analyses were performed in a QuickScan Mode without scoring individual chromosomes, followed by conventional scoring (only complete cells, 46 centromeres). The calibration curves obtained using these two scoring methods were very similar, with no significant difference in the linear-quadratic curve coefficients. Analysis of variance showed a significant effect of dose on the yield of dicentrics, but no significant effect of the laboratories, different methods of slide preparation or different incubation times used for colcemid. The results obtained to date within the MULTIBIODOSE project by a network of 8 collaborating laboratories throughout Europe are very promising. The dicentric assay in the web based scoring mode as a high throughput scoring strategy is a useful application for biodosimetry in the case of a large scale radiation accident.
Kernel Equating Under the Non-Equivalent Groups With Covariates Design
Bränberg, Kenny
2015-01-01
When equating two tests, the traditional approach is to use common test takers and/or common items. Here, the idea is to use variables correlated with the test scores (e.g., school grades and other test scores) as a substitute for common items in a non-equivalent groups with covariates (NEC) design. This is performed in the framework of kernel equating and with an extension of the method developed for post-stratification equating in the non-equivalent groups with anchor test design. Real data from a college admissions test were used to illustrate the use of the design. The equated scores from the NEC design were compared with equated scores from the equivalent group (EG) design, that is, equating with no covariates as well as with equated scores when a constructed anchor test was used. The results indicate that the NEC design can produce lower standard errors compared with an EG design. When covariates were used together with an anchor test, the smallest standard errors were obtained over a large range of test scores. The results obtained, that an EG design equating can be improved by adjusting for differences in test score distributions caused by differences in the distribution of covariates, are useful in practice because not all standardized tests have anchor tests. PMID:29881012
Kernel Equating Under the Non-Equivalent Groups With Covariates Design.
Wiberg, Marie; Bränberg, Kenny
2015-07-01
When equating two tests, the traditional approach is to use common test takers and/or common items. Here, the idea is to use variables correlated with the test scores (e.g., school grades and other test scores) as a substitute for common items in a non-equivalent groups with covariates (NEC) design. This is performed in the framework of kernel equating and with an extension of the method developed for post-stratification equating in the non-equivalent groups with anchor test design. Real data from a college admissions test were used to illustrate the use of the design. The equated scores from the NEC design were compared with equated scores from the equivalent group (EG) design, that is, equating with no covariates as well as with equated scores when a constructed anchor test was used. The results indicate that the NEC design can produce lower standard errors compared with an EG design. When covariates were used together with an anchor test, the smallest standard errors were obtained over a large range of test scores. The results obtained, that an EG design equating can be improved by adjusting for differences in test score distributions caused by differences in the distribution of covariates, are useful in practice because not all standardized tests have anchor tests.
ERIC Educational Resources Information Center
Chen, Hanwei; Cui, Zhongmin; Zhu, Rongchun; Gao, Xiaohong
2010-01-01
The most critical feature of a common-item nonequivalent groups equating design is that the average score difference between the new and old groups can be accurately decomposed into a group ability difference and a form difficulty difference. Two widely used observed-score linear equating methods, the Tucker and the Levine observed-score methods,…
ERIC Educational Resources Information Center
Gelfand, Stanley A.; Gelfand, Jessica T.
2012-01-01
Method: Complete psychometric functions for phoneme and word recognition scores at 8 signal-to-noise ratios from -15 dB to 20 dB were generated for the first 10, 20, and 25, as well as all 50, three-word presentations of the Tri-Word or Computer Assisted Speech Recognition Assessment (CASRA) Test (Gelfand, 1998) based on the results of 12…
ERIC Educational Resources Information Center
Jin, Ying; Myers, Nicholas D.; Ahn, Soyeon
2014-01-01
Previous research has demonstrated that differential item functioning (DIF) methods that do not account for multilevel data structure could result in too frequent rejection of the null hypothesis (i.e., no DIF) when the intraclass correlation coefficient (?) of the studied item was the same as the ? of the total score. The current study extended…
McDonald, Scott D; Thompson, NiVonne L; Stratton, Kelcey J; Calhoun, Patrick S
2014-03-01
Self-report questionnaires are frequently used to identify PTSD among U.S. military personnel and Veterans. Two common scoring methods used to classify PTSD include: (1) a cut score threshold and (2) endorsement of PTSD symptoms meeting DSM-IV-TR symptom cluster criteria (SCM). A third method requiring a cut score in addition to SCM has been proposed, but has received little study. The current study examined the diagnostic accuracy of three scoring methods for the Davidson Trauma Scale (DTS) among 804 Afghanistan and Iraq war-era military Service Members and Veterans. Data were weighted to approximate the prevalence of PTSD and other Axis I disorders in VA primary care. As expected, adding a cut score criterion to SCM improved specificity and positive predictive power. However, a cut score of 68-72 provided optimal diagnostic accuracy. The utility of the DTS, the role of baseline prevalence, and recommendations for future research are discussed. Published by Elsevier Ltd.
Hurault, G; Schram, M E; Roekevisch, E; Spuls, P I; Tanaka, R J
2018-06-26
The Harmonizing Outcome Measures for Eczema (HOME) recommended the Eczema Area and Severity Index (EASI) as the core outcome instrument for measuring the clinical signs of atopic dermatitis (AD). However, EASI may not have been used in previous clinical trials, and other scores, e.g. SCORAD (SCORing Atopic Dermatitis), the objective component of SCORAD (oSCORAD) and the Investigator Global Assessment (IGA), remain widely used. It is useful to establish a method to convert these scores into EASI to compare the results from different studies effectively. Indeed, EASI and oSCORAD have been found to be strongly correlated (r S pearman =0.92) 7 , suggesting a possibility to find a relationship between the two scores. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Shahhosseini, Zohreh; Hamzehgardeshi, Zeinab
2015-01-01
Background: Since several factors affect nurses’ participation in Continuing Education, and that nurses’ Continuing Education affects patients’ and community health status, it is essential to know facilitators and barriers of participation in Continuing Education programs and plan accordingly. This mixed approach study aimed to investigate the facilitators and barriers of nurses’ participation, to explore nurses’ perception of the most common facilitators and barriers. Methods: An explanatory sequential mixed methods design with follow up explanations variant were used, and it involved collecting quantitative data (361 nurses) first and then explaining the quantitative results with in-depth interviews during a qualitative study. Results: The results showed that the mean score of facilitators to nurses’ participation in Continuing Education was significantly higher than the mean score of barriers (61.99±10.85 versus 51.17±12.83; p<0.001, t=12.23). The highest mean score of facilitators of nurses’ participation in Continuing Education was related to “Update my knowledge”. By reviewing the handwritings in qualitative phase, two main levels of updating information and professional skills were extracted as the most common facilitators and lack of support as the most common barrier to nurses’ participation in continuing education program. Conclusion: According to important role Continuing Education on professional skills, nurse managers should facilitate the nurse’ participation in the Continues Education. PMID:25948439
Koohestani, Hamid Reza; Baghcheghi, Nayereh
2016-01-01
Background: Team-based learning is a structured type of cooperative learning that is becoming increasingly more popular in nursing education. This study compares levels of nursing students’ perception of the psychosocial climate of the classroom between conventional lecture group and team-based learning group. Methods: In a quasi-experimental study with pretest-posttest design 38 nursing students of second year participated. One half of the 16 sessions of cardiovascular disease nursing course sessions was taught by lectures and the second half with team-based learning. The modified college and university classroom environment inventory (CUCEI) was used to measure the perception of classroom environment. This was completed after the final lecture and TBL sessions. Results: Results revealed a significant difference in the mean scores of psycho-social climate for the TBL method (Mean (SD): 179.8(8.27)) versus the mean score for the lecture method (Mean (SD): 154.213.44)). Also, the results showed significant differences between the two groups in the innovation (p<0.001), student cohesiveness (p=0.01), cooperation (p<0.001) and equity (p= 0.03) sub-scales scores (p<0.05). Conclusion: This study provides evidence that team-based learning does have a positive effect on nursing students’ perceptions of their psycho-social climate of the classroom. PMID:28210602
AUC-based biomarker ensemble with an application on gene scores predicting low bone mineral density.
Zhao, X G; Dai, W; Li, Y; Tian, L
2011-11-01
The area under the receiver operating characteristic (ROC) curve (AUC), long regarded as a 'golden' measure for the predictiveness of a continuous score, has propelled the need to develop AUC-based predictors. However, the AUC-based ensemble methods are rather scant, largely due to the fact that the associated objective function is neither continuous nor concave. Indeed, there is no reliable numerical algorithm identifying optimal combination of a set of biomarkers to maximize the AUC, especially when the number of biomarkers is large. We have proposed a novel AUC-based statistical ensemble methods for combining multiple biomarkers to differentiate a binary response of interest. Specifically, we propose to replace the non-continuous and non-convex AUC objective function by a convex surrogate loss function, whose minimizer can be efficiently identified. With the established framework, the lasso and other regularization techniques enable feature selections. Extensive simulations have demonstrated the superiority of the new methods to the existing methods. The proposal has been applied to a gene expression dataset to construct gene expression scores to differentiate elderly women with low bone mineral density (BMD) and those with normal BMD. The AUCs of the resulting scores in the independent test dataset has been satisfactory. Aiming for directly maximizing AUC, the proposed AUC-based ensemble method provides an efficient means of generating a stable combination of multiple biomarkers, which is especially useful under the high-dimensional settings. lutian@stanford.edu. Supplementary data are available at Bioinformatics online.
Cheng, Shu-Fen; Rose, Susan
2009-01-01
This study investigated the technical adequacy of curriculum-based measures of written expression (CBM-W) in terms of writing prompts and scoring methods for deaf and hard-of-hearing students. Twenty-two students at the secondary school-level completed 3-min essays within two weeks, which were scored for nine existing and alternative curriculum-based measurement (CBM) scoring methods. The technical features of the nine scoring methods were examined for interrater reliability, alternate-form reliability, and criterion-related validity. The existing CBM scoring method--number of correct minus incorrect word sequences--yielded the highest reliability and validity coefficients. The findings from this study support the use of the CBM-W as a reliable and valid tool for assessing general writing proficiency with secondary students who are deaf or hard of hearing. The CBM alternative scoring methods that may serve as additional indicators of written expression include correct subject-verb agreements, correct clauses, and correct morphemes.
NASA Astrophysics Data System (ADS)
Crawford, I.; Ruske, S.; Topping, D. O.; Gallagher, M. W.
2015-11-01
In this paper we present improved methods for discriminating and quantifying primary biological aerosol particles (PBAPs) by applying hierarchical agglomerative cluster analysis to multi-parameter ultraviolet-light-induced fluorescence (UV-LIF) spectrometer data. The methods employed in this study can be applied to data sets in excess of 1 × 106 points on a desktop computer, allowing for each fluorescent particle in a data set to be explicitly clustered. This reduces the potential for misattribution found in subsampling and comparative attribution methods used in previous approaches, improving our capacity to discriminate and quantify PBAP meta-classes. We evaluate the performance of several hierarchical agglomerative cluster analysis linkages and data normalisation methods using laboratory samples of known particle types and an ambient data set. Fluorescent and non-fluorescent polystyrene latex spheres were sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4) where the optical size, asymmetry factor and fluorescent measurements were used as inputs to the analysis package. It was found that the Ward linkage with z-score or range normalisation performed best, correctly attributing 98 and 98.1 % of the data points respectively. The best-performing methods were applied to the BEACHON-RoMBAS (Bio-hydro-atmosphere interactions of Energy, Aerosols, Carbon, H2O, Organics and Nitrogen-Rocky Mountain Biogenic Aerosol Study) ambient data set, where it was found that the z-score and range normalisation methods yield similar results, with each method producing clusters representative of fungal spores and bacterial aerosol, consistent with previous results. The z-score result was compared to clusters generated with previous approaches (WIBS AnalysiS Program, WASP) where we observe that the subsampling and comparative attribution method employed by WASP results in the overestimation of the fungal spore concentration by a factor of 1.5 and the underestimation of bacterial aerosol concentration by a factor of 5. We suggest that this likely due to errors arising from misattribution due to poor centroid definition and failure to assign particles to a cluster as a result of the subsampling and comparative attribution method employed by WASP. The methods used here allow for the entire fluorescent population of particles to be analysed, yielding an explicit cluster attribution for each particle and improving cluster centroid definition and our capacity to discriminate and quantify PBAP meta-classes compared to previous approaches.
Arane, Karen; Mendelsohn, Kerry; Mimouni, Michael; Mimouni, Francis; Koren, Yael; Simon, Dafna Brik; Bahat, Hilla; Helou, Mona Hanna; Mendelson, Amir; Hezkelo, Nofar; Glatstein, Miguel; Berkun, Yackov; Eisenstein, Eli; Aviel, Yonatan Butbul; Brik, Riva; Hashkes, Philip J; Uziel, Yosef; Harel, Liora; Amarilyo, Gil
2018-05-24
This study assessed the validity of using established Japanese risk scoring methods to predict intravenous immunoglobulin (IVIG) resistance to Kawasaki disease in Israeli children. We reviewed the medical records of 282 patients (70% male) with Kawasaki disease from six Israeli medical centres between 2004-2013. Their mean age was 2.5 years. The risk scores were calculated using the Kobayashi, Sano and Egami scoring methods and analysed to determine if a higher risk score predicted IVIG resistance in this population. Factors that predicted a lack of response to the initial IVIG dose were identified. We found that 18% did not respond to the first IVIG dose. The three scoring methods were unable to reliably predict IVIG resistance, with sensitivities of 23-32% and specificities of 67-87%. Calculating a predictive score that was specific for this population was also unsuccessful. The factors that predicted a lacked of response to the first IVIG dose included low albumin, elevated total bilirubin and ethnicity. The established risk scoring methods created for Japanese populations with Kawasaki disease were not suitable for predicting IVIG resistance in Caucasian Israeli children and we were unable to create a specific scoring method that was able to do this. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
NASA Astrophysics Data System (ADS)
Tan, Maxine; Aghaei, Faranak; Wang, Yunzhi; Qian, Wei; Zheng, Bin
2016-03-01
Current commercialized CAD schemes have high false-positive (FP) detection rates and also have high correlations in positive lesion detection with radiologists. Thus, we recently investigated a new approach to improve the efficacy of applying CAD to assist radiologists in reading and interpreting screening mammograms. Namely, we developed a new global feature based CAD approach/scheme that can cue the warning sign on the cases with high risk of being positive. In this study, we investigate the possibility of fusing global feature or case-based scores with the local or lesion-based CAD scores using an adaptive cueing method. We hypothesize that the information from the global feature extraction (features extracted from the whole breast regions) are different from and can provide supplementary information to the locally-extracted features (computed from the segmented lesion regions only). On a large and diverse full-field digital mammography (FFDM) testing dataset with 785 cases (347 negative and 438 cancer cases with masses only), we ran our lesion-based and case-based CAD schemes "as is" on the whole dataset. To assess the supplementary information provided by the global features, we used an adaptive cueing method to adaptively adjust the original CAD-generated detection scores (Sorg) of a detected suspicious mass region based on the computed case-based score (Scase) of the case associated with this detected region. Using the adaptive cueing method, better sensitivity results were obtained at lower FP rates (<= 1 FP per image). Namely, increases of sensitivities (in the FROC curves) of up to 6.7% and 8.2% were obtained for the ROI and Case-based results, respectively.
Examining Classification Criteria: A Comparison of Three Cut Score Methods
ERIC Educational Resources Information Center
DiStefano, Christine; Morgan, Grant
2011-01-01
This study compared 3 different methods of creating cut scores for a screening instrument, T scores, receiver operating characteristic curve (ROC) analysis, and the Rasch rating scale method (RSM), for use with the Behavioral and Emotional Screening System (BESS) Teacher Rating Scale for Children and Adolescents (Kamphaus & Reynolds, 2007).…
Ryan, James G.; Barlas, David; Pollack, Simcha
2013-01-01
Background Medical knowledge (MK) in residents is commonly assessed by the in-training examination (ITE) and faculty evaluations of resident performance. Objective We assessed the reliability of clinical evaluations of residents by faculty and the relationship between faculty assessments of resident performance and ITE scores. Methods We conducted a cross-sectional, observational study at an academic emergency department with a postgraduate year (PGY)-1 to PGY-3 emergency medicine residency program, comparing summative, quarterly, faculty evaluation data for MK and overall clinical competency (OC) with annual ITE scores, accounting for PGY level. We also assessed the reliability of faculty evaluations using a random effects, intraclass correlation analysis. Results We analyzed data for 59 emergency medicine residents during a 6-year period. Faculty evaluations of MK and OC were highly reliable (κ = 0.99) and remained reliable after stratification by year of training (mean κ = 0.68–0.84). Assessments of resident performance (MK and OC) and the ITE increased with PGY level. The MK and OC results had high correlations with PGY level, and ITE scores correlated moderately with PGY. The OC and MK results had a moderate correlation with ITE score. When residents were grouped by PGY level, there was no significant correlation between MK as assessed by the faculty and the ITE score. Conclusions Resident clinical performance and ITE scores both increase with resident PGY level, but ITE scores do not predict resident clinical performance compared with peers at their PGY level. PMID:24455005
Harker, JO; Leung, JW; Siao-Salera, RM; Mann, SK; Ramirez, FC; Friedland, S; Amato, A; Radaelli, F; Paggi, S; Terruzzi, V; Hsieh, YH
2011-01-01
Introduction Variation in the outcomes in RcTs comparing water-related methods and air insufflation during the insertion phase of colonoscopy raises challenging questions regarding the approach. This report reviews the impact of water exchange on the variation in attenuation of pain during colonoscopy by water-related methods. Methods Medline (2008 to 2011) searches, abstracts of the 2011 Digestive Disease Week (DDW) and personal communications were considered to identify RcTs that compared water-related methods and air insufflation to aid insertion of the colonoscope. Results: Since 2008 nine published and one submitted RcTs and five abstracts of RcTs presented at the 2011 DDW have been identified. Thirteen RcTs (nine published, one submitted and one abstract, n=1850) described reduction of pain score during or after colonoscopy (eleven reported statistical significance); the remaining reports described lower doses of medication used, or lower proportion of patients experiencing severe pain in colonoscopy performed with water-related methods compared with air insufflation (Tables 1 and 2). The water-related methods notably differ in the timing of removal of the infused water - predominantly during insertion (water exchange) versus predominantly during withdrawal (water immersion). Use of water exchange was consistently associated with a greater attenuation of pain score in patients who did not receive full sedation (Table 3). Conclusion The comparative data reveal that a greater attenuation of pain was associated with water exchange than water immersion during insertion. The intriguing results should be subjected to further evaluation by additional RcTs to elucidate the mechanism of the pain-alleviating impact of the water method. PMID:22163081
Score Equating and Item Response Theory: Some Practical Considerations.
ERIC Educational Resources Information Center
Cook, Linda L.; Eignor, Daniel R.
The purposes of this paper are five-fold to discuss: (1) when item response theory (IRT) equating methods should provide better results than traditional methods; (2) which IRT model, the three-parameter logistic or the one-parameter logistic (Rasch), is the most reasonable to use; (3) what unique contributions IRT methods can offer the equating…
Chiong, Terri; Cheow, Esther S. H.; Woo, Chin C.; Lin, Xiao Y.; Khin, Lay W.; Lee, Chuen N.; Hartman, Mikael; Sze, Siu K.; Sorokin, Vitaly A.
2016-01-01
Aims: The SYNTAX score correlate with major cardiovascular events post-revascularization, although the histopathological basis is unclear. We aim to evaluate the association between syntax score and extracellular matrix histological characteristics of aortic punch tissue obtained during coronary artery bypass surgery (CABG). This analysis compares coronary artery bypass surgery patients with High and Low syntax score which were followed up for one year period. Methods and Results: Patients with High (score ≥ 33, (n=77)) and Low Syntax Scores (score ≤ 22, (n=71)) undergoing elective CABG were recruited prospectively. Baseline clinical characteristics and surgical risks were well matched. At 1 year, EMACCE (Sum of cardiovascular death, stroke, congestive cardiac failure, and limb, gut and myocardial ischemia) was significantly elevated in the High syntax group (P=0.022). Mass spectrometry (MS)-based quantitative iTRAQ proteomic results validated on independent cohort by immunohistochemistry (IHC) revealed that the High syntax group had significantly upraised Collagen I (P<0.0001) and Elastin (P<0.0001) content in ascending aortic wall. Conclusion: This study shows that aortic extracellular matrix (ECM) differ between High and Low syntax groups with up-regulation of Collagen I and Elastin level in High Syntax Score group. This identifies aortic punches collected during CABG as another biomarker source related with atherosclerosis severity and possible clinical outcome. PMID:27347220
Schagemann, Jan C.; Rudert, Nicola; Taylor, Michelle E.; Sim, Sotcheadt; Quenneville, Eric; Garon, Martin; Klinger, Mathias; Buschmann, Michael D.; Mittelstaedt, Hagen
2016-01-01
Objective To compare the regenerative capacity of 2 distinct bilayer implants for the restoration of osteochondral defects in a preliminary sheep model. Methods Critical sized osteochondral defects were treated with a novel biomimetic poly-ε-caprolactone (PCL) implant (Treatment No. 2; n = 6) or a combination of Chondro-Gide and Orthoss (Treatment No. 1; n = 6). At 19 months postoperation, repair tissue (n = 5 each) was analyzed for histology and biochemistry. Electromechanical mappings (Arthro-BST) were performed ex vivo. Results Histological scores, electromechanical quantitative parameter values, dsDNA and sGAG contents measured at the repair sites were statistically lower than those obtained from the contralateral surfaces. Electromechanical mappings and higher dsDNA and sGAG/weight levels indicated better regeneration for Treatment No. 1. However, these differences were not significant. For both treatments, Arthro-BST revealed early signs of degeneration of the cartilage surrounding the repair site. The International Cartilage Repair Society II histological scores of the repair tissue were significantly higher for Treatment No. 1 (10.3 ± 0.38 SE) compared to Treatment No. 2 (8.7 ± 0.45 SE). The parameters cell morphology and vascularization scored highest whereas tidemark formation scored the lowest. Conclusion There was cell infiltration and regeneration of bone and cartilage. However, repair was incomplete and fibrocartilaginous. There were no significant differences in the quality of regeneration between the treatments except in some histological scoring categories. The results from Arthro-BST measurements were comparable to traditional invasive/destructive methods of measuring quality of cartilage repair. PMID:27688843
EVALUATION OF SAFETY IN A RADIATION ONCOLOGY SETTING USING FAILURE MODE AND EFFECTS ANALYSIS
Ford, Eric C.; Gaudette, Ray; Myers, Lee; Vanderver, Bruce; Engineer, Lilly; Zellars, Richard; Song, Danny Y.; Wong, John; DeWeese, Theodore L.
2013-01-01
Purpose Failure mode and effects analysis (FMEA) is a widely used tool for prospectively evaluating safety and reliability. We report our experiences in applying FMEA in the setting of radiation oncology. Methods and Materials We performed an FMEA analysis for our external beam radiation therapy service, which consisted of the following tasks: (1) create a visual map of the process, (2) identify possible failure modes; assign risk probability numbers (RPN) to each failure mode based on tabulated scores for the severity, frequency of occurrence, and detectability, each on a scale of 1 to 10; and (3) identify improvements that are both feasible and effective. The RPN scores can span a range of 1 to 1000, with higher scores indicating the relative importance of a given failure mode. Results Our process map consisted of 269 different nodes. We identified 127 possible failure modes with RPN scores ranging from 2 to 160. Fifteen of the top-ranked failure modes were considered for process improvements, representing RPN scores of 75 and more. These specific improvement suggestions were incorporated into our practice with a review and implementation by each department team responsible for the process. Conclusions The FMEA technique provides a systematic method for finding vulnerabilities in a process before they result in an error. The FMEA framework can naturally incorporate further quantification and monitoring. A general-use system for incident and near miss reporting would be useful in this regard. PMID:19409731
Sexual Self-concept and Its Relationship to Depression, Stress and Anxiety in Postmenopausal Women
Heidari, Mohammad; Rafiei, Hossein
2017-01-01
Objectives Women in menopause have the more mood swings than before menopause. At the same time seem to sexual self-concept and sexual aspects of self-knowledge has a great impact on their mental health. This study aimed to investigate the sexual self-concept and its relationship to depression, stress and anxiety in postmenopausal women's. Methods In this descriptive correlation research, 300 of postmenopausal women referred to healthcare and medical treatment centers in Abadeh city were selected by convenience sampling method. The information in this study was collected by using questionnaires of multidimensional sexual self-concept and depression anxiety stress scale 21 (DASS-21). For data analysis, SPSS/17 software was used. Results The results showed the mean score positive sexual self-concept was 41.03 ± 8.66 and the average score of negative sexual self in women's was 110.32 ± 43.05. As well as scores of depression, stress, and anxiety, 35.67%, 32.33% and 37.67% respectively were in severe level. Positive and negative sexual self-concept scores with scores of stress, anxiety, and depression, of post-menopausal women in the confidence of 0.01, is significantly correlated (P < 0.05). Conclusions Being stress, anxiety, and depression in severe level and also a significant correlation between increased stress, anxiety and depression with negative and weak self-concept of women's, it is necessary to devote more careful attention to mental health issues of women's and have appropriate interventions. PMID:28523258
Researching Group Assessment: Jazz in the Conservatoire
ERIC Educational Resources Information Center
Barratt, Elisabeth; Moore, Hilary
2005-01-01
This article presents the results of research into methods and scorings for jazz assessment in Trinity College of Music, London, focusing on the possibility of introducing group assessment. It considers the advantages of group assessment methods, contrasting these with the more traditional approach, firmly established in conservatoires, of…
Hamilton-Craig, Christian R; Chow, Clara K; Younger, John F; Jelinek, V M; Chan, Jonathan; Liew, Gary Yh
2017-10-16
Introduction This article summarises the Cardiac Society of Australia and New Zealand position statement on coronary artery calcium (CAC) scoring. CAC scoring is a non-invasive method for quantifying coronary artery calcification using computed tomography. It is a marker of atherosclerotic plaque burden and the strongest independent predictor of future myocardial infarction and mortality. CAC scoring provides incremental risk information beyond traditional risk calculators such as the Framingham Risk Score. Its use for risk stratification is confined to primary prevention of cardiovascular events, and can be considered as individualised coronary risk scoring for intermediate risk patients, allowing reclassification to low or high risk based on the score. Medical practitioners should carefully counsel patients before CAC testing, which should only be undertaken if an alteration in therapy, including embarking on pharmacotherapy, is being considered based on the test result. Main recommendations CAC scoring should primarily be performed on individuals without coronary disease aged 45-75 years (absolute 5-year cardiovascular risk of 10-15%) who are asymptomatic. CAC scoring is also reasonable in lower risk groups (absolute 5-year cardiovascular risk, < 10%) where risk scores traditionally underestimate risk (eg, family history of premature CVD) and in patients with diabetes aged 40-60 years. We recommend aspirin and a high efficacy statin in high risk patients, defined as those with a CAC score ≥ 400, or a CAC score of 100-399 and above the 75th percentile for age and sex. It is reasonable to treat patients with CAC scores ≥ 100 with aspirin and a statin. It is reasonable not to treat asymptomatic patients with a CAC score of zero. Changes in management as a result of this statement Cardiovascular risk is reclassified according to CAC score. High risk patients are treated with a high efficacy statin and aspirin. Very low risk patients (ie, CAC score of zero) do not benefit from treatment.
A Bayesian Scoring Technique for Mining Predictive and Non-Spurious Rules
Batal, Iyad; Cooper, Gregory; Hauskrecht, Milos
2015-01-01
Rule mining is an important class of data mining methods for discovering interesting patterns in data. The success of a rule mining method heavily depends on the evaluation function that is used to assess the quality of the rules. In this work, we propose a new rule evaluation score - the Predictive and Non-Spurious Rules (PNSR) score. This score relies on Bayesian inference to evaluate the quality of the rules and considers the structure of the rules to filter out spurious rules. We present an efficient algorithm for finding rules with high PNSR scores. The experiments demonstrate that our method is able to cover and explain the data with a much smaller rule set than existing methods. PMID:25938136
A Bayesian Scoring Technique for Mining Predictive and Non-Spurious Rules.
Batal, Iyad; Cooper, Gregory; Hauskrecht, Milos
Rule mining is an important class of data mining methods for discovering interesting patterns in data. The success of a rule mining method heavily depends on the evaluation function that is used to assess the quality of the rules. In this work, we propose a new rule evaluation score - the Predictive and Non-Spurious Rules (PNSR) score. This score relies on Bayesian inference to evaluate the quality of the rules and considers the structure of the rules to filter out spurious rules. We present an efficient algorithm for finding rules with high PNSR scores. The experiments demonstrate that our method is able to cover and explain the data with a much smaller rule set than existing methods.
Lee, Onseok; Park, Sunup; Kim, Jaeyoung; Oh, Chilhwan
2017-11-01
The visual scoring method has been used as a subjective evaluation of pigmentary skin disorders. Severity of pigmentary skin disease, especially melasma, is evaluated using a visual scoring method, the MASI (melasma area severity index). This study differentiates between epidermal and dermal pigmented disease. The study was undertaken to determine methods to quantitatively measure the severity of pigmentary skin disorders under ultraviolet illumination. The optical imaging system consists of illumination (white LED, UV-A lamp) and image acquisition (DSLR camera, air cooling CMOS CCD camera). Each camera is equipped with a polarizing filter to remove glare. To analyze images of visible and UV light, images are divided into frontal, cheek, and chin regions of melasma patients. Each image must undergo image processing. To reduce the curvature error in facial contours, a gradient mask is used. The new method of segmentation of front and lateral facial images is more objective for face-area-measurement than the MASI score. Image analysis of darkness and homogeneity is adequate to quantify the conventional MASI score. Under visible light, active lesion margins appear in both epidermal and dermal melanin, whereas melanin is found in the epidermis under UV light. This study objectively analyzes severity of melasma and attempts to develop new methods of image analysis with ultraviolet optical imaging equipment. Based on the results of this study, our optical imaging system could be used as a valuable tool to assess the severity of pigmentary skin disease. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Improved score statistics for meta-analysis in single-variant and gene-level association studies.
Yang, Jingjing; Chen, Sai; Abecasis, Gonçalo
2018-06-01
Meta-analysis is now an essential tool for genetic association studies, allowing them to combine large studies and greatly accelerating the pace of genetic discovery. Although the standard meta-analysis methods perform equivalently as the more cumbersome joint analysis under ideal settings, they result in substantial power loss under unbalanced settings with various case-control ratios. Here, we investigate the power loss problem by the standard meta-analysis methods for unbalanced studies, and further propose novel meta-analysis methods performing equivalently to the joint analysis under both balanced and unbalanced settings. We derive improved meta-score-statistics that can accurately approximate the joint-score-statistics with combined individual-level data, for both linear and logistic regression models, with and without covariates. In addition, we propose a novel approach to adjust for population stratification by correcting for known population structures through minor allele frequencies. In the simulated gene-level association studies under unbalanced settings, our method recovered up to 85% power loss caused by the standard methods. We further showed the power gain of our methods in gene-level tests with 26 unbalanced studies of age-related macular degeneration . In addition, we took the meta-analysis of three unbalanced studies of type 2 diabetes as an example to discuss the challenges of meta-analyzing multi-ethnic samples. In summary, our improved meta-score-statistics with corrections for population stratification can be used to construct both single-variant and gene-level association studies, providing a useful framework for ensuring well-powered, convenient, cross-study analyses. © 2018 WILEY PERIODICALS, INC.
An alternative empirical likelihood method in missing response problems and causal inference.
Ren, Kaili; Drummond, Christopher A; Brewster, Pamela S; Haller, Steven T; Tian, Jiang; Cooper, Christopher J; Zhang, Biao
2016-11-30
Missing responses are common problems in medical, social, and economic studies. When responses are missing at random, a complete case data analysis may result in biases. A popular debias method is inverse probability weighting proposed by Horvitz and Thompson. To improve efficiency, Robins et al. proposed an augmented inverse probability weighting method. The augmented inverse probability weighting estimator has a double-robustness property and achieves the semiparametric efficiency lower bound when the regression model and propensity score model are both correctly specified. In this paper, we introduce an empirical likelihood-based estimator as an alternative to Qin and Zhang (2007). Our proposed estimator is also doubly robust and locally efficient. Simulation results show that the proposed estimator has better performance when the propensity score is correctly modeled. Moreover, the proposed method can be applied in the estimation of average treatment effect in observational causal inferences. Finally, we apply our method to an observational study of smoking, using data from the Cardiovascular Outcomes in Renal Atherosclerotic Lesions clinical trial. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Verification of different forecasts of Hungarian Meteorological Service
NASA Astrophysics Data System (ADS)
Feher, B.
2009-09-01
In this paper I show the results of the forecasts made by the Hungarian Meteorological Service. I focus on the general short- and medium-range forecasts, which contains cloudiness, precipitation, wind speed and temperature for six regions of Hungary. I would like to show the results of some special forecasts as well, such as precipitation predictions which are made for the catchment area of Danube and Tisza rivers, and daily mean temperature predictions used by Hungarian energy companies. The product received by the user is made by the general forecaster, but these predictions are based on the ALADIN and ECMWF outputs. Because of these, the product of the forecaster and the models were also verified. Method like this is able to show us, which weather elements are more difficult to forecast or which regions have higher errors. During the verification procedure the basic errors (mean error, mean absolute error) are calculated. Precipitation amount is classified into five categories, and scores like POD, TS, PC,â¦etc. were defined by contingency table determined by these categories. The procedure runs fully automatically, all the things forecasters have to do is to print the daily result each morning. Beside the daily result, verification is also made for longer periods like week, month or year. Analyzing the results of longer periods we can say that the best predictions are made for the first few days, and precipitation forecasts are less good for mountainous areas, even, the scores of the forecasters sometimes are higher than the errors of the models. Since forecaster receive results next day, it can helps him/her to reduce mistakes and learn the weakness of the models. This paper contains the verification scores, their trends, the method by which these scores are calculated, and some case studies on worse forecasts.
Fenwick, Eva K.; Xie, Jing; Rees, Gwyn; Finger, Robert P.; Lamoureux, Ecosse L.
2013-01-01
Objective In patients with Type 2 diabetes, to determine the factors associated with diabetes knowledge, derived from Rasch analysis, and compare results with a traditional raw scoring method. Research Design & Methods Participants in this cross-sectional study underwent a comprehensive clinical and biochemical assessment. Diabetes knowledge (main outcome) was assessed using the Diabetes Knowledge Test (DKT) which was psychometrically validated using Rasch analysis. The relationship between diabetes knowledge and risk factors identified during univariate analyses was examined using multivariable linear regression. The results using raw and Rasch-transformed methods were descriptively compared. Results 181 patients (mean age±standard deviation = 66.97±9.17 years; 113 (62%) male) were included. Using Rasch-derived DKT scores, those with greater education (β = 1.14; CI: 0.25,2.04, p = 0.013); had seen an ophthalmologist (β = 1.65; CI: 0.63,2.66, p = 0.002), and spoke English at home (β = 1.37; CI: 0.43,2.31, p = 0.005) had significantly better diabetes knowledge than those with less education, had not seen an ophthalmologist and spoke a language other than English, respectively. Patients who were members of the National Diabetes Service Scheme (NDSS) and had seen a diabetes educator also had better diabetes knowledge than their counterparts. Higher HbA1c level was independently associated with worse diabetes knowledge. Using raw measures, access to an ophthalmologist and NDSS membership were not independently associated with diabetes knowledge. Conclusions Sociodemographic, clinical and service use factors were independently associated with diabetes knowledge based on both raw scores and Rasch-derived scores, which supports the implementation of targeted interventions to improve patients' knowledge. Choice of psychometric analytical method can affect study outcomes and should be considered during intervention development. PMID:24312484
NASA Astrophysics Data System (ADS)
Park, Sang Cheol; Zheng, Bin; Wang, Xiao-Hui; Gur, David
2008-03-01
Digital breast tomosynthesis (DBT) has emerged as a promising imaging modality for screening mammography. However, visually detecting micro-calcification clusters depicted on DBT images is a difficult task. Computer-aided detection (CAD) schemes for detecting micro-calcification clusters depicted on mammograms can achieve high performance and the use of CAD results can assist radiologists in detecting subtle micro-calcification clusters. In this study, we compared the performance of an available 2D based CAD scheme with one that includes a new grouping and scoring method when applied to both projection and reconstructed DBT images. We selected a dataset involving 96 DBT examinations acquired on 45 women. Each DBT image set included 11 low dose projection images and a varying number of reconstructed image slices ranging from 18 to 87. In this dataset 20 true-positive micro-calcification clusters were visually detected on the projection images and 40 were visually detected on the reconstructed images, respectively. We first applied the CAD scheme that was previously developed in our laboratory to the DBT dataset. We then tested a new grouping method that defines an independent cluster by grouping the same cluster detected on different projection or reconstructed images. We then compared four scoring methods to assess the CAD performance. The maximum sensitivity level observed for the different grouping and scoring methods were 70% and 88% for the projection and reconstructed images with a maximum false-positive rate of 4.0 and 15.9 per examination, respectively. This preliminary study demonstrates that (1) among the maximum, the minimum or the average CAD generated scores, using the maximum score of the grouped cluster regions achieved the highest performance level, (2) the histogram based scoring method is reasonably effective in reducing false-positive detections on the projection images but the overall CAD sensitivity is lower due to lower signal-to-noise ratio, and (3) CAD achieved higher sensitivity and higher false-positive rate (per examination) on the reconstructed images. We concluded that without changing the detection threshold or performing pre-filtering to possibly increase detection sensitivity, current CAD schemes developed and optimized for 2D mammograms perform relatively poorly and need to be re-optimized using DBT datasets and new grouping and scoring methods need to be incorporated into the schemes if these are to be used on the DBT examinations.
Zhang, Xian; Yaseen, Zimri S.; Galynker, Igor I.; Hirsch, Joy; Winston, Arnold
2011-01-01
Objective Objective measurement of depression remains elusive. Depression has been associated with insecure attachment, and both have been associated with changes in brain reactivity in response to viewing standard emotional and neutral faces. In this study, we developed a method to calculate predicted scores for the Beck Depression Inventory II (BDI-II) using personalized stimuli: fMRI imaging of subjects viewing pictures of their own mothers. Methods 28 female subjects ages 18–30 (14 healthy controls and 14 unipolar depressed diagnosed by MINI psychiatric interview) were scored on the Beck Depression Inventory II (BDI-II) and the Adult Attachment Interview (AAI) coherence of mind scale of global attachment security. Subjects viewed pictures of Mother (M), Friend (F) and Stranger (S), during functional magnetic resonance imaging (fMRI). Using a principal component regression method (PCR), a predicted Beck Depression Inventory II (BDI-II) score was obtained from activity patterns in the paracingulate gyrus (Brodmann area 32) and compared to clinical diagnosis and the measured BDI-II score. The same procedure was performed for AAI coherence of mind scores. Results Activity patterns in BA-32 identified depressed subjects. The categorical agreement between the derived BDI-II score (using the standard clinical cut-score of 14 on the BDI-II) and depression diagnosis by MINI psychiatric interview was 89%, with sensitivity 85.7% and specificity 92.8%. Predicted and measured BDI-II scores had a correlation of 0.55. Prediction of attachment security was not statistically significant. Conclusions Brain activity in response to viewing one's mother may be diagnostic of depression. Functional magnetic resonance imaging using personalized paradigms has the potential to provide objective assessments, even when behavioral measures are not informative. Further, fMRI based diagnostic algorithms may enhance our understanding of the neural mechanisms of depression by identifying distinctive neural features of the illness. PMID:22180777
The Aristotle score: a complexity-adjusted method to evaluate surgical results.
Lacour-Gayet, F; Clarke, D; Jacobs, J; Comas, J; Daebritz, S; Daenen, W; Gaynor, W; Hamilton, L; Jacobs, M; Maruszsewski, B; Pozzi, M; Spray, T; Stellin, G; Tchervenkov, C; Mavroudis And, C
2004-06-01
Quality control is difficult to achieve in Congenital Heart Surgery (CHS) because of the diversity of the procedures. It is particularly needed, considering the potential adverse outcomes associated with complex cases. The aim of this project was to develop a new method based on the complexity of the procedures. The Aristotle project, involving a panel of expert surgeons, started in 1999 and included 50 pediatric surgeons from 23 countries, representing the EACTS, STS, ECHSA and CHSS. The complexity was based on the procedures as defined by the STS/EACTS International Nomenclature and was undertaken in two steps: the first step was establishing the Basic Score, which adjusts only the complexity of the procedures. It is based on three factors: the potential for mortality, the potential for morbidity and the anticipated technical difficulty. A questionnaire was completed by the 50 centers. The second step was the development of the Comprehensive Aristotle Score, which further adjusts the complexity according to the specific patient characteristics. It includes two categories of complexity factors, the procedure dependent and independent factors. After considering the relationship between complexity and performance, the Aristotle Committee is proposing that: Performance = Complexity x Outcome. The Aristotle score, allows precise scoring of the complexity for 145 CHS procedures. One interesting notion coming out of this study is that complexity is a constant value for a given patient regardless of the center where he is operated. The Aristotle complexity score was further applied to 26 centers reporting to the EACTS congenital database. A new display of centers is presented based on the comparison of hospital survival to complexity and to our proposed definition of performance. A complexity-adjusted method named the Aristotle Score, based on the complexity of the surgical procedures has been developed by an international group of experts. The Aristotle score, electronically available, was introduced in the EACTS and STS databases. A validation process evaluating its predictive value is being developed.
Marchick, Michael R; Setteducato, Michael L; Revenis, Jesse J; Robinson, Matthew A; Weeks, Emily C; Payton, Thomas F; Winchester, David E; Allen, Brandon R
2017-09-01
The History, Electrocardiography, Age, Risk factors, Troponin (HEART) score enables rapid risk stratification of emergency department patients presenting with chest pain. However, the subjectivity in scoring introduced by the history component has been criticized by some clinicians. We examined the association of 3 objective scoring models with the results of noninvasive cardiac testing. Medical records for all patients evaluated in the chest pain center of an academic medical center during a 1-year period were reviewed retrospectively. Each patient's history component score was calculated using 3 models developed by the authors. Differences in the distribution of HEART scores for each model, as well as their degree of agreement with one another, as well as the results of cardiac testing were analyzed. Seven hundred forty nine patients were studied, 58 of which had an abnormal stress test or computed tomography coronary angiography. The mean HEART scores for models 1, 2, and 3 were 2.97 (SD 1.17), 2.57 (SD 1.25), and 3.30 (SD 1.35), respectively, and were significantly different (P < 0.001). However, for each model, the likelihood of an abnormal cardiovascular test did not correlate with higher scores on the symptom component of the HEART score (P = 0.09, 0.41, and 0.86, respectively). While the objective scoring models produced different distributions of HEART scores, no model performed well with regards to identifying patients with abnormal advanced cardiac studies in this relatively low-risk cohort. Further studies in a broader cohort of patients, as well as comparison with the performance of subjective history scoring, is warranted before adoption of any of these objective models.
Busse, Jason W.; Bhandari, Mohit; Guyatt, Gordon H.; Heels-Ansdell, Diane; Kulkarni, Abhaya V.; Mandel, Scott; Sanders, David; Schemitsch, Emil; Swiontkowski, Marc; Tornetta, Paul; Wai, Eugene; Walter, Stephen D.
2011-01-01
Objective To explore the role of patients’ beliefs in their likelihood of recovery from severe physical trauma. Methods We developed and validated an instrument designed to capture the impact of patients’ beliefs on functional recovery from injury; the Somatic Pre-occupation and Coping (SPOC) questionnaire. At 6-weeks post-surgical fixation, we administered the SPOC questionnaire to 359 consecutive patients with operatively managed tibial shaft fractures. We constructed multivariable regression models to explore the association between SPOC scores and functional outcome at 1-year, as measured by return to work and short form-36 (SF-36) physical component summary (PCS) and mental component summary (MCS) scores. Results In our adjusted multivariable regression models that included pre-injury SF-36 scores, SPOC scores at 6-weeks post-surgery accounted for 18% of the variation in SF-36 PCS scores and 18% of SF-36 MCS scores at 1-year. In both models, 6-week SPOC scores were a far more powerful predictor of functional recovery than age, gender, fracture type, smoking status, or the presence of multi-trauma. Our adjusted analysis found that for each 14 point increment in SPOC score at 6-weeks (14 chosen on the basis of half a standard deviation of the mean SPOC score) the odds of returning to work at 1-year decreased by 40% (odds ratio = 0.60; 95% CI = 0.50 to 0.73). Conclusion The SPOC questionnaire is a valid measurement of illness beliefs in tibial fracture patients and is highly predictive of their long-term functional recovery. Future research should explore if these results extend to other trauma populations and if modification of unhelpful illness beliefs is feasible and would result in improved functional outcomes. PMID:22011635
Crown, William H
2014-02-01
This paper examines the use of propensity score matching in economic analyses of observational data. Several excellent papers have previously reviewed practical aspects of propensity score estimation and other aspects of the propensity score literature. The purpose of this paper is to compare the conceptual foundation of propensity score models with alternative estimators of treatment effects. References are provided to empirical comparisons among methods that have appeared in the literature. These comparisons are available for a subset of the methods considered in this paper. However, in some cases, no pairwise comparisons of particular methods are yet available, and there are no examples of comparisons across all of the methods surveyed here. Irrespective of the availability of empirical comparisons, the goal of this paper is to provide some intuition about the relative merits of alternative estimators in health economic evaluations where nonlinearity, sample size, availability of pre/post data, heterogeneity, and missing variables can have important implications for choice of methodology. Also considered is the potential combination of propensity score matching with alternative methods such as differences-in-differences and decomposition methods that have not yet appeared in the empirical literature.
Dretsch, Michael; Bleiberg, Joseph; Williams, Kathy; Caban, Jesus; Kelly, James; Grammer, Geoffrey; DeGraba, Thomas
2016-01-01
To examine the use of the Neurobehavioral Symptom Inventory to measure clinical changes over time in a population of US service members undergoing treatment of mild traumatic brain injury and comorbid psychological health conditions. A 4-week, 8-hour per day, intensive, outpatient, interdisciplinary, comprehensive treatment program at the National Intrepid Center of Excellence in Bethesda, Maryland. Three hundred fourteen active-duty service members being treated for combat-related comorbid mild traumatic brain injury and psychological health conditions. Repeated-measures, retrospective analysis of a single-group using a pretest-posttest treatment design. Three Neurobehavioral Symptom Inventory scoring methods: (1) a total summated score, (2) the 3-factor method, and (3) the 4-factor method (with and without orphan items). All 3 scoring methods yielded statistically significant within-subject changes between admission and discharge. The evaluation of effect sizes indicated that the 3 different Neurobehavioral Symptom Inventory scoring methods were comparable. Findings indicate that the different scoring methods all have potential for assessing clinical changes in symptoms for groups of patients undergoing treatment, with no clear advantage with any one method.
Methods for interpreting change over time in patient-reported outcome measures.
Wyrwich, K W; Norquist, J M; Lenderking, W R; Acaster, S
2013-04-01
Interpretation guidelines are needed for patient-reported outcome (PRO) measures' change scores to evaluate efficacy of an intervention and to communicate PRO results to regulators, patients, physicians, and providers. The 2009 Food and Drug Administration (FDA) Guidance for Industry Patient-Reported Outcomes (PRO) Measures: Use in Medical Product Development to Support Labeling Claims (hereafter referred to as the final FDA PRO Guidance) provides some recommendations for the interpretation of change in PRO scores as evidence of treatment efficacy. This article reviews the evolution of the methods and the terminology used to describe and aid in the communication of meaningful PRO change score thresholds. Anchor- and distribution-based methods have played important roles, and the FDA has recently stressed the importance of cross-sectional patient global assessments of concept as anchor-based methods for estimation of the responder definition, which describes an individual-level treatment benefit. The final FDA PRO Guidance proposes the cumulative distribution function (CDF) of responses as a useful method to depict the effect of treatments across the study population. While CDFs serve an important role, they should not be a replacement for the careful investigation of a PRO's relevant responder definition using anchor-based methods and providing stakeholders with a relevant threshold for the interpretation of change over time.
A consensus algorithm for approximate string matching and its application to QRS complex detection
NASA Astrophysics Data System (ADS)
Alba, Alfonso; Mendez, Martin O.; Rubio-Rincon, Miguel E.; Arce-Santana, Edgar R.
2016-08-01
In this paper, a novel algorithm for approximate string matching (ASM) is proposed. The novelty resides in the fact that, unlike most other methods, the proposed algorithm is not based on the Hamming or Levenshtein distances, but instead computes a score for each symbol in the search text based on a consensus measure. Those symbols with sufficiently high scores will likely correspond to approximate instances of the pattern string. To demonstrate the usefulness of the proposed method, it has been applied to the detection of QRS complexes in electrocardiographic signals with competitive results when compared against the classic Pan-Tompkins (PT) algorithm. The proposed method outperformed PT in 72% of the test cases, with no extra computational cost.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lah, J; Manger, R; Kim, G
Purpose: To examine the ability of traditional Failure mode and effects analysis (FMEA) and a light version of Healthcare FMEA (HFMEA), called Scenario analysis of FMEA (SAFER) by comparing their outputs in terms of the risks identified and their severity rankings. Methods: We applied two prospective methods of the quality management to surface image guided, linac-based radiosurgery (SIG-RS). For the traditional FMEA, decisions on how to improve an operation are based on risk priority number (RPN). RPN is a product of three indices: occurrence, severity and detectability. The SAFER approach; utilized two indices-frequency and severity-which were defined by a multidisciplinarymore » team. A criticality matrix was divided into 4 categories; very low, low, high and very high. For high risk events, an additional evaluation was performed. Based upon the criticality of the process, it was decided if additional safety measures were needed and what they comprise. Results: Two methods were independently compared to determine if the results and rated risks were matching or not. Our results showed an agreement of 67% between FMEA and SAFER approaches for the 15 riskiest SIG-specific failure modes. The main differences between the two approaches were the distribution of the values and the failure modes (No.52, 54, 154) that have high SAFER scores do not necessarily have high FMEA RPN scores. In our results, there were additional risks identified by both methods with little correspondence. In the SAFER, when the risk score is determined, the basis of the established decision tree or the failure mode should be more investigated. Conclusion: The FMEA method takes into account the probability that an error passes without being detected. SAFER is inductive because it requires the identification of the consequences from causes, and semi-quantitative since it allow the prioritization of risks and mitigation measures, and thus is perfectly applicable to clinical parts of radiotherapy.« less
Chan, Hiok Yang; Chen, Jerry Yongqiang; Zainul-Abidin, Suraya; Ying, Hao; Koo, Kevin; Rikhraj, Inderjeet Singh
2017-05-01
The American Orthopaedic Foot & Ankle Society (AOFAS) score is one of the most common and adapted outcome scales in hallux valgus surgery. However, AOFAS is predominantly physician based and not patient based. Although it may be straightforward to derive statistical significance, it may not equate to the true subjective benefit of the patient's experience. There is a paucity of literature defining MCID for AOFAS in hallux valgus surgery although it could have a great impact on the accuracy of analyzing surgical outcomes. Hence, the primary aim of this study was to define the Minimal Clinically Important Difference (MCID) for the AOFAS score in these patients, and the secondary aim was to correlate patients' demographics to the MCID. We conducted a retrospective cross-sectional study. A total of 446 patients were reviewed preoperatively and followed up for 2 years. An anchor question was asked 2 years postoperation: "How would you rate the overall results of your treatment for your foot and ankle condition?" (excellent, very good, good, fair, poor, terrible). The MCID was derived using 4 methods, 3 from an anchor-based approach and 1 from a distribution-based approach. Anchor-based approaches were (1) mean difference in 2-year AOFAS scores of patients who answered "good" versus "fair" based on the anchor question; (2) mean change of AOFAS score preoperatively and at 2-year follow-up in patients who answered good; (3) receiver operating characteristic (ROC) curves method, where the area under the curve (AUC) represented the likelihood that the scoring system would accurately discriminate these 2 groups of patients. The distribution-based approach used to calculate MCID was the effect size method. There were 405 (90.8%) females and 41 (9.2%) males. Mean age was 51.2 (standard deviation [SD] = 13) years, mean preoperative BMI was 24.2 (SD = 4.1). Mean preoperative AOFAS score was 55.6 (SD = 16.8), with significant improvement to 85.7 (SD = 14.4) in 2 years ( P value < .001). There were no statistical differences between demographics or preoperative AOFAS scores of patients with good versus fair satisfaction levels. At 2 years, patients who had good satisfaction had higher AOFAS scores than fair satisfaction (83.9 vs 78.1, P < .001) and higher mean change (30.2 vs 22.3, P = .015). Mean change in AOFAS score in patients with good satisfaction was 30.2 (SD = 19.8). Mean difference in good versus fair satisfaction was 7.9. Using ROC analysis, the cut-off point is 29.0, with an area under the curve (AUC) of 0.62. Effect size method derived an MCID of 8.4 with a moderate effect size of 0.5. Multiple linear regression demonstrated increasing age (β = -0.129, CI = -0.245, -0.013, P = .030) and higher preoperative AOFAS score (β = -0.874, CI = -0.644, -0.081, P < .001) to significantly decrease the amount of change in the AOFAS score. The MCID of AOFAS score in hallux valgus surgery was 7.9 to 30.2. The MCID can ensure clinical improvement from a patient's perspective and also aid in interpreting results from clinical trials and other studies. Level III, retrospective comparative series.
The Telephone Interview for Cognitive Status: Creating a crosswalk with the Mini-Mental State Exam
Fong, Tamara G.; Fearing, Michael A.; Jones, Richard N.; Shi, Peilin; Marcantonio, Edward R.; Rudolph, James L.; Yang, Frances M.; Kiely, Dan K.; Inouye, Sharon K.
2009-01-01
Background Brief cognitive screening measures are valuable tools for both research and clinical applications. The most widely used instrument, the Mini-Mental State Examination (MMSE) is limited in that it must be administered face-to-face, cannot be used in participants with visual or motor impairments, and is protected by copyright. Alternative screening instruments, such as the Telephone Interview for Cognitive Status (TICS) have been developed and may provide a valid alternative with comparable cut point scores to rate global cognitive function. Methods MMSE, TICS-30, and TICS-40 scores from 746 community dwelling elders who participated in the Aging, Demographics, and Memory Study (ADAMS) were analyzed with equipercentile equating, a statistical process of determining comparable scores based on percentile equivalents on different forms of an examination. Results Scores from the MMSE and the TICS-30 and TICS-40 corresponded well and clinically relevant cut point scores were determined; for example, an MMSE score of 23 is equivalent to 17 and 20 on the TICS-30 and TICS-40, respectively. Conclusions These findings provide scores that can be used to link TICS and MMSE scores directly. Clinically relevant and important MMSE cut points and the respective ADAMS TICS-30 and TICS-40 cut point scores have been included to identify the degree of cognitive impairment among respondents with any type of cognitive disorder. These results will help with the widespread application of the TICS in both research and clinical practice. PMID:19647495
Initial Correction versus Negative Marking in Multiple Choice Examinations
ERIC Educational Resources Information Center
Van Hecke, Tanja
2015-01-01
Optimal assessment tools should measure in a limited time the knowledge of students in a correct and unbiased way. A method for automating the scoring is multiple choice scoring. This article compares scoring methods from a probabilistic point of view by modelling the probability to pass: the number right scoring, the initial correction (IC) and…
ERIC Educational Resources Information Center
Han, Turgay; Huang, Jinyan
2017-01-01
Using generalizability (G-) theory and rater interviews as both quantitative and qualitative approaches, this study examined the impact of scoring methods (i.e., holistic versus analytic scoring) on the scoring variability and reliability of an EFL institutional writing assessment at a Turkish university. Ten raters were invited to rate 36…
ERIC Educational Resources Information Center
Anderson, Paul S.
Initial experiences with computer-assisted reconsiderative scoring are described. Reconsiderative scoring occurs when student responses are received and reviewed by the teacher before points for correctness are assigned. Manually scored completion-style questions are reconsiderative. A new method of machine assistance produces an item analysis on…