Sample records for joinpoint regression analysis

  1. Trend Analysis of Cancer Mortality and Incidence in Panama, Using Joinpoint Regression Analysis.

    PubMed

    Politis, Michael; Higuera, Gladys; Chang, Lissette Raquel; Gomez, Beatriz; Bares, Juan; Motta, Jorge

    2015-06-01

    Cancer is one of the leading causes of death worldwide and its incidence is expected to increase in the future. In Panama, cancer is also one of the leading causes of death. In 1964, a nationwide cancer registry was started and it was restructured and improved in 2012. The aim of this study is to utilize Joinpoint regression analysis to study the trends of the incidence and mortality of cancer in Panama in the last decade. Cancer mortality was estimated from the Panamanian National Institute of Census and Statistics Registry for the period 2001 to 2011. Cancer incidence was estimated from the Panamanian National Cancer Registry for the period 2000 to 2009. The Joinpoint Regression Analysis program, version 4.0.4, was used to calculate trends by age-adjusted incidence and mortality rates for selected cancers. Overall, the trend of age-adjusted cancer mortality in Panama has declined over the last 10 years (-1.12% per year). The cancers for which there was a significant increase in the trend of mortality were female breast cancer and ovarian cancer; while the highest increases in incidence were shown for breast cancer, liver cancer, and prostate cancer. Significant decrease in the trend of mortality was evidenced for the following: prostate cancer, lung and bronchus cancer, and cervical cancer; with respect to incidence, only oral and pharynx cancer in both sexes had a significant decrease. Some cancers showed no significant trends in incidence or mortality. This study reveals contrasting trends in cancer incidence and mortality in Panama in the last decade. Although Panama is considered an upper middle income nation, this study demonstrates that some cancer mortality trends, like the ones seen in cervical and lung cancer, behave similarly to the ones seen in high income countries. In contrast, other types, like breast cancer, follow a pattern seen in countries undergoing a transition to a developed economy with its associated lifestyle, nutrition, and body weight

  2. Analysis of geographical disparities in temporal trends of health outcomes using space-time joinpoint regression

    NASA Astrophysics Data System (ADS)

    Goovaerts, Pierre

    2013-06-01

    Analyzing temporal trends in health outcomes can provide a more comprehensive picture of the burden of a disease like cancer and generate new insights about the impact of various interventions. In the United States such an analysis is increasingly conducted using joinpoint regression outside a spatial framework, which overlooks the existence of significant variation among U.S. counties and states with regard to the incidence of cancer. This paper presents several innovative ways to account for space in joinpoint regression: (1) prior filtering of noise in the data by binomial kriging and use of the kriging variance as measure of reliability in weighted least-square regression, (2) detection of significant boundaries between adjacent counties based on tests of parallelism of time trends and confidence intervals of annual percent change of rates, and (3) creation of spatially compact groups of counties with similar temporal trends through the application of hierarchical cluster analysis to the results of boundary analysis. The approach is illustrated using time series of proportions of prostate cancer late-stage cases diagnosed yearly in every county of Florida since 1980s. The annual percent change (APC) in late-stage diagnosis and the onset years for significant declines vary greatly across Florida. Most counties with non-significant average APC are located in the north-western part of Florida, known as the Panhandle, which is more rural than other parts of Florida. The number of significant boundaries peaked in the early 1990s when prostate-specific antigen (PSA) test became widely available, a temporal trend that suggests the existence of geographical disparities in the implementation and/or impact of the new screening procedure, in particular as it began available.

  3. Silent changes of tuberculosis in Iran (2005-2015): A joinpoint regression analysis.

    PubMed

    Marvi, Abolfazl; Asadi-Aliabadi, Mehran; Darabi, Mehdi; Rostami-Maskopaee, Fereshteh; Siamian, Hasan; Abedi, Ghasem

    2017-01-01

    Tuberculosis (TB) poses a severe risk to public health through the world but excessively distresses low-income nations. The aim of this study is to analyze silent changes of TB in Iran (2005-2015): A joinpoint regression analysis. This is a trend study conducted on all patients ( n = 70) that register in control disease center of Joibar (one of coastal cities and tourism destination in Northern Iran which was recognized as an independent town since 1998) during 2005-2015. The characteristics of patients imported to the SPSS 19 and variation in incidence rate of different forms of pulmonary TB (PTB) (PTB+ or PTB-) and extra-PTB (EPTB)/year was analyzed. Variation in incidence rate of TB for male and female groups and different age groups (0-14, 15-24, 25-34, 35-44, 45-54, 55-64, and above 65 years) was analyzed, variation in trend of this diseases for different groups was compared in intended years, and also, variation in incidence rate of TB was analyzed by Joinpoint Regression Software. The total number of TB was 70 cases during 2005-2015. The mean age of patients was 42.31 ± 21.26 years and median age was 40 years. About 71.4% of patients were PTB (55.7% for with PTB+ and 15.7% with PTB-) and rest of them (28.4%) were EPTB. In regard to classification of cases, 97.1% of them were new cases, 1.45% of them were relapsed cases, and 1.45% of them imported cases. In addition, history of hospitalization due to TB was observed in 44.3%. Despite recent developments of governmental health-care system in Iran and proper access to it and considering this fact that identification of TB cases with passive surveillance is possible. Hence, developing certain programs for sensitization of the covered population is essential.

  4. Mortality trends among Japanese dialysis patients, 1988-2013: a joinpoint regression analysis.

    PubMed

    Wakasugi, Minako; Kazama, Junichiro James; Narita, Ichiei

    2016-09-01

    Evaluation of mortality trends in dialysis patients is important for improving their prognoses. The present study aimed to examine temporal trends in deaths (all-cause, cardiovascular, noncardiovascular and the five leading causes) among Japanese dialysis patients. Mortality data were extracted from the Japanese Society of Dialysis Therapy registry. Age-standardized mortality rates were calculated by direct standardization against the 2013 dialysis population. The average annual percentage of change (APC) and the corresponding 95% confidence interval (CI) were computed for trends using joinpoint regression analysis. A total of 469 324 deaths occurred, of which 25.9% were from cardiac failure, 17.5% from infectious disease, 10.2% from cerebrovascular disorders, 8.6% from malignant tumors and 5.6% from cardiac infarction. The joinpoint trend for all-cause mortality decreased significantly, by -3.7% (95% CI -4.2 to -3.2) per year from 1988 through 2000, then decreased more gradually, by -1.4% (95% CI -1.7 to -1.2) per year during 2000-13. The improved mortality rates were mainly due to decreased deaths from cardiovascular disease, with mortality rates due to noncardiovascular disease outnumbering those of cardiovascular disease in the last decade. Among the top five causes of death, cardiac failure has shown a marked decrease in mortality rate. However, the rates due to infectious disease have remained stable during the study period [APC 0.1 (95% CI -0.2-0.3)]. Significant progress has been made, particularly with regard to the decrease in age-standardized mortality rates. The risk of cardiovascular death has decreased, while the risk of death from infection has remained unchanged for 25 years. © The Author 2016. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.

  5. Hysterectomy trends in Australia, 2000-2001 to 2013-2014: joinpoint regression analysis.

    PubMed

    Wilson, Louise F; Pandeya, Nirmala; Mishra, Gita D

    2017-10-01

    Hysterectomy is a common gynecological procedure, particularly in middle and high income countries. The aim of this paper was to describe and examine hysterectomy trends in Australia from 2000-2001 to 2013-2014. For women aged 25 years and over, data on the number of hysterectomies performed in Australia annually were sourced from the National Hospital and Morbidity Database. Age-specific and age-standardized hysterectomy rates per 10 000 women were estimated with adjustment for hysterectomy prevalence in the population. Using joinpoint regression analysis, we estimated the average annual percentage change over the whole study period (2000-2014) and the annual percentage change for each identified trend line segment. A total of 431 162 hysterectomy procedures were performed between 2000-2001 and 2013-2014; an annual average of 30 797 procedures (for women aged 25+ years). The age-standardized hysterectomy rate, adjusted for underlying hysterectomy prevalence, decreased significantly over the whole study period [average annual percentage change -2.8%; 95% confidence interval (CI) -3.5%, -2.2%]. The trend was not linear with one joinpoint detected in 2008-2009. Between 2000-2001 and 2008-2009 there was a significant decrease in incidence (annual percentage change -4.4%; 95% CI -5.2%, -3.7%); from 2008-2009 to 2013-2014 the decrease was minimal and not significantly different from zero (annual percentage change -0.1%; 95% CI -1.7%, 1.5%). A similar change in trend was seen in all age groups. Hysterectomy rates in Australian women aged 25 years and over have declined in the first decade of the 21st century. However, in the last 5 years, rates appear to have stabilized. © 2017 Nordic Federation of Societies of Obstetrics and Gynecology.

  6. Scientific Productivity on Research in Ethical Issues over the Past Half Century: A JoinPoint Regression Analysis.

    PubMed

    Long, Nguyen Phuoc; Huy, Nguyen Tien; Trang, Nguyen Thi Huyen; Luan, Nguyen Thien; Anh, Nguyen Hoang; Nghi, Tran Diem; Hieu, Mai Van; Hirayama, Kenji; Karbwang, Juntra

    2014-09-01

    Ethics is one of the main pillars in the development of science. We performed a JoinPoint regression analysis to analyze the trends of ethical issue research over the past half century. The question is whether ethical issues are neglected despite their importance in modern research. PubMed electronic library was used to retrieve publications of all fields and ethical issues. JoinPoint regression analysis was used to identify the significant time trends of publications of all fields and ethical issues, as well as the proportion of publications on ethical issues to all fields over the past half century. Annual percent changes (APC) were computed with their 95% confidence intervals, and a p-value < 0.05 was considered statistically significant. We found that publications of ethical issues increased during the period of 1965-1996 but slightly fell in recent years (from 1996 to 2013). When comparing the absolute number of ethics related articles (APEI) to all publications of all fields (APAF) on PubMed, the results showed that the proportion of APEI to APAF statistically increased during the periods of 1965-1974, 1974-1986, and 1986-1993, with APCs of 11.0, 2.1, and 8.8, respectively. However, the trend has gradually dropped since 1993 and shown a marked decrease from 2002 to 2013 with an annual percent change of -7.4%. Scientific productivity in ethical issues research on over the past half century rapidly increased during the first 30-year period but has recently been in decline. Since ethics is an important aspect of scientific research, we suggest that greater attention is needed in order to emphasize the role of ethics in modern research.

  7. Prostate cancer mortality in Serbia, 1991-2010: a joinpoint regression analysis.

    PubMed

    Ilic, Milena; Ilic, Irena

    2016-06-01

    The aim of this descriptive epidemiological study was to analyze the mortality trend of prostate cancer in Serbia (excluding the Kosovo and Metohia) from 1991 to 2010. The age-standardized prostate cancer mortality rates (per 100 000) were calculated by direct standardization, using the World Standard Population. Average annual percentage of change (AAPC) and the corresponding 95% confidence interval (CI) was computed for trend using the joinpoint regression analysis. Significantly increased trend in prostate cancer mortality was recorded in Serbia continuously from 1991 to 2010 (AAPC = +2.2, 95% CI = 1.6-2.9). Mortality rates for prostate cancer showed a significant upward trend in all men aged 50 and over: AAPC (95% CI) was +1.9% (0.1-3.8) in aged 50-59 years, +1.7% (0.9-2.6) in aged 60-69 years, +2.0% (1.2-2.9) in aged 70-79 years and +3.5% (2.4-4.6) in aged 80 years and over. According to comparability test, prostate cancer mortality trends in majority of age groups were parallel (final selected model failed to reject parallelism, P > 0.05). The increasing prostate cancer mortality trend implies the need for more effective measures of prevention, screening and early diagnosis, as well as prostate cancer treatment in Serbia. © The Author 2015. Published by Oxford University Press on behalf of Faculty of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  8. Diabetes mortality in Serbia, 1991-2015 (a nationwide study): A joinpoint regression analysis.

    PubMed

    Ilic, Milena; Ilic, Irena

    2017-02-01

    The aim of this study was to analyze the mortality trends of diabetes mellitus in Serbia (excluding the Autonomous Province of Kosovo and Metohia). A population-based cross sectional study analyzing diabetes mortality in Serbia in the period 1991-2015 was carried out based on official data. The age-standardized mortality rates (per 100,000) were calculated by direct standardization, using the European Standard Population. Average annual percentage of change (AAPC) and the corresponding 95% confidence interval (CI) were computed using the joinpoint regression analysis. More than 63,000 (about 27,000 of men and 36,000 of women) diabetes deaths occurred in Serbia from 1991 to 2015. Death rates from diabetes were almost equal in men and in women (about 24.0 per 100,000) and places Serbia among the countries with the highest diabetes mortality rates in Europe. Since 1991, mortality from diabetes in men significantly increased by +1.2% per year (95% CI 0.7-1.7), but non-significantly increased in women by +0.2% per year (95% CI -0.4 to 0.7). Increased trends in diabetes type 1 mortality rates were significant in both genders in Serbia. Trends in mortality for diabetes type 2 showed a significant decrease in both genders since 2010. Given that diabetes mortality trends showed different patterns during the studied period, our results imply that further observation of trend is needed. Copyright © 2016 Primary Care Diabetes Europe. Published by Elsevier Ltd. All rights reserved.

  9. Geographical, temporal and racial disparities in late-stage prostate cancer incidence across Florida: A multiscale joinpoint regression analysis

    PubMed Central

    2011-01-01

    Background Although prostate cancer-related incidence and mortality have declined recently, striking racial/ethnic differences persist in the United States. Visualizing and modelling temporal trends of prostate cancer late-stage incidence, and how they vary according to geographic locations and race, should help explaining such disparities. Joinpoint regression is increasingly used to identify the timing and extent of changes in time series of health outcomes. Yet, most analyses of temporal trends are aspatial and conducted at the national level or for a single cancer registry. Methods Time series (1981-2007) of annual proportions of prostate cancer late-stage cases were analyzed for non-Hispanic Whites and non-Hispanic Blacks in each county of Florida. Noise in the data was first filtered by binomial kriging and results were modelled using joinpoint regression. A similar analysis was also conducted at the state level and for groups of metropolitan and non-metropolitan counties. Significant racial differences were detected using tests of parallelism and coincidence of time trends. A new disparity statistic was introduced to measure spatial and temporal changes in the frequency of racial disparities. Results State-level percentage of late-stage diagnosis decreased 50% since 1981; a decline that accelerated in the 90's when Prostate Specific Antigen (PSA) screening was introduced. Analysis at the metropolitan and non-metropolitan levels revealed that the frequency of late-stage diagnosis increased recently in urban areas, and this trend was significant for white males. The annual rate of decrease in late-stage diagnosis and the onset years for significant declines varied greatly among counties and racial groups. Most counties with non-significant average annual percent change (AAPC) were located in the Florida Panhandle for white males, whereas they clustered in South-eastern Florida for black males. The new disparity statistic indicated that the spatial extent of

  10. Geographical, temporal and racial disparities in late-stage prostate cancer incidence across Florida: a multiscale joinpoint regression analysis.

    PubMed

    Goovaerts, Pierre; Xiao, Hong

    2011-12-05

    Although prostate cancer-related incidence and mortality have declined recently, striking racial/ethnic differences persist in the United States. Visualizing and modelling temporal trends of prostate cancer late-stage incidence, and how they vary according to geographic locations and race, should help explaining such disparities. Joinpoint regression is increasingly used to identify the timing and extent of changes in time series of health outcomes. Yet, most analyses of temporal trends are aspatial and conducted at the national level or for a single cancer registry. Time series (1981-2007) of annual proportions of prostate cancer late-stage cases were analyzed for non-Hispanic Whites and non-Hispanic Blacks in each county of Florida. Noise in the data was first filtered by binomial kriging and results were modelled using joinpoint regression. A similar analysis was also conducted at the state level and for groups of metropolitan and non-metropolitan counties. Significant racial differences were detected using tests of parallelism and coincidence of time trends. A new disparity statistic was introduced to measure spatial and temporal changes in the frequency of racial disparities. State-level percentage of late-stage diagnosis decreased 50% since 1981; a decline that accelerated in the 90's when Prostate Specific Antigen (PSA) screening was introduced. Analysis at the metropolitan and non-metropolitan levels revealed that the frequency of late-stage diagnosis increased recently in urban areas, and this trend was significant for white males. The annual rate of decrease in late-stage diagnosis and the onset years for significant declines varied greatly among counties and racial groups. Most counties with non-significant average annual percent change (AAPC) were located in the Florida Panhandle for white males, whereas they clustered in South-eastern Florida for black males. The new disparity statistic indicated that the spatial extent of racial disparities reached a

  11. Temporal trends in motor vehicle fatalities in the United States, 1968 to 2010 - a joinpoint regression analysis.

    PubMed

    Bandi, Priti; Silver, Diana; Mijanovich, Tod; Macinko, James

    2015-12-01

    In the past 40 years, a variety of factors might have impacted motor vehicle (MV) fatality trends in the US, including public health policies, engineering innovations, trauma care improvements, etc. These factors varied in their timing across states/localities, and many were targeted at particular population subgroups. In order to identify and quantify differential rates of change over time and differences in trend patterns between population subgroups, this study employed a novel analytic method to assess temporal trends in MV fatalities between 1968 and 2010, by age group and sex. Cause-specific MV fatality data from traffic injuries between 1968 and 2010, based on death certificates filed in the 50 states, and DC were obtained from Centers for Disease Control and Prevention Wide-ranging Online Data for Epidemiologic Research (CDC WONDER). Long-term (1968 to 2010) and short-term (log-linear piecewise segments) trends in fatality rates were compared for males and females overall and in four separate age groups using joinpoint regression. MV fatalities declined on average by 2.4% per year in males and 2.2% per year in females between 1968 and 2010, with significant declines observed in all age groups and in both sexes. In males overall and those 25 to 64 years, sharp declines between 1968 and mid-to-late 1990s were followed by a stalling until the mid-2000s, but rates in females experienced a long-term steady decline of a lesser magnitude than males during this time. Trends in those aged <1 to 14 years and 15 to 24 years were mostly steady over time, but males had a larger decline than females in the latter age group between 1968 and the mid-2000s. In ages 65+, short-term trends were similar between sexes. Despite significant long-term declines in MV fatalities, the application of Joinpoint Regression found that progress in young adult and middle-aged adult males stalled in recent decades and rates in males declined relatively more than in females in certain age

  12. Long-term trends of suicide by choice of method in Norway: a joinpoint regression analysis of data from 1969 to 2012.

    PubMed

    Puzo, Quirino; Qin, Ping; Mehlum, Lars

    2016-03-11

    Suicide mortality and the rates by specific methods in a population may change over time in response to concurrent changes in relevant factors in society. This study aimed to identify significant changing points in method-specific suicide mortality from 1969 to 2012 in Norway. Data on suicide mortality by specific methods and by sex and age were retrieved from the Norwegian Cause-of-Death Register. Long-term trends in age-standardized rates of suicide mortality were analyzed by using joinpoint regression analysis. The most frequently used suicide method in the total population was hanging, followed by poisoning and firearms. Men chose suicide by firearms more often than women, whereas poisoning and drowning were more frequently used by women. The joinpoint analysis revealed that the overall trend of suicide mortality significantly changed twice along the period of 1969 to 2012 for both sexes. The male age-standardized suicide rate increased by 3.1% per year until 1989, and decreased by 1.2% per year between 1994 and 2012. Among females the long-term suicide rate increased by 4.0% per year until 1988, decreased by 5.5% through 1995, and then stabilized. Both sexes experienced an upward trend for suicide by hanging during the 44-year observation period, with a particularly significant increase in 15-24 year old males. The most distinct change among men was seen for firearms after 1988 with a significant decrease through 2012 of around 5% per year. For women, significant reductions since 1985-88 were observed for suicide by drowning and poisoning. The present study demonstrates different time trends for different suicide methods with significant reductions in suicide by firearms, drowning and poisoning after the peak in the suicide rate in the late 1980s. Suicide by means of hanging continuously increased, but did not fully compensate for the reduced use of other methods. This lends some support for the effectiveness of method-specific suicide preventive measures

  13. Malignant Lymphatic and Hematopoietic Neoplasms Mortality in Serbia, 1991–2010: A Joinpoint Regression Analysis

    PubMed Central

    Ilic, Milena; Ilic, Irena

    2014-01-01

    Background Limited data on mortality from malignant lymphatic and hematopoietic neoplasms have been published for Serbia. Methods The study covered population of Serbia during the 1991–2010 period. Mortality trends were assessed using the joinpoint regression analysis. Results Trend for overall death rates from malignant lymphoid and haematopoietic neoplasms significantly decreased: by −2.16% per year from 1991 through 1998, and then significantly increased by +2.20% per year for the 1998–2010 period. The growth during the entire period was on average +0.8% per year (95% CI 0.3 to 1.3). Mortality was higher among males than among females in all age groups. According to the comparability test, mortality trends from malignant lymphoid and haematopoietic neoplasms in men and women were parallel (final selected model failed to reject parallelism, P = 0.232). Among younger Serbian population (0–44 years old) in both sexes: trends significantly declined in males for the entire period, while in females 15–44 years of age mortality rates significantly declined only from 2003 onwards. Mortality trend significantly increased in elderly in both genders (by +1.7% in males and +1.5% in females in the 60–69 age group, and +3.8% in males and +3.6% in females in the 70+ age group). According to the comparability test, mortality trend for Hodgkin's lymphoma differed significantly from mortality trends for all other types of malignant lymphoid and haematopoietic neoplasms (P<0.05). Conclusion Unfavourable mortality trend in Serbia requires targeted intervention for risk factors control, early diagnosis and modern therapy. PMID:25333862

  14. Colorectal cancer mortality trends in Serbia during 1991-2010: an age-period-cohort analysis and a joinpoint regression analysis.

    PubMed

    Ilic, Milena; Ilic, Irena

    2016-06-22

    For both men and women worldwide, colorectal cancer is among the leading causes of cancer-related death. This study aimed to assess the mortality trends of colorectal cancer in Serbia between 1991 and 2010, prior to the introduction of population-based screening. Joinpoint regression analysis was used to estimate average annual percent change (AAPC) with the corresponding 95% confidence interval (CI). Furthermore, age-period-cohort analysis was performed to examine the effects of birth cohort and calendar period on the observed temporal trends. We observed a significantly increased trend in colorectal cancer mortality in Serbia during the study period (AAPC = 1.6%, 95% CI 1.3%-1.8%). Colorectal cancer showed an increased mortality trend in both men (AAPC = 2.0%, 95% CI 1.7%-2.2%) and women (AAPC = 1.0%, 95% CI 0.6%-1.4%). The temporal trend of colorectal cancer mortality was significantly affected by birth cohort (P < 0.05), whereas the study period did not significantly affect the trend (P = 0.072). Colorectal cancer mortality increased for the first several birth cohorts in Serbia (from 1916 to 1955), followed by downward flexion for people born after the 1960s. According to comparability test, overall mortality trends for colon cancer and rectal and anal cancer were not parallel (the final selected model rejected parallelism, P < 0.05). We found that colorectal cancer mortality in Serbia increased considerably over the past two decades. Mortality increased particularly in men, but the trends were different according to age group and subsite. In Serbia, interventions to reduce colorectal cancer burden, especially the implementation of a national screening program, as well as treatment improvements and measures to encourage the adoption of a healthy lifestyle, are needed.

  15. Research Trends in Evidence-Based Medicine: A Joinpoint Regression Analysis of More than 50 Years of Publication Data

    PubMed Central

    Hung, Bui The; Long, Nguyen Phuoc; Hung, Le Phi; Luan, Nguyen Thien; Anh, Nguyen Hoang; Nghi, Tran Diem; Van Hieu, Mai; Trang, Nguyen Thi Huyen; Rafidinarivo, Herizo Fabien; Anh, Nguyen Ky; Hawkes, David; Huy, Nguyen Tien; Hirayama, Kenji

    2015-01-01

    Background Evidence-based medicine (EBM) has developed as the dominant paradigm of assessment of evidence that is used in clinical practice. Since its development, EBM has been applied to integrate the best available research into diagnosis and treatment with the purpose of improving patient care. In the EBM era, a hierarchy of evidence has been proposed, including various types of research methods, such as meta-analysis (MA), systematic review (SRV), randomized controlled trial (RCT), case report (CR), practice guideline (PGL), and so on. Although there are numerous studies examining the impact and importance of specific cases of EBM in clinical practice, there is a lack of research quantitatively measuring publication trends in the growth and development of EBM. Therefore, a bibliometric analysis was constructed to determine the scientific productivity of EBM research over decades. Methods NCBI PubMed database was used to search, retrieve and classify publications according to research method and year of publication. Joinpoint regression analysis was undertaken to analyze trends in research productivity and the prevalence of individual research methods. Findings Analysis indicates that MA and SRV, which are classified as the highest ranking of evidence in the EBM, accounted for a relatively small but auspicious number of publications. For most research methods, the annual percent change (APC) indicates a consistent increase in publication frequency. MA, SRV and RCT show the highest rate of publication growth in the past twenty years. Only controlled clinical trials (CCT) shows a non-significant reduction in publications over the past ten years. Conclusions Higher quality research methods, such as MA, SRV and RCT, are showing continuous publication growth, which suggests an acknowledgement of the value of these methods. This study provides the first quantitative assessment of research method publication trends in EBM. PMID:25849641

  16. Trends in Unintentional Fall-Related Traumatic Brain Injury Death Rates in Older Adults in the United States, 1980-2010: A Joinpoint Analysis.

    PubMed

    Sung, Kuan-Chin; Liang, Fu-Wen; Cheng, Tain-Junn; Lu, Tsung-Hsueh; Kawachi, Ichiro

    2015-07-15

    Unintentional fall-related traumatic brain injury (TBI) death rate is high in older adults in the United States, but little is known regarding trends of these death rates. We sought to examine unintentional fall-related TBI death rates by age and sex in older adults from 1980 through 2010 in the United States. We used multiple-cause mortality data from 1980 through 2010 (31 years of data) to identify fall-related TBI deaths. Using a joinpoint regression program, we determined the joinpoints (years at which trends change significantly) and annual percentage changes (APCs) in mortality trends. The fall-related TBI death rates (deaths per 100,000 population) in older adults ages 65-74, 75-84, and 85 years and above were 2.7, 9.2, and 21.5 for females and 8.5, 18.2, and 40.8 for males, respectively, in 1980. The rate was about the same in 1992, yet increased markedly to 5.9, 23.4, and 68.9 for females and 11.6, 41.2, and 112.4 for males, respectively, in 2010. For males all 65 years years of age and above, we found the first joinpoint in 1992, when the APC for 1980 through 1992, -0.8%, changed to 6.2% for 1992-2005. The second joinpoint occurred in 2005, when the APC decreased to 3.7% for 2005-2010. For all females 65 years of age and above, the first joinpoint was in 1993 when the APC for 1980 through 1993, -0.2%, changed to 7.6% from 1993 to 2005. The second joinpoint occurred in 2005 when the APC decreased to 3.8% for 2005-2010. This descriptive epidemiological study suggests increasing fall-related TBI death rates from 1992 to 2005 and then a slowdown of increasing trends between 2005 and 2010. Continued monitoring of fall-related TBI death rate trends is needed to determine the burden of this public health problem among older adults in the United States.

  17. Selecting the Final Model — Joinpoint Help System 4.4.0.0

    Cancer.gov

    Why doesn't the joinpoint program give me the best possible fit? I can see other models with more joinpoints that would fit better. Exactly how does the program decide which tests to perform and which joinpoint model is the final model?

  18. Consecutive Non-Significant Segments — Joinpoint Help System 4.4.0.0

    Cancer.gov

    Sometimes, the APC for one segment is significantly different from zero, but when an extra joinpoint in the segment is determined by the Joinpoint software, neither APCs for the two consecutive segments are significant. Why?

  19. Pancreatic cancer mortality in Serbia from 1991-2010 – a joinpoint analysis

    PubMed Central

    Ilić, Milena; Vlajinac, Hristina; Marinković, Jelena; Kocev, Nikola

    2013-01-01

    Aim To analyze the trends of pancreatic cancer mortality in Serbia. Methods The study covered the population of Serbia in the period 1991 to 2010. Mortality trends were assessed by the joinpoint regression analysis by age and sex. Results Age-standardized mortality rates ranged from 5.93 to 8.57 per 100 000 in men and from 3.51 to 5.79 per 100 000 in women. Pancreatic cancer mortality in all age groups was higher among men than among women. It was continuously increasing since 1991 by 1.6% (95% confidence interval [CI] 1.1 to 2.0) yearly in men and by 2.2% (95% CI 1.7 to 2.7) yearly in women. Changes in mortality were not significant in younger age groups for both sexes. In older men (≥55 years), mortality was increasing, although in age groups 70-74 and 80-84 the increase was not significant. In 65-69 years old men, the increase in mortality was significant only in the period 2004 to 2010. In ≥50 years old women, mortality significantly increased from 1991 onward. In 75-79 years old women, a non-significant decrease in the period 1991 to 2000 was followed by a significant increase from 2000 to 2010. Conclusion Serbia is one of the countries with the highest pancreatic cancer mortality in the world, with increasing mortality trend in both sexes and in most age groups. PMID:23986278

  20. Ischaemic heart disease mortality in Serbia, 1991-2013; a joinpoint analysis

    PubMed Central

    Ilic, Milena; Ilic, Irena

    2017-01-01

    Background & objectives: Ischaemic heart disease (IHD) has been one of the leading causes of mortality in the world. In many European countries the mortality rates due to IHD have been rising rapidly. This study was aimed to assess the IHD mortality trend in Serbia. Methods: A population-based cross-sectional study analyzing IHD mortality in Serbia in the period 1991-2013 was carried out based on official data. The age-standardized rates (ASRs, per 100,000) were calculated using the direct method, according to the European standard population. Joinpoint analysis was used to estimate the average annual percentage change (AAPC) with the corresponding 95 per cent confidence interval (CI). Results: More than 253,000 people (143,420 men and 110,276 women) died due to IHD in Serbia during the observed period, and most of them (over 160,000 people) were patients with myocardial infarction (MI). Average annual ASR for IHD was 113.6/100,000. There was no overall significant trend for mortality due to IHD (AAPC=+0.1%, 95% CI −0.8-1.0), but there was one joinpoint: the trend significantly increased by +2.3 per cent per year from 1991 to 2006 and then significantly decreased by −6.4 per cent from 2006 to onwards. Significantly decreased mortality trends for MI in both genders were observed: according to the comparability test, mortality trends in men and women were parallel (final selected model failed to reject parallelism, P=0.0567). Interpretation & conclusions: No significant trend for mortality due to IHD was observed in Serbia during the study period. The substantial decline of mortality from IHD seen in most developed countries during the past decades was not observed in Serbia. Further efforts are required to reduce mortality from IHD in Serbian population. PMID:29664033

  1. Trends in Lung Cancer Incidence in Delhi, India 1988-2012: Age-Period-Cohort and Joinpoint Analyses

    PubMed

    Malhotra, Rajeev Kumar; Manoharan, Nalliah; Nair, Omana; Deo, Suryanarayana; Rath, Goura Kishor

    2018-06-25

    Introduction: Lung cancer (LC) has been one of the most commonly diagnosed cancers worldwide, both in terms of new cases and mortality. Exponential growth of economic and industrial activities in recent decades in the Delhi urban area may have increased the incidence of LC. The primary objective of this study was to evaluate the time trend according to gender. Method: LC incidence data over 25 years were obtained from the population based urban Delhi cancer registry. Joinpoint regression analysis was applied for evaluating the time trend of age-standardized incidence rates. The age-period-cohort (APC) model was employed using Poisson distribution with a log link function and the intrinsic estimator method. Results: During the 25 years, 13,489 male and 3,259 female LC cases were registered, accounting for 9.78% of male and 2.53% of female total cancer cases. Joinpoint regression analysis revealed that LC incidence in males continued to increase during the entire period, a sharp acceleration being observed starting from 2009. In females the LC incidence rate remained a plateau during 1988-2002 and thereafter increased. The cumulative risks for 1988-2012 were 1.79% and 0.45%. The full APC (IE) model showed best fit for an age-period-cohort effect on LC incidence, with significant increase with age peaking at 70-74 years in males and 65-69 years in females. A rising period effect was observed after adjusting for age and cohort effects in both genders and a declining cohort effect was identified after controlling for age and period effects. Conclusion: The incidence of LC in urban Delhi showed increasing trend from 1988-2012. Known factors such as environmental conservation, tobacco control, physical activity awareness and medical security should be implemented more vigorously over the long term in our population. Creative Commons Attribution License

  2. Jump Model / Comparability Ratio Model — Joinpoint Help System 4.4.0.0

    Cancer.gov

    The Jump Model / Comparability Ratio Model in the Joinpoint software provides a direct estimation of trend data (e.g. cancer rates) where there is a systematic scale change, which causes a “jump” in the rates, but is assumed not to affect the underlying trend.

  3. Trends in gastrointestinal cancer incidence in Iran, 2001-2010: a joinpoint analysis

    PubMed Central

    Motlagh, Ali; Karimi Jaberi, Maryam

    2016-01-01

    OBJECTIVES The main purpose of this study was to evaluate changes in the time trends of stomach, colorectal, and esophageal cancer during the past decade in Iran. METHODS Cancer incidence data for the years 2001 to 2010 were obtained from the cancer registration of the Ministry of Health. All incidence rates were directly age-standardized to the world standard population. In order to identified significant changes in time trends, we performed a joinpoint analysis. The annual percent change (APC) for each segment of the trends was then calculated. RESULTS The incidence of stomach cancer increased from 4.18 and 2.41 per 100,000 population in men and women, respectively, in 2001 to 17.06 (APC, 16.7%) and 8.85 (APC, 16.2%) per 100,000 population in 2010 for men and women, respectively. The corresponding values for colorectal cancer were 2.12 and 2.00 per 100,000 population for men and women, respectively, in 2001 and 11.28 (APC, 20.0%) and 10.33 (APC, 20.0%) per 100,000 in 2010. For esophageal cancer, the corresponding increase was from 3.25 and 2.10 per 100,000 population in 2001 to 5.57 (APC, 12.0%) and 5.62 (APC, 11.2%) per 100,000 population among men and women, respectively. The incidence increased most rapidly for stomach cancer in men and women aged 80 years and older (APC, 23.7% for men; APC, 18.6% for women), for colorectal cancer in men aged 60 to 69 years (APC, 24.2%) and in women aged 50 to 59 years (APC, 25.1%), and for esophageal cancer in men and women aged 80 years and older (APC, 17.5% for men; APC,15.3% for women) over the period of the study. CONCLUSIONS The incidence of gastrointestinal cancer significantly increased during the past decade. Therefore, monitoring the trends of cancer incidence can assist efforts for cancer prevention and control. PMID:27923268

  4. Analysis of cerebrovascular disease mortality trends in Andalusia (1980-2014).

    PubMed

    Cayuela, A; Cayuela, L; Rodríguez-Domínguez, S; González, A; Moniche, F

    2017-03-15

    In recent decades, mortality rates for cerebrovascular diseases (CVD) have decreased significantly in many countries. This study analyses recent tendencies in CVD mortality rates in Andalusia (1980-2014) to identify any changes in previously observed sex and age trends. CVD mortality and population data were obtained from Spain's National Statistics Institute database. We calculated age-specific and age-standardised mortality rates using the direct method (European standard population). Joinpoint regression analysis was used to estimate the annual percentage change in rates and identify significant changes in mortality trends. We also estimated rate ratios between Andalusia and Spain. Standardised rates for both males and females showed 3 periods in joinpoint regression analysis: an initial period of significant decline (1980-1997), a period of rate stabilisation (1997-2003), and another period of significant decline (2003-2014). Between 1997 and 2003, age-standardised rates stabilised in Andalusia but continued to decrease in Spain as a whole. This increased in the gap between CVD mortality rates in Andalusia and Spain for both sexes and most age groups. Copyright © 2017 The Author(s). Publicado por Elsevier España, S.L.U. All rights reserved.

  5. Principal component regression analysis with SPSS.

    PubMed

    Liu, R X; Kuang, J; Gong, Q; Hou, X L

    2003-06-01

    The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.

  6. Regression Analysis by Example. 5th Edition

    ERIC Educational Resources Information Center

    Chatterjee, Samprit; Hadi, Ali S.

    2012-01-01

    Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…

  7. Leukemia in Iran: Epidemiology and Morphology Trends.

    PubMed

    Koohi, Fatemeh; Salehiniya, Hamid; Shamlou, Reza; Eslami, Soheyla; Ghojogh, Ziyaeddin Mahery; Kor, Yones; Rafiemanesh, Hosein

    2015-01-01

    Leukemia accounts for 8% of total cancer cases and involves all age groups with different prevalence and incidence rates in Iran and the entire world and causes a significant death toll and heavy expenses for diagnosis and treatment processes. This study was done to evaluate epidemiology and morphology of blood cancer during 2003-2008. This cross- sectional study was carried out based on re- analysis of the Cancer Registry Center report of the Health Deputy in Iran during a 6-year period (2003 - 2008). Statistical analysis for incidence time trends and morphology change percentage was performed with joinpoint regression analysis using the software Joinpoint Regression Program. During the studied years a total of 18,353 hematopoietic and reticuloendothelial system cancers were recorded. Chi square test showed significant difference between sex and morphological types of blood cancer (P-value<0.001). Joinpoint analysis showed a significant increasing trend for the adjusted standard incidence rate (ASIR) for both sexes (P-value<0.05). Annual percent changes (APC) for women and men were 18.7 and 19.9, respectively. The most common morphological blood cancers were ALL, ALM, MM and CLL which accounted for 60% of total hematopoietic system cancers. Joinpoint analyze showed a significant decreasing trend for ALM in both sexes (P-value<0.05). Hematopoietic system cancers in Iran demonstrate an increasing trend for incidence rate and decreasing trend for ALL, ALM and CLL morphology.

  8. Functional Relationships and Regression Analysis.

    ERIC Educational Resources Information Center

    Preece, Peter F. W.

    1978-01-01

    Using a degenerate multivariate normal model for the distribution of organismic variables, the form of least-squares regression analysis required to estimate a linear functional relationship between variables is derived. It is suggested that the two conventional regression lines may be considered to describe functional, not merely statistical,…

  9. Regression Analysis and the Sociological Imagination

    ERIC Educational Resources Information Center

    De Maio, Fernando

    2014-01-01

    Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.

  10. Precision Efficacy Analysis for Regression.

    ERIC Educational Resources Information Center

    Brooks, Gordon P.

    When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…

  11. Study of colorectal mortality in the Andalusian population.

    PubMed

    Cayuela, A; Rodríguez-Domínguez, S; Garzón-Benavides, M; Pizarro-Moreno, A; Giráldez-Gallego, A; Cordero-Fernández, C

    2011-06-01

    to provide up-to-date information and to analyze recent changes in colorectal cancer mortality trends in Andalusia during the period of 1980-2008 using joinpoint regression models. age- and sex-specific colorectal cancer deaths were taken from the official vital statistics published by the Instituto de Estadística de Andalucía for the years 1980 to 2008. We computed age-specific rates for each 5-year age group and calendar year and age-standardized mortality rates per 100,000 men and women. A joinpoint regression analysis was used for trend analysis of standardized rates. Joinpoint regression analysis was used to identify the years when a significant change in the linear slope of the temporal trend occurred. The best fitting points (the "join-points") are chosen where the rate significantly changes. mortality from colorectal cancer in Andalusia during the period studied has increased, from 277 deaths in 1980 to 1,227 in 2008 in men, and from 333 to 805 deaths in women. Adjusted overall colorectal cancer mortality rates increased from 7.7 to 17.0 deaths per 100,000 person-years in men and from 6.6 to 9.0 per 100,000 person-years in women Changes in mortality did not evolve similarly for men and women. Age-specific CRC mortality rates are lower in women than in men, which imply that women reach comparable levels of colorectal cancer mortality at higher ages than men. sex differences for colorectal cancer mortality have been widening in the last decade in Andalusia. In spite of the decreasing trends in age-adjusted mortality rates in women, incidence rates and the absolute numbers of deaths are still increasing, largely because of the aging of the population. Consequently, colorectal cancer still has a large impact on health care services, and this impact will continue to increase for many more years.

  12. Oral cavity cancer trends over the past 25 years in Hong Kong: a multidirectional statistical analysis.

    PubMed

    Ushida, Keisuke; McGrath, Colman P; Lo, Edward C M; Zwahlen, Roger A

    2015-07-24

    Even though oral cavity cancer (OCC; ICD 10 codes C01, C02, C03, C04, C05, and C06) ranks eleventh among the world's most common cancers, accounting for approximately 2 % of all cancers, a trend analysis of OCC in Hong Kong is lacking. Hong Kong has experienced rapid economic growth with socio-cultural and environmental change after the Second World War. This together with the collected data in the cancer registry provides interesting ground for an epidemiological study on the influence of socio-cultural and environmental factors on OCC etiology. A multidirectional statistical analysis of the OCC trends over the past 25 years was performed using the databases of the Hong Kong Cancer Registry. The age, period, and cohort (APC) modeling was applied to determine age, period, and cohort effects on OCC development. Joinpoint regression analysis was used to find secular trend changes of both age-standardized and age-specific incidence rates. The APC model detected that OCC development in men was mainly dominated by the age effect, whereas in women an increasing linear period effect together with an age effect became evident. The joinpoint regression analysis showed a general downward trend of age-standardized incidence rates of OCC for men during the entire investigated period, whereas women demonstrated a significant upward trend from 2001 onwards. The results suggest that OCC incidence in Hong Kong appears to be associated with cumulative risk behaviors of the population, despite considerable socio-cultural and environmental changes after the Second World War.

  13. Multicollinearity and Regression Analysis

    NASA Astrophysics Data System (ADS)

    Daoud, Jamal I.

    2017-12-01

    In regression analysis it is obvious to have a correlation between the response and predictor(s), but having correlation among predictors is something undesired. The number of predictors included in the regression model depends on many factors among which, historical data, experience, etc. At the end selection of most important predictors is something objective due to the researcher. Multicollinearity is a phenomena when two or more predictors are correlated, if this happens, the standard error of the coefficients will increase [8]. Increased standard errors means that the coefficients for some or all independent variables may be found to be significantly different from In other words, by overinflating the standard errors, multicollinearity makes some variables statistically insignificant when they should be significant. In this paper we focus on the multicollinearity, reasons and consequences on the reliability of the regression model.

  14. Multivariate Regression Analysis and Slaughter Livestock,

    DTIC Science & Technology

    AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY

  15. Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis

    PubMed Central

    Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

    2015-01-01

    Background: Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. Methods: In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. Results: The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Conclusion: Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended. PMID:26793655

  16. Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis.

    PubMed

    Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

    2015-01-01

    Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.

  17. Regression Analysis: Legal Applications in Institutional Research

    ERIC Educational Resources Information Center

    Frizell, Julie A.; Shippen, Benjamin S., Jr.; Luna, Andrew L.

    2008-01-01

    This article reviews multiple regression analysis, describes how its results should be interpreted, and instructs institutional researchers on how to conduct such analyses using an example focused on faculty pay equity between men and women. The use of multiple regression analysis will be presented as a method with which to compare salaries of…

  18. Regression analysis using dependent Polya trees.

    PubMed

    Schörgendorfer, Angela; Branscum, Adam J

    2013-11-30

    Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.

  19. RAWS II: A MULTIPLE REGRESSION ANALYSIS PROGRAM,

    DTIC Science & Technology

    This memorandum gives instructions for the use and operation of a revised version of RAWS, a multiple regression analysis program. The program...of preprocessed data, the directed retention of variable, listing of the matrix of the normal equations and its inverse, and the bypassing of the regression analysis to provide the input variable statistics only. (Author)

  20. Common pitfalls in statistical analysis: Linear regression analysis

    PubMed Central

    Aggarwal, Rakesh; Ranganathan, Priya

    2017-01-01

    In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis. PMID:28447022

  1. Moderation analysis using a two-level regression model.

    PubMed

    Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott

    2014-10-01

    Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.

  2. Two Paradoxes in Linear Regression Analysis.

    PubMed

    Feng, Ge; Peng, Jing; Tu, Dongke; Zheng, Julia Z; Feng, Changyong

    2016-12-25

    Regression is one of the favorite tools in applied statistics. However, misuse and misinterpretation of results from regression analysis are common in biomedical research. In this paper we use statistical theory and simulation studies to clarify some paradoxes around this popular statistical method. In particular, we show that a widely used model selection procedure employed in many publications in top medical journals is wrong. Formal procedures based on solid statistical theory should be used in model selection.

  3. Two Paradoxes in Linear Regression Analysis

    PubMed Central

    FENG, Ge; PENG, Jing; TU, Dongke; ZHENG, Julia Z.; FENG, Changyong

    2016-01-01

    Summary Regression is one of the favorite tools in applied statistics. However, misuse and misinterpretation of results from regression analysis are common in biomedical research. In this paper we use statistical theory and simulation studies to clarify some paradoxes around this popular statistical method. In particular, we show that a widely used model selection procedure employed in many publications in top medical journals is wrong. Formal procedures based on solid statistical theory should be used in model selection. PMID:28638214

  4. Time Trend Analysis of Cancer‏ Incidence in Caspian Sea, 2004 - 2009: A Population-based Cancer Registries Study (northern Iran).

    PubMed

    Salehiniya, Hamid; Ghobadi Dashdebi, Sakineh; Rafiemanesh, Hosein; Mohammadian-Hafshejani, Abdollah; Enayatrad, Mostafa

    2016-01-01

    Cancer is a major public health problem in the world. In Iran especially after a transition to a dynamic and urban community, the pattern of cancer has changed significantly. An important change occurred regarding the incidence of cancer at the southern shores of the Caspian Sea, including Gilan, Mazandaran and Golestan province. This study was designed it investigate the epidemiology and changes in trend of cancer incidence in the geographic region of the Caspian Sea (North of Iran). Data were collected from Cancer Registry Center report of Iran health deputy. Trends of incidence were analyzed by joinpoint regression analysis. During the study period year (2004-2009), 33,807 cases of cancer had been recorded in three provinces of Gilan, Mazandran and Golstan. Joinpoint analysis indicated a significant increase in age-standardized incidence rates (ASR) with an average annual percentage change (AAPC) 10.3, 8.5 and 5.2 in Gilan, Mazandaran and Golestan, respectively. The most common cancer in these provinces were correspondingly cancer of stomach, breast, skin, colorectal and bladder, respectively. The incidence of cancer tends to be increasing in North of Iran. These findings warrant the epidemiologic studies are helpful in planning preventive programs and recognition of risk factors.

  5. Using Dominance Analysis to Determine Predictor Importance in Logistic Regression

    ERIC Educational Resources Information Center

    Azen, Razia; Traxel, Nicole

    2009-01-01

    This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

  6. Regression Model Optimization for the Analysis of Experimental Data

    NASA Technical Reports Server (NTRS)

    Ulbrich, N.

    2009-01-01

    A candidate math model search algorithm was developed at Ames Research Center that determines a recommended math model for the multivariate regression analysis of experimental data. The search algorithm is applicable to classical regression analysis problems as well as wind tunnel strain gage balance calibration analysis applications. The algorithm compares the predictive capability of different regression models using the standard deviation of the PRESS residuals of the responses as a search metric. This search metric is minimized during the search. Singular value decomposition is used during the search to reject math models that lead to a singular solution of the regression analysis problem. Two threshold dependent constraints are also applied. The first constraint rejects math models with insignificant terms. The second constraint rejects math models with near-linear dependencies between terms. The math term hierarchy rule may also be applied as an optional constraint during or after the candidate math model search. The final term selection of the recommended math model depends on the regressor and response values of the data set, the user s function class combination choice, the user s constraint selections, and the result of the search metric minimization. A frequently used regression analysis example from the literature is used to illustrate the application of the search algorithm to experimental data.

  7. Linear regression analysis for comparing two measurers or methods of measurement: but which regression?

    PubMed

    Ludbrook, John

    2010-07-01

    1. There are two reasons for wanting to compare measurers or methods of measurement. One is to calibrate one method or measurer against another; the other is to detect bias. Fixed bias is present when one method gives higher (or lower) values across the whole range of measurement. Proportional bias is present when one method gives values that diverge progressively from those of the other. 2. Linear regression analysis is a popular method for comparing methods of measurement, but the familiar ordinary least squares (OLS) method is rarely acceptable. The OLS method requires that the x values are fixed by the design of the study, whereas it is usual that both y and x values are free to vary and are subject to error. In this case, special regression techniques must be used. 3. Clinical chemists favour techniques such as major axis regression ('Deming's method'), the Passing-Bablok method or the bivariate least median squares method. Other disciplines, such as allometry, astronomy, biology, econometrics, fisheries research, genetics, geology, physics and sports science, have their own preferences. 4. Many Monte Carlo simulations have been performed to try to decide which technique is best, but the results are almost uninterpretable. 5. I suggest that pharmacologists and physiologists should use ordinary least products regression analysis (geometric mean regression, reduced major axis regression): it is versatile, can be used for calibration or to detect bias and can be executed by hand-held calculator or by using the loss function in popular, general-purpose, statistical software.

  8. Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis

    ERIC Educational Resources Information Center

    Kim, Rae Seon

    2011-01-01

    When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…

  9. General Nature of Multicollinearity in Multiple Regression Analysis.

    ERIC Educational Resources Information Center

    Liu, Richard

    1981-01-01

    Discusses multiple regression, a very popular statistical technique in the field of education. One of the basic assumptions in regression analysis requires that independent variables in the equation should not be highly correlated. The problem of multicollinearity and some of the solutions to it are discussed. (Author)

  10. Trends in incidence of lung cancer in Croatia from 2001 to 2013: gender and regional differences

    PubMed Central

    Siroglavić, Katarina-Josipa; Polić Vižintin, Marina; Tripković, Ingrid; Šekerija, Mario; Kukulj, Suzana

    2017-01-01

    Aim To provide an overview of the lung cancer incidence trends in the City of Zagreb (Zagreb), Split-Dalmatia County (SDC), and Croatia in the period from 2001 to 2013. Method Incidence data were obtained from the Croatian National Cancer Registry. For calculating incidence rates per 100 000 population, we used population estimates for the period 2001-2013 from the Croatian Bureau of Statistics. Age-standardized rates of lung cancer incidence were calculated by the direct standardization method using the European Standard Population. To describe incidence trends, we used joinpoint regression analysis. Results Joinpoint analysis showed a statistically significant decrease in lung cancer incidence in men in all regions, with an annual percentage change (APC) of -2.2% for Croatia, 1.9% for Zagreb, and -2.0% for SDC. In women, joinpoint analysis showed a statistically significant increase in the incidence for Croatia, with APC of 1.4%, a statistically significant increase of 1.0% for Zagreb, and no significant change in trend for SDC. In both genders, joinpoint analysis showed a significant decrease in age-standardized incidence rates of lung cancer, with APC of -1.3% for Croatia, -1.1% for Zagreb, and -1.6% for SDC. Conclusion There was an increase in female lung cancer incidence rate and a decrease in male lung cancer incidence rate in Croatia in 2001-20013 period, with similar patterns observed in all the investigated regions. These results highlight the importance of smoking prevention and cessation policies, especially among women and young people. PMID:29094814

  11. Resting-state functional magnetic resonance imaging: the impact of regression analysis.

    PubMed

    Yeh, Chia-Jung; Tseng, Yu-Sheng; Lin, Yi-Ru; Tsai, Shang-Yueh; Huang, Teng-Yi

    2015-01-01

    To investigate the impact of regression methods on resting-state functional magnetic resonance imaging (rsfMRI). During rsfMRI preprocessing, regression analysis is considered effective for reducing the interference of physiological noise on the signal time course. However, it is unclear whether the regression method benefits rsfMRI analysis. Twenty volunteers (10 men and 10 women; aged 23.4 ± 1.5 years) participated in the experiments. We used node analysis and functional connectivity mapping to assess the brain default mode network by using five combinations of regression methods. The results show that regressing the global mean plays a major role in the preprocessing steps. When a global regression method is applied, the values of functional connectivity are significantly lower (P ≤ .01) than those calculated without a global regression. This step increases inter-subject variation and produces anticorrelated brain areas. rsfMRI data processed using regression should be interpreted carefully. The significance of the anticorrelated brain areas produced by global signal removal is unclear. Copyright © 2014 by the American Society of Neuroimaging.

  12. Development of a User Interface for a Regression Analysis Software Tool

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert Manfred; Volden, Thomas R.

    2010-01-01

    An easy-to -use user interface was implemented in a highly automated regression analysis tool. The user interface was developed from the start to run on computers that use the Windows, Macintosh, Linux, or UNIX operating system. Many user interface features were specifically designed such that a novice or inexperienced user can apply the regression analysis tool with confidence. Therefore, the user interface s design minimizes interactive input from the user. In addition, reasonable default combinations are assigned to those analysis settings that influence the outcome of the regression analysis. These default combinations will lead to a successful regression analysis result for most experimental data sets. The user interface comes in two versions. The text user interface version is used for the ongoing development of the regression analysis tool. The official release of the regression analysis tool, on the other hand, has a graphical user interface that is more efficient to use. This graphical user interface displays all input file names, output file names, and analysis settings for a specific software application mode on a single screen which makes it easier to generate reliable analysis results and to perform input parameter studies. An object-oriented approach was used for the development of the graphical user interface. This choice keeps future software maintenance costs to a reasonable limit. Examples of both the text user interface and graphical user interface are discussed in order to illustrate the user interface s overall design approach.

  13. Applying Regression Analysis to Problems in Institutional Research.

    ERIC Educational Resources Information Center

    Bohannon, Tom R.

    1988-01-01

    Regression analysis is one of the most frequently used statistical techniques in institutional research. Principles of least squares, model building, residual analysis, influence statistics, and multi-collinearity are described and illustrated. (Author/MSE)

  14. Method for nonlinear exponential regression analysis

    NASA Technical Reports Server (NTRS)

    Junkin, B. G.

    1972-01-01

    Two computer programs developed according to two general types of exponential models for conducting nonlinear exponential regression analysis are described. Least squares procedure is used in which the nonlinear problem is linearized by expanding in a Taylor series. Program is written in FORTRAN 5 for the Univac 1108 computer.

  15. Secular trend analysis of lung cancer incidence in Sihui city, China between 1987 and 2011.

    PubMed

    Du, Jin-Lin; Lin, Xiao; Zhang, Li-Fang; Li, Yan-Hua; Xie, Shang-Hang; Yang, Meng-Jie; Guo, Jie; Lin, Er-Hong; Liu, Qing; Hong, Ming-Huang; Huang, Qi-Hong; Liao, Zheng-Er; Cao, Su-Mei

    2015-07-31

    With industrial and econom ic development in recent decades in South China, cancer incidence may have changed due to the changing lifestyle and environment. However, the trends of lung cancer and the roles of smoking and other environmental risk factors in the development of lung cancer in rural areas of South China remain unclear. The purpose of this study was to explore the lung cancer incidence trends and the possible causes of these trends. Joinpoint regression analysis and the age-period-cohort (APC) model were used to analyze the lung cancer incidence trends in Sihui, Guangdong province, China between 1987 and 2011, and explore the possible causes of these trends. A total of 2,397 lung cancer patients were involved in this study. A 3-fold increase in the incidence of lung cancer in both sexes was observed over the 25-year period. Joinpoint regression analysis showed that while the incidence continued to increase steadily in females during the entire period, a sharp acceleration was observed in males starting in 2005. The full APC model was selected to describe age, period, and birth cohort effects on lung cancer incidence trends in Sihui. The age cohorts in both sexes showed a continuously significant increase in the relative risk (RR) of lung cancer, with a peak in the eldest age group (80-84 years). The RR of lung cancer showed a fluctuating curve in both sexes. The birth cohorts identified an increased trend in both males and females; however, males had a plateau in the youngest cohorts who were born during 1955-1969. Increasing trends of the incidence of lung cancer in Sihui were dominated by the effects of age and birth cohorts. Social aging, smoking, and environmental changes may play important roles in such trends.

  16. An improved multiple linear regression and data analysis computer program package

    NASA Technical Reports Server (NTRS)

    Sidik, S. M.

    1972-01-01

    NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.

  17. Increasing Rates of Brain Tumours in the Swedish National Inpatient Register and the Causes of Death Register

    PubMed Central

    Hardell, Lennart; Carlberg, Michael

    2015-01-01

    Radiofrequency emissions in the frequency range 30 kHz–300 GHz were evaluated to be Group 2B, i.e., “possibly”, carcinogenic to humans by the International Agency for Research on Cancer (IARC) at WHO in May 2011. The Swedish Cancer Register has not shown increasing incidence of brain tumours in recent years and has been used to dismiss epidemiological evidence on a risk. In this study we used the Swedish National Inpatient Register (IPR) and Causes of Death Register (CDR) to further study the incidence comparing with the Cancer Register data for the time period 1998–2013 using joinpoint regression analysis. In the IPR we found a joinpoint in 2007 with Annual Percentage Change (APC) +4.25%, 95% CI +1.98, +6.57% during 2007–2013 for tumours of unknown type in the brain or CNS. In the CDR joinpoint regression found one joinpoint in 2008 with APC during 2008–2013 +22.60%, 95% CI +9.68, +37.03%. These tumour diagnoses would be based on clinical examination, mainly CT and/or MRI, but without histopathology or cytology. No statistically significant increasing incidence was found in the Swedish Cancer Register during these years. We postulate that a large part of brain tumours of unknown type are never reported to the Cancer Register. Furthermore, the frequency of diagnosis based on autopsy has declined substantially due to a general decline of autopsies in Sweden adding further to missing cases. We conclude that the Swedish Cancer Register is not reliable to be used to dismiss results in epidemiological studies on the use of wireless phones and brain tumour risk. PMID:25854296

  18. Increasing rates of brain tumours in the Swedish national inpatient register and the causes of death register.

    PubMed

    Hardell, Lennart; Carlberg, Michael

    2015-04-03

    Radiofrequency emissions in the frequency range 30 kHz-300 GHz were evaluated to be Group 2B, i.e., "possibly", carcinogenic to humans by the International Agency for Research on Cancer (IARC) at WHO in May 2011. The Swedish Cancer Register has not shown increasing incidence of brain tumours in recent years and has been used to dismiss epidemiological evidence on a risk. In this study we used the Swedish National Inpatient Register (IPR) and Causes of Death Register (CDR) to further study the incidence comparing with the Cancer Register data for the time period 1998-2013 using joinpoint regression analysis. In the IPR we found a joinpoint in 2007 with Annual Percentage Change (APC) +4.25%, 95% CI +1.98, +6.57% during 2007-2013 for tumours of unknown type in the brain or CNS. In the CDR joinpoint regression found one joinpoint in 2008 with APC during 2008-2013 +22.60%, 95% CI +9.68, +37.03%. These tumour diagnoses would be based on clinical examination, mainly CT and/or MRI, but without histopathology or cytology. No statistically significant increasing incidence was found in the Swedish Cancer Register during these years. We postulate that a large part of brain tumours of unknown type are never reported to the Cancer Register. Furthermore, the frequency of diagnosis based on autopsy has declined substantially due to a general decline of autopsies in Sweden adding further to missing cases. We conclude that the Swedish Cancer Register is not reliable to be used to dismiss results in epidemiological studies on the use of wireless phones and brain tumour risk.

  19. Regression Commonality Analysis: A Technique for Quantitative Theory Building

    ERIC Educational Resources Information Center

    Nimon, Kim; Reio, Thomas G., Jr.

    2011-01-01

    When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…

  20. Regression analysis for solving diagnosis problem of children's health

    NASA Astrophysics Data System (ADS)

    Cherkashina, Yu A.; Gerget, O. M.

    2016-04-01

    The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.

  1. Breast Cancer Trend in Iran from 2000 to 2009 and Prediction till 2020 using a Trend Analysis Method.

    PubMed

    Zahmatkesh, Bibihajar; Keramat, Afsaneh; Alavi, Nasrinossadat; Khosravi, Ahmad; Kousha, Ahmad; Motlagh, Ali Ghanbari; Darman, Mahboobeh; Partovipour, Elham; Chaman, Reza

    2016-01-01

    Breast cancer is the most common cancer in women worldwide with a rising incidence rate in most countries. Considering the increase in life expectancy and change in lifestyle of Iranian women, this study investigated the age-adjusted trend of breast cancer incidence during 2000-2009 and predicted its incidence to 2020. The 1997 and 2006 census results were used for the projection of female population by age through the cohort-component method over the studied years. Data from the Iranian cancer registration system were used to calculate the annual incidence rate of breast cancer. The age-adjusted incidence rate was then calculated using the WHO standard population distribution. The five-year-age-specific incidence rates were also obtained for each year and future incidence was determined using the trend analysis method. Annual percentage change (APC) was calculated through the joinpoint regression method. The bias adjusted incidence rate of breast cancer increased from 16.7 per 100,000 women in 2000 to 33.6 per 100,000 women in 2009. The incidence of breast cancer had a growing trend in almost all age groups above 30 years over the studied years. In this period, the age groups of 45-65 years had the highest incidence. Investigation into the joinpoint curve showed that the curve had a steep slope with an APC of 23.4% before the first joinpoint, but became milder after this. From 2005 to 2009, the APC was calculated as 2.7%, through which the incidence of breast cancer in 2020 was predicted as 63.0 per 100,000 women. The age-adjusted incidence rate of breast cancer continues to increas in Iranian women. It is predicted that this trend will continue until 2020. Therefore, it seems necessary to prioritize the prevention, control and care for breast cancer in Iran.

  2. A framework for longitudinal data analysis via shape regression

    NASA Astrophysics Data System (ADS)

    Fishbaugh, James; Durrleman, Stanley; Piven, Joseph; Gerig, Guido

    2012-02-01

    Traditional longitudinal analysis begins by extracting desired clinical measurements, such as volume or head circumference, from discrete imaging data. Typically, the continuous evolution of a scalar measurement is estimated by choosing a 1D regression model, such as kernel regression or fitting a polynomial of fixed degree. This type of analysis not only leads to separate models for each measurement, but there is no clear anatomical or biological interpretation to aid in the selection of the appropriate paradigm. In this paper, we propose a consistent framework for the analysis of longitudinal data by estimating the continuous evolution of shape over time as twice differentiable flows of deformations. In contrast to 1D regression models, one model is chosen to realistically capture the growth of anatomical structures. From the continuous evolution of shape, we can simply extract any clinical measurements of interest. We demonstrate on real anatomical surfaces that volume extracted from a continuous shape evolution is consistent with a 1D regression performed on the discrete measurements. We further show how the visualization of shape progression can aid in the search for significant measurements. Finally, we present an example on a shape complex of the brain (left hemisphere, right hemisphere, cerebellum) that demonstrates a potential clinical application for our framework.

  3. The Precision Efficacy Analysis for Regression Sample Size Method.

    ERIC Educational Resources Information Center

    Brooks, Gordon P.; Barcikowski, Robert S.

    The general purpose of this study was to examine the efficiency of the Precision Efficacy Analysis for Regression (PEAR) method for choosing appropriate sample sizes in regression studies used for precision. The PEAR method, which is based on the algebraic manipulation of an accepted cross-validity formula, essentially uses an effect size to…

  4. Background stratified Poisson regression analysis of cohort data.

    PubMed

    Richardson, David B; Langholz, Bryan

    2012-03-01

    Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models.

  5. Exact Analysis of Squared Cross-Validity Coefficient in Predictive Regression Models

    ERIC Educational Resources Information Center

    Shieh, Gwowen

    2009-01-01

    In regression analysis, the notion of population validity is of theoretical interest for describing the usefulness of the underlying regression model, whereas the presumably more important concept of population cross-validity represents the predictive effectiveness for the regression equation in future research. It appears that the inference…

  6. A Quality Assessment Tool for Non-Specialist Users of Regression Analysis

    ERIC Educational Resources Information Center

    Argyrous, George

    2015-01-01

    This paper illustrates the use of a quality assessment tool for regression analysis. It is designed for non-specialist "consumers" of evidence, such as policy makers. The tool provides a series of questions such consumers of evidence can ask to interrogate regression analysis, and is illustrated with reference to a recent study published…

  7. Increasing thyroid cancer incidence in Lithuania in 1978-2003.

    PubMed

    Smailyte, Giedre; Miseikyte-Kaubriene, Edita; Kurtinaitis, Juozas

    2006-12-11

    The aim of this paper is to analyze changes in thyroid cancer incidence trends in Lithuania during the period 1978-2003 using joinpoint regression models, with special attention to the period 1993-2003. The study was based on all cases of thyroid cancer reported to the Lithuanian Cancer Registry between 1978 and 2003. Age group-specific rates and standardized rates were calculated for each gender, using the direct method (world standard population). The joinpoint regression model was used to provide estimated annual percentage change and to detect points in time where significant changes in the trends occur. During the study period the age-standardized incidence rates increased in males from 0.7 to 2.5 cases per 100,000 and in females from 1.5 to 11.4 per 100,000. Annual percentage changes during this period in the age-standardized rates were 4.6% and 7.1% for males and females, respectively. Joinpoint analysis showed two time periods with joinpoint in the year 2000. A change in the trend occurred in which a significant increase changed to a dramatic increase in thyroid cancer incidence rates. Papillary carcinoma and stage I thyroid cancer increases over this period were mainly responsible for the pattern of changes in trend in recent years. A moderate increase in thyroid cancer incidence has been observed in Lithuania between the years 1978 and 2000. An accelerated increase in thyroid cancer incidence rates took place in the period 2000-2003. It seems that the increase in thyroid cancer incidence can be attributed mainly to the changes in the management of non palpable thyroid nodules with growing applications of ultrasound-guided fine needle aspiration biopsy in clinical practice.

  8. Understanding logistic regression analysis.

    PubMed

    Sperandei, Sandro

    2014-01-01

    Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using examples to make it as simple as possible. After definition of the technique, the basic interpretation of the results is highlighted and then some special issues are discussed.

  9. Significant social events and increasing use of life-sustaining treatment: trend analysis using extracorporeal membrane oxygenation as an example.

    PubMed

    Chen, Yen-Yuan; Chen, Likwang; Huang, Tien-Shang; Ko, Wen-Je; Chu, Tzong-Shinn; Ni, Yen-Hsuan; Chang, Shan-Chwen

    2014-03-04

    Most studies have examined the outcomes of patients supported by extracorporeal membrane oxygenation as a life-sustaining treatment. It is unclear whether significant social events are associated with the use of life-sustaining treatment. This study aimed to compare the trend of extracorporeal membrane oxygenation use in Taiwan with that in the world, and to examine the influence of significant social events on the trend of extracorporeal membrane oxygenation use in Taiwan. Taiwan's extracorporeal membrane oxygenation uses from 2000 to 2009 were collected from National Health Insurance Research Dataset. The number of the worldwide extracorporeal membrane oxygenation cases was mainly estimated using Extracorporeal Life Support Registry Report International Summary July 2012. The trend of Taiwan's crude annual incidence rate of extracorporeal membrane oxygenation use was compared with that of the rest of the world. Each trend of extracorporeal membrane oxygenation use was examined using joinpoint regression. The measurement was the crude annual incidence rate of extracorporeal membrane oxygenation use. Each of the Taiwan's crude annual incidence rates was much higher than the worldwide one in the same year. Both the trends of Taiwan's and worldwide crude annual incidence rates have significantly increased since 2000. Joinpoint regression selected the model of the Taiwan's trend with one joinpoint in 2006 as the best-fitted model, implying that the significant social events in 2006 were significantly associated with the trend change of extracorporeal membrane oxygenation use following 2006. In addition, significantly social events highlighted by the media are more likely to be associated with the increase of extracorporeal membrane oxygenation use than being fully covered by National Health Insurance. Significant social events, such as a well-known person's successful extracorporeal membrane oxygenation use highlighted by the mass media, are associated with the use of

  10. Classification and regression tree analysis vs. multivariable linear and logistic regression methods as statistical tools for studying haemophilia.

    PubMed

    Henrard, S; Speybroeck, N; Hermans, C

    2015-11-01

    Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.

  11. A study of home deaths in Japan from 1951 to 2002

    PubMed Central

    Yang, Limin; Sakamoto, Naoko; Marui, Eiji

    2006-01-01

    Background Several surveys in Japan have indicated that most terminally ill Japanese patients would prefer to die at home or in a homelike setting. However, there is a great disparity between this stated preference and the reality, since most Japanese die in hospital. We report here national changes in home deaths in Japan over the last 5 decades. Using prefecture data, we also examined the factors in the medical service associated with home death in Japan. Methods Published data on place of death was obtained from the vital statistics compiled by the Ministry of Health, Labor and Welfare of Japan. We analyzed trends of home deaths from 1951 to 2002, and describe the changes in the proportion of home deaths by region, sex, age, and cause of death. Joinpoint regression analysis was used for trend analysis. Logistic regression analysis was performed to identify secular trends in home deaths, and the impact of age, sex, year of deaths and cause of deaths on home death. We also examined the association between home death and medical service factors by multiple regression analysis, using home death rate by prefectures in 2002 as a dependent variable. Results A significant decrease in the percentage of patients dying at home was observed in the results of joinpoint regression analysis. Older patients and males were more likely to die at home. Patients who died from cancer were less likely to die at home. The results of multiple regression analysis indicated that home death was related to the number of beds in hospital, ratio of daily occupied beds in general hospital, the number of families in which the elderly were living alone, and dwelling rooms. Conclusion The pattern of the place of death has not only been determined by social and demographic characteristics of the decedent, but also associated with the medical service in the community. PMID:16524485

  12. Retro-regression--another important multivariate regression improvement.

    PubMed

    Randić, M

    2001-01-01

    We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.

  13. Analysis of Sting Balance Calibration Data Using Optimized Regression Models

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert; Bader, Jon B.

    2009-01-01

    Calibration data of a wind tunnel sting balance was processed using a search algorithm that identifies an optimized regression model for the data analysis. The selected sting balance had two moment gages that were mounted forward and aft of the balance moment center. The difference and the sum of the two gage outputs were fitted in the least squares sense using the normal force and the pitching moment at the balance moment center as independent variables. The regression model search algorithm predicted that the difference of the gage outputs should be modeled using the intercept and the normal force. The sum of the two gage outputs, on the other hand, should be modeled using the intercept, the pitching moment, and the square of the pitching moment. Equations of the deflection of a cantilever beam are used to show that the search algorithm s two recommended math models can also be obtained after performing a rigorous theoretical analysis of the deflection of the sting balance under load. The analysis of the sting balance calibration data set is a rare example of a situation when regression models of balance calibration data can directly be derived from first principles of physics and engineering. In addition, it is interesting to see that the search algorithm recommended the same regression models for the data analysis using only a set of statistical quality metrics.

  14. Linear regression analysis: part 14 of a series on evaluation of scientific publications.

    PubMed

    Schneider, Astrid; Hommel, Gerhard; Blettner, Maria

    2010-11-01

    Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.

  15. Poisson Regression Analysis of Illness and Injury Surveillance Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Frome E.L., Watkins J.P., Ellis E.D.

    2012-12-12

    The Department of Energy (DOE) uses illness and injury surveillance to monitor morbidity and assess the overall health of the work force. Data collected from each participating site include health events and a roster file with demographic information. The source data files are maintained in a relational data base, and are used to obtain stratified tables of health event counts and person time at risk that serve as the starting point for Poisson regression analysis. The explanatory variables that define these tables are age, gender, occupational group, and time. Typical response variables of interest are the number of absences duemore » to illness or injury, i.e., the response variable is a count. Poisson regression methods are used to describe the effect of the explanatory variables on the health event rates using a log-linear main effects model. Results of fitting the main effects model are summarized in a tabular and graphical form and interpretation of model parameters is provided. An analysis of deviance table is used to evaluate the importance of each of the explanatory variables on the event rate of interest and to determine if interaction terms should be considered in the analysis. Although Poisson regression methods are widely used in the analysis of count data, there are situations in which over-dispersion occurs. This could be due to lack-of-fit of the regression model, extra-Poisson variation, or both. A score test statistic and regression diagnostics are used to identify over-dispersion. A quasi-likelihood method of moments procedure is used to evaluate and adjust for extra-Poisson variation when necessary. Two examples are presented using respiratory disease absence rates at two DOE sites to illustrate the methods and interpretation of the results. In the first example the Poisson main effects model is adequate. In the second example the score test indicates considerable over-dispersion and a more detailed analysis attributes the over-dispersion to

  16. Analysis of Sting Balance Calibration Data Using Optimized Regression Models

    NASA Technical Reports Server (NTRS)

    Ulbrich, N.; Bader, Jon B.

    2010-01-01

    Calibration data of a wind tunnel sting balance was processed using a candidate math model search algorithm that recommends an optimized regression model for the data analysis. During the calibration the normal force and the moment at the balance moment center were selected as independent calibration variables. The sting balance itself had two moment gages. Therefore, after analyzing the connection between calibration loads and gage outputs, it was decided to choose the difference and the sum of the gage outputs as the two responses that best describe the behavior of the balance. The math model search algorithm was applied to these two responses. An optimized regression model was obtained for each response. Classical strain gage balance load transformations and the equations of the deflection of a cantilever beam under load are used to show that the search algorithm s two optimized regression models are supported by a theoretical analysis of the relationship between the applied calibration loads and the measured gage outputs. The analysis of the sting balance calibration data set is a rare example of a situation when terms of a regression model of a balance can directly be derived from first principles of physics. In addition, it is interesting to note that the search algorithm recommended the correct regression model term combinations using only a set of statistical quality metrics that were applied to the experimental data during the algorithm s term selection process.

  17. Quality of life in breast cancer patients--a quantile regression analysis.

    PubMed

    Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma

    2008-01-01

    Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.

  18. The impact of OSHA recordkeeping regulation changes on occupational injury and illness trends in the US: a time-series analysis.

    PubMed

    Friedman, Lee S; Forst, Linda

    2007-07-01

    The Survey of Occupational Injuries and Illnesses (SOII), based on Occupational Safety and Health Administration (OSHA) logs, indicates that the number of occupational injuries and illnesses in the US has steadily declined by 35.8% between 1992-2003. However, major changes to the OSHA recordkeeping standard occurred in 1995 and 2001. The authors assessed the relation between changes in OSHA recordkeeping regulations and the trend in occupational injuries and illnesses. SOII data available from the Bureau of Labor Statistics for years 1992-2003 were collected. The authors assessed time series data using join-point regression models. Before the first major recordkeeping change in 1995, injuries and illnesses declined annually by 0.5%. In the period 1995-2000 the slope declined by 3.1% annually (95% CI -3.7% to -2.5%), followed by another more precipitous decline occurring in 2001-2003 (-8.3%; 95% CI -10.0% to -6.6%). When stratifying the data, the authors continued to observe significant changes occurring in 1995 and 2001. The substantial declines in the number of injuries and illnesses correspond directly with changes in OSHA recordkeeping rules. Changes in employment, productivity, OSHA enforcement activity and sampling error do not explain the large decline. Based on the baseline slope (join-point regression analysis, 1992-4), the authors expected a decline of 407 964 injuries and illnesses during the period of follow-up if no intervention occurred; they actually observed a decline of 2.4 million injuries and illnesses of which 2 million or 83% of the decline can be attributed to the change in the OSHA recordkeeping rules.

  19. Visual grading characteristics and ordinal regression analysis during optimisation of CT head examinations.

    PubMed

    Zarb, Francis; McEntee, Mark F; Rainford, Louise

    2015-06-01

    To evaluate visual grading characteristics (VGC) and ordinal regression analysis during head CT optimisation as a potential alternative to visual grading assessment (VGA), traditionally employed to score anatomical visualisation. Patient images (n = 66) were obtained using current and optimised imaging protocols from two CT suites: a 16-slice scanner at the national Maltese centre for trauma and a 64-slice scanner in a private centre. Local resident radiologists (n = 6) performed VGA followed by VGC and ordinal regression analysis. VGC alone indicated that optimised protocols had similar image quality as current protocols. Ordinal logistic regression analysis provided an in-depth evaluation, criterion by criterion allowing the selective implementation of the protocols. The local radiology review panel supported the implementation of optimised protocols for brain CT examinations (including trauma) in one centre, achieving radiation dose reductions ranging from 24 % to 36 %. In the second centre a 29 % reduction in radiation dose was achieved for follow-up cases. The combined use of VGC and ordinal logistic regression analysis led to clinical decisions being taken on the implementation of the optimised protocols. This improved method of image quality analysis provided the evidence to support imaging protocol optimisation, resulting in significant radiation dose savings. • There is need for scientifically based image quality evaluation during CT optimisation. • VGC and ordinal regression analysis in combination led to better informed clinical decisions. • VGC and ordinal regression analysis led to dose reductions without compromising diagnostic efficacy.

  20. Comparison of cranial sex determination by discriminant analysis and logistic regression.

    PubMed

    Amores-Ampuero, Anabel; Alemán, Inmaculada

    2016-04-05

    Various methods have been proposed for estimating dimorphism. The objective of this study was to compare sex determination results from cranial measurements using discriminant analysis or logistic regression. The study sample comprised 130 individuals (70 males) of known sex, age, and cause of death from San José cemetery in Granada (Spain). Measurements of 19 neurocranial dimensions and 11 splanchnocranial dimensions were subjected to discriminant analysis and logistic regression, and the percentages of correct classification were compared between the sex functions obtained with each method. The discriminant capacity of the selected variables was evaluated with a cross-validation procedure. The percentage accuracy with discriminant analysis was 78.2% for the neurocranium (82.4% in females and 74.6% in males) and 73.7% for the splanchnocranium (79.6% in females and 68.8% in males). These percentages were higher with logistic regression analysis: 85.7% for the neurocranium (in both sexes) and 94.1% for the splanchnocranium (100% in females and 91.7% in males).

  1. Regression Analysis: Instructional Resource for Cost/Managerial Accounting

    ERIC Educational Resources Information Center

    Stout, David E.

    2015-01-01

    This paper describes a classroom-tested instructional resource, grounded in principles of active learning and a constructivism, that embraces two primary objectives: "demystify" for accounting students technical material from statistics regarding ordinary least-squares (OLS) regression analysis--material that students may find obscure or…

  2. Using Robust Standard Errors to Combine Multiple Regression Estimates with Meta-Analysis

    ERIC Educational Resources Information Center

    Williams, Ryan T.

    2012-01-01

    Combining multiple regression estimates with meta-analysis has continued to be a difficult task. A variety of methods have been proposed and used to combine multiple regression slope estimates with meta-analysis, however, most of these methods have serious methodological and practical limitations. The purpose of this study was to explore the use…

  3. Robust Mediation Analysis Based on Median Regression

    PubMed Central

    Yuan, Ying; MacKinnon, David P.

    2014-01-01

    Mediation analysis has many applications in psychology and the social sciences. The most prevalent methods typically assume that the error distribution is normal and homoscedastic. However, this assumption may rarely be met in practice, which can affect the validity of the mediation analysis. To address this problem, we propose robust mediation analysis based on median regression. Our approach is robust to various departures from the assumption of homoscedasticity and normality, including heavy-tailed, skewed, contaminated, and heteroscedastic distributions. Simulation studies show that under these circumstances, the proposed method is more efficient and powerful than standard mediation analysis. We further extend the proposed robust method to multilevel mediation analysis, and demonstrate through simulation studies that the new approach outperforms the standard multilevel mediation analysis. We illustrate the proposed method using data from a program designed to increase reemployment and enhance mental health of job seekers. PMID:24079925

  4. Principal regression analysis and the index leverage effect

    NASA Astrophysics Data System (ADS)

    Reigneron, Pierre-Alain; Allez, Romain; Bouchaud, Jean-Philippe

    2011-09-01

    We revisit the index leverage effect, that can be decomposed into a volatility effect and a correlation effect. We investigate the latter using a matrix regression analysis, that we call ‘Principal Regression Analysis' (PRA) and for which we provide some analytical (using Random Matrix Theory) and numerical benchmarks. We find that downward index trends increase the average correlation between stocks (as measured by the most negative eigenvalue of the conditional correlation matrix), and makes the market mode more uniform. Upward trends, on the other hand, also increase the average correlation between stocks but rotates the corresponding market mode away from uniformity. There are two time scales associated to these effects, a short one on the order of a month (20 trading days), and a longer time scale on the order of a year. We also find indications of a leverage effect for sectorial correlations as well, which reveals itself in the second and third mode of the PRA.

  5. Impact of the global financial crisis on low birth weight in Portugal: a time-trend analysis.

    PubMed

    Kana, Musa Abubakar; Correia, Sofia; Peleteiro, Barbara; Severo, Milton; Barros, Henrique

    2017-01-01

    The 2007-2008 global financial crisis had adverse consequences on population health of affected European countries. Few contemporary studies have studied its effect on perinatal indicators with long-lasting influence on adult health. Therefore, in this study, we investigated the impact of the 2007-2008 global financial crisis on low birth weight (LBW) in Portugal. Data on 2 045 155 singleton births of 1995-2014 were obtained from Statistics Portugal. Joinpoint regression analysis was performed to identify the years in which changes in LBW trends occurred, and to estimate the annual per cent changes (APC). LBW risk by time period expressed as prevalence ratios were computed using the Poisson regression. Contextual changes in sociodemographic and economic factors were provided by their trends. The joinpoint analysis identified 3 distinct periods (2 jointpoints) with different APC in LBW, corresponding to 1995-1999 (APC=4.4; 95% CI 3.2 to 5.6), 2000-2006 (APC=0.1; 95% CI -050 to 0.7) and 2007-2014 (APC=1.6; 95% CI 1.2 to 2.0). For non-Portuguese, it was, respectively, 1995-1999 (APC=1.4; 95% CI -3.9 to 7.0%), 2000-2007 (APC=-4.2; 95% CI -6.4 to -2.0) and 2008-2014 (APC=3.1; 95% CI 0.8 to 5.5). Compared with 1995-1999, all specific maternal characteristics had a 10-15% increase in LBW risk in 2000-2006 and a 20-25% increase in 2007-2014, except among migrants, for which LBW risk remained lower than in 1995-1999 but increased after the crisis. The increasing LBW risk coincides with a deceleration in gross domestic product growth rate, reduction in health expenditure, social protection allocation on family/children support and sickness. The 2007-2008 global financial crisis was associated with a significant increase in LBW, particularly among infants of non-Portuguese mothers. We recommend strengthening social policies aimed at maternity protection for vulnerable mothers and health system maintenance of social equity in perinatal healthcare.

  6. Multiplication factor versus regression analysis in stature estimation from hand and foot dimensions.

    PubMed

    Krishan, Kewal; Kanchan, Tanuj; Sharma, Abhilasha

    2012-05-01

    Estimation of stature is an important parameter in identification of human remains in forensic examinations. The present study is aimed to compare the reliability and accuracy of stature estimation and to demonstrate the variability in estimated stature and actual stature using multiplication factor and regression analysis methods. The study is based on a sample of 246 subjects (123 males and 123 females) from North India aged between 17 and 20 years. Four anthropometric measurements; hand length, hand breadth, foot length and foot breadth taken on the left side in each subject were included in the study. Stature was measured using standard anthropometric techniques. Multiplication factors were calculated and linear regression models were derived for estimation of stature from hand and foot dimensions. Derived multiplication factors and regression formula were applied to the hand and foot measurements in the study sample. The estimated stature from the multiplication factors and regression analysis was compared with the actual stature to find the error in estimated stature. The results indicate that the range of error in estimation of stature from regression analysis method is less than that of multiplication factor method thus, confirming that the regression analysis method is better than multiplication factor analysis in stature estimation. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  7. Spectral Regression Discriminant Analysis for Hyperspectral Image Classification

    NASA Astrophysics Data System (ADS)

    Pan, Y.; Wu, J.; Huang, H.; Liu, J.

    2012-08-01

    Dimensionality reduction algorithms, which aim to select a small set of efficient and discriminant features, have attracted great attention for Hyperspectral Image Classification. The manifold learning methods are popular for dimensionality reduction, such as Locally Linear Embedding, Isomap, and Laplacian Eigenmap. However, a disadvantage of many manifold learning methods is that their computations usually involve eigen-decomposition of dense matrices which is expensive in both time and memory. In this paper, we introduce a new dimensionality reduction method, called Spectral Regression Discriminant Analysis (SRDA). SRDA casts the problem of learning an embedding function into a regression framework, which avoids eigen-decomposition of dense matrices. Also, with the regression based framework, different kinds of regularizes can be naturally incorporated into our algorithm which makes it more flexible. It can make efficient use of data points to discover the intrinsic discriminant structure in the data. Experimental results on Washington DC Mall and AVIRIS Indian Pines hyperspectral data sets demonstrate the effectiveness of the proposed method.

  8. The Analysis of the Regression-Discontinuity Design in R

    ERIC Educational Resources Information Center

    Thoemmes, Felix; Liao, Wang; Jin, Ze

    2017-01-01

    This article describes the analysis of regression-discontinuity designs (RDDs) using the R packages rdd, rdrobust, and rddtools. We discuss similarities and differences between these packages and provide directions on how to use them effectively. We use real data from the Carolina Abecedarian Project to show how an analysis of an RDD can be…

  9. Regression Analysis with Dummy Variables: Use and Interpretation.

    ERIC Educational Resources Information Center

    Hinkle, Dennis E.; Oliver, J. Dale

    1986-01-01

    Multiple regression analysis (MRA) may be used when both continuous and categorical variables are included as independent research variables. The use of MRA with categorical variables involves dummy coding, that is, assigning zeros and ones to levels of categorical variables. Caution is urged in results interpretation. (Author/CH)

  10. Multilayer perceptron for robust nonlinear interval regression analysis using genetic algorithms.

    PubMed

    Hu, Yi-Chung

    2014-01-01

    On the basis of fuzzy regression, computational models in intelligence such as neural networks have the capability to be applied to nonlinear interval regression analysis for dealing with uncertain and imprecise data. When training data are not contaminated by outliers, computational models perform well by including almost all given training data in the data interval. Nevertheless, since training data are often corrupted by outliers, robust learning algorithms employed to resist outliers for interval regression analysis have been an interesting area of research. Several approaches involving computational intelligence are effective for resisting outliers, but the required parameters for these approaches are related to whether the collected data contain outliers or not. Since it seems difficult to prespecify the degree of contamination beforehand, this paper uses multilayer perceptron to construct the robust nonlinear interval regression model using the genetic algorithm. Outliers beyond or beneath the data interval will impose slight effect on the determination of data interval. Simulation results demonstrate that the proposed method performs well for contaminated datasets.

  11. Regression analysis for LED color detection of visual-MIMO system

    NASA Astrophysics Data System (ADS)

    Banik, Partha Pratim; Saha, Rappy; Kim, Ki-Doo

    2018-04-01

    Color detection from a light emitting diode (LED) array using a smartphone camera is very difficult in a visual multiple-input multiple-output (visual-MIMO) system. In this paper, we propose a method to determine the LED color using a smartphone camera by applying regression analysis. We employ a multivariate regression model to identify the LED color. After taking a picture of an LED array, we select the LED array region, and detect the LED using an image processing algorithm. We then apply the k-means clustering algorithm to determine the number of potential colors for feature extraction of each LED. Finally, we apply the multivariate regression model to predict the color of the transmitted LEDs. In this paper, we show our results for three types of environmental light condition: room environmental light, low environmental light (560 lux), and strong environmental light (2450 lux). We compare the results of our proposed algorithm from the analysis of training and test R-Square (%) values, percentage of closeness of transmitted and predicted colors, and we also mention about the number of distorted test data points from the analysis of distortion bar graph in CIE1931 color space.

  12. A primer for biomedical scientists on how to execute model II linear regression analysis.

    PubMed

    Ludbrook, John

    2012-04-01

    1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.

  13. External Tank Liquid Hydrogen (LH2) Prepress Regression Analysis Independent Review Technical Consultation Report

    NASA Technical Reports Server (NTRS)

    Parsons, Vickie s.

    2009-01-01

    The request to conduct an independent review of regression models, developed for determining the expected Launch Commit Criteria (LCC) External Tank (ET)-04 cycle count for the Space Shuttle ET tanking process, was submitted to the NASA Engineering and Safety Center NESC on September 20, 2005. The NESC team performed an independent review of regression models documented in Prepress Regression Analysis, Tom Clark and Angela Krenn, 10/27/05. This consultation consisted of a peer review by statistical experts of the proposed regression models provided in the Prepress Regression Analysis. This document is the consultation's final report.

  14. Tutorial on Biostatistics: Linear Regression Analysis of Continuous Correlated Eye Data.

    PubMed

    Ying, Gui-Shuang; Maguire, Maureen G; Glynn, Robert; Rosner, Bernard

    2017-04-01

    To describe and demonstrate appropriate linear regression methods for analyzing correlated continuous eye data. We describe several approaches to regression analysis involving both eyes, including mixed effects and marginal models under various covariance structures to account for inter-eye correlation. We demonstrate, with SAS statistical software, applications in a study comparing baseline refractive error between one eye with choroidal neovascularization (CNV) and the unaffected fellow eye, and in a study determining factors associated with visual field in the elderly. When refractive error from both eyes were analyzed with standard linear regression without accounting for inter-eye correlation (adjusting for demographic and ocular covariates), the difference between eyes with CNV and fellow eyes was 0.15 diopters (D; 95% confidence interval, CI -0.03 to 0.32D, p = 0.10). Using a mixed effects model or a marginal model, the estimated difference was the same but with narrower 95% CI (0.01 to 0.28D, p = 0.03). Standard regression for visual field data from both eyes provided biased estimates of standard error (generally underestimated) and smaller p-values, while analysis of the worse eye provided larger p-values than mixed effects models and marginal models. In research involving both eyes, ignoring inter-eye correlation can lead to invalid inferences. Analysis using only right or left eyes is valid, but decreases power. Worse-eye analysis can provide less power and biased estimates of effect. Mixed effects or marginal models using the eye as the unit of analysis should be used to appropriately account for inter-eye correlation and maximize power and precision.

  15. Linear regression analysis of survival data with missing censoring indicators.

    PubMed

    Wang, Qihua; Dinse, Gregg E

    2011-04-01

    Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial.

  16. Non-stationary hydrologic frequency analysis using B-spline quantile regression

    NASA Astrophysics Data System (ADS)

    Nasri, B.; Bouezmarni, T.; St-Hilaire, A.; Ouarda, T. B. M. J.

    2017-11-01

    Hydrologic frequency analysis is commonly used by engineers and hydrologists to provide the basic information on planning, design and management of hydraulic and water resources systems under the assumption of stationarity. However, with increasing evidence of climate change, it is possible that the assumption of stationarity, which is prerequisite for traditional frequency analysis and hence, the results of conventional analysis would become questionable. In this study, we consider a framework for frequency analysis of extremes based on B-Spline quantile regression which allows to model data in the presence of non-stationarity and/or dependence on covariates with linear and non-linear dependence. A Markov Chain Monte Carlo (MCMC) algorithm was used to estimate quantiles and their posterior distributions. A coefficient of determination and Bayesian information criterion (BIC) for quantile regression are used in order to select the best model, i.e. for each quantile, we choose the degree and number of knots of the adequate B-spline quantile regression model. The method is applied to annual maximum and minimum streamflow records in Ontario, Canada. Climate indices are considered to describe the non-stationarity in the variable of interest and to estimate the quantiles in this case. The results show large differences between the non-stationary quantiles and their stationary equivalents for an annual maximum and minimum discharge with high annual non-exceedance probabilities.

  17. Multilayer Perceptron for Robust Nonlinear Interval Regression Analysis Using Genetic Algorithms

    PubMed Central

    2014-01-01

    On the basis of fuzzy regression, computational models in intelligence such as neural networks have the capability to be applied to nonlinear interval regression analysis for dealing with uncertain and imprecise data. When training data are not contaminated by outliers, computational models perform well by including almost all given training data in the data interval. Nevertheless, since training data are often corrupted by outliers, robust learning algorithms employed to resist outliers for interval regression analysis have been an interesting area of research. Several approaches involving computational intelligence are effective for resisting outliers, but the required parameters for these approaches are related to whether the collected data contain outliers or not. Since it seems difficult to prespecify the degree of contamination beforehand, this paper uses multilayer perceptron to construct the robust nonlinear interval regression model using the genetic algorithm. Outliers beyond or beneath the data interval will impose slight effect on the determination of data interval. Simulation results demonstrate that the proposed method performs well for contaminated datasets. PMID:25110755

  18. Testing and referral patterns in the years surrounding the US Preventive Services Task Force recommendation against prostate-specific antigen screening.

    PubMed

    Hutchinson, Ryan; Akhtar, Abdulhadi; Haridas, Justin; Bhat, Deepa; Roehrborn, Claus; Lotan, Yair

    2016-12-15

    Since the US Preventive Services Task Force (USPSTF) recommended against prostate-specific antigen (PSA) screening, there have been conflicting reports regarding the impact on the behavior of providers. This study analyzed real-world data on PSA ordering and referral practices in the years surrounding the recommendation. A whole-institution sample of entered PSA orders and urology referrals was obtained from the electronic medical record. The study was performed at a tertiary referral center with a catchment in the southern United States. PSA examinations were defined as screening when they were ordered by providers with appointments in internal medicine, family medicine, or general internal medicine. Linear and quadratic regression analyses were performed, and joinpoint regression was used to assess for trend inflection points. Between January 2010 and July 2015, there were 275,784 unique ambulatory visits for men. There were 63,722 raw PSA orders, and 54,684 were evaluable. Primary care providers ordered 17,315 PSA tests and 858 urology referrals. The number of PSA tests per ambulatory visit, the number of referrals per ambulatory visit, the age at the time of the urology referral, and the proportion of PSA tests performed outside the recommended age range did not significantly change. The PSA value at the time of referral increased significantly (P = .022). Joinpoint analysis revealed no joinpoints in the analysis of total PSA orders, screening PSA tests, or examinations per 100 visits. In the years surrounding the USPSTF recommendation, PSA behavior did not change significantly. Patients were referred at progressively higher average PSA levels. The implications for prostate cancer outcomes from these trends warrant further research into provider variables associated with actual PSA utilization. Cancer 2016;122:3785-3793. © 2016 American Cancer Society. © 2016 American Cancer Society.

  19. Tutorial on Biostatistics: Linear Regression Analysis of Continuous Correlated Eye Data

    PubMed Central

    Ying, Gui-shuang; Maguire, Maureen G; Glynn, Robert; Rosner, Bernard

    2017-01-01

    Purpose To describe and demonstrate appropriate linear regression methods for analyzing correlated continuous eye data. Methods We describe several approaches to regression analysis involving both eyes, including mixed effects and marginal models under various covariance structures to account for inter-eye correlation. We demonstrate, with SAS statistical software, applications in a study comparing baseline refractive error between one eye with choroidal neovascularization (CNV) and the unaffected fellow eye, and in a study determining factors associated with visual field data in the elderly. Results When refractive error from both eyes were analyzed with standard linear regression without accounting for inter-eye correlation (adjusting for demographic and ocular covariates), the difference between eyes with CNV and fellow eyes was 0.15 diopters (D; 95% confidence interval, CI −0.03 to 0.32D, P=0.10). Using a mixed effects model or a marginal model, the estimated difference was the same but with narrower 95% CI (0.01 to 0.28D, P=0.03). Standard regression for visual field data from both eyes provided biased estimates of standard error (generally underestimated) and smaller P-values, while analysis of the worse eye provided larger P-values than mixed effects models and marginal models. Conclusion In research involving both eyes, ignoring inter-eye correlation can lead to invalid inferences. Analysis using only right or left eyes is valid, but decreases power. Worse-eye analysis can provide less power and biased estimates of effect. Mixed effects or marginal models using the eye as the unit of analysis should be used to appropriately account for inter-eye correlation and maximize power and precision. PMID:28102741

  20. REGRESSION ANALYSIS OF SEA-SURFACE-TEMPERATURE PATTERNS FOR THE NORTH PACIFIC OCEAN.

    DTIC Science & Technology

    SEA WATER, *SURFACE TEMPERATURE, *OCEANOGRAPHIC DATA, PACIFIC OCEAN, REGRESSION ANALYSIS , STATISTICAL ANALYSIS, UNDERWATER EQUIPMENT, DETECTION, UNDERWATER COMMUNICATIONS, DISTRIBUTION, THERMAL PROPERTIES, COMPUTERS.

  1. A Note on the Relationship between the Number of Indicators and Their Reliability in Detecting Regression Coefficients in Latent Regression Analysis

    ERIC Educational Resources Information Center

    Dolan, Conor V.; Wicherts, Jelte M.; Molenaar, Peter C. M.

    2004-01-01

    We consider the question of how variation in the number and reliability of indicators affects the power to reject the hypothesis that the regression coefficients are zero in latent linear regression analysis. We show that power remains constant as long as the coefficient of determination remains unchanged. Any increase in the number of indicators…

  2. Multivariate Linear Regression and CART Regression Analysis of TBM Performance at Abu Hamour Phase-I Tunnel

    NASA Astrophysics Data System (ADS)

    Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.

    2017-12-01

    The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.

  3. CatReg Software for Categorical Regression Analysis (May 2016)

    EPA Science Inventory

    CatReg 3.0 is a Microsoft Windows enhanced version of the Agency’s categorical regression analysis (CatReg) program. CatReg complements EPA’s existing Benchmark Dose Software (BMDS) by greatly enhancing a risk assessor’s ability to determine whether data from separate toxicologic...

  4. Analysis of regression methods for solar activity forecasting

    NASA Technical Reports Server (NTRS)

    Lundquist, C. A.; Vaughan, W. W.

    1979-01-01

    The paper deals with the potential use of the most recent solar data to project trends in the next few years. Assuming that a mode of solar influence on weather can be identified, advantageous use of that knowledge presumably depends on estimating future solar activity. A frequently used technique for solar cycle predictions is a linear regression procedure along the lines formulated by McNish and Lincoln (1949). The paper presents a sensitivity analysis of the behavior of such regression methods relative to the following aspects: cycle minimum, time into cycle, composition of historical data base, and unnormalized vs. normalized solar cycle data. Comparative solar cycle forecasts for several past cycles are presented as to these aspects of the input data. Implications for the current cycle, No. 21, are also given.

  5. Composite marginal quantile regression analysis for longitudinal adolescent body mass index data.

    PubMed

    Yang, Chi-Chuan; Chen, Yi-Hau; Chang, Hsing-Yi

    2017-09-20

    Childhood and adolescenthood overweight or obesity, which may be quantified through the body mass index (BMI), is strongly associated with adult obesity and other health problems. Motivated by the child and adolescent behaviors in long-term evolution (CABLE) study, we are interested in individual, family, and school factors associated with marginal quantiles of longitudinal adolescent BMI values. We propose a new method for composite marginal quantile regression analysis for longitudinal outcome data, which performs marginal quantile regressions at multiple quantile levels simultaneously. The proposed method extends the quantile regression coefficient modeling method introduced by Frumento and Bottai (Biometrics 2016; 72:74-84) to longitudinal data accounting suitably for the correlation structure in longitudinal observations. A goodness-of-fit test for the proposed modeling is also developed. Simulation results show that the proposed method can be much more efficient than the analysis without taking correlation into account and the analysis performing separate quantile regressions at different quantile levels. The application to the longitudinal adolescent BMI data from the CABLE study demonstrates the practical utility of our proposal. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  6. Predicting county-level cancer incidence rates and counts in the United States

    PubMed Central

    Yu, Binbing

    2018-01-01

    Many countries, including the United States, publish predicted numbers of cancer incidence and death in current and future years for the whole country. These predictions provide important information on the cancer burden for cancer control planners, policymakers and the general public. Based on evidence from several empirical studies, the joinpoint (segmented-line linear regression) model has been adopted by the American Cancer Society to estimate the number of new cancer cases in the United States and in individual states since 2007. Recently, cancer incidence in smaller geographic regions such as counties and FIPS code regions is of increasing interest by local policymakers. The natural extension is to directly apply the joinpoint model to county-level cancer incidence data. The direct application has several drawbacks and its performance has not been evaluated. To address the concerns, we developed a spatial random-effects joinpoint model for county-level cancer incidence data. The proposed model was used to predict both cancer incidence rates and counts at the county level. The standard joinpoint model and the proposed method were compared through a validation study. The proposed method out-performed the standard joinpoint model for almost all cancer sites, especially for moderate or rare cancer sites and for counties with small population sizes. As an application, we predicted county-level prostate cancer incidence rates and counts for the year 2011 in Connecticut. PMID:23670947

  7. Regression analysis of informative current status data with the additive hazards model.

    PubMed

    Zhao, Shishun; Hu, Tao; Ma, Ling; Wang, Peijie; Sun, Jianguo

    2015-04-01

    This paper discusses regression analysis of current status failure time data arising from the additive hazards model in the presence of informative censoring. Many methods have been developed for regression analysis of current status data under various regression models if the censoring is noninformative, and also there exists a large literature on parametric analysis of informative current status data in the context of tumorgenicity experiments. In this paper, a semiparametric maximum likelihood estimation procedure is presented and in the method, the copula model is employed to describe the relationship between the failure time of interest and the censoring time. Furthermore, I-splines are used to approximate the nonparametric functions involved and the asymptotic consistency and normality of the proposed estimators are established. A simulation study is conducted and indicates that the proposed approach works well for practical situations. An illustrative example is also provided.

  8. MULGRES: a computer program for stepwise multiple regression analysis

    Treesearch

    A. Jeff Martin

    1971-01-01

    MULGRES is a computer program source deck that is designed for multiple regression analysis employing the technique of stepwise deletion in the search for most significant variables. The features of the program, along with inputs and outputs, are briefly described, with a note on machine compatibility.

  9. Application of artificial neural network to fMRI regression analysis.

    PubMed

    Misaki, Masaya; Miyauchi, Satoru

    2006-01-15

    We used an artificial neural network (ANN) to detect correlations between event sequences and fMRI (functional magnetic resonance imaging) signals. The layered feed-forward neural network, given a series of events as inputs and the fMRI signal as a supervised signal, performed a non-linear regression analysis. This type of ANN is capable of approximating any continuous function, and thus this analysis method can detect any fMRI signals that correlated with corresponding events. Because of the flexible nature of ANNs, fitting to autocorrelation noise is a problem in fMRI analyses. We avoided this problem by using cross-validation and an early stopping procedure. The results showed that the ANN could detect various responses with different time courses. The simulation analysis also indicated an additional advantage of ANN over non-parametric methods in detecting parametrically modulated responses, i.e., it can detect various types of parametric modulations without a priori assumptions. The ANN regression analysis is therefore beneficial for exploratory fMRI analyses in detecting continuous changes in responses modulated by changes in input values.

  10. Quantile regression for the statistical analysis of immunological data with many non-detects.

    PubMed

    Eilers, Paul H C; Röder, Esther; Savelkoul, Huub F J; van Wijk, Roy Gerth

    2012-07-07

    Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Quantile regression, a generalization of percentiles to regression models, models the median or higher percentiles and tolerates very high numbers of non-detects. We present a non-technical introduction and illustrate it with an implementation to real data from a clinical trial. We show that by using quantile regression, groups can be compared and that meaningful linear trends can be computed, even if more than half of the data consists of non-detects. Quantile regression is a valuable addition to the statistical methods that can be used for the analysis of immunological datasets with non-detects.

  11. Mortality from cystic fibrosis in Europe: 1994-2010.

    PubMed

    Quintana-Gallego, Esther; Ruiz-Ramos, Miguel; Delgado-Pecellin, Isabel; Calero, Carmen; Soriano, Joan B; Lopez-Campos, Jose Luis

    2016-02-01

    To date, available mortality trends due to cystic fibrosis (CF) have been limited to the analysis of certain countries in different parts of the world showing that mortality trends have been constantly decreasing. However, no studies have examined Europe as a whole. The present study aims to analyze CF mortality trends by gender within the European Union (EU) and to quantify potential years of life lost (PYLL). Deaths from the 27 EU countries were obtained from the statistical office of the EU from the years 1994-2010. Crude and age-standardized mortality rates (ASR) were estimated for women and men using the standard European population, expressed in deaths per 1,000,000 persons. The PYLL from ages 0 up to 30 years were estimated. Trends were studied by a joinpoint regression analysis. During the study period, 5,130 deaths (2,443 in males and 2,687 in females) were identified. Females had a slightly higher mortality rate than males, with a downward trend observed for both genders. In males, the ASR changed from 1.34 in 1994 to 1.03 in 2010. In females, the ASR changed from 1.42 in 1994 to 0.92 in 2010. The mean age at death and PYLL increased for both genders. The joinpoint analysis did not identify any significant joinpoint for either gender for ASR or PYLL. Our data suggest a continued downward trend of CF mortality throughout the EU, with differences by country and gender. © 2015 Wiley Periodicals, Inc.

  12. Epistasis analysis for quantitative traits by functional regression model.

    PubMed

    Zhang, Futao; Boerwinkle, Eric; Xiong, Momiao

    2014-06-01

    The critical barrier in interaction analysis for rare variants is that most traditional statistical methods for testing interactions were originally designed for testing the interaction between common variants and are difficult to apply to rare variants because of their prohibitive computational time and poor ability. The great challenges for successful detection of interactions with next-generation sequencing (NGS) data are (1) lack of methods for interaction analysis with rare variants, (2) severe multiple testing, and (3) time-consuming computations. To meet these challenges, we shift the paradigm of interaction analysis between two loci to interaction analysis between two sets of loci or genomic regions and collectively test interactions between all possible pairs of SNPs within two genomic regions. In other words, we take a genome region as a basic unit of interaction analysis and use high-dimensional data reduction and functional data analysis techniques to develop a novel functional regression model to collectively test interactions between all possible pairs of single nucleotide polymorphisms (SNPs) within two genome regions. By intensive simulations, we demonstrate that the functional regression models for interaction analysis of the quantitative trait have the correct type 1 error rates and a much better ability to detect interactions than the current pairwise interaction analysis. The proposed method was applied to exome sequence data from the NHLBI's Exome Sequencing Project (ESP) and CHARGE-S study. We discovered 27 pairs of genes showing significant interactions after applying the Bonferroni correction (P-values < 4.58 × 10(-10)) in the ESP, and 11 were replicated in the CHARGE-S study. © 2014 Zhang et al.; Published by Cold Spring Harbor Laboratory Press.

  13. [Comparison of application of Cochran-Armitage trend test and linear regression analysis for rate trend analysis in epidemiology study].

    PubMed

    Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H

    2017-05-10

    We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P valueregression P value). The statistical power of CAT test decreased, while the result of linear regression analysis remained the same when population size was reduced by 100 times and AMI incidence rate remained unchanged. The two statistical methods have their advantages and disadvantages. It is necessary to choose statistical method according the fitting degree of data, or comprehensively analyze the results of two methods.

  14. A method for nonlinear exponential regression analysis

    NASA Technical Reports Server (NTRS)

    Junkin, B. G.

    1971-01-01

    A computer-oriented technique is presented for performing a nonlinear exponential regression analysis on decay-type experimental data. The technique involves the least squares procedure wherein the nonlinear problem is linearized by expansion in a Taylor series. A linear curve fitting procedure for determining the initial nominal estimates for the unknown exponential model parameters is included as an integral part of the technique. A correction matrix was derived and then applied to the nominal estimate to produce an improved set of model parameters. The solution cycle is repeated until some predetermined criterion is satisfied.

  15. [A SAS marco program for batch processing of univariate Cox regression analysis for great database].

    PubMed

    Yang, Rendong; Xiong, Jie; Peng, Yangqin; Peng, Xiaoning; Zeng, Xiaomin

    2015-02-01

    To realize batch processing of univariate Cox regression analysis for great database by SAS marco program. We wrote a SAS macro program, which can filter, integrate, and export P values to Excel by SAS9.2. The program was used for screening survival correlated RNA molecules of ovarian cancer. A SAS marco program could finish the batch processing of univariate Cox regression analysis, the selection and export of the results. The SAS macro program has potential applications in reducing the workload of statistical analysis and providing a basis for batch processing of univariate Cox regression analysis.

  16. A general framework for the use of logistic regression models in meta-analysis.

    PubMed

    Simmonds, Mark C; Higgins, Julian Pt

    2016-12-01

    Where individual participant data are available for every randomised trial in a meta-analysis of dichotomous event outcomes, "one-stage" random-effects logistic regression models have been proposed as a way to analyse these data. Such models can also be used even when individual participant data are not available and we have only summary contingency table data. One benefit of this one-stage regression model over conventional meta-analysis methods is that it maximises the correct binomial likelihood for the data and so does not require the common assumption that effect estimates are normally distributed. A second benefit of using this model is that it may be applied, with only minor modification, in a range of meta-analytic scenarios, including meta-regression, network meta-analyses and meta-analyses of diagnostic test accuracy. This single model can potentially replace the variety of often complex methods used in these areas. This paper considers, with a range of meta-analysis examples, how random-effects logistic regression models may be used in a number of different types of meta-analyses. This one-stage approach is compared with widely used meta-analysis methods including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-analyses of diagnostic test accuracy. © The Author(s) 2014.

  17. Factor analysis and multiple regression between topography and precipitation on Jeju Island, Korea

    NASA Astrophysics Data System (ADS)

    Um, Myoung-Jin; Yun, Hyeseon; Jeong, Chang-Sam; Heo, Jun-Haeng

    2011-11-01

    SummaryIn this study, new factors that influence precipitation were extracted from geographic variables using factor analysis, which allow for an accurate estimation of orographic precipitation. Correlation analysis was also used to examine the relationship between nine topographic variables from digital elevation models (DEMs) and the precipitation in Jeju Island. In addition, a spatial analysis was performed in order to verify the validity of the regression model. From the results of the correlation analysis, it was found that all of the topographic variables had a positive correlation with the precipitation. The relations between the variables also changed in accordance with a change in the precipitation duration. However, upon examining the correlation matrix, no significant relationship between the latitude and the aspect was found. According to the factor analysis, eight topographic variables (latitude being the exception) were found to have a direct influence on the precipitation. Three factors were then extracted from the eight topographic variables. By directly comparing the multiple regression model with the factors (model 1) to the multiple regression model with the topographic variables (model 3), it was found that model 1 did not violate the limits of statistical significance and multicollinearity. As such, model 1 was considered to be appropriate for estimating the precipitation when taking into account the topography. In the study of model 1, the multiple regression model using factor analysis was found to be the best method for estimating the orographic precipitation on Jeju Island.

  18. Bias due to two-stage residual-outcome regression analysis in genetic association studies.

    PubMed

    Demissie, Serkalem; Cupples, L Adrienne

    2011-11-01

    Association studies of risk factors and complex diseases require careful assessment of potential confounding factors. Two-stage regression analysis, sometimes referred to as residual- or adjusted-outcome analysis, has been increasingly used in association studies of single nucleotide polymorphisms (SNPs) and quantitative traits. In this analysis, first, a residual-outcome is calculated from a regression of the outcome variable on covariates and then the relationship between the adjusted-outcome and the SNP is evaluated by a simple linear regression of the adjusted-outcome on the SNP. In this article, we examine the performance of this two-stage analysis as compared with multiple linear regression (MLR) analysis. Our findings show that when a SNP and a covariate are correlated, the two-stage approach results in biased genotypic effect and loss of power. Bias is always toward the null and increases with the squared-correlation between the SNP and the covariate (). For example, for , 0.1, and 0.5, two-stage analysis results in, respectively, 0, 10, and 50% attenuation in the SNP effect. As expected, MLR was always unbiased. Since individual SNPs often show little or no correlation with covariates, a two-stage analysis is expected to perform as well as MLR in many genetic studies; however, it produces considerably different results from MLR and may lead to incorrect conclusions when independent variables are highly correlated. While a useful alternative to MLR under , the two -stage approach has serious limitations. Its use as a simple substitute for MLR should be avoided. © 2011 Wiley Periodicals, Inc.

  19. Advanced statistics: linear regression, part I: simple linear regression.

    PubMed

    Marill, Keith A

    2004-01-01

    Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.

  20. Creep-Rupture Data Analysis - Engineering Application of Regression Techniques. Ph.D. Thesis - North Carolina State Univ.

    NASA Technical Reports Server (NTRS)

    Rummler, D. R.

    1976-01-01

    The results are presented of investigations to apply regression techniques to the development of methodology for creep-rupture data analysis. Regression analysis techniques are applied to the explicit description of the creep behavior of materials for space shuttle thermal protection systems. A regression analysis technique is compared with five parametric methods for analyzing three simulated and twenty real data sets, and a computer program for the evaluation of creep-rupture data is presented.

  1. Functional Regression Models for Epistasis Analysis of Multiple Quantitative Traits.

    PubMed

    Zhang, Futao; Xie, Dan; Liang, Meimei; Xiong, Momiao

    2016-04-01

    To date, most genetic analyses of phenotypes have focused on analyzing single traits or analyzing each phenotype independently. However, joint epistasis analysis of multiple complementary traits will increase statistical power and improve our understanding of the complicated genetic structure of the complex diseases. Despite their importance in uncovering the genetic structure of complex traits, the statistical methods for identifying epistasis in multiple phenotypes remains fundamentally unexplored. To fill this gap, we formulate a test for interaction between two genes in multiple quantitative trait analysis as a multiple functional regression (MFRG) in which the genotype functions (genetic variant profiles) are defined as a function of the genomic position of the genetic variants. We use large-scale simulations to calculate Type I error rates for testing interaction between two genes with multiple phenotypes and to compare the power with multivariate pairwise interaction analysis and single trait interaction analysis by a single variate functional regression model. To further evaluate performance, the MFRG for epistasis analysis is applied to five phenotypes of exome sequence data from the NHLBI's Exome Sequencing Project (ESP) to detect pleiotropic epistasis. A total of 267 pairs of genes that formed a genetic interaction network showed significant evidence of epistasis influencing five traits. The results demonstrate that the joint interaction analysis of multiple phenotypes has a much higher power to detect interaction than the interaction analysis of a single trait and may open a new direction to fully uncovering the genetic structure of multiple phenotypes.

  2. Impact of the global financial crisis on low birth weight in Portugal: a time-trend analysis

    PubMed Central

    Kana, Musa Abubakar; Correia, Sofia; Peleteiro, Barbara; Severo, Milton; Barros, Henrique

    2017-01-01

    Background The 2007–2008 global financial crisis had adverse consequences on population health of affected European countries. Few contemporary studies have studied its effect on perinatal indicators with long-lasting influence on adult health. Therefore, in this study, we investigated the impact of the 2007–2008 global financial crisis on low birth weight (LBW) in Portugal. Methods Data on 2 045 155 singleton births of 1995–2014 were obtained from Statistics Portugal. Joinpoint regression analysis was performed to identify the years in which changes in LBW trends occurred, and to estimate the annual per cent changes (APC). LBW risk by time period expressed as prevalence ratios were computed using the Poisson regression. Contextual changes in sociodemographic and economic factors were provided by their trends. Results The joinpoint analysis identified 3 distinct periods (2 jointpoints) with different APC in LBW, corresponding to 1995–1999 (APC=4.4; 95% CI 3.2 to 5.6), 2000–2006 (APC=0.1; 95% CI −050 to 0.7) and 2007–2014 (APC=1.6; 95% CI 1.2 to 2.0). For non-Portuguese, it was, respectively, 1995–1999 (APC=1.4; 95% CI −3.9 to 7.0%), 2000–2007 (APC=−4.2; 95% CI −6.4 to −2.0) and 2008–2014 (APC=3.1; 95% CI 0.8 to 5.5). Compared with 1995–1999, all specific maternal characteristics had a 10–15% increase in LBW risk in 2000–2006 and a 20–25% increase in 2007–2014, except among migrants, for which LBW risk remained lower than in 1995–1999 but increased after the crisis. The increasing LBW risk coincides with a deceleration in gross domestic product growth rate, reduction in health expenditure, social protection allocation on family/children support and sickness. Conclusions The 2007–2008 global financial crisis was associated with a significant increase in LBW, particularly among infants of non-Portuguese mothers. We recommend strengthening social policies aimed at maternity protection for vulnerable mothers and health

  3. Forecasting urban water demand: A meta-regression analysis.

    PubMed

    Sebri, Maamar

    2016-12-01

    Water managers and planners require accurate water demand forecasts over the short-, medium- and long-term for many purposes. These range from assessing water supply needs over spatial and temporal patterns to optimizing future investments and planning future allocations across competing sectors. This study surveys the empirical literature on the urban water demand forecasting using the meta-analytical approach. Specifically, using more than 600 estimates, a meta-regression analysis is conducted to identify explanations of cross-studies variation in accuracy of urban water demand forecasting. Our study finds that accuracy depends significantly on study characteristics, including demand periodicity, modeling method, forecasting horizon, model specification and sample size. The meta-regression results remain robust to different estimators employed as well as to a series of sensitivity checks performed. The importance of these findings lies in the conclusions and implications drawn out for regulators and policymakers and for academics alike. Copyright © 2016. Published by Elsevier Ltd.

  4. A simple linear regression method for quantitative trait loci linkage analysis with censored observations.

    PubMed

    Anderson, Carl A; McRae, Allan F; Visscher, Peter M

    2006-07-01

    Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.

  5. The use of cognitive ability measures as explanatory variables in regression analysis

    PubMed Central

    Junker, Brian; Schofield, Lynne Steuerle; Taylor, Lowell J

    2015-01-01

    Cognitive ability measures are often taken as explanatory variables in regression analysis, e.g., as a factor affecting a market outcome such as an individual’s wage, or a decision such as an individual’s education acquisition. Cognitive ability is a latent construct; its true value is unobserved. Nonetheless, researchers often assume that a test score, constructed via standard psychometric practice from individuals’ responses to test items, can be safely used in regression analysis. We examine problems that can arise, and suggest that an alternative approach, a “mixed effects structural equations” (MESE) model, may be more appropriate in many circumstances. PMID:26998417

  6. The use of cognitive ability measures as explanatory variables in regression analysis.

    PubMed

    Junker, Brian; Schofield, Lynne Steuerle; Taylor, Lowell J

    2012-12-01

    Cognitive ability measures are often taken as explanatory variables in regression analysis, e.g., as a factor affecting a market outcome such as an individual's wage, or a decision such as an individual's education acquisition. Cognitive ability is a latent construct; its true value is unobserved. Nonetheless, researchers often assume that a test score , constructed via standard psychometric practice from individuals' responses to test items, can be safely used in regression analysis. We examine problems that can arise, and suggest that an alternative approach, a "mixed effects structural equations" (MESE) model, may be more appropriate in many circumstances.

  7. Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.

    PubMed

    Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A

    2016-01-01

    Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.

  8. Neighborhood social capital and crime victimization: comparison of spatial regression analysis and hierarchical regression analysis.

    PubMed

    Takagi, Daisuke; Ikeda, Ken'ichi; Kawachi, Ichiro

    2012-11-01

    Crime is an important determinant of public health outcomes, including quality of life, mental well-being, and health behavior. A body of research has documented the association between community social capital and crime victimization. The association between social capital and crime victimization has been examined at multiple levels of spatial aggregation, ranging from entire countries, to states, metropolitan areas, counties, and neighborhoods. In multilevel analysis, the spatial boundaries at level 2 are most often drawn from administrative boundaries (e.g., Census tracts in the U.S.). One problem with adopting administrative definitions of neighborhoods is that it ignores spatial spillover. We conducted a study of social capital and crime victimization in one ward of Tokyo city, using a spatial Durbin model with an inverse-distance weighting matrix that assigned each respondent a unique level of "exposure" to social capital based on all other residents' perceptions. The study is based on a postal questionnaire sent to 20-69 years old residents of Arakawa Ward, Tokyo. The response rate was 43.7%. We examined the contextual influence of generalized trust, perceptions of reciprocity, two types of social network variables, as well as two principal components of social capital (constructed from the above four variables). Our outcome measure was self-reported crime victimization in the last five years. In the spatial Durbin model, we found that neighborhood generalized trust, reciprocity, supportive networks and two principal components of social capital were each inversely associated with crime victimization. By contrast, a multilevel regression performed with the same data (using administrative neighborhood boundaries) found generally null associations between neighborhood social capital and crime. Spatial regression methods may be more appropriate for investigating the contextual influence of social capital in homogeneous cultural settings such as Japan. Copyright

  9. Functional mixture regression.

    PubMed

    Yao, Fang; Fu, Yuejiao; Lee, Thomas C M

    2011-04-01

    In functional linear models (FLMs), the relationship between the scalar response and the functional predictor process is often assumed to be identical for all subjects. Motivated by both practical and methodological considerations, we relax this assumption and propose a new class of functional regression models that allow the regression structure to vary for different groups of subjects. By projecting the predictor process onto its eigenspace, the new functional regression model is simplified to a framework that is similar to classical mixture regression models. This leads to the proposed approach named as functional mixture regression (FMR). The estimation of FMR can be readily carried out using existing software implemented for functional principal component analysis and mixture regression. The practical necessity and performance of FMR are illustrated through applications to a longevity analysis of female medflies and a human growth study. Theoretical investigations concerning the consistent estimation and prediction properties of FMR along with simulation experiments illustrating its empirical properties are presented in the supplementary material available at Biostatistics online. Corresponding results demonstrate that the proposed approach could potentially achieve substantial gains over traditional FLMs.

  10. Regression Analysis and Calibration Recommendations for the Characterization of Balance Temperature Effects

    NASA Technical Reports Server (NTRS)

    Ulbrich, N.; Volden, T.

    2018-01-01

    Analysis and use of temperature-dependent wind tunnel strain-gage balance calibration data are discussed in the paper. First, three different methods are presented and compared that may be used to process temperature-dependent strain-gage balance data. The first method uses an extended set of independent variables in order to process the data and predict balance loads. The second method applies an extended load iteration equation during the analysis of balance calibration data. The third method uses temperature-dependent sensitivities for the data analysis. Physical interpretations of the most important temperature-dependent regression model terms are provided that relate temperature compensation imperfections and the temperature-dependent nature of the gage factor to sets of regression model terms. Finally, balance calibration recommendations are listed so that temperature-dependent calibration data can be obtained and successfully processed using the reviewed analysis methods.

  11. Clinical evaluation of a novel population-based regression analysis for detecting glaucomatous visual field progression.

    PubMed

    Kovalska, M P; Bürki, E; Schoetzau, A; Orguel, S F; Orguel, S; Grieshaber, M C

    2011-04-01

    The distinction of real progression from test variability in visual field (VF) series may be based on clinical judgment, on trend analysis based on follow-up of test parameters over time, or on identification of a significant change related to the mean of baseline exams (event analysis). The aim of this study was to compare a new population-based method (Octopus field analysis, OFA) with classic regression analyses and clinical judgment for detecting glaucomatous VF changes. 240 VF series of 240 patients with at least 9 consecutive examinations available were included into this study. They were independently classified by two experienced investigators. The results of such a classification served as a reference for comparison for the following statistical tests: (a) t-test global, (b) r-test global, (c) regression analysis of 10 VF clusters and (d) point-wise linear regression analysis. 32.5 % of the VF series were classified as progressive by the investigators. The sensitivity and specificity were 89.7 % and 92.0 % for r-test, and 73.1 % and 93.8 % for the t-test, respectively. In the point-wise linear regression analysis, the specificity was comparable (89.5 % versus 92 %), but the sensitivity was clearly lower than in the r-test (22.4 % versus 89.7 %) at a significance level of p = 0.01. A regression analysis for the 10 VF clusters showed a markedly higher sensitivity for the r-test (37.7 %) than the t-test (14.1 %) at a similar specificity (88.3 % versus 93.8 %) for a significant trend (p = 0.005). In regard to the cluster distribution, the paracentral clusters and the superior nasal hemifield progressed most frequently. The population-based regression analysis seems to be superior to the trend analysis in detecting VF progression in glaucoma, and may eliminate the drawbacks of the event analysis. Further, it may assist the clinician in the evaluation of VF series and may allow better visualization of the correlation between function and structure owing to VF

  12. Selective principal component regression analysis of fluorescence hyperspectral image to assess aflatoxin contamination in corn

    USDA-ARS?s Scientific Manuscript database

    Selective principal component regression analysis (SPCR) uses a subset of the original image bands for principal component transformation and regression. For optimal band selection before the transformation, this paper used genetic algorithms (GA). In this case, the GA process used the regression co...

  13. Covariate Imbalance and Adjustment for Logistic Regression Analysis of Clinical Trial Data

    PubMed Central

    Ciolino, Jody D.; Martin, Reneé H.; Zhao, Wenle; Jauch, Edward C.; Hill, Michael D.; Palesch, Yuko Y.

    2014-01-01

    In logistic regression analysis for binary clinical trial data, adjusted treatment effect estimates are often not equivalent to unadjusted estimates in the presence of influential covariates. This paper uses simulation to quantify the benefit of covariate adjustment in logistic regression. However, International Conference on Harmonization guidelines suggest that covariate adjustment be pre-specified. Unplanned adjusted analyses should be considered secondary. Results suggest that that if adjustment is not possible or unplanned in a logistic setting, balance in continuous covariates can alleviate some (but never all) of the shortcomings of unadjusted analyses. The case of log binomial regression is also explored. PMID:24138438

  14. Declining Bias and Gender Wage Discrimination? A Meta-Regression Analysis

    ERIC Educational Resources Information Center

    Jarrell, Stephen B.; Stanley, T. D.

    2004-01-01

    The meta-regression analysis reveals that there is a strong tendency for discrimination estimates to fall and wage discrimination exist against the woman. The biasing effect of researchers' gender of not correcting for selection bias has weakened and changes in labor market have made it less important.

  15. Regression Analysis of Stage Variability for West-Central Florida Lakes

    USGS Publications Warehouse

    Sacks, Laura A.; Ellison, Donald L.; Swancar, Amy

    2008-01-01

    The variability in a lake's stage depends upon many factors, including surface-water flows, meteorological conditions, and hydrogeologic characteristics near the lake. An understanding of the factors controlling lake-stage variability for a population of lakes may be helpful to water managers who set regulatory levels for lakes. The goal of this study is to determine whether lake-stage variability can be predicted using multiple linear regression and readily available lake and basin characteristics defined for each lake. Regressions were evaluated for a recent 10-year period (1996-2005) and for a historical 10-year period (1954-63). Ground-water pumping is considered to have affected stage at many of the 98 lakes included in the recent period analysis, and not to have affected stage at the 20 lakes included in the historical period analysis. For the recent period, regression models had coefficients of determination (R2) values ranging from 0.60 to 0.74, and up to five explanatory variables. Standard errors ranged from 21 to 37 percent of the average stage variability. Net leakage was the most important explanatory variable in regressions describing the full range and low range in stage variability for the recent period. The most important explanatory variable in the model predicting the high range in stage variability was the height over median lake stage at which surface-water outflow would occur. Other explanatory variables in final regression models for the recent period included the range in annual rainfall for the period and several variables related to local and regional hydrogeology: (1) ground-water pumping within 1 mile of each lake, (2) the amount of ground-water inflow (by category), (3) the head gradient between the lake and the Upper Floridan aquifer, and (4) the thickness of the intermediate confining unit. Many of the variables in final regression models are related to hydrogeologic characteristics, underscoring the importance of ground

  16. Length bias correction in gene ontology enrichment analysis using logistic regression.

    PubMed

    Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H

    2012-01-01

    When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.

  17. Confirming the timing of phase-based costing in oncology studies: a case example in advanced melanoma.

    PubMed

    Atkins, Michael; Coutinho, Anna D; Nunna, Sasikiran; Gupte-Singh, Komal; Eaddy, Michael

    2018-02-01

    The utilization of healthcare services and costs among patients with cancer is often estimated by the phase of care: initial, interim, or terminal. Although their durations are often set arbitrarily, we sought to establish data-driven phases of care using joinpoint regression in an advanced melanoma population as a case example. A retrospective claims database study was conducted to assess the costs of advanced melanoma from distant metastasis diagnosis to death during January 2010-September 2014. Joinpoint regression analysis was applied to identify the best-fitting points, where statistically significant changes in the trend of average monthly costs occurred. To identify the initial phase, average monthly costs were modeled from metastasis diagnosis to death; and were modeled backward from death to metastasis diagnosis for the terminal phase. Points of monthly cost trend inflection denoted ending and starting points. The months between represented the interim phase. A total of 1,671 patients with advanced melanoma who died met the eligibility criteria. Initial phase was identified as the 5-month period starting with diagnosis of metastasis, after which there was a sharp, significant decline in monthly cost trend (monthly percent change [MPC] = -13.0%; 95% CI = -16.9% to -8.8%). Terminal phase was defined as the 5-month period before death (MPC = -14.0%; 95% CI = -17.6% to -10.2%). The claims-based algorithm may under-estimate patients due to misclassifications, and may over-estimate terminal phase costs because hospital and emergency visits were used as a death proxy. Also, recently approved therapies were not included, which may under-estimate advanced melanoma costs. In this advanced melanoma population, optimal duration of the initial and terminal phases of care was 5 months immediately after diagnosis of metastasis and before death, respectively. Joinpoint regression can be used to provide data-supported phase of cancer care durations, but

  18. Replica analysis of overfitting in regression models for time-to-event data

    NASA Astrophysics Data System (ADS)

    Coolen, A. C. C.; Barrett, J. E.; Paga, P.; Perez-Vicente, C. J.

    2017-09-01

    Overfitting, which happens when the number of parameters in a model is too large compared to the number of data points available for determining these parameters, is a serious and growing problem in survival analysis. While modern medicine presents us with data of unprecedented dimensionality, these data cannot yet be used effectively for clinical outcome prediction. Standard error measures in maximum likelihood regression, such as p-values and z-scores, are blind to overfitting, and even for Cox’s proportional hazards model (the main tool of medical statisticians), one finds in literature only rules of thumb on the number of samples required to avoid overfitting. In this paper we present a mathematical theory of overfitting in regression models for time-to-event data, which aims to increase our quantitative understanding of the problem and provide practical tools with which to correct regression outcomes for the impact of overfitting. It is based on the replica method, a statistical mechanical technique for the analysis of heterogeneous many-variable systems that has been used successfully for several decades in physics, biology, and computer science, but not yet in medical statistics. We develop the theory initially for arbitrary regression models for time-to-event data, and verify its predictions in detail for the popular Cox model.

  19. Predictors of postoperative outcomes of cubital tunnel syndrome treatments using multiple logistic regression analysis.

    PubMed

    Suzuki, Taku; Iwamoto, Takuji; Shizu, Kanae; Suzuki, Katsuji; Yamada, Harumoto; Sato, Kazuki

    2017-05-01

    This retrospective study was designed to investigate prognostic factors for postoperative outcomes for cubital tunnel syndrome (CubTS) using multiple logistic regression analysis with a large number of patients. Eighty-three patients with CubTS who underwent surgeries were enrolled. The following potential prognostic factors for disease severity were selected according to previous reports: sex, age, type of surgery, disease duration, body mass index, cervical lesion, presence of diabetes mellitus, Workers' Compensation status, preoperative severity, and preoperative electrodiagnostic testing. Postoperative severity of disease was assessed 2 years after surgery by Messina's criteria which is an outcome measure specifically for CubTS. Bivariate analysis was performed to select candidate prognostic factors for multiple linear regression analyses. Multiple logistic regression analysis was conducted to identify the association between postoperative severity and selected prognostic factors. Both bivariate and multiple linear regression analysis revealed only preoperative severity as an independent risk factor for poor prognosis, while other factors did not show any significant association. Although conflicting results exist regarding prognosis of CubTS, this study supports evidence from previous studies and concludes early surgical intervention portends the most favorable prognosis. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.

  20. Multiple linear regression analysis

    NASA Technical Reports Server (NTRS)

    Edwards, T. R.

    1980-01-01

    Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

  1. Rex fortran 4 system for combinatorial screening or conventional analysis of multivariate regressions

    Treesearch

    L.R. Grosenbaugh

    1967-01-01

    Describes an expansible computerized system that provides data needed in regression or covariance analysis of as many as 50 variables, 8 of which may be dependent. Alternatively, it can screen variously generated combinations of independent variables to find the regression with the smallest mean-squared-residual, which will be fitted if desired. The user can easily...

  2. Understanding poisson regression.

    PubMed

    Hayat, Matthew J; Higgins, Melinda

    2014-04-01

    Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes. Copyright 2014, SLACK Incorporated.

  3. Analysis of the Influence of Quantile Regression Model on Mainland Tourists' Service Satisfaction Performance

    PubMed Central

    Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen

    2014-01-01

    It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models. PMID:24574916

  4. Analysis of the influence of quantile regression model on mainland tourists' service satisfaction performance.

    PubMed

    Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen

    2014-01-01

    It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models.

  5. Ultrasound-enhanced bioscouring of greige cotton: regression analysis of process factors

    USDA-ARS?s Scientific Manuscript database

    Process factors of enzyme concentration, time, power and frequency were investigated for ultrasound-enhanced bioscouring of greige cotton. A fractional factorial experimental design and subsequent regression analysis of the process factors were employed to determine the significance of each factor a...

  6. A rotor optimization using regression analysis

    NASA Technical Reports Server (NTRS)

    Giansante, N.

    1984-01-01

    The design and development of helicopter rotors is subject to the many design variables and their interactions that effect rotor operation. Until recently, selection of rotor design variables to achieve specified rotor operational qualities has been a costly, time consuming, repetitive task. For the past several years, Kaman Aerospace Corporation has successfully applied multiple linear regression analysis, coupled with optimization and sensitivity procedures, in the analytical design of rotor systems. It is concluded that approximating equations can be developed rapidly for a multiplicity of objective and constraint functions and optimizations can be performed in a rapid and cost effective manner; the number and/or range of design variables can be increased by expanding the data base and developing approximating functions to reflect the expanded design space; the order of the approximating equations can be expanded easily to improve correlation between analyzer results and the approximating equations; gradients of the approximating equations can be calculated easily and these gradients are smooth functions reducing the risk of numerical problems in the optimization; the use of approximating functions allows the problem to be started easily and rapidly from various initial designs to enhance the probability of finding a global optimum; and the approximating equations are independent of the analysis or optimization codes used.

  7. Meta-regression analysis of commensal and pathogenic Escherichia coli survival in soil and water.

    PubMed

    Franz, Eelco; Schijven, Jack; de Roda Husman, Ana Maria; Blaak, Hetty

    2014-06-17

    The extent to which pathogenic and commensal E. coli (respectively PEC and CEC) can survive, and which factors predominantly determine the rate of decline, are crucial issues from a public health point of view. The goal of this study was to provide a quantitative summary of the variability in E. coli survival in soil and water over a broad range of individual studies and to identify the most important sources of variability. To that end, a meta-regression analysis on available literature data was conducted. The considerable variation in reported decline rates indicated that the persistence of E. coli is not easily predictable. The meta-analysis demonstrated that for soil and water, the type of experiment (laboratory or field), the matrix subtype (type of water and soil), and temperature were the main factors included in the regression analysis. A higher average decline rate in soil of PEC compared with CEC was observed. The regression models explained at best 57% of the variation in decline rate in soil and 41% of the variation in decline rate in water. This indicates that additional factors, not included in the current meta-regression analysis, are of importance but rarely reported. More complete reporting of experimental conditions may allow future inference on the global effects of these variables on the decline rate of E. coli.

  8. Air Leakage of US Homes: Regression Analysis and Improvements from Retrofit

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chan, Wanyu R.; Joh, Jeffrey; Sherman, Max H.

    2012-08-01

    LBNL Residential Diagnostics Database (ResDB) contains blower door measurements and other diagnostic test results of homes in United States. Of these, approximately 134,000 single-family detached homes have sufficient information for the analysis of air leakage in relation to a number of housing characteristics. We performed regression analysis to consider the correlation between normalized leakage and a number of explanatory variables: IECC climate zone, floor area, height, year built, foundation type, duct location, and other characteristics. The regression model explains 68% of the observed variability in normalized leakage. ResDB also contains the before and after retrofit air leakage measurements of approximatelymore » 23,000 homes that participated in weatherization assistant programs (WAPs) or residential energy efficiency programs. The two types of programs achieve rather similar reductions in normalized leakage: 30% for WAPs and 20% for other energy programs.« less

  9. Canadian firearms legislation and effects on homicide 1974 to 2008.

    PubMed

    Langmann, Caillin

    2012-08-01

    Canada has implemented legislation covering all firearms since 1977 and presents a model to examine incremental firearms control. The effect of legislation on homicide by firearm and the subcategory, spousal homicide, is controversial and has not been well studied to date. Legislative effects on homicide and spousal homicide were analyzed using data obtained from Statistics Canada from 1974 to 2008. Three statistical methods were applied to search for any associated effects of firearms legislation. Interrupted time series regression, ARIMA, and Joinpoint analysis were performed. Neither were any significant beneficial associations between firearms legislation and homicide or spousal homicide rates found after the passage of three Acts by the Canadian Parliament--Bill C-51 (1977), C-17 (1991), and C-68 (1995)--nor were effects found after the implementation of licensing in 2001 and the registration of rifles and shotguns in 2003. After the passage of C-68, a decrease in the rate of the decline of homicide by firearm was found by interrupted regression. Joinpoint analysis also found an increasing trend in homicide by firearm rate post the enactment of the licensing portion of C-68. Other factors found to be associated with homicide rates were median age, unemployment, immigration rates, percentage of population in low-income bracket, Gini index of income equality, population per police officer, and incarceration rate. This study failed to demonstrate a beneficial association between legislation and firearm homicide rates between 1974 and 2008.

  10. Regression Model Term Selection for the Analysis of Strain-Gage Balance Calibration Data

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert Manfred; Volden, Thomas R.

    2010-01-01

    The paper discusses the selection of regression model terms for the analysis of wind tunnel strain-gage balance calibration data. Different function class combinations are presented that may be used to analyze calibration data using either a non-iterative or an iterative method. The role of the intercept term in a regression model of calibration data is reviewed. In addition, useful algorithms and metrics originating from linear algebra and statistics are recommended that will help an analyst (i) to identify and avoid both linear and near-linear dependencies between regression model terms and (ii) to make sure that the selected regression model of the calibration data uses only statistically significant terms. Three different tests are suggested that may be used to objectively assess the predictive capability of the final regression model of the calibration data. These tests use both the original data points and regression model independent confirmation points. Finally, data from a simplified manual calibration of the Ames MK40 balance is used to illustrate the application of some of the metrics and tests to a realistic calibration data set.

  11. Evaluating differential effects using regression interactions and regression mixture models

    PubMed Central

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This paper focuses on understanding regression mixture models, a relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their formulation, and their assumptions are compared using Monte Carlo simulations and real data analysis. The capabilities of regression mixture models are described and specific issues to be addressed when conducting regression mixtures are proposed. The paper aims to clarify the role that regression mixtures can take in the estimation of differential effects and increase awareness of the benefits and potential pitfalls of this approach. Regression mixture models are shown to be a potentially effective exploratory method for finding differential effects when these effects can be defined by a small number of classes of respondents who share a typical relationship between a predictor and an outcome. It is also shown that the comparison between regression mixture models and interactions becomes substantially more complex as the number of classes increases. It is argued that regression interactions are well suited for direct tests of specific hypotheses about differential effects and regression mixtures provide a useful approach for exploring effect heterogeneity given adequate samples and study design. PMID:26556903

  12. Robust analysis of trends in noisy tokamak confinement data using geodesic least squares regression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Verdoolaege, G., E-mail: geert.verdoolaege@ugent.be; Laboratory for Plasma Physics, Royal Military Academy, B-1000 Brussels; Shabbir, A.

    Regression analysis is a very common activity in fusion science for unveiling trends and parametric dependencies, but it can be a difficult matter. We have recently developed the method of geodesic least squares (GLS) regression that is able to handle errors in all variables, is robust against data outliers and uncertainty in the regression model, and can be used with arbitrary distribution models and regression functions. We here report on first results of application of GLS to estimation of the multi-machine scaling law for the energy confinement time in tokamaks, demonstrating improved consistency of the GLS results compared to standardmore » least squares.« less

  13. A use of regression analysis in acoustical diagnostics of gear drives

    NASA Technical Reports Server (NTRS)

    Balitskiy, F. Y.; Genkin, M. D.; Ivanova, M. A.; Kobrinskiy, A. A.; Sokolova, A. G.

    1973-01-01

    A study is presented of components of the vibration spectrum as the filtered first and second harmonics of the tooth frequency which permits information to be obtained on the physical characteristics of the vibration excitation process, and an approach to be made to comparison of models of the gearing. Regression analysis of two random processes has shown a strong dependence of the second harmonic on the first, and independence of the first from the second. The nature of change in the regression line, with change in loading moment, gives rise to the idea of a variable phase shift between the first and second harmonics.

  14. Quantile regression in the presence of monotone missingness with sensitivity analysis

    PubMed Central

    Liu, Minzhao; Daniels, Michael J.; Perri, Michael G.

    2016-01-01

    In this paper, we develop methods for longitudinal quantile regression when there is monotone missingness. In particular, we propose pattern mixture models with a constraint that provides a straightforward interpretation of the marginal quantile regression parameters. Our approach allows sensitivity analysis which is an essential component in inference for incomplete data. To facilitate computation of the likelihood, we propose a novel way to obtain analytic forms for the required integrals. We conduct simulations to examine the robustness of our approach to modeling assumptions and compare its performance to competing approaches. The model is applied to data from a recent clinical trial on weight management. PMID:26041008

  15. Forecasting municipal solid waste generation using prognostic tools and regression analysis.

    PubMed

    Ghinea, Cristina; Drăgoi, Elena Niculina; Comăniţă, Elena-Diana; Gavrilescu, Marius; Câmpean, Teofil; Curteanu, Silvia; Gavrilescu, Maria

    2016-11-01

    For an adequate planning of waste management systems the accurate forecast of waste generation is an essential step, since various factors can affect waste trends. The application of predictive and prognosis models are useful tools, as reliable support for decision making processes. In this paper some indicators such as: number of residents, population age, urban life expectancy, total municipal solid waste were used as input variables in prognostic models in order to predict the amount of solid waste fractions. We applied Waste Prognostic Tool, regression analysis and time series analysis to forecast municipal solid waste generation and composition by considering the Iasi Romania case study. Regression equations were determined for six solid waste fractions (paper, plastic, metal, glass, biodegradable and other waste). Accuracy Measures were calculated and the results showed that S-curve trend model is the most suitable for municipal solid waste (MSW) prediction. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Standards for Standardized Logistic Regression Coefficients

    ERIC Educational Resources Information Center

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  17. CADDIS Volume 4. Data Analysis: PECBO Appendix - R Scripts for Non-Parametric Regressions

    EPA Pesticide Factsheets

    Script for computing nonparametric regression analysis. Overview of using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, statistical scripts.

  18. Regression-based adaptive sparse polynomial dimensional decomposition for sensitivity analysis

    NASA Astrophysics Data System (ADS)

    Tang, Kunkun; Congedo, Pietro; Abgrall, Remi

    2014-11-01

    Polynomial dimensional decomposition (PDD) is employed in this work for global sensitivity analysis and uncertainty quantification of stochastic systems subject to a large number of random input variables. Due to the intimate structure between PDD and Analysis-of-Variance, PDD is able to provide simpler and more direct evaluation of the Sobol' sensitivity indices, when compared to polynomial chaos (PC). Unfortunately, the number of PDD terms grows exponentially with respect to the size of the input random vector, which makes the computational cost of the standard method unaffordable for real engineering applications. In order to address this problem of curse of dimensionality, this work proposes a variance-based adaptive strategy aiming to build a cheap meta-model by sparse-PDD with PDD coefficients computed by regression. During this adaptive procedure, the model representation by PDD only contains few terms, so that the cost to resolve repeatedly the linear system of the least-square regression problem is negligible. The size of the final sparse-PDD representation is much smaller than the full PDD, since only significant terms are eventually retained. Consequently, a much less number of calls to the deterministic model is required to compute the final PDD coefficients.

  19. Aneurysmal subarachnoid hemorrhage prognostic decision-making algorithm using classification and regression tree analysis.

    PubMed

    Lo, Benjamin W Y; Fukuda, Hitoshi; Angle, Mark; Teitelbaum, Jeanne; Macdonald, R Loch; Farrokhyar, Forough; Thabane, Lehana; Levine, Mitchell A H

    2016-01-01

    Classification and regression tree analysis involves the creation of a decision tree by recursive partitioning of a dataset into more homogeneous subgroups. Thus far, there is scarce literature on using this technique to create clinical prediction tools for aneurysmal subarachnoid hemorrhage (SAH). The classification and regression tree analysis technique was applied to the multicenter Tirilazad database (3551 patients) in order to create the decision-making algorithm. In order to elucidate prognostic subgroups in aneurysmal SAH, neurologic, systemic, and demographic factors were taken into account. The dependent variable used for analysis was the dichotomized Glasgow Outcome Score at 3 months. Classification and regression tree analysis revealed seven prognostic subgroups. Neurological grade, occurrence of post-admission stroke, occurrence of post-admission fever, and age represented the explanatory nodes of this decision tree. Split sample validation revealed classification accuracy of 79% for the training dataset and 77% for the testing dataset. In addition, the occurrence of fever at 1-week post-aneurysmal SAH is associated with increased odds of post-admission stroke (odds ratio: 1.83, 95% confidence interval: 1.56-2.45, P < 0.01). A clinically useful classification tree was generated, which serves as a prediction tool to guide bedside prognostication and clinical treatment decision making. This prognostic decision-making algorithm also shed light on the complex interactions between a number of risk factors in determining outcome after aneurysmal SAH.

  20. Sperm Retrieval in Patients with Klinefelter Syndrome: A Skewed Regression Model Analysis.

    PubMed

    Chehrazi, Mohammad; Rahimiforoushani, Abbas; Sabbaghian, Marjan; Nourijelyani, Keramat; Sadighi Gilani, Mohammad Ali; Hoseini, Mostafa; Vesali, Samira; Yaseri, Mehdi; Alizadeh, Ahad; Mohammad, Kazem; Samani, Reza Omani

    2017-01-01

    The most common chromosomal abnormality due to non-obstructive azoospermia (NOA) is Klinefelter syndrome (KS) which occurs in 1-1.72 out of 500-1000 male infants. The probability of retrieving sperm as the outcome could be asymmetrically different between patients with and without KS, therefore logistic regression analysis is not a well-qualified test for this type of data. This study has been designed to evaluate skewed regression model analysis for data collected from microsurgical testicular sperm extraction (micro-TESE) among azoospermic patients with and without non-mosaic KS syndrome. This cohort study compared the micro-TESE outcome between 134 men with classic KS and 537 men with NOA and normal karyotype who were referred to Royan Institute between 2009 and 2011. In addition to our main outcome, which was sperm retrieval, we also used logistic and skewed regression analyses to compare the following demographic and hormonal factors: age, level of follicle stimulating hormone (FSH), luteinizing hormone (LH), and testosterone between the two groups. A comparison of the micro-TESE between the KS and control groups showed a success rate of 28.4% (38/134) for the KS group and 22.2% (119/537) for the control group. In the KS group, a significantly difference (P<0.001) existed between testosterone levels for the successful sperm retrieval group (3.4 ± 0.48 mg/mL) compared to the unsuccessful sperm retrieval group (2.33 ± 0.23 mg/mL). The index for quasi Akaike information criterion (QAIC) had a goodness of fit of 74 for the skewed model which was lower than logistic regression (QAIC=85). According to the results, skewed regression is more efficient in estimating sperm retrieval success when the data from patients with KS are analyzed. This finding should be investigated by conducting additional studies with different data structures.

  1. Temporal trends in sperm count: a systematic review and meta-regression analysis.

    PubMed

    Levine, Hagai; Jørgensen, Niels; Martino-Andrade, Anderson; Mendiola, Jaime; Weksler-Derri, Dan; Mindlis, Irina; Pinotti, Rachel; Swan, Shanna H

    2017-11-01

    Reported declines in sperm counts remain controversial today and recent trends are unknown. A definitive meta-analysis is critical given the predictive value of sperm count for fertility, morbidity and mortality. To provide a systematic review and meta-regression analysis of recent trends in sperm counts as measured by sperm concentration (SC) and total sperm count (TSC), and their modification by fertility and geographic group. PubMed/MEDLINE and EMBASE were searched for English language studies of human SC published in 1981-2013. Following a predefined protocol 7518 abstracts were screened and 2510 full articles reporting primary data on SC were reviewed. A total of 244 estimates of SC and TSC from 185 studies of 42 935 men who provided semen samples in 1973-2011 were extracted for meta-regression analysis, as well as information on years of sample collection and covariates [fertility group ('Unselected by fertility' versus 'Fertile'), geographic group ('Western', including North America, Europe Australia and New Zealand versus 'Other', including South America, Asia and Africa), age, ejaculation abstinence time, semen collection method, method of measuring SC and semen volume, exclusion criteria and indicators of completeness of covariate data]. The slopes of SC and TSC were estimated as functions of sample collection year using both simple linear regression and weighted meta-regression models and the latter were adjusted for pre-determined covariates and modification by fertility and geographic group. Assumptions were examined using multiple sensitivity analyses and nonlinear models. SC declined significantly between 1973 and 2011 (slope in unadjusted simple regression models -0.70 million/ml/year; 95% CI: -0.72 to -0.69; P < 0.001; slope in adjusted meta-regression models = -0.64; -1.06 to -0.22; P = 0.003). The slopes in the meta-regression model were modified by fertility (P for interaction = 0.064) and geographic group (P for interaction = 0.027). There was

  2. Use of generalized ordered logistic regression for the analysis of multidrug resistance data.

    PubMed

    Agga, Getahun E; Scott, H Morgan

    2015-10-01

    Statistical analysis of antimicrobial resistance data largely focuses on individual antimicrobial's binary outcome (susceptible or resistant). However, bacteria are becoming increasingly multidrug resistant (MDR). Statistical analysis of MDR data is mostly descriptive often with tabular or graphical presentations. Here we report the applicability of generalized ordinal logistic regression model for the analysis of MDR data. A total of 1,152 Escherichia coli, isolated from the feces of weaned pigs experimentally supplemented with chlortetracycline (CTC) and copper, were tested for susceptibilities against 15 antimicrobials and were binary classified into resistant or susceptible. The 15 antimicrobial agents tested were grouped into eight different antimicrobial classes. We defined MDR as the number of antimicrobial classes to which E. coli isolates were resistant ranging from 0 to 8. Proportionality of the odds assumption of the ordinal logistic regression model was violated only for the effect of treatment period (pre-treatment, during-treatment and post-treatment); but not for the effect of CTC or copper supplementation. Subsequently, a partially constrained generalized ordinal logistic model was built that allows for the effect of treatment period to vary while constraining the effects of treatment (CTC and copper supplementation) to be constant across the levels of MDR classes. Copper (Proportional Odds Ratio [Prop OR]=1.03; 95% CI=0.73-1.47) and CTC (Prop OR=1.1; 95% CI=0.78-1.56) supplementation were not significantly associated with the level of MDR adjusted for the effect of treatment period. MDR generally declined over the trial period. In conclusion, generalized ordered logistic regression can be used for the analysis of ordinal data such as MDR data when the proportionality assumptions for ordered logistic regression are violated. Published by Elsevier B.V.

  3. Multiple Logistic Regression Analysis of Cigarette Use among High School Students

    ERIC Educational Resources Information Center

    Adwere-Boamah, Joseph

    2011-01-01

    A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…

  4. An Analysis of San Diego's Housing Market Using a Geographically Weighted Regression Approach

    NASA Astrophysics Data System (ADS)

    Grant, Christina P.

    San Diego County real estate transaction data was evaluated with a set of linear models calibrated by ordinary least squares and geographically weighted regression (GWR). The goal of the analysis was to determine whether the spatial effects assumed to be in the data are best studied globally with no spatial terms, globally with a fixed effects submarket variable, or locally with GWR. 18,050 single-family residential sales which closed in the six months between April 2014 and September 2014 were used in the analysis. Diagnostic statistics including AICc, R2, Global Moran's I, and visual inspection of diagnostic plots and maps indicate superior model performance by GWR as compared to both global regressions.

  5. Noninvasive spectral imaging of skin chromophores based on multiple regression analysis aided by Monte Carlo simulation

    NASA Astrophysics Data System (ADS)

    Nishidate, Izumi; Wiswadarma, Aditya; Hase, Yota; Tanaka, Noriyuki; Maeda, Takaaki; Niizeki, Kyuichi; Aizu, Yoshihisa

    2011-08-01

    In order to visualize melanin and blood concentrations and oxygen saturation in human skin tissue, a simple imaging technique based on multispectral diffuse reflectance images acquired at six wavelengths (500, 520, 540, 560, 580 and 600nm) was developed. The technique utilizes multiple regression analysis aided by Monte Carlo simulation for diffuse reflectance spectra. Using the absorbance spectrum as a response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as predictor variables, multiple regression analysis provides regression coefficients. Concentrations of melanin and total blood are then determined from the regression coefficients using conversion vectors that are deduced numerically in advance, while oxygen saturation is obtained directly from the regression coefficients. Experiments with a tissue-like agar gel phantom validated the method. In vivo experiments with human skin of the human hand during upper limb occlusion and of the inner forearm exposed to UV irradiation demonstrated the ability of the method to evaluate physiological reactions of human skin tissue.

  6. Using Refined Regression Analysis To Assess The Ecological Services Of Restored Wetlands

    EPA Science Inventory

    A hierarchical approach to regression analysis of wetland water treatment was conducted to determine which factors are the most appropriate for characterizing wetlands of differing structure and function. We used this approach in an effort to identify the types and characteristi...

  7. Regression Analysis of Physician Distribution to Identify Areas of Need: Some Preliminary Findings.

    ERIC Educational Resources Information Center

    Morgan, Bruce B.; And Others

    A regression analysis was conducted of factors that help to explain the variance in physician distribution and which identify those factors that influence the maldistribution of physicians. Models were developed for different geographic areas to determine the most appropriate unit of analysis for the Western Missouri Area Health Education Center…

  8. No rationale for 1 variable per 10 events criterion for binary logistic regression analysis.

    PubMed

    van Smeden, Maarten; de Groot, Joris A H; Moons, Karel G M; Collins, Gary S; Altman, Douglas G; Eijkemans, Marinus J C; Reitsma, Johannes B

    2016-11-24

    Ten events per variable (EPV) is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth's correction, are compared. The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect ('separation'). We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth's correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.

  9. Logistic Regression: Concept and Application

    ERIC Educational Resources Information Center

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  10. Ecologic regression analysis and the study of the influence of air quality on mortality.

    PubMed Central

    Selvin, S; Merrill, D; Wong, L; Sacks, S T

    1984-01-01

    This presentation focuses entirely on the use and evaluation of regression analysis applied to ecologic data as a method to study the effects of ambient air pollution on mortality rates. Using extensive national data on mortality, air quality and socio-economic status regression analyses are used to study the influence of air quality on mortality. The analytic methods and data are selected in such a way that direct comparisons can be made with other ecologic regression studies of mortality and air quality. Analyses are performed by use of two types of geographic areas, age-specific mortality of both males and females and three pollutants (total suspended particulates, sulfur dioxide and nitrogen dioxide). The overall results indicate no persuasive evidence exists of a link between air quality and general mortality levels. Additionally, a lack of consistency between the present results and previous published work is noted. Overall, it is concluded that linear regression analysis applied to nationally collected ecologic data cannot be used to usefully infer a causal relationship between air quality and mortality which is in direct contradiction to other major published studies. PMID:6734568

  11. Spatial regression analysis of traffic crashes in Seoul.

    PubMed

    Rhee, Kyoung-Ah; Kim, Joon-Ki; Lee, Young-ihn; Ulfarsson, Gudmundur F

    2016-06-01

    Traffic crashes can be spatially correlated events and the analysis of the distribution of traffic crash frequency requires evaluation of parameters that reflect spatial properties and correlation. Typically this spatial aspect of crash data is not used in everyday practice by planning agencies and this contributes to a gap between research and practice. A database of traffic crashes in Seoul, Korea, in 2010 was developed at the traffic analysis zone (TAZ) level with a number of GIS developed spatial variables. Practical spatial models using available software were estimated. The spatial error model was determined to be better than the spatial lag model and an ordinary least squares baseline regression. A geographically weighted regression model provided useful insights about localization of effects. The results found that an increased length of roads with speed limit below 30 km/h and a higher ratio of residents below age of 15 were correlated with lower traffic crash frequency, while a higher ratio of residents who moved to the TAZ, more vehicle-kilometers traveled, and a greater number of access points with speed limit difference between side roads and mainline above 30 km/h all increased the number of traffic crashes. This suggests, for example, that better control or design for merging lower speed roads with higher speed roads is important. A key result is that the length of bus-only center lanes had the largest effect on increasing traffic crashes. This is important as bus-only center lanes with bus stop islands have been increasingly used to improve transit times. Hence the potential negative safety impacts of such systems need to be studied further and mitigated through improved design of pedestrian access to center bus stop islands. Copyright © 2016 Elsevier Ltd. All rights reserved.

  12. Regression analysis of mixed recurrent-event and panel-count data

    PubMed Central

    Zhu, Liang; Tong, Xinwei; Sun, Jianguo; Chen, Manhua; Srivastava, Deo Kumar; Leisenring, Wendy; Robison, Leslie L.

    2014-01-01

    In event history studies concerning recurrent events, two types of data have been extensively discussed. One is recurrent-event data (Cook and Lawless, 2007. The Analysis of Recurrent Event Data. New York: Springer), and the other is panel-count data (Zhao and others, 2010. Nonparametric inference based on panel-count data. Test 20, 1–42). In the former case, all study subjects are monitored continuously; thus, complete information is available for the underlying recurrent-event processes of interest. In the latter case, study subjects are monitored periodically; thus, only incomplete information is available for the processes of interest. In reality, however, a third type of data could occur in which some study subjects are monitored continuously, but others are monitored periodically. When this occurs, we have mixed recurrent-event and panel-count data. This paper discusses regression analysis of such mixed data and presents two estimation procedures for the problem. One is a maximum likelihood estimation procedure, and the other is an estimating equation procedure. The asymptotic properties of both resulting estimators of regression parameters are established. Also, the methods are applied to a set of mixed recurrent-event and panel-count data that arose from a Childhood Cancer Survivor Study and motivated this investigation. PMID:24648408

  13. Estimate the contribution of incubation parameters influence egg hatchability using multiple linear regression analysis

    PubMed Central

    Khalil, Mohamed H.; Shebl, Mostafa K.; Kosba, Mohamed A.; El-Sabrout, Karim; Zaki, Nesma

    2016-01-01

    Aim: This research was conducted to determine the most affecting parameters on hatchability of indigenous and improved local chickens’ eggs. Materials and Methods: Five parameters were studied (fertility, early and late embryonic mortalities, shape index, egg weight, and egg weight loss) on four strains, namely Fayoumi, Alexandria, Matrouh, and Montazah. Multiple linear regression was performed on the studied parameters to determine the most influencing one on hatchability. Results: The results showed significant differences in commercial and scientific hatchability among strains. Alexandria strain has the highest significant commercial hatchability (80.70%). Regarding the studied strains, highly significant differences in hatching chick weight among strains were observed. Using multiple linear regression analysis, fertility made the greatest percent contribution (71.31%) to hatchability, and the lowest percent contributions were made by shape index and egg weight loss. Conclusion: A prediction of hatchability using multiple regression analysis could be a good tool to improve hatchability percentage in chickens. PMID:27651666

  14. Sparse multivariate factor analysis regression models and its applications to integrative genomics analysis.

    PubMed

    Zhou, Yan; Wang, Pei; Wang, Xianlong; Zhu, Ji; Song, Peter X-K

    2017-01-01

    The multivariate regression model is a useful tool to explore complex associations between two kinds of molecular markers, which enables the understanding of the biological pathways underlying disease etiology. For a set of correlated response variables, accounting for such dependency can increase statistical power. Motivated by integrative genomic data analyses, we propose a new methodology-sparse multivariate factor analysis regression model (smFARM), in which correlations of response variables are assumed to follow a factor analysis model with latent factors. This proposed method not only allows us to address the challenge that the number of association parameters is larger than the sample size, but also to adjust for unobserved genetic and/or nongenetic factors that potentially conceal the underlying response-predictor associations. The proposed smFARM is implemented by the EM algorithm and the blockwise coordinate descent algorithm. The proposed methodology is evaluated and compared to the existing methods through extensive simulation studies. Our results show that accounting for latent factors through the proposed smFARM can improve sensitivity of signal detection and accuracy of sparse association map estimation. We illustrate smFARM by two integrative genomics analysis examples, a breast cancer dataset, and an ovarian cancer dataset, to assess the relationship between DNA copy numbers and gene expression arrays to understand genetic regulatory patterns relevant to the disease. We identify two trans-hub regions: one in cytoband 17q12 whose amplification influences the RNA expression levels of important breast cancer genes, and the other in cytoband 9q21.32-33, which is associated with chemoresistance in ovarian cancer. © 2016 WILEY PERIODICALS, INC.

  15. Advanced statistics: linear regression, part II: multiple linear regression.

    PubMed

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  16. Sibling dilution hypothesis: a regression surface analysis.

    PubMed

    Marjoribanks, K

    2001-08-01

    This study examined relationships between sibship size (the number of children in a family), birth order, and measures of academic performance, academic self-concept, and educational aspirations at different levels of family educational resources. As part of a national longitudinal study of Australian secondary school students data were collected from 2,530 boys and 2,450 girls in Years 9 and 10. Regression surfaces were constructed from models that included terms to account for linear, interaction, and curvilinear associations among the variables. Analysis suggests the general propositions (a) family educational resources have significant associations with children's school-related outcomes at different levels of sibling variables, the relationships for girls being curvilinear, and (b) sibling variables continue to have small significant associations with affective and cognitive outcomes, after taking into account variations in family educational resources. That is, the investigation provides only partial support for the sibling dilution hypothesis.

  17. Weighted functional linear regression models for gene-based association analysis.

    PubMed

    Belonogova, Nadezhda M; Svishcheva, Gulnara R; Wilson, James F; Campbell, Harry; Axenovich, Tatiana I

    2018-01-01

    Functional linear regression models are effectively used in gene-based association analysis of complex traits. These models combine information about individual genetic variants, taking into account their positions and reducing the influence of noise and/or observation errors. To increase the power of methods, where several differently informative components are combined, weights are introduced to give the advantage to more informative components. Allele-specific weights have been introduced to collapsing and kernel-based approaches to gene-based association analysis. Here we have for the first time introduced weights to functional linear regression models adapted for both independent and family samples. Using data simulated on the basis of GAW17 genotypes and weights defined by allele frequencies via the beta distribution, we demonstrated that type I errors correspond to declared values and that increasing the weights of causal variants allows the power of functional linear models to be increased. We applied the new method to real data on blood pressure from the ORCADES sample. Five of the six known genes with P < 0.1 in at least one analysis had lower P values with weighted models. Moreover, we found an association between diastolic blood pressure and the VMP1 gene (P = 8.18×10-6), when we used a weighted functional model. For this gene, the unweighted functional and weighted kernel-based models had P = 0.004 and 0.006, respectively. The new method has been implemented in the program package FREGAT, which is freely available at https://cran.r-project.org/web/packages/FREGAT/index.html.

  18. Detrended fluctuation analysis as a regression framework: Estimating dependence at different scales

    NASA Astrophysics Data System (ADS)

    Kristoufek, Ladislav

    2015-02-01

    We propose a framework combining detrended fluctuation analysis with standard regression methodology. The method is built on detrended variances and covariances and it is designed to estimate regression parameters at different scales and under potential nonstationarity and power-law correlations. The former feature allows for distinguishing between effects for a pair of variables from different temporal perspectives. The latter ones make the method a significant improvement over the standard least squares estimation. Theoretical claims are supported by Monte Carlo simulations. The method is then applied on selected examples from physics, finance, environmental science, and epidemiology. For most of the studied cases, the relationship between variables of interest varies strongly across scales.

  19. Statistical methods and regression analysis of stratospheric ozone and meteorological variables in Isfahan

    NASA Astrophysics Data System (ADS)

    Hassanzadeh, S.; Hosseinibalam, F.; Omidvari, M.

    2008-04-01

    Data of seven meteorological variables (relative humidity, wet temperature, dry temperature, maximum temperature, minimum temperature, ground temperature and sun radiation time) and ozone values have been used for statistical analysis. Meteorological variables and ozone values were analyzed using both multiple linear regression and principal component methods. Data for the period 1999-2004 are analyzed jointly using both methods. For all periods, temperature dependent variables were highly correlated, but were all negatively correlated with relative humidity. Multiple regression analysis was used to fit the meteorological variables using the meteorological variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to obtain subsets of the predictor variables to be included in the linear regression model of the meteorological variables. In 1999, 2001 and 2002 one of the meteorological variables was weakly influenced predominantly by the ozone concentrations. However, the model did not predict that the meteorological variables for the year 2000 were not influenced predominantly by the ozone concentrations that point to variation in sun radiation. This could be due to other factors that were not explicitly considered in this study.

  20. Local linear regression for function learning: an analysis based on sample discrepancy.

    PubMed

    Cervellera, Cristiano; Macciò, Danilo

    2014-11-01

    Local linear regression models, a kind of nonparametric structures that locally perform a linear estimation of the target function, are analyzed in the context of empirical risk minimization (ERM) for function learning. The analysis is carried out with emphasis on geometric properties of the available data. In particular, the discrepancy of the observation points used both to build the local regression models and compute the empirical risk is considered. This allows to treat indifferently the case in which the samples come from a random external source and the one in which the input space can be freely explored. Both consistency of the ERM procedure and approximating capabilities of the estimator are analyzed, proving conditions to ensure convergence. Since the theoretical analysis shows that the estimation improves as the discrepancy of the observation points becomes smaller, low-discrepancy sequences, a family of sampling methods commonly employed for efficient numerical integration, are also analyzed. Simulation results involving two different examples of function learning are provided.

  1. Recent lung cancer mortality trends in Europe: effect of national smoke-free legislation strengthening.

    PubMed

    López-Campos, Jose L; Ruiz-Ramos, Miguel; Fernandez, Esteve; Soriano, Joan B

    2018-07-01

    The impact of smoke-free legislation within European Union (EU) countries on lung cancer mortality has not been evaluated to date. We aimed to determine lung cancer mortality trends in the EU-27 by sex, age, and calendar year for the period of 1994 and 2012, and relate them with changes in tobacco legislation at the national level. Deaths by Eurostat in each European country were analyzed, focusing on ICD-10 codes C33 and C34 from the years 1994 to 2012. Age-standardized mortality rates (ASR) were estimated separately for women and men in the EU-27 total and within country for each one of the years studied, and the significance of changing trends was estimated by joinpoint regression analysis, exploring lag times after initiation of smoke-free legislation in every country, if any. From 1994 to 2012, there were 4 681 877 deaths from lung cancer in Europe (3 491 607 in men and 1 190 180 in women) and a nearly linear decrease in mortality rates because of lung cancer in men from was observed1994 to 2012, mirrored in women by an upward trend, narrowing the sex gap during the study period from 5.1 in 1994 to 2.8 in 2012. Joinpoint regression analysis identified a number of trend changes over time, but it appears that they were unrelated to the implementation of smoke-free legislations. A few years after the introduction of smoke-free legislations across Europe, trends of lung cancer mortality trends have not changed.

  2. Weighted regression analysis and interval estimators

    Treesearch

    Donald W. Seegrist

    1974-01-01

    A method for deriving the weighted least squares estimators for the parameters of a multiple regression model. Confidence intervals for expected values, and prediction intervals for the means of future samples are given.

  3. A menu-driven software package of Bayesian nonparametric (and parametric) mixed models for regression analysis and density estimation.

    PubMed

    Karabatsos, George

    2017-02-01

    Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected

  4. Application of Regression-Discontinuity Analysis in Pharmaceutical Health Services Research

    PubMed Central

    Zuckerman, Ilene H; Lee, Euni; Wutoh, Anthony K; Xue, Zhenyi; Stuart, Bruce

    2006-01-01

    Objective To demonstrate how a relatively underused design, regression-discontinuity (RD), can provide robust estimates of intervention effects when stronger designs are impossible to implement. Data Sources/Study Setting Administrative claims from a Mid-Atlantic state Medicaid program were used to evaluate the effectiveness of an educational drug utilization review intervention. Study Design Quasi-experimental design. Data Collection/Extraction Methods A drug utilization review study was conducted to evaluate a letter intervention to physicians treating Medicaid children with potentially excessive use of short-acting β2-agonist inhalers (SAB). The outcome measure is change in seasonally-adjusted SAB use 5 months pre- and postintervention. To determine if the intervention reduced monthly SAB utilization, results from an RD analysis are compared to findings from a pretest–posttest design using repeated-measure ANOVA. Principal Findings Both analyses indicated that the intervention significantly reduced SAB use among the high users. Average monthly SAB use declined by 0.9 canisters per month (p<.001) according to the repeated-measure ANOVA and by 0.2 canisters per month (p<.001) from RD analysis. Conclusions Regression-discontinuity design is a useful quasi-experimental methodology that has significant advantages in internal validity compared to other pre–post designs when assessing interventions in which subjects' assignment is based on cutoff scores for a critical variable. PMID:16584464

  5. Regression analysis of mixed recurrent-event and panel-count data.

    PubMed

    Zhu, Liang; Tong, Xinwei; Sun, Jianguo; Chen, Manhua; Srivastava, Deo Kumar; Leisenring, Wendy; Robison, Leslie L

    2014-07-01

    In event history studies concerning recurrent events, two types of data have been extensively discussed. One is recurrent-event data (Cook and Lawless, 2007. The Analysis of Recurrent Event Data. New York: Springer), and the other is panel-count data (Zhao and others, 2010. Nonparametric inference based on panel-count data. Test 20: , 1-42). In the former case, all study subjects are monitored continuously; thus, complete information is available for the underlying recurrent-event processes of interest. In the latter case, study subjects are monitored periodically; thus, only incomplete information is available for the processes of interest. In reality, however, a third type of data could occur in which some study subjects are monitored continuously, but others are monitored periodically. When this occurs, we have mixed recurrent-event and panel-count data. This paper discusses regression analysis of such mixed data and presents two estimation procedures for the problem. One is a maximum likelihood estimation procedure, and the other is an estimating equation procedure. The asymptotic properties of both resulting estimators of regression parameters are established. Also, the methods are applied to a set of mixed recurrent-event and panel-count data that arose from a Childhood Cancer Survivor Study and motivated this investigation. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  6. Regression Analysis of Mixed Panel Count Data with Dependent Terminal Events

    PubMed Central

    Yu, Guanglei; Zhu, Liang; Li, Yang; Sun, Jianguo; Robison, Leslie L.

    2017-01-01

    Event history studies are commonly conducted in many fields and a great deal of literature has been established for the analysis of the two types of data commonly arising from these studies: recurrent event data and panel count data. The former arises if all study subjects are followed continuously, while the latter means that each study subject is observed only at discrete time points. In reality, a third type of data, a mixture of the two types of the data above, may occur and furthermore, as with the first two types of the data, there may exist a dependent terminal event, which may preclude the occurrences of recurrent events of interest. This paper discusses regression analysis of mixed recurrent event and panel count data in the presence of a terminal event and an estimating equation-based approach is proposed for estimation of regression parameters of interest. In addition, the asymptotic properties of the proposed estimator are established and a simulation study conducted to assess the finite-sample performance of the proposed method suggests that it works well in practical situations. Finally the methodology is applied to a childhood cancer study that motivated this study. PMID:28098397

  7. Kernel analysis of partial least squares (PLS) regression models.

    PubMed

    Shinzawa, Hideyuki; Ritthiruangdej, Pitiporn; Ozaki, Yukihiro

    2011-05-01

    An analytical technique based on kernel matrix representation is demonstrated to provide further chemically meaningful insight into partial least squares (PLS) regression models. The kernel matrix condenses essential information about scores derived from PLS or principal component analysis (PCA). Thus, it becomes possible to establish the proper interpretation of the scores. A PLS model for the total nitrogen (TN) content in multiple Thai fish sauces is built with a set of near-infrared (NIR) transmittance spectra of the fish sauce samples. The kernel analysis of the scores effectively reveals that the variation of the spectral feature induced by the change in protein content is substantially associated with the total water content and the protein hydration. Kernel analysis is also carried out on a set of time-dependent infrared (IR) spectra representing transient evaporation of ethanol from a binary mixture solution of ethanol and oleic acid. A PLS model to predict the elapsed time is built with the IR spectra and the kernel matrix is derived from the scores. The detailed analysis of the kernel matrix provides penetrating insight into the interaction between the ethanol and the oleic acid.

  8. A refined method for multivariate meta-analysis and meta-regression

    PubMed Central

    Jackson, Daniel; Riley, Richard D

    2014-01-01

    Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects’ standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:23996351

  9. A refined method for multivariate meta-analysis and meta-regression.

    PubMed

    Jackson, Daniel; Riley, Richard D

    2014-02-20

    Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects' standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. Copyright © 2013 John Wiley & Sons, Ltd.

  10. [Local Regression Algorithm Based on Net Analyte Signal and Its Application in Near Infrared Spectral Analysis].

    PubMed

    Zhang, Hong-guang; Lu, Jian-gang

    2016-02-01

    Abstract To overcome the problems of significant difference among samples and nonlinearity between the property and spectra of samples in spectral quantitative analysis, a local regression algorithm is proposed in this paper. In this algorithm, net signal analysis method(NAS) was firstly used to obtain the net analyte signal of the calibration samples and unknown samples, then the Euclidean distance between net analyte signal of the sample and net analyte signal of calibration samples was calculated and utilized as similarity index. According to the defined similarity index, the local calibration sets were individually selected for each unknown sample. Finally, a local PLS regression model was built on each local calibration sets for each unknown sample. The proposed method was applied to a set of near infrared spectra of meat samples. The results demonstrate that the prediction precision and model complexity of the proposed method are superior to global PLS regression method and conventional local regression algorithm based on spectral Euclidean distance.

  11. Criteria for the use of regression analysis for remote sensing of sediment and pollutants

    NASA Technical Reports Server (NTRS)

    Whitlock, C. H.; Kuo, C. Y.; Lecroy, S. R. (Principal Investigator)

    1982-01-01

    Data analysis procedures for quantification of water quality parameters that are already identified and are known to exist within the water body are considered. The liner multiple-regression technique was examined as a procedure for defining and calibrating data analysis algorithms for such instruments as spectrometers and multispectral scanners.

  12. The Variance Normalization Method of Ridge Regression Analysis.

    ERIC Educational Resources Information Center

    Bulcock, J. W.; And Others

    The testing of contemporary sociological theory often calls for the application of structural-equation models to data which are inherently collinear. It is shown that simple ridge regression, which is commonly used for controlling the instability of ordinary least squares regression estimates in ill-conditioned data sets, is not a legitimate…

  13. Prediction of hearing outcomes by multiple regression analysis in patients with idiopathic sudden sensorineural hearing loss.

    PubMed

    Suzuki, Hideaki; Tabata, Takahisa; Koizumi, Hiroki; Hohchi, Nobusuke; Takeuchi, Shoko; Kitamura, Takuro; Fujino, Yoshihisa; Ohbuchi, Toyoaki

    2014-12-01

    This study aimed to create a multiple regression model for predicting hearing outcomes of idiopathic sudden sensorineural hearing loss (ISSNHL). The participants were 205 consecutive patients (205 ears) with ISSNHL (hearing level ≥ 40 dB, interval between onset and treatment ≤ 30 days). They received systemic steroid administration combined with intratympanic steroid injection. Data were examined by simple and multiple regression analyses. Three hearing indices (percentage hearing improvement, hearing gain, and posttreatment hearing level [HLpost]) and 7 prognostic factors (age, days from onset to treatment, initial hearing level, initial hearing level at low frequencies, initial hearing level at high frequencies, presence of vertigo, and contralateral hearing level) were included in the multiple regression analysis as dependent and explanatory variables, respectively. In the simple regression analysis, the percentage hearing improvement, hearing gain, and HLpost showed significant correlation with 2, 5, and 6 of the 7 prognostic factors, respectively. The multiple correlation coefficients were 0.396, 0.503, and 0.714 for the percentage hearing improvement, hearing gain, and HLpost, respectively. Predicted values of HLpost calculated by the multiple regression equation were reliable with 70% probability with a 40-dB-width prediction interval. Prediction of HLpost by the multiple regression model may be useful to estimate the hearing prognosis of ISSNHL. © The Author(s) 2014.

  14. [Application of negative binomial regression and modified Poisson regression in the research of risk factors for injury frequency].

    PubMed

    Cao, Qingqing; Wu, Zhenqiang; Sun, Ying; Wang, Tiezhu; Han, Tengwei; Gu, Chaomei; Sun, Yehuan

    2011-11-01

    To Eexplore the application of negative binomial regression and modified Poisson regression analysis in analyzing the influential factors for injury frequency and the risk factors leading to the increase of injury frequency. 2917 primary and secondary school students were selected from Hefei by cluster random sampling method and surveyed by questionnaire. The data on the count event-based injuries used to fitted modified Poisson regression and negative binomial regression model. The risk factors incurring the increase of unintentional injury frequency for juvenile students was explored, so as to probe the efficiency of these two models in studying the influential factors for injury frequency. The Poisson model existed over-dispersion (P < 0.0001) based on testing by the Lagrangemultiplier. Therefore, the over-dispersion dispersed data using a modified Poisson regression and negative binomial regression model, was fitted better. respectively. Both showed that male gender, younger age, father working outside of the hometown, the level of the guardian being above junior high school and smoking might be the results of higher injury frequencies. On a tendency of clustered frequency data on injury event, both the modified Poisson regression analysis and negative binomial regression analysis can be used. However, based on our data, the modified Poisson regression fitted better and this model could give a more accurate interpretation of relevant factors affecting the frequency of injury.

  15. Neck-focused panic attacks among Cambodian refugees; a logistic and linear regression analysis.

    PubMed

    Hinton, Devon E; Chhean, Dara; Pich, Vuth; Um, Khin; Fama, Jeanne M; Pollack, Mark H

    2006-01-01

    Consecutive Cambodian refugees attending a psychiatric clinic were assessed for the presence and severity of current--i.e., at least one episode in the last month--neck-focused panic. Among the whole sample (N=130), in a logistic regression analysis, the Anxiety Sensitivity Index (ASI; odds ratio=3.70) and the Clinician-Administered PTSD Scale (CAPS; odds ratio=2.61) significantly predicted the presence of current neck panic (NP). Among the neck panic patients (N=60), in the linear regression analysis, NP severity was significantly predicted by NP-associated flashbacks (beta=.42), NP-associated catastrophic cognitions (beta=.22), and CAPS score (beta=.28). Further analysis revealed the effect of the CAPS score to be significantly mediated (Sobel test [Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182]) by both NP-associated flashbacks and catastrophic cognitions. In the care of traumatized Cambodian refugees, NP severity, as well as NP-associated flashbacks and catastrophic cognitions, should be specifically assessed and treated.

  16. Dose-Dependent Effects of Statins for Patients with Aneurysmal Subarachnoid Hemorrhage: Meta-Regression Analysis.

    PubMed

    To, Minh-Son; Prakash, Shivesh; Poonnoose, Santosh I; Bihari, Shailesh

    2018-05-01

    The study uses meta-regression analysis to quantify the dose-dependent effects of statin pharmacotherapy on vasospasm, delayed ischemic neurologic deficits (DIND), and mortality in aneurysmal subarachnoid hemorrhage. Prospective, retrospective observational studies, and randomized controlled trials (RCTs) were retrieved by a systematic database search. Summary estimates were expressed as absolute risk (AR) for a given statin dose or control (placebo). Meta-regression using inverse variance weighting and robust variance estimation was performed to assess the effect of statin dose on transformed AR in a random effects model. Dose-dependence of predicted AR with 95% confidence interval (CI) was recovered by using Miller's Freeman-Tukey inverse. The database search and study selection criteria yielded 18 studies (2594 patients) for analysis. These included 12 RCTs, 4 retrospective observational studies, and 2 prospective observational studies. Twelve studies investigated simvastatin, whereas the remaining studies investigated atorvastatin, pravastatin, or pitavastatin, with simvastatin-equivalent doses ranging from 20 to 80 mg. Meta-regression revealed dose-dependent reductions in Freeman-Tukey-transformed AR of vasospasm (slope coefficient -0.00404, 95% CI -0.00720 to -0.00087; P = 0.0321), DIND (slope coefficient -0.00316, 95% CI -0.00586 to -0.00047; P = 0.0392), and mortality (slope coefficient -0.00345, 95% CI -0.00623 to -0.00067; P = 0.0352). The present meta-regression provides weak evidence for dose-dependent reductions in vasospasm, DIND and mortality associated with acute statin use after aneurysmal subarachnoid hemorrhage. However, the analysis was limited by substantial heterogeneity among individual studies. Greater dosing strategies are a potential consideration for future RCTs. Copyright © 2018 Elsevier Inc. All rights reserved.

  17. Utility-Based Instruments for People with Dementia: A Systematic Review and Meta-Regression Analysis.

    PubMed

    Li, Li; Nguyen, Kim-Huong; Comans, Tracy; Scuffham, Paul

    2018-04-01

    Several utility-based instruments have been applied in cost-utility analysis to assess health state values for people with dementia. Nevertheless, concerns and uncertainty regarding their performance for people with dementia have been raised. To assess the performance of available utility-based instruments for people with dementia by comparing their psychometric properties and to explore factors that cause variations in the reported health state values generated from those instruments by conducting meta-regression analyses. A literature search was conducted and psychometric properties were synthesized to demonstrate the overall performance of each instrument. When available, health state values and variables such as the type of instrument and cognitive impairment levels were extracted from each article. A meta-regression analysis was undertaken and available covariates were included in the models. A total of 64 studies providing preference-based values were identified and included. The EuroQol five-dimension questionnaire demonstrated the best combination of feasibility, reliability, and validity. Meta-regression analyses suggested that significant differences exist between instruments, type of respondents, and mode of administration and the variations in estimated utility values had influences on incremental quality-adjusted life-year calculation. This review finds that the EuroQol five-dimension questionnaire is the most valid utility-based instrument for people with dementia, but should be replaced by others under certain circumstances. Although no utility estimates were reported in the article, the meta-regression analyses that examined variations in utility estimates produced by different instruments impact on cost-utility analysis, potentially altering the decision-making process in some circumstances. Copyright © 2018 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  18. Multiple regression analysis of anthropometric measurements influencing the cephalic index of male Japanese university students.

    PubMed

    Hossain, Md Golam; Saw, Aik; Alam, Rashidul; Ohtsuki, Fumio; Kamarul, Tunku

    2013-09-01

    Cephalic index (CI), the ratio of head breadth to head length, is widely used to categorise human populations. The aim of this study was to access the impact of anthropometric measurements on the CI of male Japanese university students. This study included 1,215 male university students from Tokyo and Kyoto, selected using convenient sampling. Multiple regression analysis was used to determine the effect of anthropometric measurements on CI. The variance inflation factor (VIF) showed no evidence of a multicollinearity problem among independent variables. The coefficients of the regression line demonstrated a significant positive relationship between CI and minimum frontal breadth (p < 0.01), bizygomatic breadth (p < 0.01) and head height (p < 0.05), and a negative relationship between CI and morphological facial height (p < 0.01) and head circumference (p < 0.01). Moreover, the coefficient and odds ratio of logistic regression analysis showed a greater likelihood for minimum frontal breadth (p < 0.01) and bizygomatic breadth (p < 0.01) to predict round-headedness, and morphological facial height (p < 0.05) and head circumference (p < 0.01) to predict long-headedness. Stepwise regression analysis revealed bizygomatic breadth, head circumference, minimum frontal breadth, head height and morphological facial height to be the best predictor craniofacial measurements with respect to CI. The results suggest that most of the variables considered in this study appear to influence the CI of adult male Japanese students.

  19. Logistic Regression and Path Analysis Method to Analyze Factors influencing Students’ Achievement

    NASA Astrophysics Data System (ADS)

    Noeryanti, N.; Suryowati, K.; Setyawan, Y.; Aulia, R. R.

    2018-04-01

    Students' academic achievement cannot be separated from the influence of two factors namely internal and external factors. The first factors of the student (internal factors) consist of intelligence (X1), health (X2), interest (X3), and motivation of students (X4). The external factors consist of family environment (X5), school environment (X6), and society environment (X7). The objects of this research are eighth grade students of the school year 2016/2017 at SMPN 1 Jiwan Madiun sampled by using simple random sampling. Primary data are obtained by distributing questionnaires. The method used in this study is binary logistic regression analysis that aims to identify internal and external factors that affect student’s achievement and how the trends of them. Path Analysis was used to determine the factors that influence directly, indirectly or totally on student’s achievement. Based on the results of binary logistic regression, variables that affect student’s achievement are interest and motivation. And based on the results obtained by path analysis, factors that have a direct impact on student’s achievement are students’ interest (59%) and students’ motivation (27%). While the factors that have indirect influences on students’ achievement, are family environment (97%) and school environment (37).

  20. Marital status integration and suicide: A meta-analysis and meta-regression.

    PubMed

    Kyung-Sook, Woo; SangSoo, Shin; Sangjin, Shin; Young-Jeon, Shin

    2018-01-01

    Marital status is an index of the phenomenon of social integration within social structures and has long been identified as an important predictor suicide. However, previous meta-analyses have focused only on a particular marital status, or not sufficiently explored moderators. A meta-analysis of observational studies was conducted to explore the relationships between marital status and suicide and to understand the important moderating factors in this association. Electronic databases were searched to identify studies conducted between January 1, 2000 and June 30, 2016. We performed a meta-analysis, subgroup analysis, and meta-regression of 170 suicide risk estimates from 36 publications. Using random effects model with adjustment for covariates, the study found that the suicide risk for non-married versus married was OR = 1.92 (95% CI: 1.75-2.12). The suicide risk was higher for non-married individuals aged <65 years than for those aged ≥65 years, and higher for men than for women. According to the results of stratified analysis by gender, non-married men exhibited a greater risk of suicide than their married counterparts in all sub-analyses, but women aged 65 years or older showed no significant association between marital status and suicide. The suicide risk in divorced individuals was higher than for non-married individuals in both men and women. The meta-regression showed that gender, age, and sample size affected between-study variation. The results of the study indicated that non-married individuals have an aggregate higher suicide risk than married ones. In addition, gender and age were confirmed as important moderating factors in the relationship between marital status and suicide. Copyright © 2017 Elsevier Ltd. All rights reserved.

  1. Regression analysis of mixed panel count data with dependent terminal events.

    PubMed

    Yu, Guanglei; Zhu, Liang; Li, Yang; Sun, Jianguo; Robison, Leslie L

    2017-05-10

    Event history studies are commonly conducted in many fields, and a great deal of literature has been established for the analysis of the two types of data commonly arising from these studies: recurrent event data and panel count data. The former arises if all study subjects are followed continuously, while the latter means that each study subject is observed only at discrete time points. In reality, a third type of data, a mixture of the two types of the data earlier, may occur and furthermore, as with the first two types of the data, there may exist a dependent terminal event, which may preclude the occurrences of recurrent events of interest. This paper discusses regression analysis of mixed recurrent event and panel count data in the presence of a terminal event and an estimating equation-based approach is proposed for estimation of regression parameters of interest. In addition, the asymptotic properties of the proposed estimator are established, and a simulation study conducted to assess the finite-sample performance of the proposed method suggests that it works well in practical situations. Finally, the methodology is applied to a childhood cancer study that motivated this study. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  2. Spatial regression analysis on 32 years of total column ozone data

    NASA Astrophysics Data System (ADS)

    Knibbe, J. S.; van der A, R. J.; de Laat, A. T. J.

    2014-08-01

    Multiple-regression analyses have been performed on 32 years of total ozone column data that was spatially gridded with a 1 × 1.5° resolution. The total ozone data consist of the MSR (Multi Sensor Reanalysis; 1979-2008) and 2 years of assimilated SCIAMACHY (SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY) ozone data (2009-2010). The two-dimensionality in this data set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on nonseasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO), El Niño-Southern Oscillation (ENSO) and stratospheric alternative halogens which are parameterized by the effective equivalent stratospheric chlorine (EESC). For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of a similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at mid- and high latitudes, the solar cycle affects ozone positively mostly in the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high northern latitudes, the effect of QBO is positive and negative in the tropics and mid- to high latitudes, respectively, and ENSO affects ozone negatively

  3. Automated particle identification through regression analysis of size, shape and colour

    NASA Astrophysics Data System (ADS)

    Rodriguez Luna, J. C.; Cooper, J. M.; Neale, S. L.

    2016-04-01

    Rapid point of care diagnostic tests and tests to provide therapeutic information are now available for a range of specific conditions from the measurement of blood glucose levels for diabetes to card agglutination tests for parasitic infections. Due to a lack of specificity these test are often then backed up by more conventional lab based diagnostic methods for example a card agglutination test may be carried out for a suspected parasitic infection in the field and if positive a blood sample can then be sent to a lab for confirmation. The eventual diagnosis is often achieved by microscopic examination of the sample. In this paper we propose a computerized vision system for aiding in the diagnostic process; this system used a novel particle recognition algorithm to improve specificity and speed during the diagnostic process. We will show the detection and classification of different types of cells in a diluted blood sample using regression analysis of their size, shape and colour. The first step is to define the objects to be tracked by a Gaussian Mixture Model for background subtraction and binary opening and closing for noise suppression. After subtracting the objects of interest from the background the next challenge is to predict if a given object belongs to a certain category or not. This is a classification problem, and the output of the algorithm is a Boolean value (true/false). As such the computer program should be able to "predict" with reasonable level of confidence if a given particle belongs to the kind we are looking for or not. We show the use of a binary logistic regression analysis with three continuous predictors: size, shape and color histogram. The results suggest this variables could be very useful in a logistic regression equation as they proved to have a relatively high predictive value on their own.

  4. CUSUM-Logistic Regression analysis for the rapid detection of errors in clinical laboratory test results.

    PubMed

    Sampson, Maureen L; Gounden, Verena; van Deventer, Hendrik E; Remaley, Alan T

    2016-02-01

    The main drawback of the periodic analysis of quality control (QC) material is that test performance is not monitored in time periods between QC analyses, potentially leading to the reporting of faulty test results. The objective of this study was to develop a patient based QC procedure for the more timely detection of test errors. Results from a Chem-14 panel measured on the Beckman LX20 analyzer were used to develop the model. Each test result was predicted from the other 13 members of the panel by multiple regression, which resulted in correlation coefficients between the predicted and measured result of >0.7 for 8 of the 14 tests. A logistic regression model, which utilized the measured test result, the predicted test result, the day of the week and time of day, was then developed for predicting test errors. The output of the logistic regression was tallied by a daily CUSUM approach and used to predict test errors, with a fixed specificity of 90%. The mean average run length (ARL) before error detection by CUSUM-Logistic Regression (CSLR) was 20 with a mean sensitivity of 97%, which was considerably shorter than the mean ARL of 53 (sensitivity 87.5%) for a simple prediction model that only used the measured result for error detection. A CUSUM-Logistic Regression analysis of patient laboratory data can be an effective approach for the rapid and sensitive detection of clinical laboratory errors. Published by Elsevier Inc.

  5. Monitoring heavy metal Cr in soil based on hyperspectral data using regression analysis

    NASA Astrophysics Data System (ADS)

    Zhang, Ningyu; Xu, Fuyun; Zhuang, Shidong; He, Changwei

    2016-10-01

    Heavy metal pollution in soils is one of the most critical problems in the global ecology and environment safety nowadays. Hyperspectral remote sensing and its application is capable of high speed, low cost, less risk and less damage, and provides a good method for detecting heavy metals in soil. This paper proposed a new idea of applying regression analysis of stepwise multiple regression between the spectral data and monitoring the amount of heavy metal Cr by sample points in soil for environmental protection. In the measurement, a FieldSpec HandHeld spectroradiometer is used to collect reflectance spectra of sample points over the wavelength range of 325-1075 nm. Then the spectral data measured by the spectroradiometer is preprocessed to reduced the influence of the external factors, and the preprocessed methods include first-order differential equation, second-order differential equation and continuum removal method. The algorithms of stepwise multiple regression are established accordingly, and the accuracy of each equation is tested. The results showed that the accuracy of first-order differential equation works best, which makes it feasible to predict the content of heavy metal Cr by using stepwise multiple regression.

  6. Multiple Linear Regression Analysis of Factors Affecting Real Property Price Index From Case Study Research In Istanbul/Turkey

    NASA Astrophysics Data System (ADS)

    Denli, H. H.; Koc, Z.

    2015-12-01

    Estimation of real properties depending on standards is difficult to apply in time and location. Regression analysis construct mathematical models which describe or explain relationships that may exist between variables. The problem of identifying price differences of properties to obtain a price index can be converted into a regression problem, and standard techniques of regression analysis can be used to estimate the index. Considering regression analysis for real estate valuation, which are presented in real marketing process with its current characteristics and quantifiers, the method will help us to find the effective factors or variables in the formation of the value. In this study, prices of housing for sale in Zeytinburnu, a district in Istanbul, are associated with its characteristics to find a price index, based on information received from a real estate web page. The associated variables used for the analysis are age, size in m2, number of floors having the house, floor number of the estate and number of rooms. The price of the estate represents the dependent variable, whereas the rest are independent variables. Prices from 60 real estates have been used for the analysis. Same price valued locations have been found and plotted on the map and equivalence curves have been drawn identifying the same valued zones as lines.

  7. Study on power grid characteristics in summer based on Linear regression analysis

    NASA Astrophysics Data System (ADS)

    Tang, Jin-hui; Liu, You-fei; Liu, Juan; Liu, Qiang; Liu, Zhuan; Xu, Xi

    2018-05-01

    The correlation analysis of power load and temperature is the precondition and foundation for accurate load prediction, and a great deal of research has been made. This paper constructed the linear correlation model between temperature and power load, then the correlation of fault maintenance work orders with the power load is researched. Data details of Jiangxi province in 2017 summer such as temperature, power load, fault maintenance work orders were adopted in this paper to develop data analysis and mining. Linear regression models established in this paper will promote electricity load growth forecast, fault repair work order review, distribution network operation weakness analysis and other work to further deepen the refinement.

  8. Prevalence of treponema species detected in endodontic infections: systematic review and meta-regression analysis.

    PubMed

    Leite, Fábio R M; Nascimento, Gustavo G; Demarco, Flávio F; Gomes, Brenda P F A; Pucci, Cesar R; Martinho, Frederico C

    2015-05-01

    This systematic review and meta-regression analysis aimed to calculate a combined prevalence estimate and evaluate the prevalence of different Treponema species in primary and secondary endodontic infections, including symptomatic and asymptomatic cases. The MEDLINE/PubMed, Embase, Scielo, Web of Knowledge, and Scopus databases were searched without starting date restriction up to and including March 2014. Only reports in English were included. The selected literature was reviewed by 2 authors and classified as suitable or not to be included in this review. Lists were compared, and, in case of disagreements, decisions were made after a discussion based on inclusion and exclusion criteria. A pooled prevalence of Treponema species in endodontic infections was estimated. Additionally, a meta-regression analysis was performed. Among the 265 articles identified in the initial search, only 51 were included in the final analysis. The studies were classified into 2 different groups according to the type of endodontic infection and whether it was an exclusively primary/secondary study (n = 36) or a primary/secondary comparison (n = 15). The pooled prevalence of Treponema species was 41.5% (95% confidence interval, 35.9-47.0). In the multivariate model of meta-regression analysis, primary endodontic infections (P < .001), acute apical abscess, symptomatic apical periodontitis (P < .001), and concomitant presence of 2 or more species (P = .028) explained the heterogeneity regarding the prevalence rates of Treponema species. Our findings suggest that Treponema species are important pathogens involved in endodontic infections, particularly in cases of primary and acute infections. Copyright © 2015 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.

  9. Regression Verification Using Impact Summaries

    NASA Technical Reports Server (NTRS)

    Backes, John; Person, Suzette J.; Rungta, Neha; Thachuk, Oksana

    2013-01-01

    Regression verification techniques are used to prove equivalence of syntactically similar programs. Checking equivalence of large programs, however, can be computationally expensive. Existing regression verification techniques rely on abstraction and decomposition techniques to reduce the computational effort of checking equivalence of the entire program. These techniques are sound but not complete. In this work, we propose a novel approach to improve scalability of regression verification by classifying the program behaviors generated during symbolic execution as either impacted or unimpacted. Our technique uses a combination of static analysis and symbolic execution to generate summaries of impacted program behaviors. The impact summaries are then checked for equivalence using an o-the-shelf decision procedure. We prove that our approach is both sound and complete for sequential programs, with respect to the depth bound of symbolic execution. Our evaluation on a set of sequential C artifacts shows that reducing the size of the summaries can help reduce the cost of software equivalence checking. Various reduction, abstraction, and compositional techniques have been developed to help scale software verification techniques to industrial-sized systems. Although such techniques have greatly increased the size and complexity of systems that can be checked, analysis of large software systems remains costly. Regression analysis techniques, e.g., regression testing [16], regression model checking [22], and regression verification [19], restrict the scope of the analysis by leveraging the differences between program versions. These techniques are based on the idea that if code is checked early in development, then subsequent versions can be checked against a prior (checked) version, leveraging the results of the previous analysis to reduce analysis cost of the current version. Regression verification addresses the problem of proving equivalence of closely related program

  10. Evaluation of logistic regression models and effect of covariates for case-control study in RNA-Seq analysis.

    PubMed

    Choi, Seung Hoan; Labadorf, Adam T; Myers, Richard H; Lunetta, Kathryn L; Dupuis, Josée; DeStefano, Anita L

    2017-02-06

    Next generation sequencing provides a count of RNA molecules in the form of short reads, yielding discrete, often highly non-normally distributed gene expression measurements. Although Negative Binomial (NB) regression has been generally accepted in the analysis of RNA sequencing (RNA-Seq) data, its appropriateness has not been exhaustively evaluated. We explore logistic regression as an alternative method for RNA-Seq studies designed to compare cases and controls, where disease status is modeled as a function of RNA-Seq reads using simulated and Huntington disease data. We evaluate the effect of adjusting for covariates that have an unknown relationship with gene expression. Finally, we incorporate the data adaptive method in order to compare false positive rates. When the sample size is small or the expression levels of a gene are highly dispersed, the NB regression shows inflated Type-I error rates but the Classical logistic and Bayes logistic (BL) regressions are conservative. Firth's logistic (FL) regression performs well or is slightly conservative. Large sample size and low dispersion generally make Type-I error rates of all methods close to nominal alpha levels of 0.05 and 0.01. However, Type-I error rates are controlled after applying the data adaptive method. The NB, BL, and FL regressions gain increased power with large sample size, large log2 fold-change, and low dispersion. The FL regression has comparable power to NB regression. We conclude that implementing the data adaptive method appropriately controls Type-I error rates in RNA-Seq analysis. Firth's logistic regression provides a concise statistical inference process and reduces spurious associations from inaccurately estimated dispersion parameters in the negative binomial framework.

  11. Selecting risk factors: a comparison of discriminant analysis, logistic regression and Cox's regression model using data from the Tromsø Heart Study.

    PubMed

    Brenn, T; Arnesen, E

    1985-01-01

    For comparative evaluation, discriminant analysis, logistic regression and Cox's model were used to select risk factors for total and coronary deaths among 6595 men aged 20-49 followed for 9 years. Groups with mortality between 5 and 93 per 1000 were considered. Discriminant analysis selected variable sets only marginally different from the logistic and Cox methods which always selected the same sets. A time-saving option, offered for both the logistic and Cox selection, showed no advantage compared with discriminant analysis. Analysing more than 3800 subjects, the logistic and Cox methods consumed, respectively, 80 and 10 times more computer time than discriminant analysis. When including the same set of variables in non-stepwise analyses, all methods estimated coefficients that in most cases were almost identical. In conclusion, discriminant analysis is advocated for preliminary or stepwise analysis, otherwise Cox's method should be used.

  12. What Satisfies Students?: Mining Student-Opinion Data with Regression and Decision Tree Analysis

    ERIC Educational Resources Information Center

    Thomas, Emily H.; Galambos, Nora

    2004-01-01

    To investigate how students' characteristics and experiences affect satisfaction, this study uses regression and decision tree analysis with the CHAID algorithm to analyze student-opinion data. A data mining approach identifies the specific aspects of students' university experience that most influence three measures of general satisfaction. The…

  13. Applied Multiple Linear Regression: A General Research Strategy

    ERIC Educational Resources Information Center

    Smith, Brandon B.

    1969-01-01

    Illustrates some of the basic concepts and procedures for using regression analysis in experimental design, analysis of variance, analysis of covariance, and curvilinear regression. Applications to evaluation of instruction and vocational education programs are illustrated. (GR)

  14. Ca analysis: an Excel based program for the analysis of intracellular calcium transients including multiple, simultaneous regression analysis.

    PubMed

    Greensmith, David J

    2014-01-01

    Here I present an Excel based program for the analysis of intracellular Ca transients recorded using fluorescent indicators. The program can perform all the necessary steps which convert recorded raw voltage changes into meaningful physiological information. The program performs two fundamental processes. (1) It can prepare the raw signal by several methods. (2) It can then be used to analyze the prepared data to provide information such as absolute intracellular Ca levels. Also, the rates of change of Ca can be measured using multiple, simultaneous regression analysis. I demonstrate that this program performs equally well as commercially available software, but has numerous advantages, namely creating a simplified, self-contained analysis workflow. Copyright © 2013 The Author. Published by Elsevier Ireland Ltd.. All rights reserved.

  15. Multiple Correlation versus Multiple Regression.

    ERIC Educational Resources Information Center

    Huberty, Carl J.

    2003-01-01

    Describes differences between multiple correlation analysis (MCA) and multiple regression analysis (MRA), showing how these approaches involve different research questions and study designs, different inferential approaches, different analysis strategies, and different reported information. (SLD)

  16. Support vector methods for survival analysis: a comparison between ranking and regression approaches.

    PubMed

    Van Belle, Vanya; Pelckmans, Kristiaan; Van Huffel, Sabine; Suykens, Johan A K

    2011-10-01

    To compare and evaluate ranking, regression and combined machine learning approaches for the analysis of survival data. The literature describes two approaches based on support vector machines to deal with censored observations. In the first approach the key idea is to rephrase the task as a ranking problem via the concordance index, a problem which can be solved efficiently in a context of structural risk minimization and convex optimization techniques. In a second approach, one uses a regression approach, dealing with censoring by means of inequality constraints. The goal of this paper is then twofold: (i) introducing a new model combining the ranking and regression strategy, which retains the link with existing survival models such as the proportional hazards model via transformation models; and (ii) comparison of the three techniques on 6 clinical and 3 high-dimensional datasets and discussing the relevance of these techniques over classical approaches fur survival data. We compare svm-based survival models based on ranking constraints, based on regression constraints and models based on both ranking and regression constraints. The performance of the models is compared by means of three different measures: (i) the concordance index, measuring the model's discriminating ability; (ii) the logrank test statistic, indicating whether patients with a prognostic index lower than the median prognostic index have a significant different survival than patients with a prognostic index higher than the median; and (iii) the hazard ratio after normalization to restrict the prognostic index between 0 and 1. Our results indicate a significantly better performance for models including regression constraints above models only based on ranking constraints. This work gives empirical evidence that svm-based models using regression constraints perform significantly better than svm-based models based on ranking constraints. Our experiments show a comparable performance for methods

  17. Efficiency Analysis: Enhancing the Statistical and Evaluative Power of the Regression-Discontinuity Design.

    ERIC Educational Resources Information Center

    Madhere, Serge

    An analytic procedure, efficiency analysis, is proposed for improving the utility of quantitative program evaluation for decision making. The three features of the procedure are explained: (1) for statistical control, it adopts and extends the regression-discontinuity design; (2) for statistical inferences, it de-emphasizes hypothesis testing in…

  18. Variable Selection in Logistic Regression.

    DTIC Science & Technology

    1987-06-01

    23 %. AUTIOR(.) S. CONTRACT OR GRANT NUMBE Rf.i %Z. D. Bai, P. R. Krishnaiah and . C. Zhao F49620-85- C-0008 " PERFORMING ORGANIZATION NAME AND AOORESS...d I7 IOK-TK- d 7 -I0 7’ VARIABLE SELECTION IN LOGISTIC REGRESSION Z. D. Bai, P. R. Krishnaiah and L. C. Zhao Center for Multivariate Analysis...University of Pittsburgh Center for Multivariate Analysis University of Pittsburgh Y !I VARIABLE SELECTION IN LOGISTIC REGRESSION Z- 0. Bai, P. R. Krishnaiah

  19. Differentiating regressed melanoma from regressed lichenoid keratosis.

    PubMed

    Chan, Aegean H; Shulman, Kenneth J; Lee, Bonnie A

    2017-04-01

    Distinguishing regressed lichen planus-like keratosis (LPLK) from regressed melanoma can be difficult on histopathologic examination, potentially resulting in mismanagement of patients. We aimed to identify histopathologic features by which regressed melanoma can be differentiated from regressed LPLK. Twenty actively inflamed LPLK, 12 LPLK with regression and 15 melanomas with regression were compared and evaluated by hematoxylin and eosin staining as well as Melan-A, microphthalmia transcription factor (MiTF) and cytokeratin (AE1/AE3) immunostaining. (1) A total of 40% of regressed melanomas showed complete or near complete loss of melanocytes within the epidermis with Melan-A and MiTF immunostaining, while 8% of regressed LPLK exhibited this finding. (2) Necrotic keratinocytes were seen in the epidermis in 33% regressed melanomas as opposed to all of the regressed LPLK. (3) A dense infiltrate of melanophages in the papillary dermis was seen in 40% of regressed melanomas, a feature not seen in regressed LPLK. In summary, our findings suggest that a complete or near complete loss of melanocytes within the epidermis strongly favors a regressed melanoma over a regressed LPLK. In addition, necrotic epidermal keratinocytes and the presence of a dense band-like distribution of dermal melanophages can be helpful in differentiating these lesions. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  20. Error analysis of leaf area estimates made from allometric regression models

    NASA Technical Reports Server (NTRS)

    Feiveson, A. H.; Chhikara, R. S.

    1986-01-01

    Biological net productivity, measured in terms of the change in biomass with time, affects global productivity and the quality of life through biochemical and hydrological cycles and by its effect on the overall energy balance. Estimating leaf area for large ecosystems is one of the more important means of monitoring this productivity. For a particular forest plot, the leaf area is often estimated by a two-stage process. In the first stage, known as dimension analysis, a small number of trees are felled so that their areas can be measured as accurately as possible. These leaf areas are then related to non-destructive, easily-measured features such as bole diameter and tree height, by using a regression model. In the second stage, the non-destructive features are measured for all or for a sample of trees in the plots and then used as input into the regression model to estimate the total leaf area. Because both stages of the estimation process are subject to error, it is difficult to evaluate the accuracy of the final plot leaf area estimates. This paper illustrates how a complete error analysis can be made, using an example from a study made on aspen trees in northern Minnesota. The study was a joint effort by NASA and the University of California at Santa Barbara known as COVER (Characterization of Vegetation with Remote Sensing).

  1. High-throughput quantitative biochemical characterization of algal biomass by NIR spectroscopy; multiple linear regression and multivariate linear regression analysis.

    PubMed

    Laurens, L M L; Wolfrum, E J

    2013-12-18

    One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.

  2. A tandem regression-outlier analysis of a ligand cellular system for key structural modifications around ligand binding.

    PubMed

    Lin, Ying-Ting

    2013-04-30

    A tandem technique of hard equipment is often used for the chemical analysis of a single cell to first isolate and then detect the wanted identities. The first part is the separation of wanted chemicals from the bulk of a cell; the second part is the actual detection of the important identities. To identify the key structural modifications around ligand binding, the present study aims to develop a counterpart of tandem technique for cheminformatics. A statistical regression and its outliers act as a computational technique for separation. A PPARγ (peroxisome proliferator-activated receptor gamma) agonist cellular system was subjected to such an investigation. Results show that this tandem regression-outlier analysis, or the prioritization of the context equations tagged with features of the outliers, is an effective regression technique of cheminformatics to detect key structural modifications, as well as their tendency of impact to ligand binding. The key structural modifications around ligand binding are effectively extracted or characterized out of cellular reactions. This is because molecular binding is the paramount factor in such ligand cellular system and key structural modifications around ligand binding are expected to create outliers. Therefore, such outliers can be captured by this tandem regression-outlier analysis.

  3. Quantitative characterization of the regressive ecological succession by fractal analysis of plant spatial patterns

    USGS Publications Warehouse

    Alados, C.L.; Pueyo, Y.; Giner, M.L.; Navarro, T.; Escos, J.; Barroso, F.; Cabezudo, B.; Emlen, J.M.

    2003-01-01

    We studied the effect of grazing on the degree of regression of successional vegetation dynamic in a semi-arid Mediterranean matorral. We quantified the spatial distribution patterns of the vegetation by fractal analyses, using the fractal information dimension and spatial autocorrelation measured by detrended fluctuation analyses (DFA). It is the first time that fractal analysis of plant spatial patterns has been used to characterize the regressive ecological succession. Plant spatial patterns were compared over a long-term grazing gradient (low, medium and heavy grazing pressure) and on ungrazed sites for two different plant communities: A middle dense matorral of Chamaerops and Periploca at Sabinar-Romeral and a middle dense matorral of Chamaerops, Rhamnus and Ulex at Requena-Montano. The two communities differed also in the microclimatic characteristics (sea oriented at the Sabinar-Romeral site and inland oriented at the Requena-Montano site). The information fractal dimension increased as we moved from a middle dense matorral to discontinuous and scattered matorral and, finally to the late regressive succession, at Stipa steppe stage. At this stage a drastic change in the fractal dimension revealed a change in the vegetation structure, accurately indicating end successional vegetation stages. Long-term correlation analysis (DFA) revealed that an increase in grazing pressure leads to unpredictability (randomness) in species distributions, a reduction in diversity, and an increase in cover of the regressive successional species, e.g. Stipa tenacissima L. These comparisons provide a quantitative characterization of the successional dynamic of plant spatial patterns in response to grazing perturbation gradient. ?? 2002 Elsevier Science B.V. All rights reserved.

  4. Role of regression analysis and variation of rheological data in calculation of pressure drop for sludge pipelines.

    PubMed

    Farno, E; Coventry, K; Slatter, P; Eshtiaghi, N

    2018-06-15

    Sludge pumps in wastewater treatment plants are often oversized due to uncertainty in calculation of pressure drop. This issue costs millions of dollars for industry to purchase and operate the oversized pumps. Besides costs, higher electricity consumption is associated with extra CO 2 emission which creates huge environmental impacts. Calculation of pressure drop via current pipe flow theory requires model estimation of flow curve data which depends on regression analysis and also varies with natural variation of rheological data. This study investigates impact of variation of rheological data and regression analysis on variation of pressure drop calculated via current pipe flow theories. Results compare the variation of calculated pressure drop between different models and regression methods and suggest on the suitability of each method. Copyright © 2018 Elsevier Ltd. All rights reserved.

  5. Linear regression in astronomy. II

    NASA Technical Reports Server (NTRS)

    Feigelson, Eric D.; Babu, Gutti J.

    1992-01-01

    A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.

  6. Determinants of orphan drugs prices in France: a regression analysis.

    PubMed

    Korchagina, Daria; Millier, Aurelie; Vataire, Anne-Lise; Aballea, Samuel; Falissard, Bruno; Toumi, Mondher

    2017-04-21

    The introduction of the orphan drug legislation led to the increase in the number of available orphan drugs, but the access to them is often limited due to the high price. Social preferences regarding funding orphan drugs as well as the criteria taken into consideration while setting the price remain unclear. The study aimed at identifying the determinant of orphan drug prices in France using a regression analysis. All drugs with a valid orphan designation at the moment of launch for which the price was available in France were included in the analysis. The selection of covariates was based on a literature review and included drug characteristics (Anatomical Therapeutic Chemical (ATC) class, treatment line, age of target population), diseases characteristics (severity, prevalence, availability of alternative therapeutic options), health technology assessment (HTA) details (actual benefit (AB) and improvement in actual benefit (IAB) scores, delay between the HTA and commercialisation), and study characteristics (type of study, comparator, type of endpoint). The main data sources were European public assessment reports, HTA reports, summaries of opinion on orphan designation of the European Medicines Agency, and the French insurance database of drugs and tariffs. A generalized regression model was developed to test the association between the annual treatment cost and selected covariates. A total of 68 drugs were included. The mean annual treatment cost was €96,518. In the univariate analysis, the ATC class (p = 0.01), availability of alternative treatment options (p = 0.02) and the prevalence (p = 0.02) showed a significant correlation with the annual cost. The multivariate analysis demonstrated significant association between the annual cost and availability of alternative treatment options, ATC class, IAB score, type of comparator in the pivotal clinical trial, as well as commercialisation date and delay between the HTA and commercialisation. The

  7. Assessing the Impact of Influential Observations on Multiple Regression Analysis on Human Resource Research.

    ERIC Educational Resources Information Center

    Bates, Reid A.; Holton, Elwood F., III; Burnett, Michael F.

    1999-01-01

    A case study of learning transfer demonstrates the possible effect of influential observation on linear regression analysis. A diagnostic method that tests for violation of assumptions, multicollinearity, and individual and multiple influential observations helps determine which observation to delete to eliminate bias. (SK)

  8. Logistic regression applied to natural hazards: rare event logistic regression with replications

    NASA Astrophysics Data System (ADS)

    Guns, M.; Vanacker, V.

    2012-06-01

    Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.

  9. Catching up with Harvard: Results from Regression Analysis of World Universities League Tables

    ERIC Educational Resources Information Center

    Li, Mei; Shankar, Sriram; Tang, Kam Ki

    2011-01-01

    This paper uses regression analysis to test if the universities performing less well according to Shanghai Jiao Tong University's world universities league tables are able to catch up with the top performers, and to identify national and institutional factors that could affect this catching up process. We have constructed a dataset of 461…

  10. Stepwise versus Hierarchical Regression: Pros and Cons

    ERIC Educational Resources Information Center

    Lewis, Mitzi

    2007-01-01

    Multiple regression is commonly used in social and behavioral data analysis. In multiple regression contexts, researchers are very often interested in determining the "best" predictors in the analysis. This focus may stem from a need to identify those predictors that are supportive of theory. Alternatively, the researcher may simply be interested…

  11. Intermediate and advanced topics in multilevel logistic regression analysis

    PubMed Central

    Merlo, Juan

    2017-01-01

    Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher‐level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within‐cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population‐average effect of covariates measured at the subject and cluster level, in contrast to the within‐cluster or cluster‐specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster‐level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28543517

  12. Using regression analysis to predict emergency patient volume at the Indianapolis 500 mile race.

    PubMed

    Bowdish, G E; Cordell, W H; Bock, H C; Vukov, L F

    1992-10-01

    Emergency physicians often plan and provide on-site medical care for mass gatherings. Most of the mass gathering literature is descriptive. Only a few studies have looked at factors such as crowd size, event characteristics, or weather in predicting numbers and types of patients at mass gatherings. We used regression analysis to relate patient volume on Race Day at the Indianapolis Motor Speedway to weather conditions and race characteristics. Race Day weather data for the years 1983 to 1989 were obtained from the National Oceanic and Atmospheric Administration. Data regarding patients treated on 1983 to 1989 Race Days were obtained from the facility hospital (Hannah Emergency Medical Center) data base. Regression analysis was performed using weather factors and race characteristics as independent variables and number of patients seen as the dependent variable. Data from 1990 were used to test the validity of the model. There was a significant relationship between dew point (which is calculated from temperature and humidity) and patient load (P less than .01). Dew point, however, failed to predict patient load during the 1990 race. No relationships could be established between humidity, sunshine, wind, or race characteristics and number of patients. Although higher dew point was associated with higher patient load during the 1983 to 1989 races, dew point was a poor predictor of patient load during the 1990 race. Regression analysis may be useful in identifying relationships between event characteristics and patient load but is probably inadequate to explain the complexities of crowd behavior and too simplified to use as a prediction tool.

  13. Regression: The Apple Does Not Fall Far From the Tree.

    PubMed

    Vetter, Thomas R; Schober, Patrick

    2018-05-15

    Researchers and clinicians are frequently interested in either: (1) assessing whether there is a relationship or association between 2 or more variables and quantifying this association; or (2) determining whether 1 or more variables can predict another variable. The strength of such an association is mainly described by the correlation. However, regression analysis and regression models can be used not only to identify whether there is a significant relationship or association between variables but also to generate estimations of such a predictive relationship between variables. This basic statistical tutorial discusses the fundamental concepts and techniques related to the most common types of regression analysis and modeling, including simple linear regression, multiple regression, logistic regression, ordinal regression, and Poisson regression, as well as the common yet often underrecognized phenomenon of regression toward the mean. The various types of regression analysis are powerful statistical techniques, which when appropriately applied, can allow for the valid interpretation of complex, multifactorial data. Regression analysis and models can assess whether there is a relationship or association between 2 or more observed variables and estimate the strength of this association, as well as determine whether 1 or more variables can predict another variable. Regression is thus being applied more commonly in anesthesia, perioperative, critical care, and pain research. However, it is crucial to note that regression can identify plausible risk factors; it does not prove causation (a definitive cause and effect relationship). The results of a regression analysis instead identify independent (predictor) variable(s) associated with the dependent (outcome) variable. As with other statistical methods, applying regression requires that certain assumptions be met, which can be tested with specific diagnostics.

  14. Implementing informative priors for heterogeneity in meta-analysis using meta-regression and pseudo data.

    PubMed

    Rhodes, Kirsty M; Turner, Rebecca M; White, Ian R; Jackson, Dan; Spiegelhalter, David J; Higgins, Julian P T

    2016-12-20

    Many meta-analyses combine results from only a small number of studies, a situation in which the between-study variance is imprecisely estimated when standard methods are applied. Bayesian meta-analysis allows incorporation of external evidence on heterogeneity, providing the potential for more robust inference on the effect size of interest. We present a method for performing Bayesian meta-analysis using data augmentation, in which we represent an informative conjugate prior for between-study variance by pseudo data and use meta-regression for estimation. To assist in this, we derive predictive inverse-gamma distributions for the between-study variance expected in future meta-analyses. These may serve as priors for heterogeneity in new meta-analyses. In a simulation study, we compare approximate Bayesian methods using meta-regression and pseudo data against fully Bayesian approaches based on importance sampling techniques and Markov chain Monte Carlo (MCMC). We compare the frequentist properties of these Bayesian methods with those of the commonly used frequentist DerSimonian and Laird procedure. The method is implemented in standard statistical software and provides a less complex alternative to standard MCMC approaches. An importance sampling approach produces almost identical results to standard MCMC approaches, and results obtained through meta-regression and pseudo data are very similar. On average, data augmentation provides closer results to MCMC, if implemented using restricted maximum likelihood estimation rather than DerSimonian and Laird or maximum likelihood estimation. The methods are applied to real datasets, and an extension to network meta-analysis is described. The proposed method facilitates Bayesian meta-analysis in a way that is accessible to applied researchers. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  15. Valid Statistical Analysis for Logistic Regression with Multiple Sources

    NASA Astrophysics Data System (ADS)

    Fienberg, Stephen E.; Nardi, Yuval; Slavković, Aleksandra B.

    Considerable effort has gone into understanding issues of privacy protection of individual information in single databases, and various solutions have been proposed depending on the nature of the data, the ways in which the database will be used and the precise nature of the privacy protection being offered. Once data are merged across sources, however, the nature of the problem becomes far more complex and a number of privacy issues arise for the linked individual files that go well beyond those that are considered with regard to the data within individual sources. In the paper, we propose an approach that gives full statistical analysis on the combined database without actually combining it. We focus mainly on logistic regression, but the method and tools described may be applied essentially to other statistical models as well.

  16. Intermediate and advanced topics in multilevel logistic regression analysis.

    PubMed

    Austin, Peter C; Merlo, Juan

    2017-09-10

    Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher-level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within-cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population-average effect of covariates measured at the subject and cluster level, in contrast to the within-cluster or cluster-specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster-level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

  17. An Effect Size for Regression Predictors in Meta-Analysis

    ERIC Educational Resources Information Center

    Aloe, Ariel M.; Becker, Betsy Jane

    2012-01-01

    A new effect size representing the predictive power of an independent variable from a multiple regression model is presented. The index, denoted as r[subscript sp], is the semipartial correlation of the predictor with the outcome of interest. This effect size can be computed when multiple predictor variables are included in the regression model…

  18. Multivariate adaptive regression splines analysis to predict biomarkers of spontaneous preterm birth.

    PubMed

    Menon, Ramkumar; Bhat, Geeta; Saade, George R; Spratt, Heidi

    2014-04-01

    To develop classification models of demographic/clinical factors and biomarker data from spontaneous preterm birth in African Americans and Caucasians. Secondary analysis of biomarker data using multivariate adaptive regression splines (MARS), a supervised machine learning algorithm method. Analysis of data on 36 biomarkers from 191 women was reduced by MARS to develop predictive models for preterm birth in African Americans and Caucasians. Maternal plasma, cord plasma collected at admission for preterm or term labor and amniotic fluid at delivery. Data were partitioned into training and testing sets. Variable importance, a relative indicator (0-100%) and area under the receiver operating characteristic curve (AUC) characterized results. Multivariate adaptive regression splines generated models for combined and racially stratified biomarker data. Clinical and demographic data did not contribute to the model. Racial stratification of data produced distinct models in all three compartments. In African Americans maternal plasma samples IL-1RA, TNF-α, angiopoietin 2, TNFRI, IL-5, MIP1α, IL-1β and TGF-α modeled preterm birth (AUC train: 0.98, AUC test: 0.86). In Caucasians TNFR1, ICAM-1 and IL-1RA contributed to the model (AUC train: 0.84, AUC test: 0.68). African Americans cord plasma samples produced IL-12P70, IL-8 (AUC train: 0.82, AUC test: 0.66). Cord plasma in Caucasians modeled IGFII, PDGFBB, TGF-β1 , IL-12P70, and TIMP1 (AUC train: 0.99, AUC test: 0.82). Amniotic fluid in African Americans modeled FasL, TNFRII, RANTES, KGF, IGFI (AUC train: 0.95, AUC test: 0.89) and in Caucasians, TNF-α, MCP3, TGF-β3 , TNFR1 and angiopoietin 2 (AUC train: 0.94 AUC test: 0.79). Multivariate adaptive regression splines models multiple biomarkers associated with preterm birth and demonstrated racial disparity. © 2014 Nordic Federation of Societies of Obstetrics and Gynecology.

  19. Building Regression Models: The Importance of Graphics.

    ERIC Educational Resources Information Center

    Dunn, Richard

    1989-01-01

    Points out reasons for using graphical methods to teach simple and multiple regression analysis. Argues that a graphically oriented approach has considerable pedagogic advantages in the exposition of simple and multiple regression. Shows that graphical methods may play a central role in the process of building regression models. (Author/LS)

  20. Online Statistical Modeling (Regression Analysis) for Independent Responses

    NASA Astrophysics Data System (ADS)

    Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus

    2017-06-01

    Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.

  1. Robust Methods for Moderation Analysis with a Two-Level Regression Model.

    PubMed

    Yang, Miao; Yuan, Ke-Hai

    2016-01-01

    Moderation analysis has many applications in social sciences. Most widely used estimation methods for moderation analysis assume that errors are normally distributed and homoscedastic. When these assumptions are not met, the results from a classical moderation analysis can be misleading. For more reliable moderation analysis, this article proposes two robust methods with a two-level regression model when the predictors do not contain measurement error. One method is based on maximum likelihood with Student's t distribution and the other is based on M-estimators with Huber-type weights. An algorithm for obtaining the robust estimators is developed. Consistent estimates of standard errors of the robust estimators are provided. The robust approaches are compared against normal-distribution-based maximum likelihood (NML) with respect to power and accuracy of parameter estimates through a simulation study. Results show that the robust approaches outperform NML under various distributional conditions. Application of the robust methods is illustrated through a real data example. An R program is developed and documented to facilitate the application of the robust methods.

  2. Oil and gas pipeline construction cost analysis and developing regression models for cost estimation

    NASA Astrophysics Data System (ADS)

    Thaduri, Ravi Kiran

    In this study, cost data for 180 pipelines and 136 compressor stations have been analyzed. On the basis of the distribution analysis, regression models have been developed. Material, Labor, ROW and miscellaneous costs make up the total cost of a pipeline construction. The pipelines are analyzed based on different pipeline lengths, diameter, location, pipeline volume and year of completion. In a pipeline construction, labor costs dominate the total costs with a share of about 40%. Multiple non-linear regression models are developed to estimate the component costs of pipelines for various cross-sectional areas, lengths and locations. The Compressor stations are analyzed based on the capacity, year of completion and location. Unlike the pipeline costs, material costs dominate the total costs in the construction of compressor station, with an average share of about 50.6%. Land costs have very little influence on the total costs. Similar regression models are developed to estimate the component costs of compressor station for various capacities and locations.

  3. Linear regression analysis and its application to multivariate chromatographic calibration for the quantitative analysis of two-component mixtures.

    PubMed

    Dinç, Erdal; Ozdemir, Abdil

    2005-01-01

    Multivariate chromatographic calibration technique was developed for the quantitative analysis of binary mixtures enalapril maleate (EA) and hydrochlorothiazide (HCT) in tablets in the presence of losartan potassium (LST). The mathematical algorithm of multivariate chromatographic calibration technique is based on the use of the linear regression equations constructed using relationship between concentration and peak area at the five-wavelength set. The algorithm of this mathematical calibration model having a simple mathematical content was briefly described. This approach is a powerful mathematical tool for an optimum chromatographic multivariate calibration and elimination of fluctuations coming from instrumental and experimental conditions. This multivariate chromatographic calibration contains reduction of multivariate linear regression functions to univariate data set. The validation of model was carried out by analyzing various synthetic binary mixtures and using the standard addition technique. Developed calibration technique was applied to the analysis of the real pharmaceutical tablets containing EA and HCT. The obtained results were compared with those obtained by classical HPLC method. It was observed that the proposed multivariate chromatographic calibration gives better results than classical HPLC.

  4. Estimation of crown closure from AVIRIS data using regression analysis

    NASA Technical Reports Server (NTRS)

    Staenz, K.; Williams, D. J.; Truchon, M.; Fritz, R.

    1993-01-01

    Crown closure is one of the input parameters used for forest growth and yield modelling. Preliminary work by Staenz et al. indicates that imaging spectrometer data acquired with sensors such as the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) have some potential for estimating crown closure on a stand level. The objectives of this paper are: (1) to establish a relationship between AVIRIS data and the crown closure derived from aerial photography of a forested test site within the Interior Douglas Fir biogeoclimatic zone in British Columbia, Canada; (2) to investigate the impact of atmospheric effects and the forest background on the correlation between AVIRIS data and crown closure estimates; and (3) to improve this relationship using multiple regression analysis.

  5. Passing the Test: Ecological Regression Analysis in the Los Angeles County Case and Beyond.

    ERIC Educational Resources Information Center

    Lichtman, Allan J.

    1991-01-01

    Statistical analysis of racially polarized voting prepared for the Garza v County of Los Angeles (California) (1990) voting rights case is reviewed to demonstrate that ecological regression is a flexible, robust technique that illuminates the reality of ethnic voting, and superior to the neighborhood model supported by the defendants. (SLD)

  6. Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies

    PubMed Central

    Vatcheva, Kristina P.; Lee, MinJae; McCormick, Joseph B.; Rahbar, Mohammad H.

    2016-01-01

    The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epidemiologic studies. We used simulated datasets and real life data from the Cameron County Hispanic Cohort to demonstrate the adverse effects of multicollinearity in the regression analysis and encourage researchers to consider the diagnostic for multicollinearity as one of the steps in regression analysis. PMID:27274911

  7. Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies.

    PubMed

    Vatcheva, Kristina P; Lee, MinJae; McCormick, Joseph B; Rahbar, Mohammad H

    2016-04-01

    The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epidemiologic studies. We used simulated datasets and real life data from the Cameron County Hispanic Cohort to demonstrate the adverse effects of multicollinearity in the regression analysis and encourage researchers to consider the diagnostic for multicollinearity as one of the steps in regression analysis.

  8. Frequency-domain nonlinear regression algorithm for spectral analysis of broadband SFG spectroscopy.

    PubMed

    He, Yuhan; Wang, Ying; Wang, Jingjing; Guo, Wei; Wang, Zhaohui

    2016-03-01

    The resonant spectral bands of the broadband sum frequency generation (BB-SFG) spectra are often distorted by the nonresonant portion and the lineshapes of the laser pulses. Frequency domain nonlinear regression (FDNLR) algorithm was proposed to retrieve the first-order polarization induced by the infrared pulse and to improve the analysis of SFG spectra through simultaneous fitting of a series of time-resolved BB-SFG spectra. The principle of FDNLR was presented, and the validity and reliability were tested by the analysis of the virtual and measured SFG spectra. The relative phase, dephasing time, and lineshapes of the resonant vibrational SFG bands can be retrieved without any preset assumptions about the SFG bands and the incident laser pulses.

  9. Modified Regression Correlation Coefficient for Poisson Regression Model

    NASA Astrophysics Data System (ADS)

    Kaengthong, Nattacha; Domthong, Uthumporn

    2017-09-01

    This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).

  10. Isolating the Effects of Training Using Simple Regression Analysis: An Example of the Procedure.

    ERIC Educational Resources Information Center

    Waugh, C. Keith

    This paper provides a case example of simple regression analysis, a forecasting procedure used to isolate the effects of training from an identified extraneous variable. This case example focuses on results of a three-day sales training program to improve bank loan officers' knowledge, skill-level, and attitude regarding solicitation and sale of…

  11. Clinical benefit from pharmacological elevation of high-density lipoprotein cholesterol: meta-regression analysis.

    PubMed

    Hourcade-Potelleret, F; Laporte, S; Lehnert, V; Delmar, P; Benghozi, Renée; Torriani, U; Koch, R; Mismetti, P

    2015-06-01

    Epidemiological evidence that the risk of coronary heart disease is inversely associated with the level of high-density lipoprotein cholesterol (HDL-C) has motivated several phase III programmes with cholesteryl ester transfer protein (CETP) inhibitors. To assess alternative methods to predict clinical response of CETP inhibitors. Meta-regression analysis on raising HDL-C drugs (statins, fibrates, niacin) in randomised controlled trials. 51 trials in secondary prevention with a total of 167,311 patients for a follow-up >1 year where HDL-C was measured at baseline and during treatment. The meta-regression analysis showed no significant association between change in HDL-C (treatment vs comparator) and log risk ratio (RR) of clinical endpoint (non-fatal myocardial infarction or cardiac death). CETP inhibitors data are consistent with this finding (RR: 1.03; P5-P95: 0.99-1.21). A prespecified sensitivity analysis by drug class suggested that the strength of relationship might differ between pharmacological groups. A significant association for both statins (p<0.02, log RR=-0.169-0.0499*HDL-C change, R(2)=0.21) and niacin (p=0.02, log RR=1.07-0.185*HDL-C change, R(2)=0.61) but not fibrates (p=0.18, log RR=-0.367+0.077*HDL-C change, R(2)=0.40) was shown. However, the association was no longer detectable after adjustment for low-density lipoprotein cholesterol for statins or exclusion of open trials for niacin. Meta-regression suggested that CETP inhibitors might not influence coronary risk. The relation between change in HDL-C level and clinical endpoint may be drug dependent, which limits the use of HDL-C as a surrogate marker of coronary events. Other markers of HDL function may be more relevant. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  12. Selenium Exposure and Cancer Risk: an Updated Meta-analysis and Meta-regression

    PubMed Central

    Cai, Xianlei; Wang, Chen; Yu, Wanqi; Fan, Wenjie; Wang, Shan; Shen, Ning; Wu, Pengcheng; Li, Xiuyang; Wang, Fudi

    2016-01-01

    The objective of this study was to investigate the associations between selenium exposure and cancer risk. We identified 69 studies and applied meta-analysis, meta-regression and dose-response analysis to obtain available evidence. The results indicated that high selenium exposure had a protective effect on cancer risk (pooled OR = 0.78; 95%CI: 0.73–0.83). The results of linear and nonlinear dose-response analysis indicated that high serum/plasma selenium and toenail selenium had the efficacy on cancer prevention. However, we did not find a protective efficacy of selenium supplement. High selenium exposure may have different effects on specific types of cancer. It decreased the risk of breast cancer, lung cancer, esophageal cancer, gastric cancer, and prostate cancer, but it was not associated with colorectal cancer, bladder cancer, and skin cancer. PMID:26786590

  13. Quantile regression applied to spectral distance decay

    USGS Publications Warehouse

    Rocchini, D.; Cade, B.S.

    2008-01-01

    Remotely sensed imagery has long been recognized as a powerful support for characterizing and estimating biodiversity. Spectral distance among sites has proven to be a powerful approach for detecting species composition variability. Regression analysis of species similarity versus spectral distance allows us to quantitatively estimate the amount of turnover in species composition with respect to spectral and ecological variability. In classical regression analysis, the residual sum of squares is minimized for the mean of the dependent variable distribution. However, many ecological data sets are characterized by a high number of zeroes that add noise to the regression model. Quantile regressions can be used to evaluate trend in the upper quantiles rather than a mean trend across the whole distribution of the dependent variable. In this letter, we used ordinary least squares (OLS) and quantile regressions to estimate the decay of species similarity versus spectral distance. The achieved decay rates were statistically nonzero (p < 0.01), considering both OLS and quantile regressions. Nonetheless, the OLS regression estimate of the mean decay rate was only half the decay rate indicated by the upper quantiles. Moreover, the intercept value, representing the similarity reached when the spectral distance approaches zero, was very low compared with the intercepts of the upper quantiles, which detected high species similarity when habitats are more similar. In this letter, we demonstrated the power of using quantile regressions applied to spectral distance decay to reveal species diversity patterns otherwise lost or underestimated by OLS regression. ?? 2008 IEEE.

  14. A Practical Guide to Regression Discontinuity

    ERIC Educational Resources Information Center

    Jacob, Robin; Zhu, Pei; Somers, Marie-Andrée; Bloom, Howard

    2012-01-01

    Regression discontinuity (RD) analysis is a rigorous nonexperimental approach that can be used to estimate program impacts in situations in which candidates are selected for treatment based on whether their value for a numeric rating exceeds a designated threshold or cut-point. Over the last two decades, the regression discontinuity approach has…

  15. Interpreting Bivariate Regression Coefficients: Going beyond the Average

    ERIC Educational Resources Information Center

    Halcoussis, Dennis; Phillips, G. Michael

    2010-01-01

    Statistics, econometrics, investment analysis, and data analysis classes often review the calculation of several types of averages, including the arithmetic mean, geometric mean, harmonic mean, and various weighted averages. This note shows how each of these can be computed using a basic regression framework. By recognizing when a regression model…

  16. Logistic regression analysis of risk factors for postoperative recurrence of spinal tumors and analysis of prognostic factors.

    PubMed

    Zhang, Shanyong; Yang, Lili; Peng, Chuangang; Wu, Minfei

    2018-02-01

    The aim of the present study was to investigate the risk factors for postoperative recurrence of spinal tumors by logistic regression analysis and analysis of prognostic factors. In total, 77 male and 48 female patients with spinal tumor were selected in our hospital from January, 2010 to December, 2015 and divided into the benign (n=76) and malignant groups (n=49). All the patients underwent microsurgical resection of spinal tumors and were reviewed regularly 3 months after operation. The McCormick grading system was used to evaluate the postoperative spinal cord function. Data were subjected to statistical analysis. Of the 125 cases, 63 cases showed improvement after operation, 50 cases were stable, and deterioration was found in 12 cases. The improvement rate of patients with cervical spine tumor, which reached 56.3%, was the highest. Fifty-two cases of sensory disturbance, 34 cases of pain, 30 cases of inability to exercise, 26 cases of ataxia, and 12 cases of sphincter disorders were found after operation. Seventy-two cases (57.6%) underwent total resection, 18 cases (14.4%) received subtotal resection, 23 cases (18.4%) received partial resection, and 12 cases (9.6%) were only treated with biopsy/decompression. Postoperative recurrence was found in 57 cases (45.6%). The mean recurrence time of patients in the malignant group was 27.49±6.09 months, and the mean recurrence time of patients in the benign group was 40.62±4.34. The results were significantly different (P<0.001). Recurrence was found in 18 cases of the benign group and 39 cases of the malignant group, and results were significantly different (P<0.001). Tumor recurrence was shorter in patients with a higher McCormick grade (P<0.001). Recurrence was found in 13 patients with resection and all the patients with partial resection or biopsy/decompression. The results were significantly different (P<0.001). Logistic regression analysis of total resection-related factors showed that total resection

  17. Logistic regression analysis of risk factors for postoperative recurrence of spinal tumors and analysis of prognostic factors

    PubMed Central

    Zhang, Shanyong; Yang, Lili; Peng, Chuangang; Wu, Minfei

    2018-01-01

    The aim of the present study was to investigate the risk factors for postoperative recurrence of spinal tumors by logistic regression analysis and analysis of prognostic factors. In total, 77 male and 48 female patients with spinal tumor were selected in our hospital from January, 2010 to December, 2015 and divided into the benign (n=76) and malignant groups (n=49). All the patients underwent microsurgical resection of spinal tumors and were reviewed regularly 3 months after operation. The McCormick grading system was used to evaluate the postoperative spinal cord function. Data were subjected to statistical analysis. Of the 125 cases, 63 cases showed improvement after operation, 50 cases were stable, and deterioration was found in 12 cases. The improvement rate of patients with cervical spine tumor, which reached 56.3%, was the highest. Fifty-two cases of sensory disturbance, 34 cases of pain, 30 cases of inability to exercise, 26 cases of ataxia, and 12 cases of sphincter disorders were found after operation. Seventy-two cases (57.6%) underwent total resection, 18 cases (14.4%) received subtotal resection, 23 cases (18.4%) received partial resection, and 12 cases (9.6%) were only treated with biopsy/decompression. Postoperative recurrence was found in 57 cases (45.6%). The mean recurrence time of patients in the malignant group was 27.49±6.09 months, and the mean recurrence time of patients in the benign group was 40.62±4.34. The results were significantly different (P<0.001). Recurrence was found in 18 cases of the benign group and 39 cases of the malignant group, and results were significantly different (P<0.001). Tumor recurrence was shorter in patients with a higher McCormick grade (P<0.001). Recurrence was found in 13 patients with resection and all the patients with partial resection or biopsy/decompression. The results were significantly different (P<0.001). Logistic regression analysis of total resection-related factors showed that total resection

  18. Classification and regression tree analysis of acute-on-chronic hepatitis B liver failure: Seeing the forest for the trees.

    PubMed

    Shi, K-Q; Zhou, Y-Y; Yan, H-D; Li, H; Wu, F-L; Xie, Y-Y; Braddock, M; Lin, X-Y; Zheng, M-H

    2017-02-01

    At present, there is no ideal model for predicting the short-term outcome of patients with acute-on-chronic hepatitis B liver failure (ACHBLF). This study aimed to establish and validate a prognostic model by using the classification and regression tree (CART) analysis. A total of 1047 patients from two separate medical centres with suspected ACHBLF were screened in the study, which were recognized as derivation cohort and validation cohort, respectively. CART analysis was applied to predict the 3-month mortality of patients with ACHBLF. The accuracy of the CART model was tested using the area under the receiver operating characteristic curve, which was compared with the model for end-stage liver disease (MELD) score and a new logistic regression model. CART analysis identified four variables as prognostic factors of ACHBLF: total bilirubin, age, serum sodium and INR, and three distinct risk groups: low risk (4.2%), intermediate risk (30.2%-53.2%) and high risk (81.4%-96.9%). The new logistic regression model was constructed with four independent factors, including age, total bilirubin, serum sodium and prothrombin activity by multivariate logistic regression analysis. The performances of the CART model (0.896), similar to the logistic regression model (0.914, P=.382), exceeded that of MELD score (0.667, P<.001). The results were confirmed in the validation cohort. We have developed and validated a novel CART model superior to MELD for predicting three-month mortality of patients with ACHBLF. Thus, the CART model could facilitate medical decision-making and provide clinicians with a validated practical bedside tool for ACHBLF risk stratification. © 2016 John Wiley & Sons Ltd.

  19. Mixed kernel function support vector regression for global sensitivity analysis

    NASA Astrophysics Data System (ADS)

    Cheng, Kai; Lu, Zhenzhou; Wei, Yuhao; Shi, Yan; Zhou, Yicheng

    2017-11-01

    Global sensitivity analysis (GSA) plays an important role in exploring the respective effects of input variables on an assigned output response. Amongst the wide sensitivity analyses in literature, the Sobol indices have attracted much attention since they can provide accurate information for most models. In this paper, a mixed kernel function (MKF) based support vector regression (SVR) model is employed to evaluate the Sobol indices at low computational cost. By the proposed derivation, the estimation of the Sobol indices can be obtained by post-processing the coefficients of the SVR meta-model. The MKF is constituted by the orthogonal polynomials kernel function and Gaussian radial basis kernel function, thus the MKF possesses both the global characteristic advantage of the polynomials kernel function and the local characteristic advantage of the Gaussian radial basis kernel function. The proposed approach is suitable for high-dimensional and non-linear problems. Performance of the proposed approach is validated by various analytical functions and compared with the popular polynomial chaos expansion (PCE). Results demonstrate that the proposed approach is an efficient method for global sensitivity analysis.

  20. Logistic regression analysis of conventional ultrasonography, strain elastosonography, and contrast-enhanced ultrasound characteristics for the differentiation of benign and malignant thyroid nodules

    PubMed Central

    Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Liu, Weixiang

    2017-01-01

    The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules’ 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively. PMID:29228030

  1. Logistic regression analysis of conventional ultrasonography, strain elastosonography, and contrast-enhanced ultrasound characteristics for the differentiation of benign and malignant thyroid nodules.

    PubMed

    Pang, Tiantian; Huang, Leidan; Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Gong, Xuehao; Liu, Weixiang

    2017-01-01

    The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules' 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively.

  2. Nonlinear analysis of pupillary dynamics.

    PubMed

    Onorati, Francesco; Mainardi, Luca Tommaso; Sirca, Fabiola; Russo, Vincenzo; Barbieri, Riccardo

    2016-02-01

    Pupil size reflects autonomic response to different environmental and behavioral stimuli, and its dynamics have been linked to other autonomic correlates such as cardiac and respiratory rhythms. The aim of this study is to assess the nonlinear characteristics of pupil size of 25 normal subjects who participated in a psychophysiological experimental protocol with four experimental conditions, namely “baseline”, “anger”, “joy”, and “sadness”. Nonlinear measures, such as sample entropy, correlation dimension, and largest Lyapunov exponent, were computed on reconstructed signals of spontaneous fluctuations of pupil dilation. Nonparametric statistical tests were performed on surrogate data to verify that the nonlinear measures are an intrinsic characteristic of the signals. We then developed and applied a piecewise linear regression model to detrended fluctuation analysis (DFA). Two joinpoints and three scaling intervals were identified: slope α0, at slow time scales, represents a persistent nonstationary long-range correlation, whereas α1 and α2, at middle and fast time scales, respectively, represent long-range power-law correlations, similarly to DFA applied to heart rate variability signals. Of the computed complexity measures, α0 showed statistically significant differences among experimental conditions (p<0.001). Our results suggest that (a) pupil size at constant light condition is characterized by nonlinear dynamics, (b) three well-defined and distinct long-memory processes exist at different time scales, and (c) autonomic stimulation is partially reflected in nonlinear dynamics. (c) autonomic stimulation is partially reflected in nonlinear dynamics.

  3. Examination of influential observations in penalized spline regression

    NASA Astrophysics Data System (ADS)

    Türkan, Semra

    2013-10-01

    In parametric or nonparametric regression models, the results of regression analysis are affected by some anomalous observations in the data set. Thus, detection of these observations is one of the major steps in regression analysis. These observations are precisely detected by well-known influence measures. Pena's statistic is one of them. In this study, Pena's approach is formulated for penalized spline regression in terms of ordinary residuals and leverages. The real data and artificial data are used to see illustrate the effectiveness of Pena's statistic as to Cook's distance on detecting influential observations. The results of the study clearly reveal that the proposed measure is superior to Cook's Distance to detect these observations in large data set.

  4. Ca analysis: An Excel based program for the analysis of intracellular calcium transients including multiple, simultaneous regression analysis☆

    PubMed Central

    Greensmith, David J.

    2014-01-01

    Here I present an Excel based program for the analysis of intracellular Ca transients recorded using fluorescent indicators. The program can perform all the necessary steps which convert recorded raw voltage changes into meaningful physiological information. The program performs two fundamental processes. (1) It can prepare the raw signal by several methods. (2) It can then be used to analyze the prepared data to provide information such as absolute intracellular Ca levels. Also, the rates of change of Ca can be measured using multiple, simultaneous regression analysis. I demonstrate that this program performs equally well as commercially available software, but has numerous advantages, namely creating a simplified, self-contained analysis workflow. PMID:24125908

  5. Testing Different Model Building Procedures Using Multiple Regression.

    ERIC Educational Resources Information Center

    Thayer, Jerome D.

    The stepwise regression method of selecting predictors for computer assisted multiple regression analysis was compared with forward, backward, and best subsets regression, using 16 data sets. The results indicated the stepwise method was preferred because of its practical nature, when the models chosen by different selection methods were similar…

  6. Predicting Air Permeability of Handloom Fabrics: A Comparative Analysis of Regression and Artificial Neural Network Models

    NASA Astrophysics Data System (ADS)

    Mitra, Ashis; Majumdar, Prabal Kumar; Bannerjee, Debamalya

    2013-03-01

    This paper presents a comparative analysis of two modeling methodologies for the prediction of air permeability of plain woven handloom cotton fabrics. Four basic fabric constructional parameters namely ends per inch, picks per inch, warp count and weft count have been used as inputs for artificial neural network (ANN) and regression models. Out of the four regression models tried, interaction model showed very good prediction performance with a meager mean absolute error of 2.017 %. However, ANN models demonstrated superiority over the regression models both in terms of correlation coefficient and mean absolute error. The ANN model with 10 nodes in the single hidden layer showed very good correlation coefficient of 0.982 and 0.929 and mean absolute error of only 0.923 and 2.043 % for training and testing data respectively.

  7. Evaluation of Visual Field Progression in Glaucoma: Quasar Regression Program and Event Analysis.

    PubMed

    Díaz-Alemán, Valentín T; González-Hernández, Marta; Perera-Sanz, Daniel; Armas-Domínguez, Karintia

    2016-01-01

    To determine the sensitivity, specificity and agreement between the Quasar program, glaucoma progression analysis (GPA II) event analysis and expert opinion in the detection of glaucomatous progression. The Quasar program is based on linear regression analysis of both mean defect (MD) and pattern standard deviation (PSD). Each series of visual fields was evaluated by three methods; Quasar, GPA II and four experts. The sensitivity, specificity and agreement (kappa) for each method was calculated, using expert opinion as the reference standard. The study included 439 SITA Standard visual fields of 56 eyes of 42 patients, with a mean of 7.8 ± 0.8 visual fields per eye. When suspected cases of progression were considered stable, sensitivity and specificity of Quasar, GPA II and the experts were 86.6% and 70.7%, 26.6% and 95.1%, and 86.6% and 92.6% respectively. When suspected cases of progression were considered as progressing, sensitivity and specificity of Quasar, GPA II and the experts were 79.1% and 81.2%, 45.8% and 90.6%, and 85.4% and 90.6% respectively. The agreement between Quasar and GPA II when suspected cases were considered stable or progressing was 0.03 and 0.28 respectively. The degree of agreement between Quasar and the experts when suspected cases were considered stable or progressing was 0.472 and 0.507. The degree of agreement between GPA II and the experts when suspected cases were considered stable or progressing was 0.262 and 0.342. The combination of MD and PSD regression analysis in the Quasar program showed better agreement with the experts and higher sensitivity than GPA II.

  8. Incremental Net Effects in Multiple Regression

    ERIC Educational Resources Information Center

    Lipovetsky, Stan; Conklin, Michael

    2005-01-01

    A regular problem in regression analysis is estimating the comparative importance of the predictors in the model. This work considers the 'net effects', or shares of the predictors in the coefficient of the multiple determination, which is a widely used characteristic of the quality of a regression model. Estimation of the net effects can be a…

  9. What Happens to Students Placed into Developmental Education? A Meta-Analysis of Regression Discontinuity Studies

    ERIC Educational Resources Information Center

    Valentine, Jeffrey C.; Konstantopoulos, Spyros; Goldrick-Rab, Sara

    2017-01-01

    This article reports a systematic review and meta-analysis of studies that use regression discontinuity to examine the effects of placement into developmental education. Results suggest that placement into developmental education is associated with effects that are negative, statistically significant, and substantively large for three outcomes:…

  10. Regression analysis of mixed recurrent-event and panel-count data with additive rate models.

    PubMed

    Zhu, Liang; Zhao, Hui; Sun, Jianguo; Leisenring, Wendy; Robison, Leslie L

    2015-03-01

    Event-history studies of recurrent events are often conducted in fields such as demography, epidemiology, medicine, and social sciences (Cook and Lawless, 2007, The Statistical Analysis of Recurrent Events. New York: Springer-Verlag; Zhao et al., 2011, Test 20, 1-42). For such analysis, two types of data have been extensively investigated: recurrent-event data and panel-count data. However, in practice, one may face a third type of data, mixed recurrent-event and panel-count data or mixed event-history data. Such data occur if some study subjects are monitored or observed continuously and thus provide recurrent-event data, while the others are observed only at discrete times and hence give only panel-count data. A more general situation is that each subject is observed continuously over certain time periods but only at discrete times over other time periods. There exists little literature on the analysis of such mixed data except that published by Zhu et al. (2013, Statistics in Medicine 32, 1954-1963). In this article, we consider the regression analysis of mixed data using the additive rate model and develop some estimating equation-based approaches to estimate the regression parameters of interest. Both finite sample and asymptotic properties of the resulting estimators are established, and the numerical studies suggest that the proposed methodology works well for practical situations. The approach is applied to a Childhood Cancer Survivor Study that motivated this study. © 2014, The International Biometric Society.

  11. “Smooth” Semiparametric Regression Analysis for Arbitrarily Censored Time-to-Event Data

    PubMed Central

    Zhang, Min; Davidian, Marie

    2008-01-01

    Summary A general framework for regression analysis of time-to-event data subject to arbitrary patterns of censoring is proposed. The approach is relevant when the analyst is willing to assume that distributions governing model components that are ordinarily left unspecified in popular semiparametric regression models, such as the baseline hazard function in the proportional hazards model, have densities satisfying mild “smoothness” conditions. Densities are approximated by a truncated series expansion that, for fixed degree of truncation, results in a “parametric” representation, which makes likelihood-based inference coupled with adaptive choice of the degree of truncation, and hence flexibility of the model, computationally and conceptually straightforward with data subject to any pattern of censoring. The formulation allows popular models, such as the proportional hazards, proportional odds, and accelerated failure time models, to be placed in a common framework; provides a principled basis for choosing among them; and renders useful extensions of the models straightforward. The utility and performance of the methods are demonstrated via simulations and by application to data from time-to-event studies. PMID:17970813

  12. Latent profile analysis of regression-based norms demonstrates relationship of compounding MS symptom burden and negative work events.

    PubMed

    Frndak, Seth E; Smerbeck, Audrey M; Irwin, Lauren N; Drake, Allison S; Kordovski, Victoria M; Kunker, Katrina A; Khan, Anjum L; Benedict, Ralph H B

    2016-10-01

    We endeavored to clarify how distinct co-occurring symptoms relate to the presence of negative work events in employed multiple sclerosis (MS) patients. Latent profile analysis (LPA) was utilized to elucidate common disability patterns by isolating patient subpopulations. Samples of 272 employed MS patients and 209 healthy controls (HC) were administered neuroperformance tests of ambulation, hand dexterity, processing speed, and memory. Regression-based norms were created from the HC sample. LPA identified latent profiles using the regression-based z-scores. Finally, multinomial logistic regression tested for negative work event differences among the latent profiles. Four profiles were identified via LPA: a common profile (55%) characterized by slightly below average performance in all domains, a broadly low-performing profile (18%), a poor motor abilities profile with average cognition (17%), and a generally high-functioning profile (9%). Multinomial regression analysis revealed that the uniformly low-performing profile demonstrated a higher likelihood of reported negative work events. Employed MS patients with co-occurring motor, memory and processing speed impairments were most likely to report a negative work event, classifying them as uniquely at risk for job loss.

  13. Regression modeling of ground-water flow

    USGS Publications Warehouse

    Cooley, R.L.; Naff, R.L.

    1985-01-01

    Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)

  14. Predictions of biochar production and torrefaction performance from sugarcane bagasse using interpolation and regression analysis.

    PubMed

    Chen, Wei-Hsin; Hsu, Hung-Jen; Kumar, Gopalakrishnan; Budzianowski, Wojciech M; Ong, Hwai Chyuan

    2017-12-01

    This study focuses on the biochar formation and torrefaction performance of sugarcane bagasse, and they are predicted using the bilinear interpolation (BLI), inverse distance weighting (IDW) interpolation, and regression analysis. It is found that the biomass torrefied at 275°C for 60min or at 300°C for 30min or longer is appropriate to produce biochar as alternative fuel to coal with low carbon footprint, but the energy yield from the torrefaction at 300°C is too low. From the biochar yield, enhancement factor of HHV, and energy yield, the results suggest that the three methods are all feasible for predicting the performance, especially for the enhancement factor. The power parameter of unity in the IDW method provides the best predictions and the error is below 5%. The second order in regression analysis gives a more reasonable approach than the first order, and is recommended for the predictions. Copyright © 2017 Elsevier Ltd. All rights reserved.

  15. SPReM: Sparse Projection Regression Model For High-dimensional Linear Regression *

    PubMed Central

    Sun, Qiang; Zhu, Hongtu; Liu, Yufeng; Ibrahim, Joseph G.

    2014-01-01

    The aim of this paper is to develop a sparse projection regression modeling (SPReM) framework to perform multivariate regression modeling with a large number of responses and a multivariate covariate of interest. We propose two novel heritability ratios to simultaneously perform dimension reduction, response selection, estimation, and testing, while explicitly accounting for correlations among multivariate responses. Our SPReM is devised to specifically address the low statistical power issue of many standard statistical approaches, such as the Hotelling’s T2 test statistic or a mass univariate analysis, for high-dimensional data. We formulate the estimation problem of SPREM as a novel sparse unit rank projection (SURP) problem and propose a fast optimization algorithm for SURP. Furthermore, we extend SURP to the sparse multi-rank projection (SMURP) by adopting a sequential SURP approximation. Theoretically, we have systematically investigated the convergence properties of SURP and the convergence rate of SURP estimates. Our simulation results and real data analysis have shown that SPReM out-performs other state-of-the-art methods. PMID:26527844

  16. Influence diagnostics in meta-regression model.

    PubMed

    Shi, Lei; Zuo, ShanShan; Yu, Dalei; Zhou, Xiaohua

    2017-09-01

    This paper studies the influence diagnostics in meta-regression model including case deletion diagnostic and local influence analysis. We derive the subset deletion formulae for the estimation of regression coefficient and heterogeneity variance and obtain the corresponding influence measures. The DerSimonian and Laird estimation and maximum likelihood estimation methods in meta-regression are considered, respectively, to derive the results. Internal and external residual and leverage measure are defined. The local influence analysis based on case-weights perturbation scheme, responses perturbation scheme, covariate perturbation scheme, and within-variance perturbation scheme are explored. We introduce a method by simultaneous perturbing responses, covariate, and within-variance to obtain the local influence measure, which has an advantage of capable to compare the influence magnitude of influential studies from different perturbations. An example is used to illustrate the proposed methodology. Copyright © 2017 John Wiley & Sons, Ltd.

  17. Interactions between cadmium and decabrominated diphenyl ether on blood cells count in rats-Multiple factorial regression analysis.

    PubMed

    Curcic, Marijana; Buha, Aleksandra; Stankovic, Sanja; Milovanovic, Vesna; Bulat, Zorica; Đukić-Ćosić, Danijela; Antonijević, Evica; Vučinić, Slavica; Matović, Vesna; Antonijevic, Biljana

    2017-02-01

    The objective of this study was to assess toxicity of Cd and BDE-209 mixture on haematological parameters in subacutely exposed rats and to determine the presence and type of interactions between these two chemicals using multiple factorial regression analysis. Furthermore, for the assessment of interaction type, an isobologram based methodology was applied and compared with multiple factorial regression analysis. Chemicals were given by oral gavage to the male Wistar rats weighing 200-240g for 28days. Animals were divided in 16 groups (8/group): control vehiculum group, three groups of rats were treated with 2.5, 7.5 or 15mg Cd/kg/day. These doses were chosen on the bases of literature data and reflect relatively high Cd environmental exposure, three groups of rats were treated with 1000, 2000 or 4000mg BDE-209/kg/bw/day, doses proved to induce toxic effects in rats. Furthermore, nine groups of animals were treated with different mixtures of Cd and BDE-209 containing doses of Cd and BDE-209 stated above. Blood samples were taken at the end of experiment and red blood cells, white blood cells and platelets counts were determined. For interaction assessment multiple factorial regression analysis and fitted isobologram approach were used. In this study, we focused on multiple factorial regression analysis as a method for interaction assessment. We also investigated the interactions between Cd and BDE-209 by the derived model for the description of the obtained fitted isobologram curves. Current study indicated that co-exposure to Cd and BDE-209 can result in significant decrease in RBC count, increase in WBC count and decrease in PLT count, when compared with controls. Multiple factorial regression analysis used for the assessment of interactions type between Cd and BDE-209 indicated synergism for the effect on RBC count and no interactions i.e. additivity for the effects on WBC and PLT counts. On the other hand, isobologram based approach showed slight antagonism

  18. Skeletal height estimation from regression analysis of sternal lengths in a Northwest Indian population of Chandigarh region: a postmortem study.

    PubMed

    Singh, Jagmahender; Pathak, R K; Chavali, Krishnadutt H

    2011-03-20

    Skeletal height estimation from regression analysis of eight sternal lengths in the subjects of Chandigarh zone of Northwest India is the topic of discussion in this study. Analysis of eight sternal lengths (length of manubrium, length of mesosternum, combined length of manubrium and mesosternum, total sternal length and first four intercostals lengths of mesosternum) measured from 252 male and 91 female sternums obtained at postmortems revealed that mean cadaver stature and sternal lengths were more in North Indians and males than the South Indians and females. Except intercostal lengths, all the sternal lengths were positively correlated with stature of the deceased in both sexes (P < 0.001). The multiple regression analysis of sternal lengths was found more useful than the linear regression for stature estimation. Using multivariate regression analysis, the combined length of manubrium and mesosternum in both sexes and the length of manubrium along with 2nd and 3rd intercostal lengths of mesosternum in males were selected as best estimators of stature. Nonetheless, the stature of males can be predicted with SEE of 6.66 (R(2) = 0.16, r = 0.318) from combination of MBL+BL_3+LM+BL_2, and in females from MBL only, it can be estimated with SEE of 6.65 (R(2) = 0.10, r = 0.318), whereas from the multiple regression analysis of pooled data, stature can be known with SEE of 6.97 (R(2) = 0.387, r = 575) from the combination of MBL+LM+BL_2+TSL+BL_3. The R(2) and F-ratio were found to be statistically significant for almost all the variables in both the sexes, except 4th intercostal length in males and 2nd to 4th intercostal lengths in females. The 'major' sternal lengths were more useful than the 'minor' ones for stature estimation The universal regression analysis used by Kanchan et al. [39] when applied to sternal lengths, gave satisfactory estimates of stature for males only but female stature was comparatively better estimated from simple linear regressions. But they

  19. A novel simple QSAR model for the prediction of anti-HIV activity using multiple linear regression analysis.

    PubMed

    Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Markopoulos, John; Igglessi-Markopoulou, Olga

    2006-08-01

    A quantitative-structure activity relationship was obtained by applying Multiple Linear Regression Analysis to a series of 80 1-[2-hydroxyethoxy-methyl]-6-(phenylthio) thymine (HEPT) derivatives with significant anti-HIV activity. For the selection of the best among 37 different descriptors, the Elimination Selection Stepwise Regression Method (ES-SWR) was utilized. The resulting QSAR model (R (2) (CV) = 0.8160; S (PRESS) = 0.5680) proved to be very accurate both in training and predictive stages.

  20. Survival Data and Regression Models

    NASA Astrophysics Data System (ADS)

    Grégoire, G.

    2014-12-01

    We start this chapter by introducing some basic elements for the analysis of censored survival data. Then we focus on right censored data and develop two types of regression models. The first one concerns the so-called accelerated failure time models (AFT), which are parametric models where a function of a parameter depends linearly on the covariables. The second one is a semiparametric model, where the covariables enter in a multiplicative form in the expression of the hazard rate function. The main statistical tool for analysing these regression models is the maximum likelihood methodology and, in spite we recall some essential results about the ML theory, we refer to the chapter "Logistic Regression" for a more detailed presentation.

  1. Improved Regression Analysis of Temperature-Dependent Strain-Gage Balance Calibration Data

    NASA Technical Reports Server (NTRS)

    Ulbrich, N.

    2015-01-01

    An improved approach is discussed that may be used to directly include first and second order temperature effects in the load prediction algorithm of a wind tunnel strain-gage balance. The improved approach was designed for the Iterative Method that fits strain-gage outputs as a function of calibration loads and uses a load iteration scheme during the wind tunnel test to predict loads from measured gage outputs. The improved approach assumes that the strain-gage balance is at a constant uniform temperature when it is calibrated and used. First, the method introduces a new independent variable for the regression analysis of the balance calibration data. The new variable is designed as the difference between the uniform temperature of the balance and a global reference temperature. This reference temperature should be the primary calibration temperature of the balance so that, if needed, a tare load iteration can be performed. Then, two temperature{dependent terms are included in the regression models of the gage outputs. They are the temperature difference itself and the square of the temperature difference. Simulated temperature{dependent data obtained from Triumph Aerospace's 2013 calibration of NASA's ARC-30K five component semi{span balance is used to illustrate the application of the improved approach.

  2. Classification of Effective Soil Depth by Using Multinomial Logistic Regression Analysis

    NASA Astrophysics Data System (ADS)

    Chang, C. H.; Chan, H. C.; Chen, B. A.

    2016-12-01

    Classification of effective soil depth is a task of determining the slopeland utilizable limitation in Taiwan. The "Slopeland Conservation and Utilization Act" categorizes the slopeland into agriculture and husbandry land, land suitable for forestry and land for enhanced conservation according to the factors including average slope, effective soil depth, soil erosion and parental rock. However, sit investigation of the effective soil depth requires a cost-effective field work. This research aimed to classify the effective soil depth by using multinomial logistic regression with the environmental factors. The Wen-Shui Watershed located at the central Taiwan was selected as the study areas. The analysis of multinomial logistic regression is performed by the assistance of a Geographic Information Systems (GIS). The effective soil depth was categorized into four levels including deeper, deep, shallow and shallower. The environmental factors of slope, aspect, digital elevation model (DEM), curvature and normalized difference vegetation index (NDVI) were selected for classifying the soil depth. An Error Matrix was then used to assess the model accuracy. The results showed an overall accuracy of 75%. At the end, a map of effective soil depth was produced to help planners and decision makers in determining the slopeland utilizable limitation in the study areas.

  3. Computational Tools for Probing Interactions in Multiple Linear Regression, Multilevel Modeling, and Latent Curve Analysis

    ERIC Educational Resources Information Center

    Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.

    2006-01-01

    Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…

  4. Human allometry: adult bodies are more nearly geometrically similar than regression analysis has suggested.

    PubMed

    Burton, Richard F

    2010-01-01

    It is almost a matter of dogma that human body mass in adults tends to vary roughly in proportion to the square of height (stature), as Quetelet stated in 1835. As he realised, perfect isometry or geometric similarity requires that body mass varies with height cubed, so there seems to be a trend for tall adults to be relatively much lighter than short ones. Much evidence regarding component tissues and organs seems to accord with this idea. However, the hypothesis is presented that the proportions of the body are actually very much less size-dependent. Past evidence has mostly been obtained by least-squares regression analysis, but this cannot generally give a true picture of the allometric relationships. This is because there is considerable scatter in the data (leading to a low correlation between mass and height) and because neither variable causally determines the other. The relevant regression equations, though often formulated in logarithmic terms, effectively treat the masses as proportional to (body height)(b). Values of b estimated by regression must usually underestimate the true functional values, doing so especially when mass and height are poorly correlated. It is therefore telling support for the hypothesis that published estimates of b both for the whole body (which range between 1.0 and 2.5) and for its component tissues and organs (which vary even more) correlate with the corresponding correlation coefficients for mass and height. There is no simple statistical technique for establishing the true functional relationships, but Monte Carlo modelling has shown that the results obtained for total body mass are compatible with a true height exponent of three. Other data, on relationships between body mass and the girths of various body parts such as the thigh and chest, are also more consistent with isometry than regression analysis has suggested. This too is demonstrated by modelling. It thus seems that much of anthropometry needs to be re

  5. Variability and trends in the consumption of drugs for treating attention-deficit/hyperactivity disorder in Castile-La Mancha, Spain (1992-2015).

    PubMed

    Criado-Álvarez, J J; González González, J; Romo Barrientos, C; Mohedano Moriano, A; Montero Rubio, J C; Pérez Veiga, J P

    2016-09-16

    Attention-deficit/hyperactivity disorder (ADHD) is one of the most common behavioural disorders of childhood; its prevalence in Spain is estimated at 5-9%. Available treatments for this condition include methylphenidate, atomoxetine, and lisdexamfetamine, whose consumption increases each year. The prevalence of ADHD was estimated by calculating the defined daily dose per 1,000 population per day of each drug and the total doses (therapeutic group N06BA) between 1992 and 2015 in each of the provinces of Castile-La Mancha (Spain). Trends, joinpoints, and annual percentages of change were analysed using joinpoint regression models. The minimum prevalence of ADHD in the population of Castile-La Mancha aged 5 to 19 was estimated at 13.22 cases per 1,000 population per day; prevalence varied across provinces (p<.05). Overall consumption has increased from 1992 to 2015, with an annual percentages of change of 10.3% and several joinpoints (2000, 2009, and 2012). methylphenidate represents 89.6% of total drug consumption, followed by lisdexamfetamine at 8%. Analysing drug consumption enables us to estimate the distribution of ADHD patients in Castile-La Mancha. Our data show an increase in the consumption of these drugs as well as differences in drug consumption between provinces, which reflect differences in ADHD management in clinical practice. Copyright © 2016. Publicado por Elsevier España, S.L.U.

  6. Drug treatment rates with beta-blockers and ACE-inhibitors/angiotensin receptor blockers and recurrences in takotsubo cardiomyopathy: A meta-regression analysis.

    PubMed

    Brunetti, Natale Daniele; Santoro, Francesco; De Gennaro, Luisa; Correale, Michele; Gaglione, Antonio; Di Biase, Matteo

    2016-07-01

    In a recent paper Singh et al. analyzed the effect of drug treatment on recurrence of takotsubo cardiomyopathy (TTC) in a comprehensive meta-analysis. The study found that recurrence rates were independent of clinic utilization of BB prescription, but inversely correlated with ACEi/ARB prescription: authors therefore conclude that ACEi/ARB rather than BB may reduce risk of recurrence. We aimed to re-analyze data reported in the study, now weighted for populations' size, in a meta-regression analysis. After multiple meta-regression analysis, we found a significant regression between rates of prescription of ACEi and rates of recurrence of TTC; regression was not statistically significant for BBs. On the bases of our re-analysis, we confirm that rates of recurrence of TTC are lower in populations of patients with higher rates of treatment with ACEi/ARB. That could not necessarily imply that ACEi may prevent recurrence of TTC, but barely that, for example, rates of recurrence are lower in cohorts more compliant with therapy or more prescribed with ACEi because more carefully followed. Randomized prospective studies are surely warranted. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  7. Assessing risk factors for periodontitis using regression

    NASA Astrophysics Data System (ADS)

    Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa

    2013-10-01

    Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.

  8. Regression Analysis of Mixed Recurrent-Event and Panel-Count Data with Additive Rate Models

    PubMed Central

    Zhu, Liang; Zhao, Hui; Sun, Jianguo; Leisenring, Wendy; Robison, Leslie L.

    2015-01-01

    Summary Event-history studies of recurrent events are often conducted in fields such as demography, epidemiology, medicine, and social sciences (Cook and Lawless, 2007; Zhao et al., 2011). For such analysis, two types of data have been extensively investigated: recurrent-event data and panel-count data. However, in practice, one may face a third type of data, mixed recurrent-event and panel-count data or mixed event-history data. Such data occur if some study subjects are monitored or observed continuously and thus provide recurrent-event data, while the others are observed only at discrete times and hence give only panel-count data. A more general situation is that each subject is observed continuously over certain time periods but only at discrete times over other time periods. There exists little literature on the analysis of such mixed data except that published by Zhu et al. (2013). In this paper, we consider the regression analysis of mixed data using the additive rate model and develop some estimating equation-based approaches to estimate the regression parameters of interest. Both finite sample and asymptotic properties of the resulting estimators are established, and the numerical studies suggest that the proposed methodology works well for practical situations. The approach is applied to a Childhood Cancer Survivor Study that motivated this study. PMID:25345405

  9. Algorithm For Solution Of Subset-Regression Problems

    NASA Technical Reports Server (NTRS)

    Verhaegen, Michel

    1991-01-01

    Reliable and flexible algorithm for solution of subset-regression problem performs QR decomposition with new column-pivoting strategy, enables selection of subset directly from originally defined regression parameters. This feature, in combination with number of extensions, makes algorithm very flexible for use in analysis of subset-regression problems in which parameters have physical meanings. Also extended to enable joint processing of columns contaminated by noise with those free of noise, without using scaling techniques.

  10. Regression Discontinuity Designs in Epidemiology

    PubMed Central

    Moscoe, Ellen; Mutevedzi, Portia; Newell, Marie-Louise; Bärnighausen, Till

    2014-01-01

    When patients receive an intervention based on whether they score below or above some threshold value on a continuously measured random variable, the intervention will be randomly assigned for patients close to the threshold. The regression discontinuity design exploits this fact to estimate causal treatment effects. In spite of its recent proliferation in economics, the regression discontinuity design has not been widely adopted in epidemiology. We describe regression discontinuity, its implementation, and the assumptions required for causal inference. We show that regression discontinuity is generalizable to the survival and nonlinear models that are mainstays of epidemiologic analysis. We then present an application of regression discontinuity to the much-debated epidemiologic question of when to start HIV patients on antiretroviral therapy. Using data from a large South African cohort (2007–2011), we estimate the causal effect of early versus deferred treatment eligibility on mortality. Patients whose first CD4 count was just below the 200 cells/μL CD4 count threshold had a 35% lower hazard of death (hazard ratio = 0.65 [95% confidence interval = 0.45–0.94]) than patients presenting with CD4 counts just above the threshold. We close by discussing the strengths and limitations of regression discontinuity designs for epidemiology. PMID:25061922

  11. A classical regression framework for mediation analysis: fitting one model to estimate mediation effects.

    PubMed

    Saunders, Christina T; Blume, Jeffrey D

    2017-10-26

    Mediation analysis explores the degree to which an exposure's effect on an outcome is diverted through a mediating variable. We describe a classical regression framework for conducting mediation analyses in which estimates of causal mediation effects and their variance are obtained from the fit of a single regression model. The vector of changes in exposure pathway coefficients, which we named the essential mediation components (EMCs), is used to estimate standard causal mediation effects. Because these effects are often simple functions of the EMCs, an analytical expression for their model-based variance follows directly. Given this formula, it is instructive to revisit the performance of routinely used variance approximations (e.g., delta method and resampling methods). Requiring the fit of only one model reduces the computation time required for complex mediation analyses and permits the use of a rich suite of regression tools that are not easily implemented on a system of three equations, as would be required in the Baron-Kenny framework. Using data from the BRAIN-ICU study, we provide examples to illustrate the advantages of this framework and compare it with the existing approaches. © The Author 2017. Published by Oxford University Press.

  12. Suicide in Greece 1992-2012: A time-series analysis.

    PubMed

    Papaslanis, Theodoros; Kontaxakis, Vassilis; Christodoulou, Christos; Konstantakopoulos, George; Kontaxaki, Maria-Irini; Papadimitriou, George N

    2016-08-01

    Since 2008, Greece has entered a long period of economic crisis with adverse effects on various aspects of daily life. In this frame, it is quite important to examine the suicide trends in Greece. Our analysis covered the period 1992-2012. 2012 was the last year for which official suicide data were available. The inclusion of data for pre-crisis period enabled us to assess trends in suicide preceding the economic crisis, starting in 2008. Trends in sex- and age-adjusted standardized suicide rates (SSR) were analyzed using joinpoint regression. Total SSR presented statistically significant annual decrease of 0.89% (95% confidence interval (CI): -1.7, -0.1) during the period 1992-2008. After 2009, the trend in total SSR increased statistically significant annual increase (12.48%; 95% CI: 0.3%, 26.1%). SSR in males presented an initial period of modest annual decrease (-0.84%; 95% CI: -1.6%, -0.1%), during the period 1992-2008. After 2009, an annual increase by 9.25% (95% CI: 2.7%, 16.3%) was revealed. No change in female SSR trend was observed during the studied period. According to the results of this study, there is clear evidence of an increase in the overall SSR and male SSR in Greece during the period of the current financial crisis. © The Author(s) 2016.

  13. A Simulation Investigation of Principal Component Regression.

    ERIC Educational Resources Information Center

    Allen, David E.

    Regression analysis is one of the more common analytic tools used by researchers. However, multicollinearity between the predictor variables can cause problems in using the results of regression analyses. Problems associated with multicollinearity include entanglement of relative influences of variables due to reduced precision of estimation,…

  14. Extrinsic local regression on manifold-valued data

    PubMed Central

    Lin, Lizhen; St Thomas, Brian; Zhu, Hongtu; Dunson, David B.

    2017-01-01

    We propose an extrinsic regression framework for modeling data with manifold valued responses and Euclidean predictors. Regression with manifold responses has wide applications in shape analysis, neuroscience, medical imaging and many other areas. Our approach embeds the manifold where the responses lie onto a higher dimensional Euclidean space, obtains a local regression estimate in that space, and then projects this estimate back onto the image of the manifold. Outside the regression setting both intrinsic and extrinsic approaches have been proposed for modeling i.i.d manifold-valued data. However, to our knowledge our work is the first to take an extrinsic approach to the regression problem. The proposed extrinsic regression framework is general, computationally efficient and theoretically appealing. Asymptotic distributions and convergence rates of the extrinsic regression estimates are derived and a large class of examples are considered indicating the wide applicability of our approach. PMID:29225385

  15. Cox regression analysis with missing covariates via nonparametric multiple imputation.

    PubMed

    Hsu, Chiu-Hsieh; Yu, Mandi

    2018-01-01

    We consider the situation of estimating Cox regression in which some covariates are subject to missing, and there exists additional information (including observed event time, censoring indicator and fully observed covariates) which may be predictive of the missing covariates. We propose to use two working regression models: one for predicting the missing covariates and the other for predicting the missing probabilities. For each missing covariate observation, these two working models are used to define a nearest neighbor imputing set. This set is then used to non-parametrically impute covariate values for the missing observation. Upon the completion of imputation, Cox regression is performed on the multiply imputed datasets to estimate the regression coefficients. In a simulation study, we compare the nonparametric multiple imputation approach with the augmented inverse probability weighted (AIPW) method, which directly incorporates the two working models into estimation of Cox regression, and the predictive mean matching imputation (PMM) method. We show that all approaches can reduce bias due to non-ignorable missing mechanism. The proposed nonparametric imputation method is robust to mis-specification of either one of the two working models and robust to mis-specification of the link function of the two working models. In contrast, the PMM method is sensitive to misspecification of the covariates included in imputation. The AIPW method is sensitive to the selection probability. We apply the approaches to a breast cancer dataset from Surveillance, Epidemiology and End Results (SEER) Program.

  16. Prevalence of rapid eye movement sleep behavior disorder (RBD) in Parkinson's disease: a meta and meta-regression analysis.

    PubMed

    Zhang, Xiaona; Sun, Xiaoxuan; Wang, Junhong; Tang, Liou; Xie, Anmu

    2017-01-01

    Rapid eye movement sleep behavior disorder (RBD) is thought to be one of the most frequent preceding symptoms of Parkinson's disease (PD). However, the prevalence of RBD in PD stated in the published studies is still inconsistent. We conducted a meta and meta-regression analysis in this paper to estimate the pooled prevalence. We searched the electronic databases of PubMed, ScienceDirect, EMBASE and EBSCO up to June 2016 for related articles. STATA 12.0 statistics software was used to calculate the available data from each research. The prevalence of RBD in PD patients in each study was combined to a pooled prevalence with a 95 % confidence interval (CI). Subgroup analysis and meta-regression analysis were performed to search for the causes of the heterogeneity. A total of 28 studies with 6869 PD cases were deemed eligible and included in our meta-analysis based on the inclusion and exclusion criteria. The pooled prevalence of RBD in PD was 42.3 % (95 % CI 37.4-47.1 %). In subgroup analysis and meta-regression analysis, we found that the important causes of heterogeneity were the diagnosis criteria of RBD and age of PD patients (P = 0.016, P = 0.019, respectively). The results indicate that nearly half of the PD patients are suffering from RBD. Older age and longer duration are risk factors for RBD in PD. We can use the minimal diagnosis criteria for RBD according to the International Classification of Sleep Disorders to diagnose RBD patients in our daily work if polysomnography is not necessary.

  17. Orthodontic bracket bonding without previous adhesive priming: A meta-regression analysis.

    PubMed

    Altmann, Aline Segatto Pires; Degrazia, Felipe Weidenbach; Celeste, Roger Keller; Leitune, Vicente Castelo Branco; Samuel, Susana Maria Werner; Collares, Fabrício Mezzomo

    2016-05-01

    To determine the consensus among studies that adhesive resin application improves the bond strength of orthodontic brackets and the association of methodological variables on the influence of bond strength outcome. In vitro studies were selected to answer whether adhesive resin application increases the immediate shear bond strength of metal orthodontic brackets bonded with a photo-cured orthodontic adhesive. Studies included were those comparing a group having adhesive resin to a group without adhesive resin with the primary outcome measurement shear bond strength in MPa. A systematic electronic search was performed in PubMed and Scopus databases. Nine studies were included in the analysis. Based on the pooled data and due to a high heterogeneity among studies (I(2)  =  93.3), a meta-regression analysis was conducted. The analysis demonstrated that five experimental conditions explained 86.1% of heterogeneity and four of them had significantly affected in vitro shear bond testing. The shear bond strength of metal brackets was not significantly affected when bonded with adhesive resin, when compared to those without adhesive resin. The adhesive resin application can be set aside during metal bracket bonding to enamel regardless of the type of orthodontic adhesive used.

  18. Regression-based statistical mediation and moderation analysis in clinical research: Observations, recommendations, and implementation.

    PubMed

    Hayes, Andrew F; Rockwood, Nicholas J

    2017-11-01

    There have been numerous treatments in the clinical research literature about various design, analysis, and interpretation considerations when testing hypotheses about mechanisms and contingencies of effects, popularly known as mediation and moderation analysis. In this paper we address the practice of mediation and moderation analysis using linear regression in the pages of Behaviour Research and Therapy and offer some observations and recommendations, debunk some popular myths, describe some new advances, and provide an example of mediation, moderation, and their integration as conditional process analysis using the PROCESS macro for SPSS and SAS. Our goal is to nudge clinical researchers away from historically significant but increasingly old school approaches toward modifications, revisions, and extensions that characterize more modern thinking about the analysis of the mechanisms and contingencies of effects. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. Simple estimation procedures for regression analysis of interval-censored failure time data under the proportional hazards model.

    PubMed

    Sun, Jianguo; Feng, Yanqin; Zhao, Hui

    2015-01-01

    Interval-censored failure time data occur in many fields including epidemiological and medical studies as well as financial and sociological studies, and many authors have investigated their analysis (Sun, The statistical analysis of interval-censored failure time data, 2006; Zhang, Stat Modeling 9:321-343, 2009). In particular, a number of procedures have been developed for regression analysis of interval-censored data arising from the proportional hazards model (Finkelstein, Biometrics 42:845-854, 1986; Huang, Ann Stat 24:540-568, 1996; Pan, Biometrics 56:199-203, 2000). For most of these procedures, however, one drawback is that they involve estimation of both regression parameters and baseline cumulative hazard function. In this paper, we propose two simple estimation approaches that do not need estimation of the baseline cumulative hazard function. The asymptotic properties of the resulting estimates are given, and an extensive simulation study is conducted and indicates that they work well for practical situations.

  20. Time series regression studies in environmental epidemiology.

    PubMed

    Bhaskaran, Krishnan; Gasparrini, Antonio; Hajat, Shakoor; Smeeth, Liam; Armstrong, Ben

    2013-08-01

    Time series regression studies have been widely used in environmental epidemiology, notably in investigating the short-term associations between exposures such as air pollution, weather variables or pollen, and health outcomes such as mortality, myocardial infarction or disease-specific hospital admissions. Typically, for both exposure and outcome, data are available at regular time intervals (e.g. daily pollution levels and daily mortality counts) and the aim is to explore short-term associations between them. In this article, we describe the general features of time series data, and we outline the analysis process, beginning with descriptive analysis, then focusing on issues in time series regression that differ from other regression methods: modelling short-term fluctuations in the presence of seasonal and long-term patterns, dealing with time varying confounding factors and modelling delayed ('lagged') associations between exposure and outcome. We finish with advice on model checking and sensitivity analysis, and some common extensions to the basic model.

  1. Application of principal component regression and partial least squares regression in ultraviolet spectrum water quality detection

    NASA Astrophysics Data System (ADS)

    Li, Jiangtong; Luo, Yongdao; Dai, Honglin

    2018-01-01

    Water is the source of life and the essential foundation of all life. With the development of industrialization, the phenomenon of water pollution is becoming more and more frequent, which directly affects the survival and development of human. Water quality detection is one of the necessary measures to protect water resources. Ultraviolet (UV) spectral analysis is an important research method in the field of water quality detection, which partial least squares regression (PLSR) analysis method is becoming predominant technology, however, in some special cases, PLSR's analysis produce considerable errors. In order to solve this problem, the traditional principal component regression (PCR) analysis method was improved by using the principle of PLSR in this paper. The experimental results show that for some special experimental data set, improved PCR analysis method performance is better than PLSR. The PCR and PLSR is the focus of this paper. Firstly, the principal component analysis (PCA) is performed by MATLAB to reduce the dimensionality of the spectral data; on the basis of a large number of experiments, the optimized principal component is extracted by using the principle of PLSR, which carries most of the original data information. Secondly, the linear regression analysis of the principal component is carried out with statistic package for social science (SPSS), which the coefficients and relations of principal components can be obtained. Finally, calculating a same water spectral data set by PLSR and improved PCR, analyzing and comparing two results, improved PCR and PLSR is similar for most data, but improved PCR is better than PLSR for data near the detection limit. Both PLSR and improved PCR can be used in Ultraviolet spectral analysis of water, but for data near the detection limit, improved PCR's result better than PLSR.

  2. Predictors of course in obsessive-compulsive disorder: logistic regression versus Cox regression for recurrent events.

    PubMed

    Kempe, P T; van Oppen, P; de Haan, E; Twisk, J W R; Sluis, A; Smit, J H; van Dyck, R; van Balkom, A J L M

    2007-09-01

    Two methods for predicting remissions in obsessive-compulsive disorder (OCD) treatment are evaluated. Y-BOCS measurements of 88 patients with a primary OCD (DSM-III-R) diagnosis were performed over a 16-week treatment period, and during three follow-ups. Remission at any measurement was defined as a Y-BOCS score lower than thirteen combined with a reduction of seven points when compared with baseline. Logistic regression models were compared with a Cox regression for recurrent events model. Logistic regression yielded different models at different evaluation times. The recurrent events model remained stable when fewer measurements were used. Higher baseline levels of neuroticism and more severe OCD symptoms were associated with a lower chance of remission, early age of onset and more depressive symptoms with a higher chance. Choice of outcome time affects logistic regression prediction models. Recurrent events analysis uses all information on remissions and relapses. Short- and long-term predictors for OCD remission show overlap.

  3. Deriving percentage study weights in multi-parameter meta-analysis models: with application to meta-regression, network meta-analysis and one-stage individual participant data models.

    PubMed

    Riley, Richard D; Ensor, Joie; Jackson, Dan; Burke, Danielle L

    2017-01-01

    Many meta-analysis models contain multiple parameters, for example due to multiple outcomes, multiple treatments or multiple regression coefficients. In particular, meta-regression models may contain multiple study-level covariates, and one-stage individual participant data meta-analysis models may contain multiple patient-level covariates and interactions. Here, we propose how to derive percentage study weights for such situations, in order to reveal the (otherwise hidden) contribution of each study toward the parameter estimates of interest. We assume that studies are independent, and utilise a decomposition of Fisher's information matrix to decompose the total variance matrix of parameter estimates into study-specific contributions, from which percentage weights are derived. This approach generalises how percentage weights are calculated in a traditional, single parameter meta-analysis model. Application is made to one- and two-stage individual participant data meta-analyses, meta-regression and network (multivariate) meta-analysis of multiple treatments. These reveal percentage study weights toward clinically important estimates, such as summary treatment effects and treatment-covariate interactions, and are especially useful when some studies are potential outliers or at high risk of bias. We also derive percentage study weights toward methodologically interesting measures, such as the magnitude of ecological bias (difference between within-study and across-study associations) and the amount of inconsistency (difference between direct and indirect evidence in a network meta-analysis).

  4. A Survey of UML Based Regression Testing

    NASA Astrophysics Data System (ADS)

    Fahad, Muhammad; Nadeem, Aamer

    Regression testing is the process of ensuring software quality by analyzing whether changed parts behave as intended, and unchanged parts are not affected by the modifications. Since it is a costly process, a lot of techniques are proposed in the research literature that suggest testers how to build regression test suite from existing test suite with minimum cost. In this paper, we discuss the advantages and drawbacks of using UML diagrams for regression testing and analyze that UML model helps in identifying changes for regression test selection effectively. We survey the existing UML based regression testing techniques and provide an analysis matrix to give a quick insight into prominent features of the literature work. We discuss the open research issues like managing and reducing the size of regression test suite, prioritization of the test cases that would be helpful during strict schedule and resources that remain to be addressed for UML based regression testing.

  5. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…

  6. Combining multiple regression and principal component analysis for accurate predictions for column ozone in Peninsular Malaysia

    NASA Astrophysics Data System (ADS)

    Rajab, Jasim M.; MatJafri, M. Z.; Lim, H. S.

    2013-06-01

    This study encompasses columnar ozone modelling in the peninsular Malaysia. Data of eight atmospheric parameters [air surface temperature (AST), carbon monoxide (CO), methane (CH4), water vapour (H2Ovapour), skin surface temperature (SSKT), atmosphere temperature (AT), relative humidity (RH), and mean surface pressure (MSP)] data set, retrieved from NASA's Atmospheric Infrared Sounder (AIRS), for the entire period (2003-2008) was employed to develop models to predict the value of columnar ozone (O3) in study area. The combined method, which is based on using both multiple regressions combined with principal component analysis (PCA) modelling, was used to predict columnar ozone. This combined approach was utilized to improve the prediction accuracy of columnar ozone. Separate analysis was carried out for north east monsoon (NEM) and south west monsoon (SWM) seasons. The O3 was negatively correlated with CH4, H2Ovapour, RH, and MSP, whereas it was positively correlated with CO, AST, SSKT, and AT during both the NEM and SWM season periods. Multiple regression analysis was used to fit the columnar ozone data using the atmospheric parameter's variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to acquire subsets of the predictor variables to be comprised in the linear regression model of the atmospheric parameter's variables. It was found that the increase in columnar O3 value is associated with an increase in the values of AST, SSKT, AT, and CO and with a drop in the levels of CH4, H2Ovapour, RH, and MSP. The result of fitting the best models for the columnar O3 value using eight of the independent variables gave about the same values of the R (≈0.93) and R2 (≈0.86) for both the NEM and SWM seasons. The common variables that appeared in both regression equations were SSKT, CH4 and RH, and the principal precursor of the columnar O3 value in both the NEM and SWM seasons was SSKT.

  7. Estimating the causes of traffic accidents using logistic regression and discriminant analysis.

    PubMed

    Karacasu, Murat; Ergül, Barış; Altin Yavuz, Arzu

    2014-01-01

    Factors that affect traffic accidents have been analysed in various ways. In this study, we use the methods of logistic regression and discriminant analysis to determine the damages due to injury and non-injury accidents in the Eskisehir Province. Data were obtained from the accident reports of the General Directorate of Security in Eskisehir; 2552 traffic accidents between January and December 2009 were investigated regarding whether they resulted in injury. According to the results, the effects of traffic accidents were reflected in the variables. These results provide a wealth of information that may aid future measures toward the prevention of undesired results.

  8. The Use of Multiple Regression and Trend Analysis to Understand Enrollment Fluctuations. AIR Forum 1979 Paper.

    ERIC Educational Resources Information Center

    Campbell, S. Duke; Greenberg, Barry

    The development of a predictive equation capable of explaining a significant percentage of enrollment variability at Florida International University is described. A model utilizing trend analysis and a multiple regression approach to enrollment forecasting was adapted to investigate enrollment dynamics at the university. Four independent…

  9. Stolon regression

    PubMed Central

    Cherry Vogt, Kimberly S

    2008-01-01

    Many colonial organisms encrust surfaces with feeding and reproductive polyps connected by vascular stolons. Such colonies often show a dichotomy between runner-like forms, with widely spaced polyps and long stolon connections, and sheet-like forms, with closely spaced polyps and short stolon connections. Generative processes, such as rates of polyp initiation relative to rates of stolon elongation, are typically thought to underlie this dichotomy. Regressive processes, such as tissue regression and cell death, may also be relevant. In this context, we have recently characterized the process of stolon regression in a colonial cnidarian, Podocoryna carnea. Stolon regression occurs naturally in these colonies. To characterize this process in detail, high levels of stolon regression were induced in experimental colonies by treatment with reactive oxygen and reactive nitrogen species (ROS and RNS). Either treatment results in stolon regression and is accompanied by high levels of endogenous ROS and RNS as well as morphological indications of cell death in the regressing stolon. The initiating step in regression appears to be a perturbation of normal colony-wide gastrovascular flow. This suggests more general connections between stolon regression and a wide variety of environmental effects. Here we summarize our results and further discuss such connections. PMID:19704785

  10. Quantifying trends in disease impact to produce a consistent and reproducible definition of an emerging infectious disease.

    PubMed

    Funk, Sebastian; Bogich, Tiffany L; Jones, Kate E; Kilpatrick, A Marm; Daszak, Peter

    2013-01-01

    The proper allocation of public health resources for research and control requires quantification of both a disease's current burden and the trend in its impact. Infectious diseases that have been labeled as "emerging infectious diseases" (EIDs) have received heightened scientific and public attention and resources. However, the label 'emerging' is rarely backed by quantitative analysis and is often used subjectively. This can lead to over-allocation of resources to diseases that are incorrectly labelled "emerging," and insufficient allocation of resources to diseases for which evidence of an increasing or high sustained impact is strong. We suggest a simple quantitative approach, segmented regression, to characterize the trends and emergence of diseases. Segmented regression identifies one or more trends in a time series and determines the most statistically parsimonious split(s) (or joinpoints) in the time series. These joinpoints in the time series indicate time points when a change in trend occurred and may identify periods in which drivers of disease impact change. We illustrate the method by analyzing temporal patterns in incidence data for twelve diseases. This approach provides a way to classify a disease as currently emerging, re-emerging, receding, or stable based on temporal trends, as well as to pinpoint the time when the change in these trends happened. We argue that quantitative approaches to defining emergence based on the trend in impact of a disease can, with appropriate context, be used to prioritize resources for research and control. Implementing this more rigorous definition of an EID will require buy-in and enforcement from scientists, policy makers, peer reviewers and journal editors, but has the potential to improve resource allocation for global health.

  11. On the use of regression analysis for the estimation of human biological age.

    PubMed

    Krøll, J; Saxtrup, O

    2000-01-01

    The present investigation compares three linear regression procedures for the definition of human biological age (bioage). As a model system for bioage definition is used the variations with age of blood hemoglobin (B-hemoglobin) in males in the age range 50-95 years. The bioage measures compared are: 1: P-bioage; defined from regression of chronological age on B-hemoglobin results. 2: AC-bioage; obtained by indirect regression, using in reverse the equation describing the regression of B-hemoglobin on age in a reference population. 3: BC-bioage; defined by orthogonal regression on the reference regression line of B-hemoglobin on age. It is demonstrated that the P-bioage measure gives an overestimation of the bioage in the younger and an underestimation in the older individuals. This 'regression to the mean' is avoided using the indirect regression procedures. Here the relatively low SD of the BC-bioage measure results from the inclusion of individual chronological age in the orthogonal regression procedure. Observations on male blood donors illustrates the variation of the AC- and BC-bioage measures in the individual.

  12. Gaussian Process Regression Model in Spatial Logistic Regression

    NASA Astrophysics Data System (ADS)

    Sofro, A.; Oktaviarina, A.

    2018-01-01

    Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.

  13. The microcomputer scientific software series 2: general linear model--regression.

    Treesearch

    Harold M. Rauscher

    1983-01-01

    The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...

  14. Secure and Efficient Regression Analysis Using a Hybrid Cryptographic Framework: Development and Evaluation

    PubMed Central

    Jiang, Xiaoqian; Aziz, Md Momin Al; Wang, Shuang; Mohammed, Noman

    2018-01-01

    Background Machine learning is an effective data-driven tool that is being widely used to extract valuable patterns and insights from data. Specifically, predictive machine learning models are very important in health care for clinical data analysis. The machine learning algorithms that generate predictive models often require pooling data from different sources to discover statistical patterns or correlations among different attributes of the input data. The primary challenge is to fulfill one major objective: preserving the privacy of individuals while discovering knowledge from data. Objective Our objective was to develop a hybrid cryptographic framework for performing regression analysis over distributed data in a secure and efficient way. Methods Existing secure computation schemes are not suitable for processing the large-scale data that are used in cutting-edge machine learning applications. We designed, developed, and evaluated a hybrid cryptographic framework, which can securely perform regression analysis, a fundamental machine learning algorithm using somewhat homomorphic encryption and a newly introduced secure hardware component of Intel Software Guard Extensions (Intel SGX) to ensure both privacy and efficiency at the same time. Results Experimental results demonstrate that our proposed method provides a better trade-off in terms of security and efficiency than solely secure hardware-based methods. Besides, there is no approximation error. Computed model parameters are exactly similar to plaintext results. Conclusions To the best of our knowledge, this kind of secure computation model using a hybrid cryptographic framework, which leverages both somewhat homomorphic encryption and Intel SGX, is not proposed or evaluated to this date. Our proposed framework ensures data security and computational efficiency at the same time. PMID:29506966

  15. Effects of measurement errors on psychometric measurements in ergonomics studies: Implications for correlations, ANOVA, linear regression, factor analysis, and linear discriminant analysis.

    PubMed

    Liu, Yan; Salvendy, Gavriel

    2009-05-01

    This paper aims to demonstrate the effects of measurement errors on psychometric measurements in ergonomics studies. A variety of sources can cause random measurement errors in ergonomics studies and these errors can distort virtually every statistic computed and lead investigators to erroneous conclusions. The effects of measurement errors on five most widely used statistical analysis tools have been discussed and illustrated: correlation; ANOVA; linear regression; factor analysis; linear discriminant analysis. It has been shown that measurement errors can greatly attenuate correlations between variables, reduce statistical power of ANOVA, distort (overestimate, underestimate or even change the sign of) regression coefficients, underrate the explanation contributions of the most important factors in factor analysis and depreciate the significance of discriminant function and discrimination abilities of individual variables in discrimination analysis. The discussions will be restricted to subjective scales and survey methods and their reliability estimates. Other methods applied in ergonomics research, such as physical and electrophysiological measurements and chemical and biomedical analysis methods, also have issues of measurement errors, but they are beyond the scope of this paper. As there has been increasing interest in the development and testing of theories in ergonomics research, it has become very important for ergonomics researchers to understand the effects of measurement errors on their experiment results, which the authors believe is very critical to research progress in theory development and cumulative knowledge in the ergonomics field.

  16. Spatial Bayesian Latent Factor Regression Modeling of Coordinate-based Meta-analysis Data

    PubMed Central

    Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D.; Nichols, Thomas E.

    2017-01-01

    Summary Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the paper are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to 1) identify areas of consistent activation; and 2) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterised as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. PMID:28498564

  17. Spatial Bayesian latent factor regression modeling of coordinate-based meta-analysis data.

    PubMed

    Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D; Nichols, Thomas E

    2018-03-01

    Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the article are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to (i) identify areas of consistent activation; and (ii) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterized as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. © 2017, The International Biometric Society.

  18. P300 Amplitude in Alzheimer's Disease: A Meta-Analysis and Meta-Regression.

    PubMed

    Hedges, Dawson; Janis, Rebecca; Mickelson, Stephen; Keith, Cierra; Bennett, David; Brown, Bruce L

    2016-01-01

    Alzheimer's disease accounts for 60% of all dementia. Numerous biomarkers have been developed that can help in making an early diagnosis. The P300 is an event-related potential that may be abnormal in Alzheimer's disease. Given the possible association between P300 amplitude and Alzheimer's disease and the need for biomarkers in early Alzheimer's disease, the main purpose of this meta-analysis and meta-regression was to characterize P300 amplitude in probable Alzheimer's disease compared to healthy controls. Using online search engines, we identified peer-reviewed articles containing amplitude measures for the P300 in response to a visual or auditory oddball stimulus in subjects with Alzheimer's disease and in a healthy control group and pooled effect sizes for differences in P300 amplitude between Alzheimer's disease and control groups to obtain summary effect sizes. We also used meta-regression to determine whether age, sex, educational attainment, or dementia severity affected the association between P300 amplitude and Alzheimer's disease. Twenty articles containing a total of 646 subjects met inclusion and exclusion criteria. The overall effect size from all electrode locations was 1.079 (95% confidence interval=0.745-1.412, P<.001). The pooled effect sizes for the Cz, Fz, and Pz locations were 1.226 (P<.001), 0.724 (P=.0007), and 1.430 (P<.001), respectively. Meta-regression showed an association between amplitude and educational attainment, but no association between amplitude and age, sex, and dementia severity. In conclusion, P300 amplitude is smaller in subjects with Alzheimer's disease than in healthy controls. © EEG and Clinical Neuroscience Society (ECNS) 2014.

  19. Fuzzy multinomial logistic regression analysis: A multi-objective programming approach

    NASA Astrophysics Data System (ADS)

    Abdalla, Hesham A.; El-Sayed, Amany A.; Hamed, Ramadan

    2017-05-01

    Parameter estimation for multinomial logistic regression is usually based on maximizing the likelihood function. For large well-balanced datasets, Maximum Likelihood (ML) estimation is a satisfactory approach. Unfortunately, ML can fail completely or at least produce poor results in terms of estimated probabilities and confidence intervals of parameters, specially for small datasets. In this study, a new approach based on fuzzy concepts is proposed to estimate parameters of the multinomial logistic regression. The study assumes that the parameters of multinomial logistic regression are fuzzy. Based on the extension principle stated by Zadeh and Bárdossy's proposition, a multi-objective programming approach is suggested to estimate these fuzzy parameters. A simulation study is used to evaluate the performance of the new approach versus Maximum likelihood (ML) approach. Results show that the new proposed model outperforms ML in cases of small datasets.

  20. CATEGORICAL REGRESSION ANALYSIS OF ACUTE INHALATION TOXICITY DATA FOR HYDROGEN SULFIDE

    EPA Science Inventory

    Categorical regression is one of the tools offered by the U.S. EPA for derivation of acute reference exposures (AREs), which are dose-response assessments for acute exposures to inhaled chemicals. Categorical regression is used as a meta-analytical technique to calculate probabi...

  1. Multivariate analysis of fMRI time series: classification and regression of brain responses using machine learning.

    PubMed

    Formisano, Elia; De Martino, Federico; Valente, Giancarlo

    2008-09-01

    Machine learning and pattern recognition techniques are being increasingly employed in functional magnetic resonance imaging (fMRI) data analysis. By taking into account the full spatial pattern of brain activity measured simultaneously at many locations, these methods allow detecting subtle, non-strictly localized effects that may remain invisible to the conventional analysis with univariate statistical methods. In typical fMRI applications, pattern recognition algorithms "learn" a functional relationship between brain response patterns and a perceptual, cognitive or behavioral state of a subject expressed in terms of a label, which may assume discrete (classification) or continuous (regression) values. This learned functional relationship is then used to predict the unseen labels from a new data set ("brain reading"). In this article, we describe the mathematical foundations of machine learning applications in fMRI. We focus on two methods, support vector machines and relevance vector machines, which are respectively suited for the classification and regression of fMRI patterns. Furthermore, by means of several examples and applications, we illustrate and discuss the methodological challenges of using machine learning algorithms in the context of fMRI data analysis.

  2. Regression: A Bibliography.

    ERIC Educational Resources Information Center

    Pedrini, D. T.; Pedrini, Bonnie C.

    Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…

  3. Selection of higher order regression models in the analysis of multi-factorial transcription data.

    PubMed

    Prazeres da Costa, Olivia; Hoffman, Arthur; Rey, Johannes W; Mansmann, Ulrich; Buch, Thorsten; Tresch, Achim

    2014-01-01

    Many studies examine gene expression data that has been obtained under the influence of multiple factors, such as genetic background, environmental conditions, or exposure to diseases. The interplay of multiple factors may lead to effect modification and confounding. Higher order linear regression models can account for these effects. We present a new methodology for linear model selection and apply it to microarray data of bone marrow-derived macrophages. This experiment investigates the influence of three variable factors: the genetic background of the mice from which the macrophages were obtained, Yersinia enterocolitica infection (two strains, and a mock control), and treatment/non-treatment with interferon-γ. We set up four different linear regression models in a hierarchical order. We introduce the eruption plot as a new practical tool for model selection complementary to global testing. It visually compares the size and significance of effect estimates between two nested models. Using this methodology we were able to select the most appropriate model by keeping only relevant factors showing additional explanatory power. Application to experimental data allowed us to qualify the interaction of factors as either neutral (no interaction), alleviating (co-occurring effects are weaker than expected from the single effects), or aggravating (stronger than expected). We find a biologically meaningful gene cluster of putative C2TA target genes that appear to be co-regulated with MHC class II genes. We introduced the eruption plot as a tool for visual model comparison to identify relevant higher order interactions in the analysis of expression data obtained under the influence of multiple factors. We conclude that model selection in higher order linear regression models should generally be performed for the analysis of multi-factorial microarray data.

  4. Physical and Cognitive-Affective Factors Associated with Fatigue in Individuals with Fibromyalgia: A Multiple Regression Analysis

    ERIC Educational Resources Information Center

    Muller, Veronica; Brooks, Jessica; Tu, Wei-Mo; Moser, Erin; Lo, Chu-Ling; Chan, Fong

    2015-01-01

    Purpose: The main objective of this study was to determine the extent to which physical and cognitive-affective factors are associated with fibromyalgia (FM) fatigue. Method: A quantitative descriptive design using correlation techniques and multiple regression analysis. The participants consisted of 302 members of the National Fibromyalgia &…

  5. MIXOR: a computer program for mixed-effects ordinal regression analysis.

    PubMed

    Hedeker, D; Gibbons, R D

    1996-03-01

    MIXOR provides maximum marginal likelihood estimates for mixed-effects ordinal probit, logistic, and complementary log-log regression models. These models can be used for analysis of dichotomous and ordinal outcomes from either a clustered or longitudinal design. For clustered data, the mixed-effects model assumes that data within clusters are dependent. The degree of dependency is jointly estimated with the usual model parameters, thus adjusting for dependence resulting from clustering of the data. Similarly, for longitudinal data, the mixed-effects approach can allow for individual-varying intercepts and slopes across time, and can estimate the degree to which these time-related effects vary in the population of individuals. MIXOR uses marginal maximum likelihood estimation, utilizing a Fisher-scoring solution. For the scoring solution, the Cholesky factor of the random-effects variance-covariance matrix is estimated, along with the effects of model covariates. Examples illustrating usage and features of MIXOR are provided.

  6. Cancer prevalence and education by cancer site: logistic regression analysis.

    PubMed

    Johnson, Stephanie; Corsten, Martin J; McDonald, James T; Gupta, Michael

    2010-10-01

    Previously, using the American National Health Interview Survey (NHIS) and a logistic regression analysis, we found that upper aerodigestive tract (UADT) cancer is correlated with low socioeconomic status (SES). The objective of this study was to determine if this correlation between low SES and cancer prevalence exists for other cancers. We again used the NHIS and employed education level as our main measure of SES. We controlled for potentially confounding factors, including smoking status and alcohol consumption. We found that only two cancer subsites shared the pattern of increased prevalence with low education level and decreased prevalence with high education level: UADT cancer and cervical cancer. UADT cancer and cervical cancer were the only two cancers identified that had a link between prevalence and lower education level. This raises the possibility that an associated risk factor for the two cancers is causing the relationship between lower education level and prevalence.

  7. Statistical model to perform error analysis of curve fits of wind tunnel test data using the techniques of analysis of variance and regression analysis

    NASA Technical Reports Server (NTRS)

    Alston, D. W.

    1981-01-01

    The considered research had the objective to design a statistical model that could perform an error analysis of curve fits of wind tunnel test data using analysis of variance and regression analysis techniques. Four related subproblems were defined, and by solving each of these a solution to the general research problem was obtained. The capabilities of the evolved true statistical model are considered. The least squares fit is used to determine the nature of the force, moment, and pressure data. The order of the curve fit is increased in order to delete the quadratic effect in the residuals. The analysis of variance is used to determine the magnitude and effect of the error factor associated with the experimental data.

  8. Investigating bias in squared regression structure coefficients

    PubMed Central

    Nimon, Kim F.; Zientek, Linda R.; Thompson, Bruce

    2015-01-01

    The importance of structure coefficients and analogs of regression weights for analysis within the general linear model (GLM) has been well-documented. The purpose of this study was to investigate bias in squared structure coefficients in the context of multiple regression and to determine if a formula that had been shown to correct for bias in squared Pearson correlation coefficients and coefficients of determination could be used to correct for bias in squared regression structure coefficients. Using data from a Monte Carlo simulation, this study found that squared regression structure coefficients corrected with Pratt's formula produced less biased estimates and might be more accurate and stable estimates of population squared regression structure coefficients than estimates with no such corrections. While our findings are in line with prior literature that identified multicollinearity as a predictor of bias in squared regression structure coefficients but not coefficients of determination, the findings from this study are unique in that the level of predictive power, number of predictors, and sample size were also observed to contribute bias in squared regression structure coefficients. PMID:26217273

  9. SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES

    PubMed Central

    Zhu, Liping; Huang, Mian; Li, Runze

    2012-01-01

    This paper is concerned with quantile regression for a semiparametric regression model, in which both the conditional mean and conditional variance function of the response given the covariates admit a single-index structure. This semiparametric regression model enables us to reduce the dimension of the covariates and simultaneously retains the flexibility of nonparametric regression. Under mild conditions, we show that the simple linear quantile regression offers a consistent estimate of the index parameter vector. This is a surprising and interesting result because the single-index model is possibly misspecified under the linear quantile regression. With a root-n consistent estimate of the index vector, one may employ a local polynomial regression technique to estimate the conditional quantile function. This procedure is computationally efficient, which is very appealing in high-dimensional data analysis. We show that the resulting estimator of the quantile function performs asymptotically as efficiently as if the true value of the index vector were known. The methodologies are demonstrated through comprehensive simulation studies and an application to a real dataset. PMID:24501536

  10. Performance of an Axisymmetric Rocket Based Combined Cycle Engine During Rocket Only Operation Using Linear Regression Analysis

    NASA Technical Reports Server (NTRS)

    Smith, Timothy D.; Steffen, Christopher J., Jr.; Yungster, Shaye; Keller, Dennis J.

    1998-01-01

    The all rocket mode of operation is shown to be a critical factor in the overall performance of a rocket based combined cycle (RBCC) vehicle. An axisymmetric RBCC engine was used to determine specific impulse efficiency values based upon both full flow and gas generator configurations. Design of experiments methodology was used to construct a test matrix and multiple linear regression analysis was used to build parametric models. The main parameters investigated in this study were: rocket chamber pressure, rocket exit area ratio, injected secondary flow, mixer-ejector inlet area, mixer-ejector area ratio, and mixer-ejector length-to-inlet diameter ratio. A perfect gas computational fluid dynamics analysis, using both the Spalart-Allmaras and k-omega turbulence models, was performed with the NPARC code to obtain values of vacuum specific impulse. Results from the multiple linear regression analysis showed that for both the full flow and gas generator configurations increasing mixer-ejector area ratio and rocket area ratio increase performance, while increasing mixer-ejector inlet area ratio and mixer-ejector length-to-diameter ratio decrease performance. Increasing injected secondary flow increased performance for the gas generator analysis, but was not statistically significant for the full flow analysis. Chamber pressure was found to be not statistically significant.

  11. Analysis of the labor productivity of enterprises via quantile regression

    NASA Astrophysics Data System (ADS)

    Türkan, Semra

    2017-07-01

    In this study, we have analyzed the factors that affect the performance of Turkey's Top 500 Industrial Enterprises using quantile regression. The variable about labor productivity of enterprises is considered as dependent variable, the variableabout assets is considered as independent variable. The distribution of labor productivity of enterprises is right-skewed. If the dependent distribution is skewed, linear regression could not catch important aspects of the relationships between the dependent variable and its predictors due to modeling only the conditional mean. Hence, the quantile regression, which allows modelingany quantilesof the dependent distribution, including the median,appears to be useful. It examines whether relationships between dependent and independent variables are different for low, medium, and high percentiles. As a result of analyzing data, the effect of total assets is relatively constant over the entire distribution, except the upper tail. It hasa moderately stronger effect in the upper tail.

  12. Neither fixed nor random: weighted least squares meta-regression.

    PubMed

    Stanley, T D; Doucouliagos, Hristos

    2017-03-01

    Our study revisits and challenges two core conventional meta-regression estimators: the prevalent use of 'mixed-effects' or random-effects meta-regression analysis and the correction of standard errors that defines fixed-effects meta-regression analysis (FE-MRA). We show how and explain why an unrestricted weighted least squares MRA (WLS-MRA) estimator is superior to conventional random-effects (or mixed-effects) meta-regression when there is publication (or small-sample) bias that is as good as FE-MRA in all cases and better than fixed effects in most practical applications. Simulations and statistical theory show that WLS-MRA provides satisfactory estimates of meta-regression coefficients that are practically equivalent to mixed effects or random effects when there is no publication bias. When there is publication selection bias, WLS-MRA always has smaller bias than mixed effects or random effects. In practical applications, an unrestricted WLS meta-regression is likely to give practically equivalent or superior estimates to fixed-effects, random-effects, and mixed-effects meta-regression approaches. However, random-effects meta-regression remains viable and perhaps somewhat preferable if selection for statistical significance (publication bias) can be ruled out and when random, additive normal heterogeneity is known to directly affect the 'true' regression coefficient. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  13. Logistic regression analysis of factors associated with avascular necrosis of the femoral head following femoral neck fractures in middle-aged and elderly patients.

    PubMed

    Ai, Zi-Sheng; Gao, You-Shui; Sun, Yuan; Liu, Yue; Zhang, Chang-Qing; Jiang, Cheng-Hua

    2013-03-01

    Risk factors for femoral neck fracture-induced avascular necrosis of the femoral head have not been elucidated clearly in middle-aged and elderly patients. Moreover, the high incidence of screw removal in China and its effect on the fate of the involved femoral head require statistical methods to reflect their intrinsic relationship. Ninety-nine patients older than 45 years with femoral neck fracture were treated by internal fixation between May 1999 and April 2004. Descriptive analysis, interaction analysis between associated factors, single factor logistic regression, multivariate logistic regression, and detailed interaction analysis were employed to explore potential relationships among associated factors. Avascular necrosis of the femoral head was found in 15 cases (15.2 %). Age × the status of implants (removal vs. maintenance) and gender × the timing of reduction were interactive according to two-factor interactive analysis. Age, the displacement of fractures, the quality of reduction, and the status of implants were found to be significant factors in single factor logistic regression analysis. Age, age × the status of implants, and the quality of reduction were found to be significant factors in multivariate logistic regression analysis. In fine interaction analysis after multivariate logistic regression analysis, implant removal was the most important risk factor for avascular necrosis in 56-to-85-year-old patients, with a risk ratio of 26.00 (95 % CI = 3.076-219.747). The middle-aged and elderly have less incidence of avascular necrosis of the femoral head following femoral neck fractures treated by cannulated screws. The removal of cannulated screws can induce a significantly high incidence of avascular necrosis of the femoral head in elderly patients, while a high-quality reduction is helpful to reduce avascular necrosis.

  14. Applications of statistics to medical science, III. Correlation and regression.

    PubMed

    Watanabe, Hiroshi

    2012-01-01

    In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.

  15. Deletion Diagnostics for Alternating Logistic Regressions

    PubMed Central

    Preisser, John S.; By, Kunthel; Perin, Jamie; Qaqish, Bahjat F.

    2013-01-01

    Deletion diagnostics are introduced for the regression analysis of clustered binary outcomes estimated with alternating logistic regressions, an implementation of generalized estimating equations (GEE) that estimates regression coefficients in a marginal mean model and in a model for the intracluster association given by the log odds ratio. The diagnostics are developed within an estimating equations framework that recasts the estimating functions for association parameters based upon conditional residuals into equivalent functions based upon marginal residuals. Extensions of earlier work on GEE diagnostics follow directly, including computational formulae for one-step deletion diagnostics that measure the influence of a cluster of observations on the estimated regression parameters and on the overall marginal mean or association model fit. The diagnostic formulae are evaluated with simulations studies and with an application concerning an assessment of factors associated with health maintenance visits in primary care medical practices. The application and the simulations demonstrate that the proposed cluster-deletion diagnostics for alternating logistic regressions are good approximations of their exact fully iterated counterparts. PMID:22777960

  16. Poisson Mixture Regression Models for Heart Disease Prediction.

    PubMed

    Mufudza, Chipo; Erol, Hamza

    2016-01-01

    Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.

  17. Poisson Mixture Regression Models for Heart Disease Prediction

    PubMed Central

    Erol, Hamza

    2016-01-01

    Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611

  18. Stress Regression Analysis of Asphalt Concrete Deck Pavement Based on Orthogonal Experimental Design and Interlayer Contact

    NASA Astrophysics Data System (ADS)

    Wang, Xuntao; Feng, Jianhu; Wang, Hu; Hong, Shidi; Zheng, Supei

    2018-03-01

    A three-dimensional finite element box girder bridge and its asphalt concrete deck pavement were established by ANSYS software, and the interlayer bonding condition of asphalt concrete deck pavement was assumed to be contact bonding condition. Orthogonal experimental design is used to arrange the testing plans of material parameters, and an evaluation of the effect of different material parameters in the mechanical response of asphalt concrete surface layer was conducted by multiple linear regression model and using the results from the finite element analysis. Results indicated that stress regression equations can well predict the stress of the asphalt concrete surface layer, and elastic modulus of waterproof layer has a significant influence on stress values of asphalt concrete surface layer.

  19. Correlation and simple linear regression.

    PubMed

    Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G

    2003-06-01

    In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.

  20. Secure and Efficient Regression Analysis Using a Hybrid Cryptographic Framework: Development and Evaluation.

    PubMed

    Sadat, Md Nazmus; Jiang, Xiaoqian; Aziz, Md Momin Al; Wang, Shuang; Mohammed, Noman

    2018-03-05

    Machine learning is an effective data-driven tool that is being widely used to extract valuable patterns and insights from data. Specifically, predictive machine learning models are very important in health care for clinical data analysis. The machine learning algorithms that generate predictive models often require pooling data from different sources to discover statistical patterns or correlations among different attributes of the input data. The primary challenge is to fulfill one major objective: preserving the privacy of individuals while discovering knowledge from data. Our objective was to develop a hybrid cryptographic framework for performing regression analysis over distributed data in a secure and efficient way. Existing secure computation schemes are not suitable for processing the large-scale data that are used in cutting-edge machine learning applications. We designed, developed, and evaluated a hybrid cryptographic framework, which can securely perform regression analysis, a fundamental machine learning algorithm using somewhat homomorphic encryption and a newly introduced secure hardware component of Intel Software Guard Extensions (Intel SGX) to ensure both privacy and efficiency at the same time. Experimental results demonstrate that our proposed method provides a better trade-off in terms of security and efficiency than solely secure hardware-based methods. Besides, there is no approximation error. Computed model parameters are exactly similar to plaintext results. To the best of our knowledge, this kind of secure computation model using a hybrid cryptographic framework, which leverages both somewhat homomorphic encryption and Intel SGX, is not proposed or evaluated to this date. Our proposed framework ensures data security and computational efficiency at the same time. ©Md Nazmus Sadat, Xiaoqian Jiang, Md Momin Al Aziz, Shuang Wang, Noman Mohammed. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 05.03.2018.

  1. The comparison of robust partial least squares regression with robust principal component regression on a real

    NASA Astrophysics Data System (ADS)

    Polat, Esra; Gunay, Suleyman

    2013-10-01

    One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.

  2. Multiple regression for physiological data analysis: the problem of multicollinearity.

    PubMed

    Slinker, B K; Glantz, S A

    1985-07-01

    Multiple linear regression, in which several predictor variables are related to a response variable, is a powerful statistical tool for gaining quantitative insight into complex in vivo physiological systems. For these insights to be correct, all predictor variables must be uncorrelated. However, in many physiological experiments the predictor variables cannot be precisely controlled and thus change in parallel (i.e., they are highly correlated). There is a redundancy of information about the response, a situation called multicollinearity, that leads to numerical problems in estimating the parameters in regression equations; the parameters are often of incorrect magnitude or sign or have large standard errors. Although multicollinearity can be avoided with good experimental design, not all interesting physiological questions can be studied without encountering multicollinearity. In these cases various ad hoc procedures have been proposed to mitigate multicollinearity. Although many of these procedures are controversial, they can be helpful in applying multiple linear regression to some physiological problems.

  3. [Influencing factors on depression among medical staff in Hunan province under ordinal regression analysis].

    PubMed

    Liu, Zhi-yu; Zhong, Meng; Hai, Yan; Du, Qi-yun; Wang, Ai-hua; Xie, Dong-hua

    2012-11-01

    To understand the situation of depression and its related influencing factors among medical staff in Hunan province. Data were collected through random sampling with multi-stage stratified cluster. Wilcoxon rank sum test, Kruskal-Wallis H test and Ordinal regression analysis were used for data analysis by SPSS 17.0 software. This survey was including 16,000 medical personnel with 14, 988 valid questionnaires and the effective rate was 93.68%. from the single factor analysis showed that factors as: level of the hospital grading, gender, education background, age, occupation, title, departments, the number of continue education, income, working overtime every week, the frequency of night work, the number of patients treated in the emergency room etc., had statistical significances (P < 0.05). Data from ordinal regression showed that the probabilities related to depression that clinicians and nurses suffering from were 1.58 times more than the pharmacists (OR = 1.58, 95%CI: 1.30 - 1.92). The probability among those whose income was less than 2000 Yuan/month was 2.19 times of the ones whose earned more than 3000 Yuan/month (OR = 2.19, 95%CI: 2.05 - 2.35). The higher the numbers of days with working overtime every week, the frequencies of night work, and the numbers of patients being treated at the emergency room, with more probabilities of the people with depression seen in our study. Depression seemed to be common among doctors and nurses. We suggested that the government need to increase the monthly income and to reduce the workload and intensity, lessen the overworking time, etc.

  4. Regression Analysis of Combined Gene Expression Regulation in Acute Myeloid Leukemia

    PubMed Central

    Li, Yue; Liang, Minggao; Zhang, Zhaolei

    2014-01-01

    Gene expression is a combinatorial function of genetic/epigenetic factors such as copy number variation (CNV), DNA methylation (DM), transcription factors (TF) occupancy, and microRNA (miRNA) post-transcriptional regulation. At the maturity of microarray/sequencing technologies, large amounts of data measuring the genome-wide signals of those factors became available from Encyclopedia of DNA Elements (ENCODE) and The Cancer Genome Atlas (TCGA). However, there is a lack of an integrative model to take full advantage of these rich yet heterogeneous data. To this end, we developed RACER (Regression Analysis of Combined Expression Regulation), which fits the mRNA expression as response using as explanatory variables, the TF data from ENCODE, and CNV, DM, miRNA expression signals from TCGA. Briefly, RACER first infers the sample-specific regulatory activities by TFs and miRNAs, which are then used as inputs to infer specific TF/miRNA-gene interactions. Such a two-stage regression framework circumvents a common difficulty in integrating ENCODE data measured in generic cell-line with the sample-specific TCGA measurements. As a case study, we integrated Acute Myeloid Leukemia (AML) data from TCGA and the related TF binding data measured in K562 from ENCODE. As a proof-of-concept, we first verified our model formalism by 10-fold cross-validation on predicting gene expression. We next evaluated RACER on recovering known regulatory interactions, and demonstrated its superior statistical power over existing methods in detecting known miRNA/TF targets. Additionally, we developed a feature selection procedure, which identified 18 regulators, whose activities clustered consistently with cytogenetic risk groups. One of the selected regulators is miR-548p, whose inferred targets were significantly enriched for leukemia-related pathway, implicating its novel role in AML pathogenesis. Moreover, survival analysis using the inferred activities identified C-Fos as a potential AML

  5. Grades, Gender, and Encouragement: A Regression Discontinuity Analysis

    ERIC Educational Resources Information Center

    Owen, Ann L.

    2010-01-01

    The author employs a regression discontinuity design to provide direct evidence on the effects of grades earned in economics principles classes on the decision to major in economics and finds a differential effect for male and female students. Specifically, for female students, receiving an A for a final grade in the first economics class is…

  6. Increasing incidence of thyroid cancer in the Nordic countries with main focus on Swedish data.

    PubMed

    Carlberg, Michael; Hedendahl, Lena; Ahonen, Mikko; Koppel, Tarmo; Hardell, Lennart

    2016-07-07

    Radiofrequency radiation in the frequency range 30 kHz-300 GHz was evaluated to be Group 2B, i.e. 'possibly' carcinogenic to humans, by the International Agency for Research on Cancer (IARC) at WHO in May 2011. Among the evaluated devices were mobile and cordless phones, since they emit radiofrequency electromagnetic fields (RF-EMF). In addition to the brain, another organ, the thyroid gland, also receives high exposure. The incidence of thyroid cancer is increasing in many countries, especially the papillary type that is the most radiosensitive type. We used the Swedish Cancer Register to study the incidence of thyroid cancer during 1970-2013 using joinpoint regression analysis. In women, the incidence increased statistically significantly during the whole study period; average annual percentage change (AAPC) +1.19 % (95 % confidence interval (CI) +0.56, +1.83 %). Two joinpoints were detected, 1979 and 2001, with a high increase of the incidence during the last period 2001-2013 with an annual percentage change (APC) of +5.34 % (95 % CI +3.93, +6.77 %). AAPC for all men during 1970-2013 was +0.77 % (95 % CI -0.03, +1.58 %). One joinpoint was detected in 2005 with a statistically significant increase in incidence during 2005-2013; APC +7.56 % (95 % CI +3.34, +11.96 %). Based on NORDCAN data, there was a statistically significant increase in the incidence of thyroid cancer in the Nordic countries during the same time period. In both women and men a joinpoint was detected in 2006. The incidence increased during 2006-2013 in women; APC +6.16 % (95 % CI +3.94, +8.42 %) and in men; APC +6.84 % (95 % CI +3.69, +10.08 %), thus showing similar results as the Swedish Cancer Register. Analyses based on data from the Cancer Register showed that the increasing trend in Sweden was mainly caused by thyroid cancer of the papillary type. We postulate that the whole increase cannot be attributed to better diagnostic procedures. Increasing exposure to ionizing

  7. Detection of epistatic effects with logic regression and a classical linear regression model.

    PubMed

    Malina, Magdalena; Ickstadt, Katja; Schwender, Holger; Posch, Martin; Bogdan, Małgorzata

    2014-02-01

    To locate multiple interacting quantitative trait loci (QTL) influencing a trait of interest within experimental populations, usually methods as the Cockerham's model are applied. Within this framework, interactions are understood as the part of the joined effect of several genes which cannot be explained as the sum of their additive effects. However, if a change in the phenotype (as disease) is caused by Boolean combinations of genotypes of several QTLs, this Cockerham's approach is often not capable to identify them properly. To detect such interactions more efficiently, we propose a logic regression framework. Even though with the logic regression approach a larger number of models has to be considered (requiring more stringent multiple testing correction) the efficient representation of higher order logic interactions in logic regression models leads to a significant increase of power to detect such interactions as compared to a Cockerham's approach. The increase in power is demonstrated analytically for a simple two-way interaction model and illustrated in more complex settings with simulation study and real data analysis.

  8. Meta-analysis and meta-regression analysis of outcomes of carotid endarterectomy and stenting in the elderly.

    PubMed

    Antoniou, George A; Georgiadis, George S; Georgakarakos, Efstratios I; Antoniou, Stavros A; Bessias, Nikos; Smyth, John Vincent; Murray, David; Lazarides, Miltos K

    2013-12-01

    Uncertainty exists about the influence of advanced age on the outcomes of carotid revascularization. To undertake a comprehensive review of the literature and conduct an analysis of the outcomes of carotid interventions in the elderly. A systematic literature review was conducted to identify articles comparing early outcomes of carotid endarterectomy (CEA) or carotid stenting (CAS) in elderly and young patients. Combined overall effect sizes were calculated using fixed or random effects models. Meta-regression models were formed to explore potential heterogeneity as a result of changes in practice over time. RESULTS Our analysis comprised 44 studies reporting data on 512,685 CEA and 75,201 CAS procedures. Carotid stenting was associated with increased incidence of stroke in elderly patients compared with their young counterparts (odds ratio [OR], 1.56; 95% CI, 1.40-1.75), whereas CEA had equivalent cerebrovascular outcomes in old and young age groups (OR, 0.94; 95% CI, 0.88-0.99). Carotid stenting had similar peri-interventional mortality risks in old and young patients (OR, 0.86; 95% CI, 0.72-1.03), whereas CEA was associated with heightened mortality in elderly patients (OR, 1.62; 95% CI, 1.47-1.77). The incidence of myocardial infarction was increased in patients of advanced age in both CEA and CAS (OR, 1.64; 95% CI, 1.57-1.72 and OR, 1.30; 95% CI, 1.16-1.45, respectively). Meta-regression analyses revealed a significant effect of publication date on peri-interventional stroke (P = .003) and mortality (P < .001) in CAS. Age should be considered when planning a carotid intervention. Carotid stenting has an increased risk of adverse cerebrovascular events in elderly patients but mortality equivalent to younger patients. Carotid endarterectomy is associated with similar neurologic outcomes in elderly and young patients, at the expense of increased mortality.

  9. Risk factors for pedicled flap necrosis in hand soft tissue reconstruction: a multivariate logistic regression analysis.

    PubMed

    Gong, Xu; Cui, Jianli; Jiang, Ziping; Lu, Laijin; Li, Xiucun

    2018-03-01

    Few clinical retrospective studies have reported the risk factors of pedicled flap necrosis in hand soft tissue reconstruction. The aim of this study was to identify non-technical risk factors associated with pedicled flap perioperative necrosis in hand soft tissue reconstruction via a multivariate logistic regression analysis. For patients with hand soft tissue reconstruction, we carefully reviewed hospital records and identified 163 patients who met the inclusion criteria. The characteristics of these patients, flap transfer procedures and postoperative complications were recorded. Eleven predictors were identified. The correlations between pedicled flap necrosis and risk factors were analysed using a logistic regression model. Of 163 skin flaps, 125 flaps survived completely without any complications. The pedicled flap necrosis rate in hands was 11.04%, which included partial flap necrosis (7.36%) and total flap necrosis (3.68%). Soft tissue defects in fingers were noted in 68.10% of all cases. The logistic regression analysis indicated that the soft tissue defect site (P = 0.046, odds ratio (OR) = 0.079, confidence interval (CI) (0.006, 0.959)), flap size (P = 0.020, OR = 1.024, CI (1.004, 1.045)) and postoperative wound infection (P < 0.001, OR = 17.407, CI (3.821, 79.303)) were statistically significant risk factors for pedicled flap necrosis of the hand. Soft tissue defect site, flap size and postoperative wound infection were risk factors associated with pedicled flap necrosis in hand soft tissue defect reconstruction. © 2017 Royal Australasian College of Surgeons.

  10. Statistical research using the multiple regression analysis in areas of the cast hipereutectoid steel rolls manufacturing

    NASA Astrophysics Data System (ADS)

    Kiss, I.; Alexa, V.; Serban, S.; Rackov, M.; Čavić, M.

    2018-01-01

    The cast hipereutectoid steel (usually named Adamite) is a roll manufacturing destined material, having mechanical, chemical properties and Carbon [C] content of which stands between steelandiron, along-withitsalloyelements such as Nickel [Ni], Chrome [Cr], Molybdenum [Mo] and/or other alloy elements. Adamite Rolls are basically alloy steel rolls (a kind of high carbon steel) having hardness ranging from 40 to 55 degrees Shore C, with Carbon [C] percentage ranging from 1.35% until to 2% (usually between 1.2˜2.3%), the extra Carbon [C] and the special alloying element giving an extra wear resistance and strength. First of all the Adamite roll’s prominent feature is the small variation in hardness of the working surface, and has a good abrasion resistance and bite performance. This paper reviews key aspects of roll material properties and presents an analysis of the influences of chemical composition upon the mechanical properties (hardness) of the cast hipereutectoid steel rolls (Adamite). Using the multiple regression analysis (the double and triple regression equations), some mathematical correlations between the cast hipereutectoid steel rolls’ chemical composition and the obtained hardness are presented. In this work several results and evidence obtained by actual experiments are presented. Thus, several variation boundaries for the chemical composition of cast hipereutectoid steel rolls, in view the obtaining the proper values of the hardness, are revealed. For the multiple regression equations, correlation coefficients and graphical representations the software Matlab was used.

  11. An analysis of input errors in precipitation-runoff models using regression with errors in the independent variables

    USGS Publications Warehouse

    Troutman, Brent M.

    1982-01-01

    Errors in runoff prediction caused by input data errors are analyzed by treating precipitation-runoff models as regression (conditional expectation) models. Independent variables of the regression consist of precipitation and other input measurements; the dependent variable is runoff. In models using erroneous input data, prediction errors are inflated and estimates of expected storm runoff for given observed input variables are biased. This bias in expected runoff estimation results in biased parameter estimates if these parameter estimates are obtained by a least squares fit of predicted to observed runoff values. The problems of error inflation and bias are examined in detail for a simple linear regression of runoff on rainfall and for a nonlinear U.S. Geological Survey precipitation-runoff model. Some implications for flood frequency analysis are considered. A case study using a set of data from Turtle Creek near Dallas, Texas illustrates the problems of model input errors.

  12. The Use of Multiple Regression Models to Determine if Conjoint Analysis Should Be Conducted on Aggregate Data.

    ERIC Educational Resources Information Center

    Fraas, John W.; Newman, Isadore

    1996-01-01

    In a conjoint-analysis consumer-preference study, researchers must determine whether the product factor estimates, which measure consumer preferences, should be calculated and interpreted for each respondent or collectively. Multiple regression models can determine whether to aggregate data by examining factor-respondent interaction effects. This…

  13. Impact of Dobutamine in Patients With Septic Shock: A Meta-Regression Analysis.

    PubMed

    Nadeem, Rashid; Sockanathan, Shivani; Singh, Mukesh; Hussain, Tamseela; Kent, Patrick; AbuAlreesh, Sarah

    2017-05-01

    Septic shock frequently requires vasopressor agents. Conflicting evidence exists for use of inotropes in patients with septic shock. Data from English studies on human adult septic shock patients were collected. A total of 83 studies were reviewed, while 11 studies with 21 data sets including 239 patients were pooled for meta-regression analysis. For VO2, pooled difference in means (PDM) was 0.274. For cardiac index (CI), PDM was 0.783. For delivery of oxygen, PDM was -0.890. For heart rate, PDM was -0.714. For left ventricle stroke work index, PDM was 0.375. For mean arterial pressure, PDM was -0.204. For mean pulmonary artery pressure, PDM was 0.085. For O2 extraction, PDM was 0.647. For PaCO2, PDM was -0.053. For PaO2, PDM was 0.282. For pulmonary artery occlusive pressure, PDM was 0.270. For pulmonary capillary wedge pressure, PDM was 0.300. For PVO2, PDM was -0.492. For right atrial pressure, PDM was 0.246. For SaO2, PDM was 0.604. For stroke volume index, PDM was 0.446. For SvO2, PDM was -0.816. For systemic vascular resistance, PDM was -0.600. For systemic vascular resistance index, PDM was 0.319. Meta-regression analysis was performed for VO2, DO2, CI, and O2 extraction. Age was found to be significant confounding factor for CI, DO2, and O2 extraction. APACHE score was not found to be a significant confounding factor for any of the parameters. Dobutamine seems to have a positive effect on cardiovascular parameters in patients with septic shock. Prospective studies with larger samples are required to further validate this observation.

  14. Brain networks of temporal preparation: A multiple regression analysis of neuropsychological data.

    PubMed

    Triviño, Mónica; Correa, Ángel; Lupiáñez, Juan; Funes, María Jesús; Catena, Andrés; He, Xun; Humphreys, Glyn W

    2016-11-15

    There are only a few studies on the brain networks involved in the ability to prepare in time, and most of them followed a correlational rather than a neuropsychological approach. The present neuropsychological study performed multiple regression analysis to address the relationship between both grey and white matter (measured by magnetic resonance imaging in patients with brain lesion) and different effects in temporal preparation (Temporal orienting, Foreperiod and Sequential effects). Two versions of a temporal preparation task were administered to a group of 23 patients with acquired brain injury. In one task, the cue presented (a red versus green square) to inform participants about the time of appearance (early versus late) of a target stimulus was blocked, while in the other task the cue was manipulated on a trial-by-trial basis. The duration of the cue-target time intervals (400 versus 1400ms) was always manipulated within blocks in both tasks. Regression analysis were conducted between either the grey matter lesion size or the white matter tracts disconnection and the three temporal preparation effects separately. The main finding was that each temporal preparation effect was predicted by a different network of structures, depending on cue expectancy. Specifically, the Temporal orienting effect was related to both prefrontal and temporal brain areas. The Foreperiod effect was related to right and left prefrontal structures. Sequential effects were predicted by both parietal cortex and left subcortical structures. These findings show a clear dissociation of brain circuits involved in the different ways to prepare in time, showing for the first time the involvement of temporal areas in the Temporal orienting effect, as well as the parietal cortex in the Sequential effects. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. FIRE: an SPSS program for variable selection in multiple linear regression analysis via the relative importance of predictors.

    PubMed

    Lorenzo-Seva, Urbano; Ferrando, Pere J

    2011-03-01

    We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.

  16. Introduction to the use of regression models in epidemiology.

    PubMed

    Bender, Ralf

    2009-01-01

    Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.

  17. Quantile Regression in the Study of Developmental Sciences

    PubMed Central

    Petscher, Yaacov; Logan, Jessica A. R.

    2014-01-01

    Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstrated and contrasted with linear regression when considering models with: (a) one continuous predictor, (b) one dichotomous predictor, (c) a continuous and a dichotomous predictor, and (d) a longitudinal application. Results from each example exhibited the differential inferences which may be drawn using linear or quantile regression. PMID:24329596

  18. Global Prevalence of Elder Abuse: A Meta-analysis and Meta-regression.

    PubMed

    Ho, C Sh; Wong, S Y; Chiu, M M; Ho, R Cm

    2017-06-01

    Elder abuse is increasingly recognised as a global public health and social problem. There has been limited inter-study comparison of the prevalence and risk factors for elder abuse. This study aimed to estimate the pooled and subtype prevalence of elder abuse worldwide and identify significant associated risk factors. We conducted a meta-analysis and meta-regression of 34 population-based and 17 non-population-based studies. The pooled prevalences of elder abuse were 10.0% (95% confidence interval, 5.2%-18.6%) and 34.3% (95% confidence interval, 22.9%-47.8%) in population-based studies and third party- or caregiver-reported studies, respectively. Being in a marital relationship was found to be a significant moderator using random-effects model. This meta-analysis revealed that third parties or caregivers were more likely to report abuse than older abused adults. Subgroup analyses showed that females and those resident in non-western countries were more likely to be abused. Emotional abuse was the most prevalent elder abuse subtype and financial abuse was less commonly reported by third parties or caregivers. Heterogeneity in the prevalence was due to the high proportion of married older adults in the sample. Subgroup analysis showed that cultural factors, subtypes of abuse, and gender also contributed to heterogeneity in the pooled prevalence of elder abuse.

  19. Digression and Value Concatenation to Enable Privacy-Preserving Regression.

    PubMed

    Li, Xiao-Bai; Sarkar, Sumit

    2014-09-01

    Regression techniques can be used not only for legitimate data analysis, but also to infer private information about individuals. In this paper, we demonstrate that regression trees, a popular data-analysis and data-mining technique, can be used to effectively reveal individuals' sensitive data. This problem, which we call a "regression attack," has not been addressed in the data privacy literature, and existing privacy-preserving techniques are not appropriate in coping with this problem. We propose a new approach to counter regression attacks. To protect against privacy disclosure, our approach introduces a novel measure, called digression , which assesses the sensitive value disclosure risk in the process of building a regression tree model. Specifically, we develop an algorithm that uses the measure for pruning the tree to limit disclosure of sensitive data. We also propose a dynamic value-concatenation method for anonymizing data, which better preserves data utility than a user-defined generalization scheme commonly used in existing approaches. Our approach can be used for anonymizing both numeric and categorical data. An experimental study is conducted using real-world financial, economic and healthcare data. The results of the experiments demonstrate that the proposed approach is very effective in protecting data privacy while preserving data quality for research and analysis.

  20. Regression Models for the Analysis of Longitudinal Gaussian Data from Multiple Sources

    PubMed Central

    O’Brien, Liam M.; Fitzmaurice, Garrett M.

    2006-01-01

    We present a regression model for the joint analysis of longitudinal multiple source Gaussian data. Longitudinal multiple source data arise when repeated measurements are taken from two or more sources, and each source provides a measure of the same underlying variable and on the same scale. This type of data generally produces a relatively large number of observations per subject; thus estimation of an unstructured covariance matrix often may not be possible. We consider two methods by which parsimonious models for the covariance can be obtained for longitudinal multiple source data. The methods are illustrated with an example of multiple informant data arising from a longitudinal interventional trial in psychiatry. PMID:15726666

  1. Identification of Sexually Abused Female Adolescents at Risk for Suicidal Ideations: A Classification and Regression Tree Analysis

    ERIC Educational Resources Information Center

    Brabant, Marie-Eve; Hebert, Martine; Chagnon, Francois

    2013-01-01

    This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression,…

  2. Relationship between the curve of Spee and craniofacial variables: A regression analysis.

    PubMed

    Halimi, Abdelali; Benyahia, Hicham; Azeroual, Mohamed-Faouzi; Bahije, Loubna; Zaoui, Fatima

    2018-06-01

    The aim of this regression analysis was to identify the determining factors, which impact the curve of Spee during its genesis, its therapeutic reconstruction, and its stability, within a continuously evolving craniofacial morphology throughout life. We selected a total of 107 patients, according to the inclusion criteria. A morphological and functional clinical examination was performed for each patient: plaster models, tracing of the curve of Spee, crowding, Angle's classification, overjet and overbite were thus recorded. Then, we made a cephalometric analysis based on the standardized lateral cephalograms. In the sagittal dimension, we measured the values of angles ANB, SNA, SNB, SND, I/i; and the following distances: AoBo, I/NA, i/NB, SE and SL. In the vertical dimension, we measured the values of angles FMA, GoGn/SN, the occlusal plane, and the following distances: SAr, ArD, Ar/Con, Con/Gn, GoPo, HFP, HFA and IF. The statistical analysis was performed using the SPSS software with a significance level of 0.05. Our sample including 107 subjects was composed of 77 female patients (71.3%) and 30 male patients (27.8%) 7 hypodivergent patients (6.5%), 56 hyperdivergent patients (52.3%) and 44 normodivergent patients (41.1%). Patients' mean age was 19.35±5.95 years. The hypodivergent patients presented more pronounced curves of Spee compared to the normodivergent and the hyperdivergent populations; patients in skeletal Class I presented less pronounced curves of Spee compared to patients in skeletal Class II and Class III. These differences were non significant (P>0.05). The curve of Spee was positively and moderately correlated with Angle's classification, overjet, overbite, sellion-articulare distance, and breathing type (P<0.05). We found no correlation between age, gender and the other parameters included in the study with the curve of Spee (P>0.05). Seventy five percent (75%) of the hyperdivergent patients with an oral breathing presented an overbite of 3mm

  3. Quantile Regression in the Study of Developmental Sciences

    ERIC Educational Resources Information Center

    Petscher, Yaacov; Logan, Jessica A. R.

    2014-01-01

    Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of…

  4. Interquantile Shrinkage in Regression Models

    PubMed Central

    Jiang, Liewen; Wang, Huixia Judy; Bondell, Howard D.

    2012-01-01

    Conventional analysis using quantile regression typically focuses on fitting the regression model at different quantiles separately. However, in situations where the quantile coefficients share some common feature, joint modeling of multiple quantiles to accommodate the commonality often leads to more efficient estimation. One example of common features is that a predictor may have a constant effect over one region of quantile levels but varying effects in other regions. To automatically perform estimation and detection of the interquantile commonality, we develop two penalization methods. When the quantile slope coefficients indeed do not change across quantile levels, the proposed methods will shrink the slopes towards constant and thus improve the estimation efficiency. We establish the oracle properties of the two proposed penalization methods. Through numerical investigations, we demonstrate that the proposed methods lead to estimations with competitive or higher efficiency than the standard quantile regression estimation in finite samples. Supplemental materials for the article are available online. PMID:24363546

  5. Local Linear Regression for Data with AR Errors.

    PubMed

    Li, Runze; Li, Yan

    2009-07-01

    In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and the profile least squares techniques. We further propose the SCAD penalized profile least squares method to determine the order of auto-regressive process. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the performance of the proposed procedures with the existing one. From our empirical studies, the newly proposed procedures can dramatically improve the accuracy of naive local linear regression with working-independent error structure. We illustrate the proposed methodology by an analysis of real data set.

  6. Using "Excel" for White's Test--An Important Technique for Evaluating the Equality of Variance Assumption and Model Specification in a Regression Analysis

    ERIC Educational Resources Information Center

    Berenson, Mark L.

    2013-01-01

    There is consensus in the statistical literature that severe departures from its assumptions invalidate the use of regression modeling for purposes of inference. The assumptions of regression modeling are usually evaluated subjectively through visual, graphic displays in a residual analysis but such an approach, taken alone, may be insufficient…

  7. National trends in the recommendation of radiotherapy after prostatectomy for prostate cancer before and after the reporting of a survival benefit in March 2009.

    PubMed

    Mahal, Brandon A; Hoffman, Karen E; Efstathiou, Jason A; Nguyen, Paul L

    2015-06-01

    Three randomized trials demonstrated that postprostatectomy adjuvant radiotherapy improves biochemical disease-free survival for patients with adverse pathologic features, and 1 trial found adjuvant radiotherapy improves overall survival. We sought to determine whether postprostatectomy radiotherapy (PPRT) utilization changed after publication of the survival benefit in March 2009. The Surveillance, Epidemiology, and End Results database was used to identify men diagnosed with prostate cancer from 2004 to 2011 who met criteria for enrollment in the randomized trials (positive margins and/or pT3-4 disease at radical prostatectomy). Joinpoint regression identified inflection points in PPRT utilization. Logistic regression was used to evaluate factors associated with PPRT recommendation. Of 35,361 men, 5104 (14.4%) received a recommendation for PPRT. In joinpoint regression, 2009 was the inflection point in PPRT utilization. In multivariable analysis, PPRT recommendations were more likely after March 2009 than before 15.8% vs. 13.5%, adjusted odds ratio (AOR; 1.09; 95% confidence interval [CI], 1.02-1.16; P = .008), in men with pT3 (vs. pT2, AOR, 2.81; 95% CI, 2.53-3.11; P < .001), pT4 (vs. pT2 AOR, 4.62; 95% CI, 3.85-5.54; P < .001), or margin positive (AOR, 1.46; 95% CI, 1.34-1.58; P < .001) disease and in men who were younger (per year decrease, AOR, 1.02; 95% CI, 1.02-1.03; P < .001), married (AOR, 1.10; 95% CI, 1.02-1.19; P = .01), or lived in metropolitan areas (AOR, 1.30; 95% CI, 1.16-1.47; P < .001). PPRT recommendations increased after the reporting of a survival benefit in March 2009, but absolute utilization rates remain low, suggesting that the oncologic community remains unconvinced that PPRT is needed for most patients with adverse features. Further work is needed to identify patients who might benefit most from PPRT. Copyright © 2015 Elsevier Inc. All rights reserved.

  8. [Spatial interpolation of soil organic matter using regression Kriging and geographically weighted regression Kriging].

    PubMed

    Yang, Shun-hua; Zhang, Hai-tao; Guo, Long; Ren, Yan

    2015-06-01

    Relative elevation and stream power index were selected as auxiliary variables based on correlation analysis for mapping soil organic matter. Geographically weighted regression Kriging (GWRK) and regression Kriging (RK) were used for spatial interpolation of soil organic matter and compared with ordinary Kriging (OK), which acts as a control. The results indicated that soil or- ganic matter was significantly positively correlated with relative elevation whilst it had a significantly negative correlation with stream power index. Semivariance analysis showed that both soil organic matter content and its residuals (including ordinary least square regression residual and GWR resi- dual) had strong spatial autocorrelation. Interpolation accuracies by different methods were esti- mated based on a data set of 98 validation samples. Results showed that the mean error (ME), mean absolute error (MAE) and root mean square error (RMSE) of RK were respectively 39.2%, 17.7% and 20.6% lower than the corresponding values of OK, with a relative-improvement (RI) of 20.63. GWRK showed a similar tendency, having its ME, MAE and RMSE to be respectively 60.6%, 23.7% and 27.6% lower than those of OK, with a RI of 59.79. Therefore, both RK and GWRK significantly improved the accuracy of OK interpolation of soil organic matter due to their in- corporation of auxiliary variables. In addition, GWRK performed obviously better than RK did in this study, and its improved performance should be attributed to the consideration of sample spatial locations.

  9. Identifying maternal and infant factors associated with newborn size in rural Bangladesh by partial least squares (PLS) regression analysis

    PubMed Central

    Rahman, Md. Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D. W.; Labrique, Alain B.; Rashid, Mahbubur; Christian, Parul; West, Keith P.

    2017-01-01

    Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 − -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset. PMID:29261760

  10. Identifying maternal and infant factors associated with newborn size in rural Bangladesh by partial least squares (PLS) regression analysis.

    PubMed

    Kabir, Alamgir; Rahman, Md Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D W; Labrique, Alain B; Rashid, Mahbubur; Christian, Parul; West, Keith P

    2017-01-01

    Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 - -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset.

  11. Serum S100B Is Related to Illness Duration and Clinical Symptoms in Schizophrenia—A Meta-Regression Analysis

    PubMed Central

    Schümberg, Katharina; Polyakova, Maryna; Steiner, Johann; Schroeter, Matthias L.

    2016-01-01

    S100B has been linked to glial pathology in several psychiatric disorders. Previous studies found higher S100B serum levels in patients with schizophrenia compared to healthy controls, and a number of covariates influencing the size of this effect have been proposed in the literature. Here, we conducted a meta-analysis and meta-regression analysis on alterations of serum S100B in schizophrenia in comparison with healthy control subjects. The meta-analysis followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement to guarantee a high quality and reproducibility. With strict inclusion criteria 19 original studies could be included in the quantitative meta-analysis, comprising a total of 766 patients and 607 healthy control subjects. The meta-analysis confirmed higher values of the glial serum marker S100B in schizophrenia if compared with control subjects. Meta-regression analyses revealed significant effects of illness duration and clinical symptomatology, in particular the total score of the Positive and Negative Syndrome Scale (PANSS), on serum S100B levels in schizophrenia. In sum, results confirm glial pathology in schizophrenia that is modulated by illness duration and related to clinical symptomatology. Further studies are needed to investigate mechanisms and mediating factors related to these findings. PMID:26941608

  12. Value of Information Analysis for Time-lapse Seismic Data by Simulation-Regression

    NASA Astrophysics Data System (ADS)

    Dutta, G.; Mukerji, T.; Eidsvik, J.

    2016-12-01

    A novel method to estimate the Value of Information (VOI) of time-lapse seismic data in the context of reservoir development is proposed. VOI is a decision analytic metric quantifying the incremental value that would be created by collecting information prior to making a decision under uncertainty. The VOI has to be computed before collecting the information and can be used to justify its collection. Previous work on estimating the VOI of geophysical data has involved explicit approximation of the posterior distribution of reservoir properties given the data and then evaluating the prospect values for that posterior distribution of reservoir properties. Here, we propose to directly estimate the prospect values given the data by building a statistical relationship between them using regression. Various regression techniques such as Partial Least Squares Regression (PLSR), Multivariate Adaptive Regression Splines (MARS) and k-Nearest Neighbors (k-NN) are used to estimate the VOI, and the results compared. For a univariate Gaussian case, the VOI obtained from simulation-regression has been shown to be close to the analytical solution. Estimating VOI by simulation-regression is much less computationally expensive since the posterior distribution of reservoir properties given each possible dataset need not be modeled and the prospect values need not be evaluated for each such posterior distribution of reservoir properties. This method is flexible, since it does not require rigid model specification of posterior but rather fits conditional expectations non-parametrically from samples of values and data.

  13. Prediction by regression and intrarange data scatter in surface-process studies

    USGS Publications Warehouse

    Toy, T.J.; Osterkamp, W.R.; Renard, K.G.

    1993-01-01

    Modeling is a major component of contemporary earth science, and regression analysis occupies a central position in the parameterization, calibration, and validation of geomorphic and hydrologic models. Although this methodology can be used in many ways, we are primarily concerned with the prediction of values for one variable from another variable. Examination of the literature reveals considerable inconsistency in the presentation of the results of regression analysis and the occurrence of patterns in the scatter of data points about the regression line. Both circumstances confound utilization and evaluation of the models. Statisticians are well aware of various problems associated with the use of regression analysis and offer improved practices; often, however, their guidelines are not followed. After a review of the aforementioned circumstances and until standard criteria for model evaluation become established, we recommend, as a minimum, inclusion of scatter diagrams, the standard error of the estimate, and sample size in reporting the results of regression analyses for most surface-process studies. ?? 1993 Springer-Verlag.

  14. Estimation of lung tumor position from multiple anatomical features on 4D-CT using multiple regression analysis.

    PubMed

    Ono, Tomohiro; Nakamura, Mitsuhiro; Hirose, Yoshinori; Kitsuda, Kenji; Ono, Yuka; Ishigaki, Takashi; Hiraoka, Masahiro

    2017-09-01

    To estimate the lung tumor position from multiple anatomical features on four-dimensional computed tomography (4D-CT) data sets using single regression analysis (SRA) and multiple regression analysis (MRA) approach and evaluate an impact of the approach on internal target volume (ITV) for stereotactic body radiotherapy (SBRT) of the lung. Eleven consecutive lung cancer patients (12 cases) underwent 4D-CT scanning. The three-dimensional (3D) lung tumor motion exceeded 5 mm. The 3D tumor position and anatomical features, including lung volume, diaphragm, abdominal wall, and chest wall positions, were measured on 4D-CT images. The tumor position was estimated by SRA using each anatomical feature and MRA using all anatomical features. The difference between the actual and estimated tumor positions was defined as the root-mean-square error (RMSE). A standard partial regression coefficient for the MRA was evaluated. The 3D lung tumor position showed a high correlation with the lung volume (R = 0.92 ± 0.10). Additionally, ITVs derived from SRA and MRA approaches were compared with ITV derived from contouring gross tumor volumes on all 10 phases of the 4D-CT (conventional ITV). The RMSE of the SRA was within 3.7 mm in all directions. Also, the RMSE of the MRA was within 1.6 mm in all directions. The standard partial regression coefficient for the lung volume was the largest and had the most influence on the estimated tumor position. Compared with conventional ITV, average percentage decrease of ITV were 31.9% and 38.3% using SRA and MRA approaches, respectively. The estimation accuracy of lung tumor position was improved by the MRA approach, which provided smaller ITV than conventional ITV. © 2017 The Authors. Journal of Applied Clinical Medical Physics published by Wiley Periodicals, Inc. on behalf of American Association of Physicists in Medicine.

  15. Valuing avoided morbidity using meta-regression analysis: what can health status measures and QALYs tell us about WTP?

    PubMed

    Van Houtven, George; Powers, John; Jessup, Amber; Yang, Jui-Chen

    2006-08-01

    Many economists argue that willingness-to-pay (WTP) measures are most appropriate for assessing the welfare effects of health changes. Nevertheless, the health evaluation literature is still dominated by studies estimating nonmonetary health status measures (HSMs), which are often used to assess changes in quality-adjusted life years (QALYs). Using meta-regression analysis, this paper combines results from both WTP and HSM studies applied to acute morbidity, and it tests whether a systematic relationship exists between HSM and WTP estimates. We analyze over 230 WTP estimates from 17 different studies and find evidence that QALY-based estimates of illness severity--as measured by the Quality of Well-Being (QWB) Scale--are significant factors in explaining variation in WTP, as are changes in the duration of illness and the average income and age of the study populations. In addition, we test and reject the assumption of a constant WTP per QALY gain. We also demonstrate how the estimated meta-regression equations can serve as benefit transfer functions for policy analysis. By specifying the change in duration and severity of the acute illness and the characteristics of the affected population, we apply the regression functions to predict average WTP per case avoided. Copyright 2006 John Wiley & Sons, Ltd.

  16. [The effect of increasing tobacco tax on tobacco sales in Japan].

    PubMed

    Ito, Yuri; Nakamura, Masakazu

    2013-09-01

    Since the special tobacco tax was established in 1998, the tobacco tax and price of tobacco have increased thrice, in 2003, 2006, and 2010, respectively. We evaluated the effect of increases in tax on the consumption and sales of tobacco in Japan using the annual data on the number of tobacco products sold and the total sales from Japan Tobacco, Inc. We applied the number of tobacco products sold and the total sales per year to a joinpoint regression model to examine the trends in the data. This model could help identify the year in which a decrease or increase was apparent from the data. In addition, we examined the effect of each tax increase while also considering other factors that may have caused a decrease in the levels of tobacco consumption using the method proposed by Hirano et al. According to the joinpoint regression analysis, the number of tobacco products sold started decreasing in 1998, and the trends of decrease accelerated to 5% per year, from 2005. Owing to the tax increase, tobacco sales reduced by -2.4%, -2.9%, and -10.1% (corrected for the effect of the Tohoku Great Earthquake), and price elasticity was estimated as -0.30, -0.27, and -0.28 (corrected) in 2003, 2006, and 2010, respectively. The effect of tobacco tax increase on the decrease in tobacco sales was greatest in 2010, while the price elasticity remained almost the same as it was during the previous tax increase. The sharp hike in tobacco tax in 2010 decreased the number of tobacco products sold, while the price elasticity in 2010 was similar to that in 2003 and 2006. Our findings suggest that further increase in tobacco tax is needed to reduce the damage caused by smoking in the people of Japan.

  17. What Satisfies Students? Mining Student-Opinion Data with Regression and Decision-Tree Analysis. AIR 2002 Forum Paper.

    ERIC Educational Resources Information Center

    Thomas, Emily H.; Galambos, Nora

    To investigate how students' characteristics and experiences affect satisfaction, this study used regression and decision-tree analysis with the CHAID algorithm to analyze student opinion data from a sample of 1,783 college students. A data-mining approach identifies the specific aspects of students' university experience that most influence three…

  18. A Method of Calculating Functional Independence Measure at Discharge from Functional Independence Measure Effectiveness Predicted by Multiple Regression Analysis Has a High Degree of Predictive Accuracy.

    PubMed

    Tokunaga, Makoto; Watanabe, Susumu; Sonoda, Shigeru

    2017-09-01

    Multiple linear regression analysis is often used to predict the outcome of stroke rehabilitation. However, the predictive accuracy may not be satisfactory. The objective of this study was to elucidate the predictive accuracy of a method of calculating motor Functional Independence Measure (mFIM) at discharge from mFIM effectiveness predicted by multiple regression analysis. The subjects were 505 patients with stroke who were hospitalized in a convalescent rehabilitation hospital. The formula "mFIM at discharge = mFIM effectiveness × (91 points - mFIM at admission) + mFIM at admission" was used. By including the predicted mFIM effectiveness obtained through multiple regression analysis in this formula, we obtained the predicted mFIM at discharge (A). We also used multiple regression analysis to directly predict mFIM at discharge (B). The correlation between the predicted and the measured values of mFIM at discharge was compared between A and B. The correlation coefficients were .916 for A and .878 for B. Calculating mFIM at discharge from mFIM effectiveness predicted by multiple regression analysis had a higher degree of predictive accuracy of mFIM at discharge than that directly predicted. Copyright © 2017 National Stroke Association. Published by Elsevier Inc. All rights reserved.

  19. Quantile Regression with Censored Data

    ERIC Educational Resources Information Center

    Lin, Guixian

    2009-01-01

    The Cox proportional hazards model and the accelerated failure time model are frequently used in survival data analysis. They are powerful, yet have limitation due to their model assumptions. Quantile regression offers a semiparametric approach to model data with possible heterogeneity. It is particularly powerful for censored responses, where the…

  20. Regression Simulation Model. Appendix X. Users Manual,

    DTIC Science & Technology

    1981-03-01

    change as the prediction equations become refined. Whereas no notice will be provided when the changes are made, the programs will be modified such that...NATIONAL BUREAU Of STANDARDS 1963 A ___,_ __ _ __ _ . APPENDIX X ( R4/ EGRESSION IMULATION ’jDEL. Ape’A ’) 7 USERS MANUA submitted to The Great River...regression analysis and to establish a prediction equation (model). The prediction equation contains the partial regression coefficients (B-weights) which

  1. Association between response rates and survival outcomes in patients with newly diagnosed multiple myeloma. A systematic review and meta-regression analysis.

    PubMed

    Mainou, Maria; Madenidou, Anastasia-Vasiliki; Liakos, Aris; Paschos, Paschalis; Karagiannis, Thomas; Bekiari, Eleni; Vlachaki, Efthymia; Wang, Zhen; Murad, Mohammad Hassan; Kumar, Shaji; Tsapas, Apostolos

    2017-06-01

    We performed a systematic review and meta-regression analysis of randomized control trials to investigate the association between response to initial treatment and survival outcomes in patients with newly diagnosed multiple myeloma (MM). Response outcomes included complete response (CR) and the combined outcome of CR or very good partial response (VGPR), while survival outcomes were overall survival (OS) and progression-free survival (PFS). We used random-effect meta-regression models and conducted sensitivity analyses based on definition of CR and study quality. Seventy-two trials were included in the systematic review, 63 of which contributed data in meta-regression analyses. There was no association between OS and CR in patients without autologous stem cell transplant (ASCT) (regression coefficient: .02, 95% confidence interval [CI] -0.06, 0.10), in patients undergoing ASCT (-.11, 95% CI -0.44, 0.22) and in trials comparing ASCT with non-ASCT patients (.04, 95% CI -0.29, 0.38). Similarly, OS did not correlate with the combined metric of CR or VGPR, and no association was evident between response outcomes and PFS. Sensitivity analyses yielded similar results. This meta-regression analysis suggests that there is no association between conventional response outcomes and survival in patients with newly diagnosed MM. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  2. Logistic regression for risk factor modelling in stuttering research.

    PubMed

    Reed, Phil; Wu, Yaqionq

    2013-06-01

    To outline the uses of logistic regression and other statistical methods for risk factor analysis in the context of research on stuttering. The principles underlying the application of a logistic regression are illustrated, and the types of questions to which such a technique has been applied in the stuttering field are outlined. The assumptions and limitations of the technique are discussed with respect to existing stuttering research, and with respect to formulating appropriate research strategies to accommodate these considerations. Finally, some alternatives to the approach are briefly discussed. The way the statistical procedures are employed are demonstrated with some hypothetical data. Research into several practical issues concerning stuttering could benefit if risk factor modelling were used. Important examples are early diagnosis, prognosis (whether a child will recover or persist) and assessment of treatment outcome. After reading this article you will: (a) Summarize the situations in which logistic regression can be applied to a range of issues about stuttering; (b) Follow the steps in performing a logistic regression analysis; (c) Describe the assumptions of the logistic regression technique and the precautions that need to be checked when it is employed; (d) Be able to summarize its advantages over other techniques like estimation of group differences and simple regression. Copyright © 2012 Elsevier Inc. All rights reserved.

  3. MIXREG: a computer program for mixed-effects regression analysis with autocorrelated errors.

    PubMed

    Hedeker, D; Gibbons, R D

    1996-05-01

    MIXREG is a program that provides estimates for a mixed-effects regression model (MRM) for normally-distributed response data including autocorrelated errors. This model can be used for analysis of unbalanced longitudinal data, where individuals may be measured at a different number of timepoints, or even at different timepoints. Autocorrelated errors of a general form or following an AR(1), MA(1), or ARMA(1,1) form are allowable. This model can also be used for analysis of clustered data, where the mixed-effects model assumes data within clusters are dependent. The degree of dependency is estimated jointly with estimates of the usual model parameters, thus adjusting for clustering. MIXREG uses maximum marginal likelihood estimation, utilizing both the EM algorithm and a Fisher-scoring solution. For the scoring solution, the covariance matrix of the random effects is expressed in its Gaussian decomposition, and the diagonal matrix reparameterized using the exponential transformation. Estimation of the individual random effects is accomplished using an empirical Bayes approach. Examples illustrating usage and features of MIXREG are provided.

  4. High-Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis

    PubMed Central

    Daye, Z. John; Chen, Jinbo; Li, Hongzhe

    2011-01-01

    Summary We consider the problem of high-dimensional regression under non-constant error variances. Despite being a common phenomenon in biological applications, heteroscedasticity has, so far, been largely ignored in high-dimensional analysis of genomic data sets. We propose a new methodology that allows non-constant error variances for high-dimensional estimation and model selection. Our method incorporates heteroscedasticity by simultaneously modeling both the mean and variance components via a novel doubly regularized approach. Extensive Monte Carlo simulations indicate that our proposed procedure can result in better estimation and variable selection than existing methods when heteroscedasticity arises from the presence of predictors explaining error variances and outliers. Further, we demonstrate the presence of heteroscedasticity in and apply our method to an expression quantitative trait loci (eQTLs) study of 112 yeast segregants. The new procedure can automatically account for heteroscedasticity in identifying the eQTLs that are associated with gene expression variations and lead to smaller prediction errors. These results demonstrate the importance of considering heteroscedasticity in eQTL data analysis. PMID:22547833

  5. A flexible count data regression model for risk analysis.

    PubMed

    Guikema, Seth D; Coffelt, Jeremy P; Goffelt, Jeremy P

    2008-02-01

    In many cases, risk and reliability analyses involve estimating the probabilities of discrete events such as hardware failures and occurrences of disease or death. There is often additional information in the form of explanatory variables that can be used to help estimate the likelihood of different numbers of events in the future through the use of an appropriate regression model, such as a generalized linear model. However, existing generalized linear models (GLM) are limited in their ability to handle the types of variance structures often encountered in using count data in risk and reliability analysis. In particular, standard models cannot handle both underdispersed data (variance less than the mean) and overdispersed data (variance greater than the mean) in a single coherent modeling framework. This article presents a new GLM based on a reformulation of the Conway-Maxwell Poisson (COM) distribution that is useful for both underdispersed and overdispersed count data and demonstrates this model by applying it to the assessment of electric power system reliability. The results show that the proposed COM GLM can provide as good of fits to data as the commonly used existing models for overdispered data sets while outperforming these commonly used models for underdispersed data sets.

  6. A Novel Multiobjective Evolutionary Algorithm Based on Regression Analysis

    PubMed Central

    Song, Zhiming; Wang, Maocai; Dai, Guangming; Vasile, Massimiliano

    2015-01-01

    As is known, the Pareto set of a continuous multiobjective optimization problem with m objective functions is a piecewise continuous (m − 1)-dimensional manifold in the decision space under some mild conditions. However, how to utilize the regularity to design multiobjective optimization algorithms has become the research focus. In this paper, based on this regularity, a model-based multiobjective evolutionary algorithm with regression analysis (MMEA-RA) is put forward to solve continuous multiobjective optimization problems with variable linkages. In the algorithm, the optimization problem is modelled as a promising area in the decision space by a probability distribution, and the centroid of the probability distribution is (m − 1)-dimensional piecewise continuous manifold. The least squares method is used to construct such a model. A selection strategy based on the nondominated sorting is used to choose the individuals to the next generation. The new algorithm is tested and compared with NSGA-II and RM-MEDA. The result shows that MMEA-RA outperforms RM-MEDA and NSGA-II on the test instances with variable linkages. At the same time, MMEA-RA has higher efficiency than the other two algorithms. A few shortcomings of MMEA-RA have also been identified and discussed in this paper. PMID:25874246

  7. Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert M.

    2013-01-01

    A new regression model search algorithm was developed that may be applied to both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The algorithm is a simplified version of a more complex algorithm that was originally developed for the NASA Ames Balance Calibration Laboratory. The new algorithm performs regression model term reduction to prevent overfitting of data. It has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a regression model search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression model. Therefore, the simplified algorithm is not intended to replace the original algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new search algorithm.

  8. Regression analysis for bivariate gap time with missing first gap time data.

    PubMed

    Huang, Chia-Hui; Chen, Yi-Hau

    2017-01-01

    We consider ordered bivariate gap time while data on the first gap time are unobservable. This study is motivated by the HIV infection and AIDS study, where the initial HIV contracting time is unavailable, but the diagnosis times for HIV and AIDS are available. We are interested in studying the risk factors for the gap time between initial HIV contraction and HIV diagnosis, and gap time between HIV and AIDS diagnoses. Besides, the association between the two gap times is also of interest. Accordingly, in the data analysis we are faced with two-fold complexity, namely data on the first gap time is completely missing, and the second gap time is subject to induced informative censoring due to dependence between the two gap times. We propose a modeling framework for regression analysis of bivariate gap time under the complexity of the data. The estimating equations for the covariate effects on, as well as the association between, the two gap times are derived through maximum likelihood and suitable counting processes. Large sample properties of the resulting estimators are developed by martingale theory. Simulations are performed to examine the performance of the proposed analysis procedure. An application of data from the HIV and AIDS study mentioned above is reported for illustration.

  9. Analysis of multi-layered films. [determining dye densities by applying a regression analysis to the spectral response of the composite transparency

    NASA Technical Reports Server (NTRS)

    Scarpace, F. L.; Voss, A. W.

    1973-01-01

    Dye densities of multi-layered films are determined by applying a regression analysis to the spectral response of the composite transparency. The amount of dye in each layer is determined by fitting the sum of the individual dye layer densities to the measured dye densities. From this, dye content constants are calculated. Methods of calculating equivalent exposures are discussed. Equivalent exposures are a constant amount of energy over a limited band-width that will give the same dye content constants as the real incident energy. Methods of using these equivalent exposures for analysis of photographic data are presented.

  10. Relationship between rice yield and climate variables in southwest Nigeria using multiple linear regression and support vector machine analysis

    NASA Astrophysics Data System (ADS)

    Oguntunde, Philip G.; Lischeid, Gunnar; Dietrich, Ottfried

    2018-03-01

    This study examines the variations of climate variables and rice yield and quantifies the relationships among them using multiple linear regression, principal component analysis, and support vector machine (SVM) analysis in southwest Nigeria. The climate and yield data used was for a period of 36 years between 1980 and 2015. Similar to the observed decrease ( P < 0.001) in rice yield, pan evaporation, solar radiation, and wind speed declined significantly. Eight principal components exhibited an eigenvalue > 1 and explained 83.1% of the total variance of predictor variables. The SVM regression function using the scores of the first principal component explained about 75% of the variance in rice yield data and linear regression about 64%. SVM regression between annual solar radiation values and yield explained 67% of the variance. Only the first component of the principal component analysis (PCA) exhibited a clear long-term trend and sometimes short-term variance similar to that of rice yield. Short-term fluctuations of the scores of the PC1 are closely coupled to those of rice yield during the 1986-1993 and the 2006-2013 periods thereby revealing the inter-annual sensitivity of rice production to climate variability. Solar radiation stands out as the climate variable of highest influence on rice yield, and the influence was especially strong during monsoon and post-monsoon periods, which correspond to the vegetative, booting, flowering, and grain filling stages in the study area. The outcome is expected to provide more in-depth regional-specific climate-rice linkage for screening of better cultivars that can positively respond to future climate fluctuations as well as providing information that may help optimized planting dates for improved radiation use efficiency in the study area.

  11. Continuous water-quality monitoring and regression analysis to estimate constituent concentrations and loads in the Sheyenne River, North Dakota, 1980-2006

    USGS Publications Warehouse

    Ryberg, Karen R.

    2007-01-01

    This report presents the results of a study by the U.S. Geological Survey, done in cooperation with the North Dakota State Water Commission, to estimate water-quality constituent concentrations at seven sites on the Sheyenne River, N. Dak. Regression analysis of water-quality data collected in 1980-2006 was used to estimate concentrations for hardness, dissolved solids, calcium, magnesium, sodium, and sulfate. The explanatory variables examined for the regression relations were continuously monitored streamflow, specific conductance, and water temperature. For the conditions observed in 1980-2006, streamflow was a significant explanatory variable for some constituents. Specific conductance was a significant explanatory variable for all of the constituents, and water temperature was not a statistically significant explanatory variable for any of the constituents in this study. The regression relations were evaluated using common measures of variability, including R2, the proportion of variability in the estimated constituent concentration explained by the explanatory variables and regression equation. R2 values ranged from 0.784 for calcium to 0.997 for dissolved solids. The regression relations also were evaluated by calculating the median relative percentage difference (RPD) between measured constituent concentration and the constituent concentration estimated by the regression equations. Median RPDs ranged from 1.7 for dissolved solids to 11.5 for sulfate. The regression relations also may be used to estimate daily constituent loads. The relations should be monitored for change over time, especially at sites 2 and 3 which have a short period of record. In addition, caution should be used when the Sheyenne River is affected by ice or when upstream sites are affected by isolated storm runoff. Almost all of the outliers and highly influential samples removed from the analysis were made during periods when the Sheyenne River might be affected by ice.

  12. Methods for identifying SNP interactions: a review on variations of Logic Regression, Random Forest and Bayesian logistic regression.

    PubMed

    Chen, Carla Chia-Ming; Schwender, Holger; Keith, Jonathan; Nunkesser, Robin; Mengersen, Kerrie; Macrossan, Paula

    2011-01-01

    Due to advancements in computational ability, enhanced technology and a reduction in the price of genotyping, more data are being generated for understanding genetic associations with diseases and disorders. However, with the availability of large data sets comes the inherent challenges of new methods of statistical analysis and modeling. Considering a complex phenotype may be the effect of a combination of multiple loci, various statistical methods have been developed for identifying genetic epistasis effects. Among these methods, logic regression (LR) is an intriguing approach incorporating tree-like structures. Various methods have built on the original LR to improve different aspects of the model. In this study, we review four variations of LR, namely Logic Feature Selection, Monte Carlo Logic Regression, Genetic Programming for Association Studies, and Modified Logic Regression-Gene Expression Programming, and investigate the performance of each method using simulated and real genotype data. We contrast these with another tree-like approach, namely Random Forests, and a Bayesian logistic regression with stochastic search variable selection.

  13. Lidocaine dose-response effect on postoperative cognitive deficit: meta-analysis and meta-regression.

    PubMed

    Habibi, Mohammad Reza; Habibi, Valiollah; Habibi, Ali; Soleimani, Aria

    2018-04-01

    The true influence of the perioperative intravenous lidocaine on the development of postoperative cognitive deficit (POCD) in coronary artery bypass grafting (CABG) remains controversial. The principal aim is to undertake a meta-regression to determine whether moderator variables mediate the relationship between lidocaine and POCD. Areas covered: We searched the Web of Science, PubMed database, Scopus and the Cochrane Library database (up to June 2017) and systematically reviewed a list of retrieved articles. Our final review includes only randomized controlled trials (RCTs) that compared infusion of lidocaine and placebo during cardiopulmonary bypass (CPB). Mantel-Haenszel risk ratio (MH RR) and corresponding 95% confidence interval (CI) was used to report the overall effect and meta-regression analysis. A total of 688 patients in five RCTs were included. POCD occurred in 34% of all cases. Perioperative lidocaine reduces POCD (MH RR 0.702 (95% CI: 0.541-0.909). Younger age, male gender, longer CPB and higher concentration of lidocaine significantly mediate the relationship between lidocaine and POCD in favour of the neuroprotective effect of lidocaine. Expert commentary: The neuroprotective effect of lidocaine on POCD is consistent in spite of longer CPB time. A higher concentration of lidocaine strengthened the neuroprotective effect of lidocaine.

  14. Regression Analysis of Optical Coherence Tomography Disc Variables for Glaucoma Diagnosis.

    PubMed

    Richter, Grace M; Zhang, Xinbo; Tan, Ou; Francis, Brian A; Chopra, Vikas; Greenfield, David S; Varma, Rohit; Schuman, Joel S; Huang, David

    2016-08-01

    To report diagnostic accuracy of optical coherence tomography (OCT) disc variables using both time-domain (TD) and Fourier-domain (FD) OCT, and to improve the use of OCT disc variable measurements for glaucoma diagnosis through regression analyses that adjust for optic disc size and axial length-based magnification error. Observational, cross-sectional. In total, 180 normal eyes of 112 participants and 180 eyes of 138 participants with perimetric glaucoma from the Advanced Imaging for Glaucoma Study. Diagnostic variables evaluated from TD-OCT and FD-OCT were: disc area, rim area, rim volume, optic nerve head volume, vertical cup-to-disc ratio (CDR), and horizontal CDR. These were compared with overall retinal nerve fiber layer thickness and ganglion cell complex. Regression analyses were performed that corrected for optic disc size and axial length. Area-under-receiver-operating curves (AUROC) were used to assess diagnostic accuracy before and after the adjustments. An index based on multiple logistic regression that combined optic disc variables with axial length was also explored with the aim of improving diagnostic accuracy of disc variables. Comparison of diagnostic accuracy of disc variables, as measured by AUROC. The unadjusted disc variables with the highest diagnostic accuracies were: rim volume for TD-OCT (AUROC=0.864) and vertical CDR (AUROC=0.874) for FD-OCT. Magnification correction significantly worsened diagnostic accuracy for rim variables, and while optic disc size adjustments partially restored diagnostic accuracy, the adjusted AUROCs were still lower. Axial length adjustments to disc variables in the form of multiple logistic regression indices led to a slight but insignificant improvement in diagnostic accuracy. Our various regression approaches were not able to significantly improve disc-based OCT glaucoma diagnosis. However, disc rim area and vertical CDR had very high diagnostic accuracy, and these disc variables can serve to complement

  15. Neuropsychometric tests in cross sectional and longitudinal studies - a regression analysis of ADAS - cog, SKT and MMSE.

    PubMed

    Ihl, R; Grass-Kapanke, B; Jänner, M; Weyer, G

    1999-11-01

    In clinical and drug studies, different neuropsychometric tests are used. So far, no empirical data have been published to compare studies using different tests. The purpose of this study was to calculate a regression formula allowing a comparison of cross-sectional and longitudinal data from three neuropsychometric tests that are frequently used in drug studies (Alzheimer's Disease Assessment Scale, ADAS-cog; Syndrom Kurz Test, SKT; Mini Mental State Examination, MMSE). 177 patients with dementia according to ICD10 criteria were studied for the cross sectional and 61 for the longitudinal analysis. Correlations and linear regressions were calculated between tests. Significance was proven with ANOVA and t-tests using the SPSS statistical package. Significant Spearman correlations and slopes in the regression occurred in the cross sectional analysis (ADAS-cog-SKT r(s) = 0.77, slope = 0.45, SKT-ADAS-cog slope = 1.3, r2 = 0.59; ADAS-cog-MMSE r2 = 0.76, slope = -0.42, MMSE-ADAS-cog slope = -1.5, r2 = 0.64; MMSE-SKT r(s) = -0.79, slope = -0.87, SKT-MMSE slope = -0.71, r2 = 0.62; p<0.001 after Bonferroni correction; N = 177) and in the longitudinal analysis (SKT-ADAS-cog, r(s) = 0.48, slope = 0.69, ADAS-cog-SKT slope = 0.69, p<0.001, r2 = 0.32, MMSE-SKT, r(s) = 0.44, slope = -0.41, SKT-MMSE, slope = -0.55, p<0.001, r2 = 0.21). The results allow calculation of ADAS-scores when SKT scores are given, and vice versa. In longitudinal studies or in the course of the disease, scores assessed with the ADAS-cog and the SKT may now be statistically compared. In all comparisons, bottom and ceiling effects of the tests have to be taken into account.

  16. Effective psychological and psychosocial approaches to reduce repetition of self-harm: a systematic review, meta-analysis and meta-regression

    PubMed Central

    Robinson, Jo; Spittal, Matthew J; Carter, Greg

    2016-01-01

    Objective To examine the efficacy of psychological and psychosocial interventions for reductions in repeated self-harm. Design We conducted a systematic review, meta-analysis and meta-regression to examine the efficacy of psychological and psychosocial interventions to reduce repeat self-harm in adults. We included a sensitivity analysis of studies with a low risk of bias for the meta-analysis. For the meta-regression, we examined whether the type, intensity (primary analyses) and other components of intervention or methodology (secondary analyses) modified the overall intervention effect. Data sources A comprehensive search of MEDLINE, PsycInfo and EMBASE (from 1999 to June 2016) was performed. Eligibility criteria for selecting studies Randomised controlled trials of psychological and psychosocial interventions for adult self-harm patients. Results Forty-five trials were included with data available from 36 (7354 participants) for the primary analysis. Meta-analysis showed a significant benefit of all psychological and psychosocial interventions combined (risk ratio 0.84; 95% CI 0.74 to 0.96; number needed to treat=33); however, sensitivity analyses showed that this benefit was non-significant when restricted to a limited number of high-quality studies. Meta-regression showed that the type of intervention did not modify the treatment effects. Conclusions Consideration of a psychological or psychosocial intervention over and above treatment as usual is worthwhile; with the public health benefits of ensuring that this practice is widely adopted potentially worth the investment. However, the specific type and nature of the intervention that should be delivered is not yet clear. Cognitive–behavioural therapy or interventions with an interpersonal focus and targeted on the precipitants to self-harm may be the best candidates on the current evidence. Further research is required. PMID:27660314

  17. Difficulties with Regression Analysis of Age-Adjusted Rates.

    DTIC Science & Technology

    1982-09-01

    variables used in those analyses, such as death rates in various states, have been age adjusted, whereas the predictor variables have not been age adjusted...The use of crude state death rates as the outcome variable with crude covariates and age as predictors can avoid the problem, at least under some...should be regressed on age-adjusted exposure Z+B+ Although age-specific death rates , Yas+’ may be available, it is often difficult to obtain age

  18. Ridge Regression Signal Processing

    NASA Technical Reports Server (NTRS)

    Kuhl, Mark R.

    1990-01-01

    The introduction of the Global Positioning System (GPS) into the National Airspace System (NAS) necessitates the development of Receiver Autonomous Integrity Monitoring (RAIM) techniques. In order to guarantee a certain level of integrity, a thorough understanding of modern estimation techniques applied to navigational problems is required. The extended Kalman filter (EKF) is derived and analyzed under poor geometry conditions. It was found that the performance of the EKF is difficult to predict, since the EKF is designed for a Gaussian environment. A novel approach is implemented which incorporates ridge regression to explain the behavior of an EKF in the presence of dynamics under poor geometry conditions. The basic principles of ridge regression theory are presented, followed by the derivation of a linearized recursive ridge estimator. Computer simulations are performed to confirm the underlying theory and to provide a comparative analysis of the EKF and the recursive ridge estimator.

  19. Multivariate logistic regression analysis of postoperative complications and risk model establishment of gastrectomy for gastric cancer: A single-center cohort report.

    PubMed

    Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing

    2016-01-01

    Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.

  20. Analysis and Interpretation of Findings Using Multiple Regression Techniques

    ERIC Educational Resources Information Center

    Hoyt, William T.; Leierer, Stephen; Millington, Michael J.

    2006-01-01

    Multiple regression and correlation (MRC) methods form a flexible family of statistical techniques that can address a wide variety of different types of research questions of interest to rehabilitation professionals. In this article, we review basic concepts and terms, with an emphasis on interpretation of findings relevant to research questions…

  1. Evaluation of the comprehensive palatability of Japanese sake paired with dishes by multiple regression analysis based on subdomains.

    PubMed

    Nakamura, Ryo; Nakano, Kumiko; Tamura, Hiroyasu; Mizunuma, Masaki; Fushiki, Tohru; Hirata, Dai

    2017-08-01

    Many factors contribute to palatability. In order to evaluate the palatability of Japanese alcohol sake paired with certain dishes by integrating multiple factors, here we applied an evaluation method previously reported for palatability of cheese by multiple regression analysis based on 3 subdomain factors (rewarding, cultural, and informational). We asked 94 Japanese participants/subjects to evaluate the palatability of sake (1st evaluation/E1 for the first cup, 2nd/E2 and 3rd/E3 for the palatability with aftertaste/afterglow of certain dishes) and to respond to a questionnaire related to 3 subdomains. In E1, 3 factors were extracted by a factor analysis, and the subsequent multiple regression analyses indicated that the palatability of sake was interpreted by mainly the rewarding. Further, the results of attribution-dissections in E1 indicated that 2 factors (rewarding and informational) contributed to the palatability. Finally, our results indicated that the palatability of sake was influenced by the dish eaten just before drinking.

  2. Internet search query analysis can be used to demonstrate the rapidly increasing public awareness of palliative care in the USA.

    PubMed

    McLean, Sarah; Lennon, Paul; Glare, Paul

    2017-01-27

    A lack of public awareness of palliative care (PC) has been identified as one of the main barriers to appropriate PC access. Internet search query analysis is a novel methodology, which has been effectively used in surveillance of infectious diseases, and can be used to monitor public awareness of health-related topics. We aimed to demonstrate the utility of internet search query analysis to evaluate changes in public awareness of PC in the USA between 2005 and 2015. Google Trends provides a referenced score for the popularity of a search term, for defined regions over defined time periods. The popularity of the search term 'palliative care' was measured monthly between 1/1/2005 and 31/12/2015 in the USA and in the UK. Results were analysed using independent t-tests and joinpoint analysis. The mean monthly popularity of the search term increased between 2008-2009 (p<0.001), 2011-2012 (p<0.001), 2013-2014 (p=0.004) and 2014-2015 (p=0.002) in the USA. Joinpoint analysis was used to evaluate the monthly percentage change (MPC) in the popularity of the search term. In the USA, the MPC increase was 0.6%/month (p<0.05); in the UK the MPC of 0.05% was non-significant. Although internet search query surveillance is a novel methodology, it is freely accessible and has significant potential to monitor health-seeking behaviour among the public. PC is rapidly growing in the USA, and the rapidly increasing public awareness of PC as demonstrated in this study, in comparison with the UK, where PC is relatively well established is encouraging in increasingly ensuring appropriate PC access for all. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.

  3. Isolating and Examining Sources of Suppression and Multicollinearity in Multiple Linear Regression

    ERIC Educational Resources Information Center

    Beckstead, Jason W.

    2012-01-01

    The presence of suppression (and multicollinearity) in multiple regression analysis complicates interpretation of predictor-criterion relationships. The mathematical conditions that produce suppression in regression analysis have received considerable attention in the methodological literature but until now nothing in the way of an analytic…

  4. Predicting the aquatic toxicity mode of action using logistic regression and linear discriminant analysis.

    PubMed

    Ren, Y Y; Zhou, L C; Yang, L; Liu, P Y; Zhao, B W; Liu, H X

    2016-09-01

    The paper highlights the use of the logistic regression (LR) method in the construction of acceptable statistically significant, robust and predictive models for the classification of chemicals according to their aquatic toxic modes of action. Essentials accounting for a reliable model were all considered carefully. The model predictors were selected by stepwise forward discriminant analysis (LDA) from a combined pool of experimental data and chemical structure-based descriptors calculated by the CODESSA and DRAGON software packages. Model predictive ability was validated both internally and externally. The applicability domain was checked by the leverage approach to verify prediction reliability. The obtained models are simple and easy to interpret. In general, LR performs much better than LDA and seems to be more attractive for the prediction of the more toxic compounds, i.e. compounds that exhibit excess toxicity versus non-polar narcotic compounds and more reactive compounds versus less reactive compounds. In addition, model fit and regression diagnostics was done through the influence plot which reflects the hat-values, studentized residuals, and Cook's distance statistics of each sample. Overdispersion was also checked for the LR model. The relationships between the descriptors and the aquatic toxic behaviour of compounds are also discussed.

  5. Spectral distance decay: Assessing species beta-diversity by quantile regression

    USGS Publications Warehouse

    Rocchinl, D.; Nagendra, H.; Ghate, R.; Cade, B.S.

    2009-01-01

    Remotely sensed data represents key information for characterizing and estimating biodiversity. Spectral distance among sites has proven to be a powerful approach for detecting species composition variability. Regression analysis of species similarity versus spectral distance may allow us to quantitatively estimate how beta-diversity in species changes with respect to spectral and ecological variability. In classical regression analysis, the residual sum of squares is minimized for the mean of the dependent variable distribution. However, many ecological datasets are characterized by a high number of zeroes that can add noise to the regression model. Quantile regression can be used to evaluate trend in the upper quantiles rather than a mean trend across the whole distribution of the dependent variable. In this paper, we used ordinary least square (ols) and quantile regression to estimate the decay of species similarity versus spectral distance. The achieved decay rates were statistically nonzero (p < 0.05) considering both ols and quantile regression. Nonetheless, ols regression estimate of mean decay rate was only half the decay rate indicated by the upper quantiles. Moreover, the intercept value, representing the similarity reached when spectral distance approaches zero, was very low compared with the intercepts of upper quantiles, which detected high species similarity when habitats are more similar. In this paper we demonstrated the power of using quantile regressions applied to spectral distance decay in order to reveal species diversity patterns otherwise lost or underestimated by ordinary least square regression. ?? 2009 American Society for Photogrammetry and Remote Sensing.

  6. The process and utility of classification and regression tree methodology in nursing research

    PubMed Central

    Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda

    2014-01-01

    Aim This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Background Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Design Discussion paper. Data sources English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984–2013. Discussion Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Implications for Nursing Research Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Conclusion Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. PMID:24237048

  7. The process and utility of classification and regression tree methodology in nursing research.

    PubMed

    Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda

    2014-06-01

    This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Discussion paper. English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984-2013. Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. © 2013 The Authors. Journal of Advanced Nursing Published by John Wiley & Sons Ltd.

  8. Epstein-Barr Virus and Gastric Cancer Risk: A Meta-analysis With Meta-regression of Case-control Studies.

    PubMed

    Bae, Jong-Myon; Kim, Eun Hee

    2016-03-01

    Research on how the risk of gastric cancer increases with Epstein-Barr virus (EBV) infection is lacking. In a systematic review that investigated studies published until September 2014, the authors did not calculate the summary odds ratio (SOR) due to heterogeneity across studies. Therefore, we include here additional studies published until October 2015 and conduct a meta-analysis with meta-regression that controls for the heterogeneity among studies. Using the studies selected in the previously published systematic review, we formulated lists of references, cited articles, and related articles provided by PubMed. From the lists, only case-control studies that detected EBV in tissue samples were selected. In order to control for the heterogeneity among studies, subgroup analysis and meta-regression were performed. In the 33 case-control results with adjacent non-cancer tissue, the total number of test samples in the case and control groups was 5280 and 4962, respectively. In the 14 case-control results with normal tissue, the total number of test samples in case and control groups was 1393 and 945, respectively. Upon meta-regression, the type of control tissue was found to be a statistically significant variable with regard to heterogeneity. When the control tissue was normal tissue of healthy individuals, the SOR was 3.41 (95% CI, 1.78 to 6.51; I-squared, 65.5%). The results of the present study support the argument that EBV infection increases the risk of gastric cancer. In the future, age-matched and sex-matched case-control studies should be conducted.

  9. Biostatistics Series Module 6: Correlation and Linear Regression.

    PubMed

    Hazra, Avijit; Gogtay, Nithya

    2016-01-01

    Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient ( r ). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r 2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation ( y = a + bx ), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous.

  10. Biostatistics Series Module 6: Correlation and Linear Regression

    PubMed Central

    Hazra, Avijit; Gogtay, Nithya

    2016-01-01

    Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient (r). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation (y = a + bx), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous. PMID:27904175

  11. A note on variance estimation in random effects meta-regression.

    PubMed

    Sidik, Kurex; Jonkman, Jeffrey N

    2005-01-01

    For random effects meta-regression inference, variance estimation for the parameter estimates is discussed. Because estimated weights are used for meta-regression analysis in practice, the assumed or estimated covariance matrix used in meta-regression is not strictly correct, due to possible errors in estimating the weights. Therefore, this note investigates the use of a robust variance estimation approach for obtaining variances of the parameter estimates in random effects meta-regression inference. This method treats the assumed covariance matrix of the effect measure variables as a working covariance matrix. Using an example of meta-analysis data from clinical trials of a vaccine, the robust variance estimation approach is illustrated in comparison with two other methods of variance estimation. A simulation study is presented, comparing the three methods of variance estimation in terms of bias and coverage probability. We find that, despite the seeming suitability of the robust estimator for random effects meta-regression, the improved variance estimator of Knapp and Hartung (2003) yields the best performance among the three estimators, and thus may provide the best protection against errors in the estimated weights.

  12. Determination of boiling point of petrochemicals by gas chromatography-mass spectrometry and multivariate regression analysis of structural activity relationship.

    PubMed

    Fakayode, Sayo O; Mitchell, Breanna S; Pollard, David A

    2014-08-01

    Accurate understanding of analyte boiling points (BP) is of critical importance in gas chromatographic (GC) separation and crude oil refinery operation in petrochemical industries. This study reported the first combined use of GC separation and partial-least-square (PLS1) multivariate regression analysis of petrochemical structural activity relationship (SAR) for accurate BP determination of two commercially available (D3710 and MA VHP) calibration gas mix samples. The results of the BP determination using PLS1 multivariate regression were further compared with the results of traditional simulated distillation method of BP determination. The developed PLS1 regression was able to correctly predict analytes BP in D3710 and MA VHP calibration gas mix samples, with a root-mean-square-%-relative-error (RMS%RE) of 6.4%, and 10.8% respectively. In contrast, the overall RMS%RE of 32.9% and 40.4%, respectively obtained for BP determination in D3710 and MA VHP using a traditional simulated distillation method were approximately four times larger than the corresponding RMS%RE of BP prediction using MRA, demonstrating the better predictive ability of MRA. The reported method is rapid, robust, and promising, and can be potentially used routinely for fast analysis, pattern recognition, and analyte BP determination in petrochemical industries. Copyright © 2014 Elsevier B.V. All rights reserved.

  13. Outcomes of an intervention to improve hospital antibiotic prescribing: interrupted time series with segmented regression analysis.

    PubMed

    Ansari, Faranak; Gray, Kirsteen; Nathwani, Dilip; Phillips, Gabby; Ogston, Simon; Ramsay, Craig; Davey, Peter

    2003-11-01

    To evaluate an intervention to reduce inappropriate use of key antibiotics with interrupted time series analysis. The intervention is a policy for appropriate use of Alert Antibiotics (carbapenems, glycopeptides, amphotericin, ciprofloxacin, linezolid, piperacillin-tazobactam and third-generation cephalosporins) implemented through concurrent, patient-specific feedback by clinical pharmacists. Statistical significance and effect size were calculated by segmented regression analysis of interrupted time series of drug use and cost for 2 years before and after the intervention started. Use of Alert Antibiotics increased before the intervention started but decreased steadily for 2 years thereafter. The changes in slope of the time series were 0.27 defined daily doses/100 bed-days per month (95% CI 0.19-0.34) and pound 1908 per month (95% CI pound 1238- pound 2578). The cost of development, dissemination and implementation of the intervention ( pound 20133) was well below the most conservative estimate of the reduction in cost ( pound 133296), which is the lower 95% CI of effect size assuming that cost would not have continued to increase without the intervention. However, if use had continued to increase, the difference between predicted and actual cost of Alert Antibiotics was pound 572448 (95% CI pound 435696- pound 709176) over the 24 months after the intervention started. Segmented regression analysis of pharmacy stock data is a simple, practical and robust method for measuring the impact of interventions to change prescribing. The Alert Antibiotic Monitoring intervention was associated with significant decreases in total use and cost in the 2 years after the programme was implemented. In our hospital, the value of the data far exceeded the cost of processing and analysis.

  14. Vertebral artery injury associated with blunt cervical spine trauma: a multivariate regression analysis.

    PubMed

    Lebl, Darren R; Bono, Christopher M; Velmahos, George; Metkar, Umesh; Nguyen, Joseph; Harris, Mitchel B

    2013-07-15

    Retrospective analysis of prospective registry data. To determine the patient characteristics, risk factors, and fracture patterns associated with vertebral artery injury (VAI) in patients with blunt cervical spine injury. VAI associated with cervical spine trauma has the potential for catastrophical clinical sequelae. The patterns of cervical spine injury and patient characteristics associated with VAI remain to be determined. A retrospective review of prospectively collected data from the American College of Surgeons trauma registries at 3 level-1 trauma centers identified all patients with a cervical spine injury on multidetector computed tomographic scan during a 3-year period (January 1, 2007, to January 1, 2010). Fracture pattern and patient characteristics were recorded. Logistic multivariate regression analysis of independent predictors for VAI and subgroup analysis of neurological events related to VAI was performed. Twenty-one percent of 1204 patients with cervical injuries (n = 253) underwent screening for VAI by multidetector computed tomography angiogram. VAI was diagnosed in 17% (42 of 253), unilateral in 15% (38 of 253), and bilateral in 1.6% (4 of 253) and was associated with a lower Glasgow coma scale (P < 0.001), a higher injury severity score (P < 0.01), and a higher mortality (P < 0.001). VAI was associated with ankylosing spondylitis/diffuse idiopathic skeletal hyperosteosis (crude odds ratio [OR] = 8.04; 95% confidence interval [CI], 1.30-49.68; P = 0.034), and occipitocervical dissociation (P < 0.001) by univariate analysis and fracture displacement into the transverse foramen 1 mm or more (adjusted OR = 3.29; 95% CI, 1.15-9.41; P = 0.026), and basilar skull fracture (adjusted OR = 4.25; 95% CI, 1.25-14.47; P= 0.021), by multivariate regression model. Subgroup analyses of neurological events secondary to VAI occurred in 14% (6 of 42) and the stroke-related mortality rate was 4.8% (2 of 42). Neurological events were associated with male sex (P

  15. Benchmark dose analysis via nonparametric regression modeling

    PubMed Central

    Piegorsch, Walter W.; Xiong, Hui; Bhattacharya, Rabi N.; Lin, Lizhen

    2013-01-01

    Estimation of benchmark doses (BMDs) in quantitative risk assessment traditionally is based upon parametric dose-response modeling. It is a well-known concern, however, that if the chosen parametric model is uncertain and/or misspecified, inaccurate and possibly unsafe low-dose inferences can result. We describe a nonparametric approach for estimating BMDs with quantal-response data based on an isotonic regression method, and also study use of corresponding, nonparametric, bootstrap-based confidence limits for the BMD. We explore the confidence limits’ small-sample properties via a simulation study, and illustrate the calculations with an example from cancer risk assessment. It is seen that this nonparametric approach can provide a useful alternative for BMD estimation when faced with the problem of parametric model uncertainty. PMID:23683057

  16. Some Applied Research Concerns Using Multiple Linear Regression Analysis.

    ERIC Educational Resources Information Center

    Newman, Isadore; Fraas, John W.

    The intention of this paper is to provide an overall reference on how a researcher can apply multiple linear regression in order to utilize the advantages that it has to offer. The advantages and some concerns expressed about the technique are examined. A number of practical ways by which researchers can deal with such concerns as…

  17. Automating approximate Bayesian computation by local linear regression.

    PubMed

    Thornton, Kevin R

    2009-07-07

    In several biological contexts, parameter inference often relies on computationally-intensive techniques. "Approximate Bayesian Computation", or ABC, methods based on summary statistics have become increasingly popular. A particular flavor of ABC based on using a linear regression to approximate the posterior distribution of the parameters, conditional on the summary statistics, is computationally appealing, yet no standalone tool exists to automate the procedure. Here, I describe a program to implement the method. The software package ABCreg implements the local linear-regression approach to ABC. The advantages are: 1. The code is standalone, and fully-documented. 2. The program will automatically process multiple data sets, and create unique output files for each (which may be processed immediately in R), facilitating the testing of inference procedures on simulated data, or the analysis of multiple data sets. 3. The program implements two different transformation methods for the regression step. 4. Analysis options are controlled on the command line by the user, and the program is designed to output warnings for cases where the regression fails. 5. The program does not depend on any particular simulation machinery (coalescent, forward-time, etc.), and therefore is a general tool for processing the results from any simulation. 6. The code is open-source, and modular.Examples of applying the software to empirical data from Drosophila melanogaster, and testing the procedure on simulated data, are shown. In practice, the ABCreg simplifies implementing ABC based on local-linear regression.

  18. Hospital charges associated with motorcycle crash factors: a quantile regression analysis.

    PubMed

    Olsen, Cody S; Thomas, Andrea M; Cook, Lawrence J

    2014-08-01

    Previous studies of motorcycle crash (MC) related hospital charges use trauma registries and hospital records, and do not adjust for the number of motorcyclists not requiring medical attention. This may lead to conservative estimates of helmet use effectiveness. MC records were probabilistically linked with emergency department and hospital records to obtain total hospital charges. Missing data were imputed. Multivariable quantile regression estimated reductions in hospital charges associated with helmet use and other crash factors. Motorcycle helmets were associated with reduced median hospital charges of $256 (42% reduction) and reduced 98th percentile of $32,390 (33% reduction). After adjusting for other factors, helmets were associated with reductions in charges in all upper percentiles studied. Quantile regression models described homogenous and heterogeneous associations between other crash factors and charges. Quantile regression comprehensively describes associations between crash factors and hospital charges. Helmet use among motorcyclists is associated with decreased hospital charges. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  19. Detection of Cutting Tool Wear using Statistical Analysis and Regression Model

    NASA Astrophysics Data System (ADS)

    Ghani, Jaharah A.; Rizal, Muhammad; Nuawi, Mohd Zaki; Haron, Che Hassan Che; Ramli, Rizauddin

    2010-10-01

    This study presents a new method for detecting the cutting tool wear based on the measured cutting force signals. A statistical-based method called Integrated Kurtosis-based Algorithm for Z-Filter technique, called I-kaz was used for developing a regression model and 3D graphic presentation of I-kaz 3D coefficient during machining process. The machining tests were carried out using a CNC turning machine Colchester Master Tornado T4 in dry cutting condition. A Kistler 9255B dynamometer was used to measure the cutting force signals, which were transmitted, analyzed, and displayed in the DasyLab software. Various force signals from machining operation were analyzed, and each has its own I-kaz 3D coefficient. This coefficient was examined and its relationship with flank wear lands (VB) was determined. A regression model was developed due to this relationship, and results of the regression model shows that the I-kaz 3D coefficient value decreases as tool wear increases. The result then is used for real time tool wear monitoring.

  20. Multiple Regression Analysis of mRNA-miRNA Associations in Colorectal Cancer Pathway

    PubMed Central

    Wang, Fengfeng; Wong, S. C. Cesar; Chan, Lawrence W. C.; Cho, William C. S.; Yip, S. P.; Yung, Benjamin Y. M.

    2014-01-01

    Background. MicroRNA (miRNA) is a short and endogenous RNA molecule that regulates posttranscriptional gene expression. It is an important factor for tumorigenesis of colorectal cancer (CRC), and a potential biomarker for diagnosis, prognosis, and therapy of CRC. Our objective is to identify the related miRNAs and their associations with genes frequently involved in CRC microsatellite instability (MSI) and chromosomal instability (CIN) signaling pathways. Results. A regression model was adopted to identify the significantly associated miRNAs targeting a set of candidate genes frequently involved in colorectal cancer MSI and CIN pathways. Multiple linear regression analysis was used to construct the model and find the significant mRNA-miRNA associations. We identified three significantly associated mRNA-miRNA pairs: BCL2 was positively associated with miR-16 and SMAD4 was positively associated with miR-567 in the CRC tissue, while MSH6 was positively associated with miR-142-5p in the normal tissue. As for the whole model, BCL2 and SMAD4 models were not significant, and MSH6 model was significant. The significant associations were different in the normal and the CRC tissues. Conclusion. Our results have laid down a solid foundation in exploration of novel CRC mechanisms, and identification of miRNA roles as oncomirs or tumor suppressor mirs in CRC. PMID:24895601

  1. A general framework for the regression analysis of pooled biomarker assessments.

    PubMed

    Liu, Yan; McMahan, Christopher; Gallagher, Colin

    2017-07-10

    As a cost-efficient data collection mechanism, the process of assaying pooled biospecimens is becoming increasingly common in epidemiological research; for example, pooling has been proposed for the purpose of evaluating the diagnostic efficacy of biological markers (biomarkers). To this end, several authors have proposed techniques that allow for the analysis of continuous pooled biomarker assessments. Regretfully, most of these techniques proceed under restrictive assumptions, are unable to account for the effects of measurement error, and fail to control for confounding variables. These limitations are understandably attributable to the complex structure that is inherent to measurements taken on pooled specimens. Consequently, in order to provide practitioners with the tools necessary to accurately and efficiently analyze pooled biomarker assessments, herein, a general Monte Carlo maximum likelihood-based procedure is presented. The proposed approach allows for the regression analysis of pooled data under practically all parametric models and can be used to directly account for the effects of measurement error. Through simulation, it is shown that the proposed approach can accurately and efficiently estimate all unknown parameters and is more computational efficient than existing techniques. This new methodology is further illustrated using monocyte chemotactic protein-1 data collected by the Collaborative Perinatal Project in an effort to assess the relationship between this chemokine and the risk of miscarriage. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  2. Poor Smokers, Poor Quitters, and Cigarette Tax Regressivity

    PubMed Central

    Remler, Dahlia K.

    2004-01-01

    The traditional view that excise taxes are regressive has been challenged. I document the history of the term regressive tax, show that traditional definitions have always found cigarette taxes to be regressive, and illustrate the implications of the greater price responsiveness observed among the poor. I explain the different definitions of tax burden: accounting, welfare-based willingness to pay, and welfare-based time inconsistent. Progressivity (equity across income groups) is sensitive to the way in which tax burden is assessed. Analysis of horizontal equity (fairness within a given income group) shows that cigarette taxes heavily burden poor smokers who do not quit, no matter how tax burden is assessed. PMID:14759931

  3. A Comparison of Rule-based Analysis with Regression Methods in Understanding the Risk Factors for Study Withdrawal in a Pediatric Study.

    PubMed

    Haghighi, Mona; Johnson, Suzanne Bennett; Qian, Xiaoning; Lynch, Kristian F; Vehik, Kendra; Huang, Shuai

    2016-08-26

    Regression models are extensively used in many epidemiological studies to understand the linkage between specific outcomes of interest and their risk factors. However, regression models in general examine the average effects of the risk factors and ignore subgroups with different risk profiles. As a result, interventions are often geared towards the average member of the population, without consideration of the special health needs of different subgroups within the population. This paper demonstrates the value of using rule-based analysis methods that can identify subgroups with heterogeneous risk profiles in a population without imposing assumptions on the subgroups or method. The rules define the risk pattern of subsets of individuals by not only considering the interactions between the risk factors but also their ranges. We compared the rule-based analysis results with the results from a logistic regression model in The Environmental Determinants of Diabetes in the Young (TEDDY) study. Both methods detected a similar suite of risk factors, but the rule-based analysis was superior at detecting multiple interactions between the risk factors that characterize the subgroups. A further investigation of the particular characteristics of each subgroup may detect the special health needs of the subgroup and lead to tailored interventions.

  4. Non-ignorable missingness in logistic regression.

    PubMed

    Wang, Joanna J J; Bartlett, Mark; Ryan, Louise

    2017-08-30

    Nonresponses and missing data are common in observational studies. Ignoring or inadequately handling missing data may lead to biased parameter estimation, incorrect standard errors and, as a consequence, incorrect statistical inference and conclusions. We present a strategy for modelling non-ignorable missingness where the probability of nonresponse depends on the outcome. Using a simple case of logistic regression, we quantify the bias in regression estimates and show the observed likelihood is non-identifiable under non-ignorable missing data mechanism. We then adopt a selection model factorisation of the joint distribution as the basis for a sensitivity analysis to study changes in estimated parameters and the robustness of study conclusions against different assumptions. A Bayesian framework for model estimation is used as it provides a flexible approach for incorporating different missing data assumptions and conducting sensitivity analysis. Using simulated data, we explore the performance of the Bayesian selection model in correcting for bias in a logistic regression. We then implement our strategy using survey data from the 45 and Up Study to investigate factors associated with worsening health from the baseline to follow-up survey. Our findings have practical implications for the use of the 45 and Up Study data to answer important research questions relating to health and quality-of-life. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  5. Independent contrasts and PGLS regression estimators are equivalent.

    PubMed

    Blomberg, Simon P; Lefevre, James G; Wells, Jessie A; Waterhouse, Mary

    2012-05-01

    We prove that the slope parameter of the ordinary least squares regression of phylogenetically independent contrasts (PICs) conducted through the origin is identical to the slope parameter of the method of generalized least squares (GLSs) regression under a Brownian motion model of evolution. This equivalence has several implications: 1. Understanding the structure of the linear model for GLS regression provides insight into when and why phylogeny is important in comparative studies. 2. The limitations of the PIC regression analysis are the same as the limitations of the GLS model. In particular, phylogenetic covariance applies only to the response variable in the regression and the explanatory variable should be regarded as fixed. Calculation of PICs for explanatory variables should be treated as a mathematical idiosyncrasy of the PIC regression algorithm. 3. Since the GLS estimator is the best linear unbiased estimator (BLUE), the slope parameter estimated using PICs is also BLUE. 4. If the slope is estimated using different branch lengths for the explanatory and response variables in the PIC algorithm, the estimator is no longer the BLUE, so this is not recommended. Finally, we discuss whether or not and how to accommodate phylogenetic covariance in regression analyses, particularly in relation to the problem of phylogenetic uncertainty. This discussion is from both frequentist and Bayesian perspectives.

  6. Regression modeling and prediction of road sweeping brush load characteristics from finite element analysis and experimental results.

    PubMed

    Wang, Chong; Sun, Qun; Wahab, Magd Abdel; Zhang, Xingyu; Xu, Limin

    2015-09-01

    Rotary cup brushes mounted on each side of a road sweeper undertake heavy debris removal tasks but the characteristics have not been well known until recently. A Finite Element (FE) model that can analyze brush deformation and predict brush characteristics have been developed to investigate the sweeping efficiency and to assist the controller design. However, the FE model requires large amount of CPU time to simulate each brush design and operating scenario, which may affect its applications in a real-time system. This study develops a mathematical regression model to summarize the FE modeled results. The complex brush load characteristic curves were statistically analyzed to quantify the effects of cross-section, length, mounting angle, displacement and rotational speed etc. The data were then fitted by a multiple variable regression model using the maximum likelihood method. The fitted results showed good agreement with the FE analysis results and experimental results, suggesting that the mathematical regression model may be directly used in a real-time system to predict characteristics of different brushes under varying operating conditions. The methodology may also be used in the design and optimization of rotary brush tools. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. The global prevalence and correlates of skin bleaching: a meta-analysis and meta-regression analysis.

    PubMed

    Sagoe, Dominic; Pallesen, Ståle; Dlova, Ncoza C; Lartey, Margaret; Ezzedine, Khaled; Dadzie, Ophelia

    2018-06-11

    To estimate and investigate the global lifetime prevalence and correlates of skin bleaching. A meta-analysis and meta-regression analysis was performed based on a systematic and comprehensive literature search conducted in Google Scholar, ISI Web of Science, ProQuest, PsycNET, PubMed, and other relevant websites and reference lists. A total of 68 studies (67,665 participants) providing original data on the lifetime prevalence of skin bleaching were included. Publication bias was corrected using the trim and fill procedure. The pooled (imputed) lifetime prevalence of skin bleaching was 27.7% (95% CI: 19.6-37.5, I 2  = 99.6, P < 0.01). The highest significant prevalences were associated with: males (28.0%), topical corticosteroid use (51.8%), Africa (27.1%), persons aged ≤30 years (55.9%), individuals with only primary school education (31.6%), urban or semiurban residents (74.9%), patients (21.3%), data from 2010-2017 (26.8%), dermatological evaluation and testing-based assessment (24.9%), random sampling methods (29.2%), and moderate quality studies (32.3%). The proportion of females in study samples was significantly related to skin bleaching prevalence. Despite some limitations, our results indicate that the practice of skin bleaching is a serious global public health issue that should be addressed through appropriate public health interventions. © 2018 The International Society of Dermatology.

  8. A Systematic Review and Meta-Regression Analysis of Lung Cancer Risk and Inorganic Arsenic in Drinking Water.

    PubMed

    Lamm, Steven H; Ferdosi, Hamid; Dissen, Elisabeth K; Li, Ji; Ahn, Jaeil

    2015-12-07

    High levels (> 200 µg/L) of inorganic arsenic in drinking water are known to be a cause of human lung cancer, but the evidence at lower levels is uncertain. We have sought the epidemiological studies that have examined the dose-response relationship between arsenic levels in drinking water and the risk of lung cancer over a range that includes both high and low levels of arsenic. Regression analysis, based on six studies identified from an electronic search, examined the relationship between the log of the relative risk and the log of the arsenic exposure over a range of 1-1000 µg/L. The best-fitting continuous meta-regression model was sought and found to be a no-constant linear-quadratic analysis where both the risk and the exposure had been logarithmically transformed. This yielded both a statistically significant positive coefficient for the quadratic term and a statistically significant negative coefficient for the linear term. Sub-analyses by study design yielded results that were similar for both ecological studies and non-ecological studies. Statistically significant X-intercepts consistently found no increased level of risk at approximately 100-150 µg/L arsenic.

  9. A Systematic Review and Meta-Regression Analysis of Lung Cancer Risk and Inorganic Arsenic in Drinking Water

    PubMed Central

    Lamm, Steven H.; Ferdosi, Hamid; Dissen, Elisabeth K.; Li, Ji; Ahn, Jaeil

    2015-01-01

    High levels (> 200 µg/L) of inorganic arsenic in drinking water are known to be a cause of human lung cancer, but the evidence at lower levels is uncertain. We have sought the epidemiological studies that have examined the dose-response relationship between arsenic levels in drinking water and the risk of lung cancer over a range that includes both high and low levels of arsenic. Regression analysis, based on six studies identified from an electronic search, examined the relationship between the log of the relative risk and the log of the arsenic exposure over a range of 1–1000 µg/L. The best-fitting continuous meta-regression model was sought and found to be a no-constant linear-quadratic analysis where both the risk and the exposure had been logarithmically transformed. This yielded both a statistically significant positive coefficient for the quadratic term and a statistically significant negative coefficient for the linear term. Sub-analyses by study design yielded results that were similar for both ecological studies and non-ecological studies. Statistically significant X-intercepts consistently found no increased level of risk at approximately 100–150 µg/L arsenic. PMID:26690190

  10. Stature Estimation from Lower Limb Anthropometry using Linear Regression Analysis: A Study on the Malaysian Population.

    PubMed

    Abu Bakar, S N; Aspalilah, A; AbdelNasser, I; Nurliza, A; Hairuliza, M J; Swarhib, M; Das, S; Mohd Nor, F

    2017-01-01

    Stature is one of the characteristics that could be used to identify human, besides age, sex and racial affiliation. This is useful when the body found is either dismembered, mutilated or even decomposed, and helps in narrowing down the missing person's identity. The main aim of the present study was to construct regression functions for stature estimation by using lower limb bones in the Malaysian population. The sample comprised 87 adult individuals (81 males, 6 females) aged between 20 to 79 years. The parameters such as thigh length, lower leg length, leg length, foot length, foot height and foot breadth were measured. They were measured by a ruler and measuring tape. Statistical analysis involved independent t-test to analyse the difference between lower limbs in male and female. The Pearson's correlation test was used to analyse correlations between lower limb parameters and stature, and the linear regressions were used to form equations. The paired t-test was used to compare between actual stature and estimated stature by using the equations formed. Using independent t-test, there was a significant difference (p< 0.05) in the measurement between males and females with regard to leg length, thigh length, lower leg length, foot length and foot breadth. The thigh length, leg length and foot length were observed to have strong correlations with stature with p= 0.75, p= 0.81 and p= 0.69, respectively. Linear regressions were formulated for stature estimation. Paired t-test showed no significant difference between actual stature and estimated stature. It is concluded that regression functions can be used to estimate stature to identify skeletal remains in the Malaysia population.

  11. Robust mislabel logistic regression without modeling mislabel probabilities.

    PubMed

    Hung, Hung; Jou, Zhi-Yu; Huang, Su-Yun

    2018-03-01

    Logistic regression is among the most widely used statistical methods for linear discriminant analysis. In many applications, we only observe possibly mislabeled responses. Fitting a conventional logistic regression can then lead to biased estimation. One common resolution is to fit a mislabel logistic regression model, which takes into consideration of mislabeled responses. Another common method is to adopt a robust M-estimation by down-weighting suspected instances. In this work, we propose a new robust mislabel logistic regression based on γ-divergence. Our proposal possesses two advantageous features: (1) It does not need to model the mislabel probabilities. (2) The minimum γ-divergence estimation leads to a weighted estimating equation without the need to include any bias correction term, that is, it is automatically bias-corrected. These features make the proposed γ-logistic regression more robust in model fitting and more intuitive for model interpretation through a simple weighting scheme. Our method is also easy to implement, and two types of algorithms are included. Simulation studies and the Pima data application are presented to demonstrate the performance of γ-logistic regression. © 2017, The International Biometric Society.

  12. Subsonic Aircraft With Regression and Neural-Network Approximators Designed

    NASA Technical Reports Server (NTRS)

    Patnaik, Surya N.; Hopkins, Dale A.

    2004-01-01

    At the NASA Glenn Research Center, NASA Langley Research Center's Flight Optimization System (FLOPS) and the design optimization testbed COMETBOARDS with regression and neural-network-analysis approximators have been coupled to obtain a preliminary aircraft design methodology. For a subsonic aircraft, the optimal design, that is the airframe-engine combination, is obtained by the simulation. The aircraft is powered by two high-bypass-ratio engines with a nominal thrust of about 35,000 lbf. It is to carry 150 passengers at a cruise speed of Mach 0.8 over a range of 3000 n mi and to operate on a 6000-ft runway. The aircraft design utilized a neural network and a regression-approximations-based analysis tool, along with a multioptimizer cascade algorithm that uses sequential linear programming, sequential quadratic programming, the method of feasible directions, and then sequential quadratic programming again. Optimal aircraft weight versus the number of design iterations is shown. The central processing unit (CPU) time to solution is given. It is shown that the regression-method-based analyzer exhibited a smoother convergence pattern than the FLOPS code. The optimum weight obtained by the approximation technique and the FLOPS code differed by 1.3 percent. Prediction by the approximation technique exhibited no error for the aircraft wing area and turbine entry temperature, whereas it was within 2 percent for most other parameters. Cascade strategy was required by FLOPS as well as the approximators. The regression method had a tendency to hug the data points, whereas the neural network exhibited a propensity to follow a mean path. The performance of the neural network and regression methods was considered adequate. It was at about the same level for small, standard, and large models with redundancy ratios (defined as the number of input-output pairs to the number of unknown coefficients) of 14, 28, and 57, respectively. In an SGI octane workstation (Silicon Graphics

  13. Marginal regression analysis of recurrent events with coarsened censoring times.

    PubMed

    Hu, X Joan; Rosychuk, Rhonda J

    2016-12-01

    Motivated by an ongoing pediatric mental health care (PMHC) study, this article presents weakly structured methods for analyzing doubly censored recurrent event data where only coarsened information on censoring is available. The study extracted administrative records of emergency department visits from provincial health administrative databases. The available information of each individual subject is limited to a subject-specific time window determined up to concealed data. To evaluate time-dependent effect of exposures, we adapt the local linear estimation with right censored survival times under the Cox regression model with time-varying coefficients (cf. Cai and Sun, Scandinavian Journal of Statistics 2003, 30, 93-111). We establish the pointwise consistency and asymptotic normality of the regression parameter estimator, and examine its performance by simulation. The PMHC study illustrates the proposed approach throughout the article. © 2016, The International Biometric Society.

  14. Age-period-cohort analysis of the suicide rate in Korea.

    PubMed

    Park, Chiho; Jee, Yon Ho; Jung, Keum Ji

    2016-04-01

    The suicide rate has been increasing in Korea, and the country now has the highest rank in the world. This study aimed to present the long-term trends in Korea's suicide rate using Joinpoint analysis and age-period-cohort (APC) modeling. The population and the number of suicides for each five-year age group were obtained from the National Statistical Office for the period 1984-2013 for Koreans aged 10 years and older. We determined the changes in the trends in age-standardized mortality rates using Joinpoint. APC modeling was performed to describe the trends in the suicide rate using the intrinsic estimator method. The age-standardized suicide rate in men rapidly increased from 1989 to 2004, and slightly increased thereafter, whereas the suicide rate in women increased from 1989 to 2009 and then decreased thereafter. Within the same period, the suicide rate was higher among the older age groups than in the younger groups. Within the same birth cohort, the suicide rate of the older groups was also higher than that in the younger groups. Within the same age group, the suicide rate of the younger cohorts was higher than it was in the older cohorts. In the APC modeling, old age, recent period, and having been born before 1924 were associated with higher suicide rates. The accuracy and completeness of the suicide rate data may lead to bias. This study showed an increasing trend in the suicide rates for men and women after 1989. These trends may be mainly attributed to cohort effects. Copyright © 2016 Elsevier B.V. All rights reserved.

  15. Regression analysis on the variation in efficiency frontiers for prevention stage of HIV/AIDS.

    PubMed

    Kamae, Maki S; Kamae, Isao; Cohen, Joshua T; Neumann, Peter J

    2011-01-01

    To investigate how the cost effectiveness of preventing HIV/AIDS varies across possible efficiency frontiers (EFs) by taking into account potentially relevant external factors, such as prevention stage, and how the EFs can be characterized using regression analysis given uncertainty of the QALY-cost estimates. We reviewed cost-effectiveness estimates for the prevention and treatment of HIV/AIDS published from 2002-2007 and catalogued in the Tufts Medical Center Cost-Effectiveness Analysis (CEA) Registry. We constructed efficiency frontier (EF) curves by plotting QALYs against costs, using methods used by the Institute for Quality and Efficiency in Health Care (IQWiG) in Germany. We stratified the QALY-cost ratios by prevention stage, country of study, and payer perspective, and estimated EF equations using log and square-root models. A total of 53 QALY-cost ratios were identified for HIV/AIDS in the Tufts CEA Registry. Plotted ratios stratified by prevention stage were visually grouped into a cluster consisting of primary/secondary prevention measures and a cluster consisting of tertiary measures. Correlation coefficients for each cluster were statistically significant. For each cluster, we derived two EF equations - one based on the log model, and one based on the square-root model. Our findings indicate that stratification of HIV/AIDS interventions by prevention stage can yield distinct EFs, and that the correlation and regression analyses are useful for parametrically characterizing EF equations. Our study has certain limitations, such as the small number of included articles and the potential for study populations to be non-representative of countries of interest. Nonetheless, our approach could help develop a deeper appreciation of cost effectiveness beyond the deterministic approach developed by IQWiG.

  16. A comparison of regression methods for model selection in individual-based landscape genetic analysis.

    PubMed

    Shirk, Andrew J; Landguth, Erin L; Cushman, Samuel A

    2018-01-01

    Anthropogenic migration barriers fragment many populations and limit the ability of species to respond to climate-induced biome shifts. Conservation actions designed to conserve habitat connectivity and mitigate barriers are needed to unite fragmented populations into larger, more viable metapopulations, and to allow species to track their climate envelope over time. Landscape genetic analysis provides an empirical means to infer landscape factors influencing gene flow and thereby inform such conservation actions. However, there are currently many methods available for model selection in landscape genetics, and considerable uncertainty as to which provide the greatest accuracy in identifying the true landscape model influencing gene flow among competing alternative hypotheses. In this study, we used population genetic simulations to evaluate the performance of seven regression-based model selection methods on a broad array of landscapes that varied by the number and type of variables contributing to resistance, the magnitude and cohesion of resistance, as well as the functional relationship between variables and resistance. We also assessed the effect of transformations designed to linearize the relationship between genetic and landscape distances. We found that linear mixed effects models had the highest accuracy in every way we evaluated model performance; however, other methods also performed well in many circumstances, particularly when landscape resistance was high and the correlation among competing hypotheses was limited. Our results provide guidance for which regression-based model selection methods provide the most accurate inferences in landscape genetic analysis and thereby best inform connectivity conservation actions. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.

  17. Understanding Child Stunting in India: A Comprehensive Analysis of Socio-Economic, Nutritional and Environmental Determinants Using Additive Quantile Regression

    PubMed Central

    Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.

    2013-01-01

    Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839

  18. Understanding child stunting in India: a comprehensive analysis of socio-economic, nutritional and environmental determinants using additive quantile regression.

    PubMed

    Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A

    2013-01-01

    Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Using cross-sectional data for children aged 0-24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role.

  19. Semiparametric regression analysis of interval-censored competing risks data.

    PubMed

    Mao, Lu; Lin, Dan-Yu; Zeng, Donglin

    2017-09-01

    Interval-censored competing risks data arise when each study subject may experience an event or failure from one of several causes and the failure time is not observed directly but rather is known to lie in an interval between two examinations. We formulate the effects of possibly time-varying (external) covariates on the cumulative incidence or sub-distribution function of competing risks (i.e., the marginal probability of failure from a specific cause) through a broad class of semiparametric regression models that captures both proportional and non-proportional hazards structures for the sub-distribution. We allow each subject to have an arbitrary number of examinations and accommodate missing information on the cause of failure. We consider nonparametric maximum likelihood estimation and devise a fast and stable EM-type algorithm for its computation. We then establish the consistency, asymptotic normality, and semiparametric efficiency of the resulting estimators for the regression parameters by appealing to modern empirical process theory. In addition, we show through extensive simulation studies that the proposed methods perform well in realistic situations. Finally, we provide an application to a study on HIV-1 infection with different viral subtypes. © 2017, The International Biometric Society.

  20. A logistic regression analysis of factors related to the treatment compliance of infertile patients with polycystic ovary syndrome.

    PubMed

    Li, Saijiao; He, Aiyan; Yang, Jing; Yin, TaiLang; Xu, Wangming

    2011-01-01

    To investigate factors that can affect compliance with treatment of polycystic ovary syndrome (PCOS) in infertile patients and to provide a basis for clinical treatment, specialist consultation and health education. Patient compliance was assessed via a questionnaire based on the Morisky-Green test and the treatment principles of PCOS. Then interviews were conducted with 99 infertile patients diagnosed with PCOS at Renmin Hospital of Wuhan University in China, from March to September 2009. Finally, these data were analyzed using logistic regression analysis. Logistic regression analysis revealed that a total of 23 (25.6%) of the participants showed good compliance. Factors that significantly (p < 0.05) affected compliance with treatment were the patient's body mass index, convenience of medical treatment and concerns about adverse drug reactions. Patients who are obese, experience inconvenient medical treatment or are concerned about adverse drug reactions are more likely to exhibit noncompliance. Treatment education and intervention aimed at these patients should be strengthened in the clinic to improve treatment compliance. Further research is needed to better elucidate the compliance behavior of patients with PCOS.

  1. Demand analysis of flood insurance by using logistic regression model and genetic algorithm

    NASA Astrophysics Data System (ADS)

    Sidi, P.; Mamat, M. B.; Sukono; Supian, S.; Putra, A. S.

    2018-03-01

    Citarum River floods in the area of South Bandung Indonesia, often resulting damage to some buildings belonging to the people living in the vicinity. One effort to alleviate the risk of building damage is to have flood insurance. The main obstacle is not all people in the Citarum basin decide to buy flood insurance. In this paper, we intend to analyse the decision to buy flood insurance. It is assumed that there are eight variables that influence the decision of purchasing flood assurance, include: income level, education level, house distance with river, building election with road, flood frequency experience, flood prediction, perception on insurance company, and perception towards government effort in handling flood. The analysis was done by using logistic regression model, and to estimate model parameters, it is done with genetic algorithm. The results of the analysis shows that eight variables analysed significantly influence the demand of flood insurance. These results are expected to be considered for insurance companies, to influence the decision of the community to be willing to buy flood insurance.

  2. Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert Manfred

    2013-01-01

    A new regression model search algorithm was developed in 2011 that may be used to analyze both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The new algorithm is a simplified version of a more complex search algorithm that was originally developed at the NASA Ames Balance Calibration Laboratory. The new algorithm has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression models. Therefore, the simplified search algorithm is not intended to replace the original search algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm either fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new regression model search algorithm.

  3. An investigation of correlation between pilot scanning behavior and workload using stepwise regression analysis

    NASA Technical Reports Server (NTRS)

    Waller, M. C.

    1976-01-01

    An electro-optical device called an oculometer which tracks a subject's lookpoint as a time function has been used to collect data in a real-time simulation study of instrument landing system (ILS) approaches. The data describing the scanning behavior of a pilot during the instrument approaches have been analyzed by use of a stepwise regression analysis technique. A statistically significant correlation between pilot workload, as indicated by pilot ratings, and scanning behavior has been established. In addition, it was demonstrated that parameters derived from the scanning behavior data can be combined in a mathematical equation to provide a good representation of pilot workload.

  4. Frontier-based techniques in measuring hospital efficiency in Iran: a systematic review and meta-regression analysis

    PubMed Central

    2013-01-01

    Background In recent years, there has been growing interest in measuring the efficiency of hospitals in Iran and several studies have been conducted on the topic. The main objective of this paper was to review studies in the field of hospital efficiency and examine the estimated technical efficiency (TE) of Iranian hospitals. Methods Persian and English databases were searched for studies related to measuring hospital efficiency in Iran. Ordinary least squares (OLS) regression models were applied for statistical analysis. The PRISMA guidelines were followed in the search process. Results A total of 43 efficiency scores from 29 studies were retrieved and used to approach the research question. Data envelopment analysis was the principal frontier efficiency method in the estimation of efficiency scores. The pooled estimate of mean TE was 0.846 (±0.134). There was a considerable variation in the efficiency scores between the different studies performed in Iran. There were no differences in efficiency scores between data envelopment analysis (DEA) and stochastic frontier analysis (SFA) techniques. The reviewed studies are generally similar and suffer from similar methodological deficiencies, such as no adjustment for case mix and quality of care differences. The results of OLS regression revealed that studies that included more variables and more heterogeneous hospitals generally reported higher TE. Larger sample size was associated with reporting lower TE. Conclusions The features of frontier-based techniques had a profound impact on the efficiency scores among Iranian hospital studies. These studies suffer from major methodological deficiencies and were of sub-optimal quality, limiting their validity and reliability. It is suggested that improving data collection and processing in Iranian hospital databases may have a substantial impact on promoting the quality of research in this field. PMID:23945011

  5. Development of LACIE CCEA-1 weather/wheat yield models. [regression analysis

    NASA Technical Reports Server (NTRS)

    Strommen, N. D.; Sakamoto, C. M.; Leduc, S. K.; Umberger, D. E. (Principal Investigator)

    1979-01-01

    The advantages and disadvantages of the casual (phenological, dynamic, physiological), statistical regression, and analog approaches to modeling for grain yield are examined. Given LACIE's primary goal of estimating wheat production for the large areas of eight major wheat-growing regions, the statistical regression approach of correlating historical yield and climate data offered the Center for Climatic and Environmental Assessment the greatest potential return within the constraints of time and data sources. The basic equation for the first generation wheat-yield model is given. Topics discussed include truncation, trend variable, selection of weather variables, episodic events, strata selection, operational data flow, weighting, and model results.

  6. Beyond Multiple Regression: Using Commonality Analysis to Better Understand R[superscript 2] Results

    ERIC Educational Resources Information Center

    Warne, Russell T.

    2011-01-01

    Multiple regression is one of the most common statistical methods used in quantitative educational research. Despite the versatility and easy interpretability of multiple regression, it has some shortcomings in the detection of suppressor variables and for somewhat arbitrarily assigning values to the structure coefficients of correlated…

  7. Nonlinear Trimodal Regression Analysis of Radiodensitometric Distributions to Quantify Sarcopenic and Sequelae Muscle Degeneration

    PubMed Central

    Árnadóttir, Í.; Gíslason, M. K.; Carraro, U.

    2016-01-01

    Muscle degeneration has been consistently identified as an independent risk factor for high mortality in both aging populations and individuals suffering from neuromuscular pathology or injury. While there is much extant literature on its quantification and correlation to comorbidities, a quantitative gold standard for analyses in this regard remains undefined. Herein, we hypothesize that rigorously quantifying entire radiodensitometric distributions elicits more muscle quality information than average values reported in extant methods. This study reports the development and utility of a nonlinear trimodal regression analysis method utilized on radiodensitometric distributions of upper leg muscles from CT scans of a healthy young adult, a healthy elderly subject, and a spinal cord injury patient. The method was then employed with a THA cohort to assess pre- and postsurgical differences in their healthy and operative legs. Results from the initial representative models elicited high degrees of correlation to HU distributions, and regression parameters highlighted physiologically evident differences between subjects. Furthermore, results from the THA cohort echoed physiological justification and indicated significant improvements in muscle quality in both legs following surgery. Altogether, these results highlight the utility of novel parameters from entire HU distributions that could provide insight into the optimal quantification of muscle degeneration. PMID:28115982

  8. Regression analysis utilizing subjective evaluation of emotional experience in PET studies on emotions.

    PubMed

    Aalto, Sargo; Wallius, Esa; Näätänen, Petri; Hiltunen, Jaana; Metsähonkala, Liisa; Sipilä, Hannu; Karlsson, Hasse

    2005-09-01

    A methodological study on subject-specific regression analysis (SSRA) exploring the correlation between the neural response and the subjective evaluation of emotional experience in eleven healthy females is presented. The target emotions, i.e., amusement and sadness, were induced using validated film clips, regional cerebral blood flow (rCBF) was measured using positron emission tomography (PET), and the subjective intensity of the emotional experience during the PET scanning was measured using a category ratio (CR-10) scale. Reliability analysis of the rating data indicated that the subjects rated the intensity of their emotional experience fairly consistently on the CR-10 scale (Cronbach alphas 0.70-0.97). A two-phase random-effects analysis was performed to ensure the generalizability and inter-study comparability of the SSRA results. Random-effects SSRAs using Statistical non-Parametric Mapping 99 (SnPM99) showed that rCBF correlated with the self-rated intensity of the emotional experience mainly in the brain regions that were identified in the random-effects subtraction analyses using the same imaging data. Our results give preliminary evidence of a linear association between the neural responses related to amusement and sadness and the self-evaluated intensity of the emotional experience in several regions involved in the emotional response. SSRA utilizing subjective evaluation of emotional experience turned out a feasible and promising method of analysis. It allows versatile exploration of the neurobiology of emotions and the neural correlates of actual and individual emotional experience. Thus, SSRA might be able to catch the idiosyncratic aspects of the emotional response better than traditional subtraction analysis.

  9. Effect of Ankle Range of Motion (ROM) and Lower-Extremity Muscle Strength on Static Balance Control Ability in Young Adults: A Regression Analysis

    PubMed Central

    Kim, Seong-Gil

    2018-01-01

    Background The purpose of this study was to investigate the effect of ankle ROM and lower-extremity muscle strength on static balance control ability in young adults. Material/Methods This study was conducted with 65 young adults, but 10 young adults dropped out during the measurement, so 55 young adults (male: 19, female: 36) completed the study. Postural sway (length and velocity) was measured with eyes open and closed, and ankle ROM (AROM and PROM of dorsiflexion and plantarflexion) and lower-extremity muscle strength (flexor and extensor of hip, knee, and ankle joint) were measured. Pearson correlation coefficient was used to examine the correlation between variables and static balance ability. Simple linear regression analysis and multiple linear regression analysis were used to examine the effect of variables on static balance ability. Results In correlation analysis, plantarflexion ROM (AROM and PROM) and lower-extremity muscle strength (except hip extensor) were significantly correlated with postural sway (p<0.05). In simple correlation analysis, all variables that passed the correlation analysis procedure had significant influence (p<0.05). In multiple linear regression analysis, plantar flexion PROM with eyes open significantly influenced sway length (B=0.681) and sway velocity (B=0.011). Conclusions Lower-extremity muscle strength and ankle plantarflexion ROM influenced static balance control ability, with ankle plantarflexion PROM showing the greatest influence. Therefore, both contractile structures and non-contractile structures should be of interest when considering static balance control ability improvement. PMID:29760375

  10. Effect of Ankle Range of Motion (ROM) and Lower-Extremity Muscle Strength on Static Balance Control Ability in Young Adults: A Regression Analysis.

    PubMed

    Kim, Seong-Gil; Kim, Wan-Soo

    2018-05-15

    BACKGROUND The purpose of this study was to investigate the effect of ankle ROM and lower-extremity muscle strength on static balance control ability in young adults. MATERIAL AND METHODS This study was conducted with 65 young adults, but 10 young adults dropped out during the measurement, so 55 young adults (male: 19, female: 36) completed the study. Postural sway (length and velocity) was measured with eyes open and closed, and ankle ROM (AROM and PROM of dorsiflexion and plantarflexion) and lower-extremity muscle strength (flexor and extensor of hip, knee, and ankle joint) were measured. Pearson correlation coefficient was used to examine the correlation between variables and static balance ability. Simple linear regression analysis and multiple linear regression analysis were used to examine the effect of variables on static balance ability. RESULTS In correlation analysis, plantarflexion ROM (AROM and PROM) and lower-extremity muscle strength (except hip extensor) were significantly correlated with postural sway (p<0.05). In simple correlation analysis, all variables that passed the correlation analysis procedure had significant influence (p<0.05). In multiple linear regression analysis, plantar flexion PROM with eyes open significantly influenced sway length (B=0.681) and sway velocity (B=0.011). CONCLUSIONS Lower-extremity muscle strength and ankle plantarflexion ROM influenced static balance control ability, with ankle plantarflexion PROM showing the greatest influence. Therefore, both contractile structures and non-contractile structures should be of interest when considering static balance control ability improvement.

  11. Teaching the Concept of Breakdown Point in Simple Linear Regression.

    ERIC Educational Resources Information Center

    Chan, Wai-Sum

    2001-01-01

    Most introductory textbooks on simple linear regression analysis mention the fact that extreme data points have a great influence on ordinary least-squares regression estimation; however, not many textbooks provide a rigorous mathematical explanation of this phenomenon. Suggests a way to fill this gap by teaching students the concept of breakdown…

  12. Machine learning of swimming data via wisdom of crowd and regression analysis.

    PubMed

    Xie, Jiang; Xu, Junfu; Nie, Celine; Nie, Qing

    2017-04-01

    Every performance, in an officially sanctioned meet, by a registered USA swimmer is recorded into an online database with times dating back to 1980. For the first time, statistical analysis and machine learning methods are systematically applied to 4,022,631 swim records. In this study, we investigate performance features for all strokes as a function of age and gender. The variances in performance of males and females for different ages and strokes were studied, and the correlations of performances for different ages were estimated using the Pearson correlation. Regression analysis show the performance trends for both males and females at different ages and suggest critical ages for peak training. Moreover, we assess twelve popular machine learning methods to predict or classify swimmer performance. Each method exhibited different strengths or weaknesses in different cases, indicating no one method could predict well for all strokes. To address this problem, we propose a new method by combining multiple inference methods to derive Wisdom of Crowd Classifier (WoCC). Our simulation experiments demonstrate that the WoCC is a consistent method with better overall prediction accuracy. Our study reveals several new age-dependent trends in swimming and provides an accurate method for classifying and predicting swimming times.

  13. Risky decision making in Attention-Deficit/Hyperactivity Disorder: A meta-regression analysis.

    PubMed

    Dekkers, Tycho J; Popma, Arne; Agelink van Rentergem, Joost A; Bexkens, Anika; Huizenga, Hilde M

    2016-04-01

    ADHD has been associated with various forms of risky real life decision making, for example risky driving, unsafe sex and substance abuse. However, results from laboratory studies on decision making deficits in ADHD have been inconsistent, probably because of between study differences. We therefore performed a meta-regression analysis in which 37 studies (n ADHD=1175; n Control=1222) were included, containing 52 effect sizes. The overall analysis yielded a small to medium effect size (standardized mean difference=.36, p<.001, 95% CI [.22, .51]), indicating that groups with ADHD showed more risky decision making than control groups. There was a trend for a moderating influence of co-morbid Disruptive Behavior Disorders (DBD): studies including more participants with co-morbid DBD had larger effect sizes. No moderating influence of co-morbid internalizing disorders, age or task explicitness was found. These results indicate that ADHD is related to increased risky decision making in laboratory settings, which tended to be more pronounced if ADHD is accompanied by DBD. We therefore argue that risky decision making should have a more prominent role in research on the neuropsychological and -biological mechanisms of ADHD, which can be useful in ADHD assessment and intervention. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. Childhood Cancer Incidence Trends in Association With US Folic Acid Fortification (1986–2008)

    PubMed Central

    Johnson, Kimberly J.; Ross, Julie A.

    2012-01-01

    OBJECTIVE: Epidemiologic evidence indicates that prenatal vitamin supplementation reduces risk for some childhood cancers; however, a systematic evaluation of population-based childhood cancer incidence trends after fortification of enriched grain products with folic acid in the United States in 1996–1998 has not been previously reported. Here we describe temporal trends in childhood cancer incidence in association with US folic acid fortification. METHODS: Using Surveillance, Epidemiology, and End Results program data (1986–2008), we calculated incidence rate ratios and 95% confidence intervals to compare pre- and postfortification cancer incidence rates in children aged 0 to 4 years. Incidence trends were also evaluated by using joinpoint and loess regression models. RESULTS: From 1986 through 2008, 8829 children aged 0 to 4 years were diagnosed with malignancies, including 3790 and 3299 in utero during the pre- and postfortification periods, respectively. Pre- and postfortification incidence rates were similar for all cancers combined and for most specific cancer types. Rates of Wilms tumor (WT), primitive neuroectodermal tumors (PNETs), and ependymomas were significantly lower postfortification. Joinpoint regression models detected increasing WT incidence from 1986 through 1997 followed by a sizable decline from 1997 through 2008, and increasing PNET incidence from 1986 through 1993 followed by a sharp decrease from 1993 through 2008. Loess curves indicated similar patterns. CONCLUSIONS: These results provide support for a decrease in WT and possibly PNET incidence, but not other childhood cancers, after US folic acid fortification. PMID:22614769

  15. Childhood cancer incidence trends in association with US folic acid fortification (1986-2008).

    PubMed

    Linabery, Amy M; Johnson, Kimberly J; Ross, Julie A

    2012-06-01

    Epidemiologic evidence indicates that prenatal vitamin supplementation reduces risk for some childhood cancers; however, a systematic evaluation of population-based childhood cancer incidence trends after fortification of enriched grain products with folic acid in the United States in 1996-1998 has not been previously reported. Here we describe temporal trends in childhood cancer incidence in association with US folic acid fortification. Using Surveillance, Epidemiology, and End Results program data (1986-2008), we calculated incidence rate ratios and 95% confidence intervals to compare pre- and postfortification cancer incidence rates in children aged 0 to 4 years. Incidence trends were also evaluated by using joinpoint and loess regression models. From 1986 through 2008, 8829 children aged 0 to 4 years were diagnosed with malignancies, including 3790 and 3299 in utero during the pre- and postfortification periods, respectively. Pre- and postfortification incidence rates were similar for all cancers combined and for most specific cancer types. Rates of Wilms tumor (WT), primitive neuroectodermal tumors (PNETs), and ependymomas were significantly lower postfortification. Joinpoint regression models detected increasing WT incidence from 1986 through 1997 followed by a sizable decline from 1997 through 2008, and increasing PNET incidence from 1986 through 1993 followed by a sharp decrease from 1993 through 2008. Loess curves indicated similar patterns. These results provide support for a decrease in WT and possibly PNET incidence, but not other childhood cancers, after US folic acid fortification.

  16. Cytopathologic differential diagnosis of low-grade urothelial carcinoma and reactive urothelial proliferation in bladder washings: a logistic regression analysis.

    PubMed

    Cakir, Ebru; Kucuk, Ulku; Pala, Emel Ebru; Sezer, Ozlem; Ekin, Rahmi Gokhan; Cakmak, Ozgur

    2017-05-01

    Conventional cytomorphologic assessment is the first step to establish an accurate diagnosis in urinary cytology. In cytologic preparations, the separation of low-grade urothelial carcinoma (LGUC) from reactive urothelial proliferation (RUP) can be exceedingly difficult. The bladder washing cytologies of 32 LGUC and 29 RUP were reviewed. The cytologic slides were examined for the presence or absence of the 28 cytologic features. The cytologic criteria showing statistical significance in LGUC were increased numbers of monotonous single (non-umbrella) cells, three-dimensional cellular papillary clusters without fibrovascular cores, irregular bordered clusters, atypical single cells, irregular nuclear overlap, cytoplasmic homogeneity, increased N/C ratio, pleomorphism, nuclear border irregularity, nuclear eccentricity, elongated nuclei, and hyperchromasia (p ˂ 0.05), and the cytologic criteria showing statistical significance in RUP were inflammatory background, mixture of small and large urothelial cells, loose monolayer aggregates, and vacuolated cytoplasm (p ˂ 0.05). When these variables were subjected to a stepwise logistic regression analysis, four features were selected to distinguish LGUC from RUP: increased numbers of monotonous single (non-umbrella) cells, increased nuclear cytoplasmic ratio, hyperchromasia, and presence of small and large urothelial cells (p = 0.0001). By this logistic model of the 32 cases with proven LGUC, the stepwise logistic regression analysis correctly predicted 31 (96.9%) patients with this diagnosis, and of the 29 patients with RUP, the logistic model correctly predicted 26 (89.7%) patients as having this disease. There are several cytologic features to separate LGUC from RUP. Stepwise logistic regression analysis is a valuable tool for determining the most useful cytologic criteria to distinguish these entities. © 2017 APMIS. Published by John Wiley & Sons Ltd.

  17. Factors affecting the outcome of excimer laser photorefractive keratectomy: a preliminary multivariable regression analysis

    NASA Astrophysics Data System (ADS)

    Maguen, Ezra I.; Papaioannou, Thanassis; Nesburn, Anthony B.; Salz, James J.; Warren, Cathy; Grundfest, Warren S.

    1996-05-01

    Multivariable regression analysis was used to evaluate the combined effects of some preoperative and operative variables on the change of refraction following excimer laser photorefractive keratectomy for myopia (PRK). This analysis was performed on 152 eyes (at 6 months postoperatively) and 156 eyes (at 12 months postoperatively). The following variables were considered: intended refractive correction, patient age, treatment zone, central corneal thickness, average corneal curvature, and intraocular pressure. At 6 months after surgery, the cumulative R2 was 0.43 with 0.38 attributed to the intended correction and 0.06 attributed to the preoperative corneal curvature. At 12 months, the cumulative R2 was 0.37 where 0.33 was attributed to the intended correction, 0.02 to the preoperative corneal curvature, and 0.01 to both preoperative corneal thickness and to the patient age. Further model augmentation is necessary to account for the remaining variability and the behavior of the residuals.

  18. Variable Selection for Regression Models of Percentile Flows

    NASA Astrophysics Data System (ADS)

    Fouad, G.

    2017-12-01

    Percentile flows describe the flow magnitude equaled or exceeded for a given percent of time, and are widely used in water resource management. However, these statistics are normally unavailable since most basins are ungauged. Percentile flows of ungauged basins are often predicted using regression models based on readily observable basin characteristics, such as mean elevation. The number of these independent variables is too large to evaluate all possible models. A subset of models is typically evaluated using automatic procedures, like stepwise regression. This ignores a large variety of methods from the field of feature (variable) selection and physical understanding of percentile flows. A study of 918 basins in the United States was conducted to compare an automatic regression procedure to the following variable selection methods: (1) principal component analysis, (2) correlation analysis, (3) random forests, (4) genetic programming, (5) Bayesian networks, and (6) physical understanding. The automatic regression procedure only performed better than principal component analysis. Poor performance of the regression procedure was due to a commonly used filter for multicollinearity, which rejected the strongest models because they had cross-correlated independent variables. Multicollinearity did not decrease model performance in validation because of a representative set of calibration basins. Variable selection methods based strictly on predictive power (numbers 2-5 from above) performed similarly, likely indicating a limit to the predictive power of the variables. Similar performance was also reached using variables selected based on physical understanding, a finding that substantiates recent calls to emphasize physical understanding in modeling for predictions in ungauged basins. The strongest variables highlighted the importance of geology and land cover, whereas widely used topographic variables were the weakest predictors. Variables suffered from a high

  19. The role of empathy and emotional intelligence in nurses' communication attitudes using regression models and fuzzy-set qualitative comparative analysis models.

    PubMed

    Giménez-Espert, María Del Carmen; Prado-Gascó, Vicente Javier

    2018-03-01

    To analyse link between empathy and emotional intelligence as a predictor of nurses' attitudes towards communication while comparing the contribution of emotional aspects and attitudinal elements on potential behaviour. Nurses' attitudes towards communication, empathy and emotional intelligence are key skills for nurses involved in patient care. There are currently no studies analysing this link, and its investigation is needed because attitudes may influence communication behaviours. Correlational study. To attain this goal, self-reported instruments (attitudes towards communication of nurses, trait emotional intelligence (Trait Emotional Meta-Mood Scale) and Jefferson Scale of Nursing Empathy (Jefferson Scale Nursing Empathy) were collected from 460 nurses between September 2015-February 2016. Two different analytical methodologies were used: traditional regression models and fuzzy-set qualitative comparative analysis models. The results of the regression model suggest that cognitive dimensions of attitude are a significant and positive predictor of the behavioural dimension. The perspective-taking dimension of empathy and the emotional-clarity dimension of emotional intelligence were significant positive predictors of the dimensions of attitudes towards communication, except for the affective dimension (for which the association was negative). The results of the fuzzy-set qualitative comparative analysis models confirm that the combination of high levels of cognitive dimension of attitudes, perspective-taking and emotional clarity explained high levels of the behavioural dimension of attitude. Empathy and emotional intelligence are predictors of nurses' attitudes towards communication, and the cognitive dimension of attitude is a good predictor of the behavioural dimension of attitudes towards communication of nurses in both regression models and fuzzy-set qualitative comparative analysis. In general, the fuzzy-set qualitative comparative analysis models appear

  20. A Methodology for Generating Placement Rules that Utilizes Logistic Regression

    ERIC Educational Resources Information Center

    Wurtz, Keith

    2008-01-01

    The purpose of this article is to provide the necessary tools for institutional researchers to conduct a logistic regression analysis and interpret the results. Aspects of the logistic regression procedure that are necessary to evaluate models are presented and discussed with an emphasis on cutoff values and choosing the appropriate number of…

  1. Estimating regression coefficients from clustered samples: Sampling errors and optimum sample allocation

    NASA Technical Reports Server (NTRS)

    Kalton, G.

    1983-01-01

    A number of surveys were conducted to study the relationship between the level of aircraft or traffic noise exposure experienced by people living in a particular area and their annoyance with it. These surveys generally employ a clustered sample design which affects the precision of the survey estimates. Regression analysis of annoyance on noise measures and other variables is often an important component of the survey analysis. Formulae are presented for estimating the standard errors of regression coefficients and ratio of regression coefficients that are applicable with a two- or three-stage clustered sample design. Using a simple cost function, they also determine the optimum allocation of the sample across the stages of the sample design for the estimation of a regression coefficient.

  2. Principal components and iterative regression analysis of geophysical series: Application to Sunspot number (1750 2004)

    NASA Astrophysics Data System (ADS)

    Nordemann, D. J. R.; Rigozo, N. R.; de Souza Echer, M. P.; Echer, E.

    2008-11-01

    We present here an implementation of a least squares iterative regression method applied to the sine functions embedded in the principal components extracted from geophysical time series. This method seems to represent a useful improvement for the non-stationary time series periodicity quantitative analysis. The principal components determination followed by the least squares iterative regression method was implemented in an algorithm written in the Scilab (2006) language. The main result of the method is to obtain the set of sine functions embedded in the series analyzed in decreasing order of significance, from the most important ones, likely to represent the physical processes involved in the generation of the series, to the less important ones that represent noise components. Taking into account the need of a deeper knowledge of the Sun's past history and its implication to global climate change, the method was applied to the Sunspot Number series (1750-2004). With the threshold and parameter values used here, the application of the method leads to a total of 441 explicit sine functions, among which 65 were considered as being significant and were used for a reconstruction that gave a normalized mean squared error of 0.146.

  3. Parametric regression model for survival data: Weibull regression model as an example

    PubMed Central

    2016-01-01

    Weibull regression model is one of the most popular forms of parametric regression model that it provides estimate of baseline hazard function, as well as coefficients for covariates. Because of technical difficulties, Weibull regression model is seldom used in medical literature as compared to the semi-parametric proportional hazard model. To make clinical investigators familiar with Weibull regression model, this article introduces some basic knowledge on Weibull regression model and then illustrates how to fit the model with R software. The SurvRegCensCov package is useful in converting estimated coefficients to clinical relevant statistics such as hazard ratio (HR) and event time ratio (ETR). Model adequacy can be assessed by inspecting Kaplan-Meier curves stratified by categorical variable. The eha package provides an alternative method to model Weibull regression model. The check.dist() function helps to assess goodness-of-fit of the model. Variable selection is based on the importance of a covariate, which can be tested using anova() function. Alternatively, backward elimination starting from a full model is an efficient way for model development. Visualization of Weibull regression model after model development is interesting that it provides another way to report your findings. PMID:28149846

  4. The use of segmented regression in analysing interrupted time series studies: an example in pre-hospital ambulance care.

    PubMed

    Taljaard, Monica; McKenzie, Joanne E; Ramsay, Craig R; Grimshaw, Jeremy M

    2014-06-19

    An interrupted time series design is a powerful quasi-experimental approach for evaluating effects of interventions introduced at a specific point in time. To utilize the strength of this design, a modification to standard regression analysis, such as segmented regression, is required. In segmented regression analysis, the change in intercept and/or slope from pre- to post-intervention is estimated and used to test causal hypotheses about the intervention. We illustrate segmented regression using data from a previously published study that evaluated the effectiveness of a collaborative intervention to improve quality in pre-hospital ambulance care for acute myocardial infarction (AMI) and stroke. In the original analysis, a standard regression model was used with time as a continuous variable. We contrast the results from this standard regression analysis with those from segmented regression analysis. We discuss the limitations of the former and advantages of the latter, as well as the challenges of using segmented regression in analysing complex quality improvement interventions. Based on the estimated change in intercept and slope from pre- to post-intervention using segmented regression, we found insufficient evidence of a statistically significant effect on quality of care for stroke, although potential clinically important effects for AMI cannot be ruled out. Segmented regression analysis is the recommended approach for analysing data from an interrupted time series study. Several modifications to the basic segmented regression analysis approach are available to deal with challenges arising in the evaluation of complex quality improvement interventions.

  5. Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models

    ERIC Educational Resources Information Center

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…

  6. Alternatives for using multivariate regression to adjust prospective payment rates

    PubMed Central

    Sheingold, Steven H.

    1990-01-01

    Multivariate regression analysis has been used in structuring three of the adjustments to Medicare's prospective payment rates. Because the indirect-teaching adjustment, the disproportionate-share adjustment, and the adjustment for large cities are responsible for distributing approximately $3 billion in payments each year, the specification of regression models for these adjustments is of critical importance. In this article, the application of regression for adjusting Medicare's prospective rates is discussed, and the implications that differing specifications could have for these adjustments are demonstrated. PMID:10113271

  7. Quasi-experimental evidence on tobacco tax regressivity.

    PubMed

    Koch, Steven F

    2018-01-01

    Tobacco taxes are known to reduce tobacco consumption and to be regressive, such that tobacco control policy may have the perverse effect of further harming the poor. However, if tobacco consumption falls faster amongst the poor than the rich, tobacco control policy can actually be progressive. We take advantage of persistent and committed tobacco control activities in South Africa to examine the household tobacco expenditure burden. For the analysis, we make use of two South African Income and Expenditure Surveys (2005/06 and 2010/11) that span a series of such tax increases and have been matched across the years, yielding 7806 matched pairs of tobacco consuming households and 4909 matched pairs of cigarette consuming households. By matching households across the surveys, we are able to examine both the regressivity of the household tobacco burden, and any change in that regressivity, and since tobacco taxes have been a consistent component of tobacco prices, our results also relate to the regressivity of tobacco taxes. Like previous research into cigarette and tobacco expenditures, we find that the tobacco burden is regressive; thus, so are tobacco taxes. However, we find that over the five-year period considered, the tobacco burden has decreased, and, most importantly, falls less heavily on the poor. Thus, the tobacco burden and the tobacco tax is less regressive in 2010/11 than in 2005/06. Thus, increased tobacco taxes can, in at least some circumstances, reduce the financial burden that tobacco places on households. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Fast Quantitative Analysis Of Museum Objects Using Laser-Induced Breakdown Spectroscopy And Multiple Regression Algorithms

    NASA Astrophysics Data System (ADS)

    Lorenzetti, G.; Foresta, A.; Palleschi, V.; Legnaioli, S.

    2009-09-01

    The recent development of mobile instrumentation, specifically devoted to in situ analysis and study of museum objects, allows the acquisition of many LIBS spectra in very short time. However, such large amount of data calls for new analytical approaches which would guarantee a prompt analysis of the results obtained. In this communication, we will present and discuss the advantages of statistical analytical methods, such as Partial Least Squares Multiple Regression algorithms vs. the classical calibration curve approach. PLS algorithms allows to obtain in real time the information on the composition of the objects under study; this feature of the method, compared to the traditional off-line analysis of the data, is extremely useful for the optimization of the measurement times and number of points associated with the analysis. In fact, the real time availability of the compositional information gives the possibility of concentrating the attention on the most `interesting' parts of the object, without over-sampling the zones which would not provide useful information for the scholars or the conservators. Some example on the applications of this method will be presented, including the studies recently performed by the researcher of the Applied Laser Spectroscopy Laboratory on museum bronze objects.

  9. Regression analysis of case K interval-censored failure time data in the presence of informative censoring.

    PubMed

    Wang, Peijie; Zhao, Hui; Sun, Jianguo

    2016-12-01

    Interval-censored failure time data occur in many fields such as demography, economics, medical research, and reliability and many inference procedures on them have been developed (Sun, 2006; Chen, Sun, and Peace, 2012). However, most of the existing approaches assume that the mechanism that yields interval censoring is independent of the failure time of interest and it is clear that this may not be true in practice (Zhang et al., 2007; Ma, Hu, and Sun, 2015). In this article, we consider regression analysis of case K interval-censored failure time data when the censoring mechanism may be related to the failure time of interest. For the problem, an estimated sieve maximum-likelihood approach is proposed for the data arising from the proportional hazards frailty model and for estimation, a two-step procedure is presented. In the addition, the asymptotic properties of the proposed estimators of regression parameters are established and an extensive simulation study suggests that the method works well. Finally, we apply the method to a set of real interval-censored data that motivated this study. © 2016, The International Biometric Society.

  10. Gradient descent for robust kernel-based regression

    NASA Astrophysics Data System (ADS)

    Guo, Zheng-Chu; Hu, Ting; Shi, Lei

    2018-06-01

    In this paper, we study the gradient descent algorithm generated by a robust loss function over a reproducing kernel Hilbert space (RKHS). The loss function is defined by a windowing function G and a scale parameter σ, which can include a wide range of commonly used robust losses for regression. There is still a gap between theoretical analysis and optimization process of empirical risk minimization based on loss: the estimator needs to be global optimal in the theoretical analysis while the optimization method can not ensure the global optimality of its solutions. In this paper, we aim to fill this gap by developing a novel theoretical analysis on the performance of estimators generated by the gradient descent algorithm. We demonstrate that with an appropriately chosen scale parameter σ, the gradient update with early stopping rules can approximate the regression function. Our elegant error analysis can lead to convergence in the standard L 2 norm and the strong RKHS norm, both of which are optimal in the mini-max sense. We show that the scale parameter σ plays an important role in providing robustness as well as fast convergence. The numerical experiments implemented on synthetic examples and real data set also support our theoretical results.

  11. Recent changes in the trends of teen birth rates, 1981-2006.

    PubMed

    Wingo, Phyllis A; Smith, Ruben A; Tevendale, Heather D; Ferré, Cynthia

    2011-03-01

    To explore trends in teen birth rates by selected demographics. We used birth certificate data and joinpoint regression to examine trends in teen birth rates by age (10-14, 15-17, and 18-19 years) and race during 1981-2006 and by age and Hispanic origin during 1990-2006. Joinpoint analysis describes changing trends over successive segments of time and uses annual percentage change (APC) to express the amount of increase or decrease within each segment. For teens younger than 18 years, the decline in birth rates began in 1994 and ended in 2003 (APC: -8.03% per year for ages 10-14 years; APC: -5.63% per year for ages 15-17 years). The downward trend for 18- and 19-year-old teens began earlier (1991) and ended 1 year later (2004) (APC: -2.37% per year). For each study population, the trend was approximately level during the most recent time segment, except for continuing declines for 18- and 19-year-old white and Asian/Pacific Islander teens. The only increasing trend in the most recent time segment was for 18- and 19-year-old Hispanic teens. During these declines, the age distribution of teens who gave birth shifted to slightly older ages, and the percentage whose current birth was at least their second birth decreased. Teen birth rates were generally level during 2003/2004-2006 after the long-term declines. Rates increased among older Hispanic teens. These results indicate a need for renewed attention to effective teen pregnancy prevention programs in specific populations. Copyright © 2011. Published by Elsevier Inc.

  12. Regression analysis of sparse asynchronous longitudinal data

    PubMed Central

    Cao, Hongyuan; Zeng, Donglin; Fine, Jason P.

    2015-01-01

    Summary We consider estimation of regression models for sparse asynchronous longitudinal observations, where time-dependent responses and covariates are observed intermittently within subjects. Unlike with synchronous data, where the response and covariates are observed at the same time point, with asynchronous data, the observation times are mismatched. Simple kernel-weighted estimating equations are proposed for generalized linear models with either time invariant or time-dependent coefficients under smoothness assumptions for the covariate processes which are similar to those for synchronous data. For models with either time invariant or time-dependent coefficients, the estimators are consistent and asymptotically normal but converge at slower rates than those achieved with synchronous data. Simulation studies evidence that the methods perform well with realistic sample sizes and may be superior to a naive application of methods for synchronous data based on an ad hoc last value carried forward approach. The practical utility of the methods is illustrated on data from a study on human immunodeficiency virus. PMID:26568699

  13. Regression analysis of sparse asynchronous longitudinal data.

    PubMed

    Cao, Hongyuan; Zeng, Donglin; Fine, Jason P

    2015-09-01

    We consider estimation of regression models for sparse asynchronous longitudinal observations, where time-dependent responses and covariates are observed intermittently within subjects. Unlike with synchronous data, where the response and covariates are observed at the same time point, with asynchronous data, the observation times are mismatched. Simple kernel-weighted estimating equations are proposed for generalized linear models with either time invariant or time-dependent coefficients under smoothness assumptions for the covariate processes which are similar to those for synchronous data. For models with either time invariant or time-dependent coefficients, the estimators are consistent and asymptotically normal but converge at slower rates than those achieved with synchronous data. Simulation studies evidence that the methods perform well with realistic sample sizes and may be superior to a naive application of methods for synchronous data based on an ad hoc last value carried forward approach. The practical utility of the methods is illustrated on data from a study on human immunodeficiency virus.

  14. Logistic regression analysis to predict Medical Licensing Examination of Thailand (MLET) Step1 success or failure.

    PubMed

    Wanvarie, Samkaew; Sathapatayavongs, Boonmee

    2007-09-01

    The aim of this paper was to assess factors that predict students' performance in the Medical Licensing Examination of Thailand (MLET) Step1 examination. The hypothesis was that demographic factors and academic records would predict the students' performance in the Step1 Licensing Examination. A logistic regression analysis of demographic factors (age, sex and residence) and academic records [high school grade point average (GPA), National University Entrance Examination Score and GPAs of the pre-clinical years] with the MLET Step1 outcome was accomplished using the data of 117 third-year Ramathibodi medical students. Twenty-three (19.7%) students failed the MLET Step1 examination. Stepwise logistic regression analysis showed that the significant predictors of MLET Step1 success/failure were residence background and GPAs of the second and third preclinical years. For students whose sophomore and third-year GPAs increased by an average of 1 point, the odds of passing the MLET Step1 examination increased by a factor of 16.3 and 12.8 respectively. The minimum GPAs for students from urban and rural backgrounds to pass the examination were estimated from the equation (2.35 vs 2.65 from 4.00 scale). Students from rural backgrounds and/or low-grade point averages in their second and third preclinical years of medical school are at risk of failing the MLET Step1 examination. They should be given intensive tutorials during the second and third pre-clinical years.

  15. A local equation for differential diagnosis of β-thalassemia trait and iron deficiency anemia by logistic regression analysis in Southeast Iran.

    PubMed

    Sargolzaie, Narjes; Miri-Moghaddam, Ebrahim

    2014-01-01

    The most common differential diagnosis of β-thalassemia (β-thal) trait is iron deficiency anemia. Several red blood cell equations were introduced during different studies for differential diagnosis between β-thal trait and iron deficiency anemia. Due to genetic variations in different regions, these equations cannot be useful in all population. The aim of this study was to determine a native equation with high accuracy for differential diagnosis of β-thal trait and iron deficiency anemia for the Sistan and Baluchestan population by logistic regression analysis. We selected 77 iron deficiency anemia and 100 β-thal trait cases. We used binary logistic regression analysis and determined best equations for probability prediction of β-thal trait against iron deficiency anemia in our population. We compared diagnostic values and receiver operative characteristic (ROC) curve related to this equation and another 10 published equations in discriminating β-thal trait and iron deficiency anemia. The binary logistic regression analysis determined the best equation for best probability prediction of β-thal trait against iron deficiency anemia with area under curve (AUC) 0.998. Based on ROC curves and AUC, Green & King, England & Frazer, and then Sirdah indices, respectively, had the most accuracy after our equation. We suggest that to get the best equation and cut-off in each region, one needs to evaluate specific information of each region, specifically in areas where populations are homogeneous, to provide a specific formula for differentiating between β-thal trait and iron deficiency anemia.

  16. Engine With Regression and Neural Network Approximators Designed

    NASA Technical Reports Server (NTRS)

    Patnaik, Surya N.; Hopkins, Dale A.

    2001-01-01

    At the NASA Glenn Research Center, the NASA engine performance program (NEPP, ref. 1) and the design optimization testbed COMETBOARDS (ref. 2) with regression and neural network analysis-approximators have been coupled to obtain a preliminary engine design methodology. The solution to a high-bypass-ratio subsonic waverotor-topped turbofan engine, which is shown in the preceding figure, was obtained by the simulation depicted in the following figure. This engine is made of 16 components mounted on two shafts with 21 flow stations. The engine is designed for a flight envelope with 47 operating points. The design optimization utilized both neural network and regression approximations, along with the cascade strategy (ref. 3). The cascade used three algorithms in sequence: the method of feasible directions, the sequence of unconstrained minimizations technique, and sequential quadratic programming. The normalized optimum thrusts obtained by the three methods are shown in the following figure: the cascade algorithm with regression approximation is represented by a triangle, a circle is shown for the neural network solution, and a solid line indicates original NEPP results. The solutions obtained from both approximate methods lie within one standard deviation of the benchmark solution for each operating point. The simulation improved the maximum thrust by 5 percent. The performance of the linear regression and neural network methods as alternate engine analyzers was found to be satisfactory for the analysis and operation optimization of air-breathing propulsion engines (ref. 4).

  17. Regression Analysis of Top of Descent Location for Idle-thrust Descents

    NASA Technical Reports Server (NTRS)

    Stell, Laurel; Bronsvoort, Jesper; McDonald, Greg

    2013-01-01

    In this paper, multiple regression analysis is used to model the top of descent (TOD) location of user-preferred descent trajectories computed by the flight management system (FMS) on over 1000 commercial flights into Melbourne, Australia. The independent variables cruise altitude, final altitude, cruise Mach, descent speed, wind, and engine type were also recorded or computed post-operations. Both first-order and second-order models are considered, where cross-validation, hypothesis testing, and additional analysis are used to compare models. This identifies the models that should give the smallest errors if used to predict TOD location for new data in the future. A model that is linear in TOD altitude, final altitude, descent speed, and wind gives an estimated standard deviation of 3.9 nmi for TOD location given the trajec- tory parameters, which means about 80% of predictions would have error less than 5 nmi in absolute value. This accuracy is better than demonstrated by other ground automation predictions using kinetic models. Furthermore, this approach would enable online learning of the model. Additional data or further knowl- edge of algorithms is necessary to conclude definitively that no second-order terms are appropriate. Possible applications of the linear model are described, including enabling arriving aircraft to fly optimized descents computed by the FMS even in congested airspace. In particular, a model for TOD location that is linear in the independent variables would enable decision support tool human-machine interfaces for which a kinetic approach would be computationally too slow.

  18. Categorical Variables in Multiple Regression: Some Cautions.

    ERIC Educational Resources Information Center

    O'Grady, Kevin E.; Medoff, Deborah R.

    1988-01-01

    Limitations of dummy coding and nonsense coding as methods of coding categorical variables for use as predictors in multiple regression analysis are discussed. The combination of these approaches often yields estimates and tests of significance that are not intended by researchers for inclusion in their models. (SLD)

  19. Normalization Ridge Regression in Practice I: Comparisons Between Ordinary Least Squares, Ridge Regression and Normalization Ridge Regression.

    ERIC Educational Resources Information Center

    Bulcock, J. W.

    The problem of model estimation when the data are collinear was examined. Though the ridge regression (RR) outperforms ordinary least squares (OLS) regression in the presence of acute multicollinearity, it is not a problem free technique for reducing the variance of the estimates. It is a stochastic procedure when it should be nonstochastic and it…

  20. Differential item functioning analysis with ordinal logistic regression techniques. DIFdetect and difwithpar.

    PubMed

    Crane, Paul K; Gibbons, Laura E; Jolley, Lance; van Belle, Gerald

    2006-11-01

    We present an ordinal logistic regression model for identification of items with differential item functioning (DIF) and apply this model to a Mini-Mental State Examination (MMSE) dataset. We employ item response theory ability estimation in our models. Three nested ordinal logistic regression models are applied to each item. Model testing begins with examination of the statistical significance of the interaction term between ability and the group indicator, consistent with nonuniform DIF. Then we turn our attention to the coefficient of the ability term in models with and without the group term. If including the group term has a marked effect on that coefficient, we declare that it has uniform DIF. We examined DIF related to language of test administration in addition to self-reported race, Hispanic ethnicity, age, years of education, and sex. We used PARSCALE for IRT analyses and STATA for ordinal logistic regression approaches. We used an iterative technique for adjusting IRT ability estimates on the basis of DIF findings. Five items were found to have DIF related to language. These same items also had DIF related to other covariates. The ordinal logistic regression approach to DIF detection, when combined with IRT ability estimates, provides a reasonable alternative for DIF detection. There appear to be several items with significant DIF related to language of test administration in the MMSE. More attention needs to be paid to the specific criteria used to determine whether an item has DIF, not just the technique used to identify DIF.

  1. Meta-analysis and meta-regression of omega-3 polyunsaturated fatty acid supplementation for major depressive disorder.

    PubMed

    Mocking, R J T; Harmsen, I; Assies, J; Koeter, M W J; Ruhé, H G; Schene, A H

    2016-03-15

    Omega-3 polyunsaturated fatty acid (PUFA) supplementation has been proposed as (adjuvant) treatment for major depressive disorder (MDD). In the present meta-analysis, we pooled randomized placebo-controlled trials assessing the effects of omega-3 PUFA supplementation on depressive symptoms in MDD. Moreover, we performed meta-regression to test whether supplementation effects depended on eicosapentaenoic acid (EPA) or docosahexaenoic acid dose, their ratio, study duration, participants' age, percentage antidepressant users, baseline MDD symptom severity, publication year and study quality. To limit heterogeneity, we only included studies in adult patients with MDD assessed using standardized clinical interviews, and excluded studies that specifically studied perinatal/perimenopausal or comorbid MDD. Our PubMED/EMBASE search resulted in 1955 articles, from which we included 13 studies providing 1233 participants. After taking potential publication bias into account, meta-analysis showed an overall beneficial effect of omega-3 PUFAs on depressive symptoms in MDD (standardized mean difference=0.398 (0.114-0.682), P=0.006, random-effects model). As an explanation for significant heterogeneity (I(2)=73.36, P<0.001), meta-regression showed that higher EPA dose (β=0.00037 (0.00009-0.00065), P=0.009), higher percentage antidepressant users (β=0.0058 (0.00017-0.01144), P=0.044) and earlier publication year (β=-0.0735 (-0.143 to 0.004), P=0.04) were significantly associated with better outcome for PUFA supplementation. Additional sensitivity analyses were performed. In conclusion, present meta-analysis suggested a beneficial overall effect of omega-3 PUFA supplementation in MDD patients, especially for higher doses of EPA and in participants taking antidepressants. Future precision medicine trials should establish whether possible interactions between EPA and antidepressants could provide targets to improve antidepressant response and its prediction. Furthermore, potential

  2. Local spatial variations analysis of smear-positive tuberculosis in Xinjiang using Geographically Weighted Regression model.

    PubMed

    Wei, Wang; Yuan-Yuan, Jin; Ci, Yan; Ahan, Alayi; Ming-Qin, Cao

    2016-10-06

    The spatial interplay between socioeconomic factors and tuberculosis (TB) cases contributes to the understanding of regional tuberculosis burdens. Historically, local Poisson Geographically Weighted Regression (GWR) has allowed for the identification of the geographic disparities of TB cases and their relevant socioeconomic determinants, thereby forecasting local regression coefficients for the relations between the incidence of TB and its socioeconomic determinants. Therefore, the aims of this study were to: (1) identify the socioeconomic determinants of geographic disparities of smear positive TB in Xinjiang, China (2) confirm if the incidence of smear positive TB and its associated socioeconomic determinants demonstrate spatial variability (3) compare the performance of two main models: one is Ordinary Least Square Regression (OLS), and the other local GWR model. Reported smear-positive TB cases in Xinjiang were extracted from the TB surveillance system database during 2004-2010. The average number of smear-positive TB cases notified in Xinjiang was collected from 98 districts/counties. The population density (POPden), proportion of minorities (PROmin), number of infectious disease network reporting agencies (NUMagen), proportion of agricultural population (PROagr), and per capita annual gross domestic product (per capita GDP) were gathered from the Xinjiang Statistical Yearbook covering a period from 2004 to 2010. The OLS model and GWR model were then utilized to investigate socioeconomic determinants of smear-positive TB cases. Geoda 1.6.7, and GWR 4.0 software were used for data analysis. Our findings indicate that the relations between the average number of smear-positive TB cases notified in Xinjiang and their socioeconomic determinants (POPden, PROmin, NUMagen, PROagr, and per capita GDP) were significantly spatially non-stationary. This means that in some areas more smear-positive TB cases could be related to higher socioeconomic determinant regression

  3. NCCS Regression Test Harness

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tharrington, Arnold N.

    2015-09-09

    The NCCS Regression Test Harness is a software package that provides a framework to perform regression and acceptance testing on NCCS High Performance Computers. The package is written in Python and has only the dependency of a Subversion repository to store the regression tests.

  4. Factors Affecting Regression-Discontinuity.

    ERIC Educational Resources Information Center

    Schumacker, Randall E.

    The regression-discontinuity approach to evaluating educational programs is reviewed, and regression-discontinuity post-program mean differences under various conditions are discussed. The regression-discontinuity design is used to determine whether post-program differences exist between an experimental program and a control group. The difference…

  5. MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION

    EPA Science Inventory

    Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...

  6. A regression-based 3-D shoulder rhythm.

    PubMed

    Xu, Xu; Lin, Jia-hua; McGorry, Raymond W

    2014-03-21

    In biomechanical modeling of the shoulder, it is important to know the orientation of each bone in the shoulder girdle when estimating the loads on each musculoskeletal element. However, because of the soft tissue overlying the bones, it is difficult to accurately derive the orientation of the clavicle and scapula using surface markers during dynamic movement. The purpose of this study is to develop two regression models which predict the orientation of the clavicle and the scapula. The first regression model uses humerus orientation and individual factors such as age, gender, and anthropometry data as the predictors. The second regression model includes only the humerus orientation as the predictor. Thirty-eight participants performed 118 static postures covering the volume of the right hand reach. The orientation of the thorax, clavicle, scapula and humerus were measured with a motion tracking system. Regression analysis was performed on the Euler angles decomposed from the orientation of each bone from 26 randomly selected participants. The regression models were then validated with the remaining 12 participants. The results indicate that for the first model, the r(2) of the predicted orientation of the clavicle and the scapula ranged between 0.31 and 0.65, and the RMSE obtained from the validation dataset ranged from 6.92° to 10.39°. For the second model, the r(2) ranged between 0.19 and 0.57, and the RMSE obtained from the validation dataset ranged from 6.62° and 11.13°. The derived regression-based shoulder rhythm could be useful in future biomechanical modeling of the shoulder. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.

  7. A graphical method to evaluate spectral preprocessing in multivariate regression calibrations: example with Savitzky-Golay filters and partial least squares regression

    USDA-ARS?s Scientific Manuscript database

    In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly ...

  8. Regression and multivariate models for predicting particulate matter concentration level.

    PubMed

    Nazif, Amina; Mohammed, Nurul Izma; Malakahmad, Amirhossein; Abualqumboz, Motasem S

    2018-01-01

    The devastating health effects of particulate matter (PM 10 ) exposure by susceptible populace has made it necessary to evaluate PM 10 pollution. Meteorological parameters and seasonal variation increases PM 10 concentration levels, especially in areas that have multiple anthropogenic activities. Hence, stepwise regression (SR), multiple linear regression (MLR) and principal component regression (PCR) analyses were used to analyse daily average PM 10 concentration levels. The analyses were carried out using daily average PM 10 concentration, temperature, humidity, wind speed and wind direction data from 2006 to 2010. The data was from an industrial air quality monitoring station in Malaysia. The SR analysis established that meteorological parameters had less influence on PM 10 concentration levels having coefficient of determination (R 2 ) result from 23 to 29% based on seasoned and unseasoned analysis. While, the result of the prediction analysis showed that PCR models had a better R 2 result than MLR methods. The results for the analyses based on both seasoned and unseasoned data established that MLR models had R 2 result from 0.50 to 0.60. While, PCR models had R 2 result from 0.66 to 0.89. In addition, the validation analysis using 2016 data also recognised that the PCR model outperformed the MLR model, with the PCR model for the seasoned analysis having the best result. These analyses will aid in achieving sustainable air quality management strategies.

  9. Evolution of the use of noninvasive mechanical ventilation in chronic obstructive pulmonary disease in a Spanish region, 1997-2010.

    PubMed

    Carpe-Carpe, Bienvenida; Hernando-Arizaleta, Lauro; Ibáñez-Pérez, M Carmen; Palomar-Rodríguez, Joaquín A; Esquinas-Rodríguez, Antonio M

    2013-08-01

    Noninvasive mechanical ventilation (NIV) appeared in the 1980s as an alternative to invasive mechanical ventilation (IMV) in patients with acute respiratory failure. We evaluated the introduction of NIV and the results in patients with acute exacerbation of chronic obstructive pulmonary disease in the Region of Murcia (Spain). A retrospective observational study based on the minimum basic hospital discharge data of all patients hospitalised for this pathology in all public hospitals in the region between 1997 and 2010. We performed a time trend analysis on hospital attendance, the use of each ventilatory intervention and hospital mortality through joinpoint regression. We identified 30.027 hospital discharges. Joinpoint analysis: downward trend in attendance (annual percentage change [APC]=-3.4, 95% CI: - 4.8; -2.0, P <.05) and in the group without ventilatory intervention (APC=-4.2%, -5.6; -2.8, P <.05); upward trend in the use of NIV (APC=16.4, 12.0; 20. 9, P <.05), and downward trend that was not statistically significant in IMV (APC=-4.5%, -10.3; 1.7). We observed an upward trend without statistical significance in overall mortality (APC=0.5, -1.3; 2.4) and in the group without intervention (APC=0.1, -1.6; 1.9); downward trend with statistical significance in the NIV group (APC=-7.1, -11.7; -2.2, P <.05) and not statistically significant in the IMV group (APC=-0,8, -6, 1; 4.8). The mean stay did not change substantially. The introduction of NIV has reduced the group of patients not receiving assisted ventilation. No improvement in results was found in terms of mortality or length of stay. Copyright © 2012 SEPAR. Published by Elsevier Espana. All rights reserved.

  10. Trace analysis of acids and bases by conductometric titration with multiparametric non-linear regression.

    PubMed

    Coelho, Lúcia H G; Gutz, Ivano G R

    2006-03-15

    A chemometric method for analysis of conductometric titration data was introduced to extend its applicability to lower concentrations and more complex acid-base systems. Auxiliary pH measurements were made during the titration to assist the calculation of the distribution of protonable species on base of known or guessed equilibrium constants. Conductivity values of each ionized or ionizable species possibly present in the sample were introduced in a general equation where the only unknown parameters were the total concentrations of (conjugated) bases and of strong electrolytes not involved in acid-base equilibria. All these concentrations were adjusted by a multiparametric nonlinear regression (NLR) method, based on the Levenberg-Marquardt algorithm. This first conductometric titration method with NLR analysis (CT-NLR) was successfully applied to simulated conductometric titration data and to synthetic samples with multiple components at concentrations as low as those found in rainwater (approximately 10 micromol L(-1)). It was possible to resolve and quantify mixtures containing a strong acid, formic acid, acetic acid, ammonium ion, bicarbonate and inert electrolyte with accuracy of 5% or better.

  11. Interpreting Regression Results: beta Weights and Structure Coefficients are Both Important.

    ERIC Educational Resources Information Center

    Thompson, Bruce

    Various realizations have led to less frequent use of the "OVA" methods (analysis of variance--ANOVA--among others) and to more frequent use of general linear model approaches such as regression. However, too few researchers understand all the various coefficients produced in regression. This paper explains these coefficients and their…

  12. Estimation of diffusion coefficients from voltammetric signals by support vector and gaussian process regression

    PubMed Central

    2014-01-01

    Background Support vector regression (SVR) and Gaussian process regression (GPR) were used for the analysis of electroanalytical experimental data to estimate diffusion coefficients. Results For simulated cyclic voltammograms based on the EC, Eqr, and EqrC mechanisms these regression algorithms in combination with nonlinear kernel/covariance functions yielded diffusion coefficients with higher accuracy as compared to the standard approach of calculating diffusion coefficients relying on the Nicholson-Shain equation. The level of accuracy achieved by SVR and GPR is virtually independent of the rate constants governing the respective reaction steps. Further, the reduction of high-dimensional voltammetric signals by manual selection of typical voltammetric peak features decreased the performance of both regression algorithms compared to a reduction by downsampling or principal component analysis. After training on simulated data sets, diffusion coefficients were estimated by the regression algorithms for experimental data comprising voltammetric signals for three organometallic complexes. Conclusions Estimated diffusion coefficients closely matched the values determined by the parameter fitting method, but reduced the required computational time considerably for one of the reaction mechanisms. The automated processing of voltammograms according to the regression algorithms yields better results than the conventional analysis of peak-related data. PMID:24987463

  13. Two-Year versus One-Year Head Start Program Impact: Addressing Selection Bias by Comparing Regression Modeling with Propensity Score Analysis

    ERIC Educational Resources Information Center

    Leow, Christine; Wen, Xiaoli; Korfmacher, Jon

    2015-01-01

    This article compares regression modeling and propensity score analysis as different types of statistical techniques used in addressing selection bias when estimating the impact of two-year versus one-year Head Start on children's school readiness. The analyses were based on the national Head Start secondary dataset. After controlling for…

  14. Reduced COPD Exacerbation Risk Correlates With Improved FEV1: A Meta-Regression Analysis.

    PubMed

    Zider, Alexander D; Wang, Xiaoyan; Buhr, Russell G; Sirichana, Worawan; Barjaktarevic, Igor Z; Cooper, Christopher B

    2017-09-01

    The mechanism by which various classes of medication reduce COPD exacerbation risk remains unknown. We hypothesized a correlation between reduced exacerbation risk and improvement in airway patency as measured according to FEV 1 . By systematic review, COPD trials were identified that reported therapeutic changes in predose FEV 1 (dFEV 1 ) and occurrence of moderate to severe exacerbations. Using meta-regression analysis, a model was generated with dFEV 1 as the moderator variable and the absolute difference in exacerbation rate (RD), ratio of exacerbation rates (RRs), or hazard ratio (HR) as dependent variables. The analysis of RD and RR included 119,227 patients, and the HR analysis included 73,475 patients. For every 100-mL change in predose FEV 1 , the HR decreased by 21% (95% CI, 17-26; P < .001; R 2  = 0.85) and the absolute exacerbation rate decreased by 0.06 per patient per year (95% CI, 0.02-0.11; P = .009; R 2  = 0.05), which corresponded to an RR of 0.86 (95% CI, 0.81-0.91; P < .001; R 2  = 0.20). The relationship with exacerbation risk remained statistically significant across multiple subgroup analyses. A significant correlation between increased FEV 1 and lower COPD exacerbation risk suggests that airway patency is an important mechanism responsible for this effect. Copyright © 2017 American College of Chest Physicians. Published by Elsevier Inc. All rights reserved.

  15. Propensity score estimation: machine learning and classification methods as alternatives to logistic regression

    PubMed Central

    Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson

    2010-01-01

    Summary Objective Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this Review was to assess machine learning alternatives to logistic regression which may accomplish the same goals but with fewer assumptions or greater accuracy. Study Design and Setting We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. Results We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (CART), and meta-classifiers (in particular, boosting). Conclusion While the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and to a lesser extent decision trees (particularly CART) appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. PMID:20630332

  16. Melanin and blood concentration in human skin studied by multiple regression analysis: experiments

    NASA Astrophysics Data System (ADS)

    Shimada, M.; Yamada, Y.; Itoh, M.; Yatagai, T.

    2001-09-01

    Knowledge of the mechanism of human skin colour and measurement of melanin and blood concentration in human skin are needed in the medical and cosmetic fields. The absorbance spectrum from reflectance at the visible wavelength of human skin increases under several conditions such as a sunburn or scalding. The change of the absorbance spectrum from reflectance including the scattering effect does not correspond to the molar absorption spectrum of melanin and blood. The modified Beer-Lambert law is applied to the change in the absorbance spectrum from reflectance of human skin as the change in melanin and blood is assumed to be small. The concentration of melanin and blood was estimated from the absorbance spectrum reflectance of human skin using multiple regression analysis. Estimated concentrations were compared with the measured one in a phantom experiment and this method was applied to in vivo skin.

  17. The value of a statistical life: a meta-analysis with a mixed effects regression model.

    PubMed

    Bellavance, François; Dionne, Georges; Lebeau, Martin

    2009-03-01

    The value of a statistical life (VSL) is a very controversial topic, but one which is essential to the optimization of governmental decisions. We see a great variability in the values obtained from different studies. The source of this variability needs to be understood, in order to offer public decision-makers better guidance in choosing a value and to set clearer guidelines for future research on the topic. This article presents a meta-analysis based on 39 observations obtained from 37 studies (from nine different countries) which all use a hedonic wage method to calculate the VSL. Our meta-analysis is innovative in that it is the first to use the mixed effects regression model [Raudenbush, S.W., 1994. Random effects models. In: Cooper, H., Hedges, L.V. (Eds.), The Handbook of Research Synthesis. Russel Sage Foundation, New York] to analyze studies on the value of a statistical life. We conclude that the variability found in the values studied stems in large part from differences in methodologies.

  18. Hyperspectral analysis of soil organic matter in coal mining regions using wavelets, correlations, and partial least squares regression.

    PubMed

    Lin, Lixin; Wang, Yunjia; Teng, Jiyao; Wang, Xuchen

    2016-02-01

    Hyperspectral estimation of soil organic matter (SOM) in coal mining regions is an important tool for enhancing fertilization in soil restoration programs. The correlation--partial least squares regression (PLSR) method effectively solves the information loss problem of correlation--multiple linear stepwise regression, but results of the correlation analysis must be optimized to improve precision. This study considers the relationship between spectral reflectance and SOM based on spectral reflectance curves of soil samples collected from coal mining regions. Based on the major absorption troughs in the 400-1006 nm spectral range, PLSR analysis was performed using 289 independent bands of the second derivative (SDR) with three levels and measured SOM values. A wavelet-correlation-PLSR (W-C-PLSR) model was then constructed. By amplifying useful information that was previously obscured by noise, the W-C-PLSR model was optimal for estimating SOM content, with smaller prediction errors in both calibration (R(2) = 0.970, root mean square error (RMSEC) = 3.10, and mean relative error (MREC) = 8.75) and validation (RMSEV = 5.85 and MREV = 14.32) analyses, as compared with other models. Results indicate that W-C-PLSR has great potential to estimate SOM in coal mining regions.

  19. Predictors of exercise capacity following exercise-based rehabilitation in patients with coronary heart disease and heart failure: A meta-regression analysis.

    PubMed

    Uddin, Jamal; Zwisler, Ann-Dorthe; Lewinter, Christian; Moniruzzaman, Mohammad; Lund, Ken; Tang, Lars H; Taylor, Rod S

    2016-05-01

    The aim of this study was to undertake a comprehensive assessment of the patient, intervention and trial-level factors that may predict exercise capacity following exercise-based rehabilitation in patients with coronary heart disease and heart failure. Meta-analysis and meta-regression analysis. Randomized controlled trials of exercise-based rehabilitation were identified from three published systematic reviews. Exercise capacity was pooled across trials using random effects meta-analysis, and meta-regression used to examine the association between exercise capacity and a range of patient (e.g. age), intervention (e.g. exercise frequency) and trial (e.g. risk of bias) factors. 55 trials (61 exercise-control comparisons, 7553 patients) were included. Following exercise-based rehabilitation compared to control, overall exercise capacity was on average 0.95 (95% CI: 0.76-1.41) standard deviation units higher, and in trials reporting maximum oxygen uptake (VO2max) was 3.3 ml/kg.min(-1) (95% CI: 2.6-4.0) higher. There was evidence of a high level of statistical heterogeneity across trials (I(2) statistic > 50%). In multivariable meta-regression analysis, only exercise intervention intensity was found to be significantly associated with VO2max (P = 0.04); those trials with the highest average exercise intensity had the largest mean post-rehabilitation VO2max compared to control. We found considerable heterogeneity across randomized controlled trials in the magnitude of improvement in exercise capacity following exercise-based rehabilitation compared to control among patients with coronary heart disease or heart failure. Whilst higher exercise intensities were associated with a greater level of post-rehabilitation exercise capacity, there was no strong evidence to support other intervention, patient or trial factors to be predictive. © The European Society of Cardiology 2015.

  20. A systematic review and meta-regression analysis of mivacurium for tracheal intubation.

    PubMed

    Vanlinthout, L E H; Mesfin, S H; Hens, N; Vanacker, B F; Robertson, E N; Booij, L H D J

    2014-12-01

    We systematically reviewed factors associated with intubation conditions in randomised controlled trials of mivacurium, using random-effects meta-regression analysis. We included 29 studies of 1050 healthy participants. Four factors explained 72.9% of the variation in the probability of excellent intubation conditions: mivacurium dose, 24.4%; opioid use, 29.9%; time to intubation and age together, 18.6%. The odds ratio (95% CI) for excellent intubation was 3.14 (1.65-5.73) for doubling the mivacurium dose, 5.99 (2.14-15.18) for adding opioids to the intubation sequence, and 6.55 (6.01-7.74) for increasing the delay between mivacurium injection and airway insertion from 1 to 2 min in subjects aged 25 years and 2.17 (2.01-2.69) for subjects aged 70 years, p < 0.001 for all. We conclude that good conditions for tracheal intubation are more likely by delaying laryngoscopy after injecting a higher dose of mivacurium with an opioid, particularly in older people. © 2014 The Association of Anaesthetists of Great Britain and Ireland.