Megan Morano's profile

LRH: Health Social Determinants

Lakeland Regional Health
Business Analytics
Data Science
Identify Social Determinants of Health that Lead to Hospital Readmissions
Overview
In 2012, the Centers for Medicare & Medicaid Services (CMS) began reducing Medicare payments to hospitals with excessive readmission rates, as a result our team began looking into Lakeland Regional’s data to find variables with a statistical impact on readmission rates. Using Lakeland Regional’s own patient data, our team started the process of creating a data dictionary and organizing the data we are going to use to inquire the possible reasons for patients to be readmitted. To access this data, our team used a MySQL database and for data analysis, RStudio, and Stata will be used for determining the statistically significant social determinants and data visualization. Some of the major results from our research are that people with heart conditions tend to be readmitted at a much higher rate. Additionally, people’s level of income tends to play a huge role in readmission rates, and a lot of the reasons people get readmitted falls back onto LRH themselves. We found that there is a large discrepancy in average expected length of stays, readmission rates and mortality rates when compared to the actual values. With COVID-19, our data gathering was limited and the amount of information we were able to evaluate. We were only allowed to evaluate de identified data that we could keep on our laptops. Also, we redefined our scope to data visualizations instead of regression.
Literature Review 
To help guide our query-based analysis, our team sought out scholarly articles to help refine our search parameters. Our group eventually came across a medical journal posted roughly seven years ago that holds significant information pertaining to determinants that affect readmission rates. The journal was created by the authors with the same goal in mind as our team’s goal, lowering readmission rates, and analyzing the determinants. Therefore, it stands to reason that the information presented within this medical journal will certainly help to improve our efforts in determining the social determinants present within our dataset that are correlated with high readmission rates. “Several interventions that involve multiple components (e.g., patient needs assessment, medication reconciliation, patient education, arranging timely outpatient appointments, and providing telephone follow-up), have successfully reduced readmission rates for patients discharged to home” (Kripalani, Theobald, Anctil, & Vasilevskis, 2013). Due to the nature of the medical field, it stands to reason that these social determinants will still be relevant even in 2019. 

Therefore, our team will attempt to perform some query-based searches that are similar or the same as those discussed in the medical journal. The journal goes on to discuss the potential solutions to reduce readmission rates for patients that are discharged to post-acute care facilities. “For patients discharged to post-acute care facilities, multicomponent interventions have reduced readmissions through enhanced communication, medication safety, advanced care planning, and enhanced training to manage common medical conditions that commonly precipitate readmission” (Kripalani, Theobald, Anctil, & Vasilevskis, 2013). While these determinants may not be represented within our dataset provided by Lakeland Regional Health, they certainly offer some insight that may lead to some potential breakthroughs in reducing readmission rates through other means. Our team will continue focusing on the previously listed social determinants before moving on to other ones within the dataset.

The most relevant portion of the medical journal describes the highest 30-day readmission rates that date back to 2012. "The highest 30-day readmission rates were observed for patients with heart failure (26.9%), psychosis (24.6%), recent vascular surgery (23.9%), chronic obstructive pulmonary disease (22.6%), and pneumonia (20.1%" (Kripalani, Theobald, Anctil, & Vasilevskis, 2013). The journal goes on to state that these 30-day readmission rates have remained relatively constant during the last decade (2003 - 2013). If our team can locate the highest 30-day readmission rates for a time period that is relatively close to 2019, we may be able to focus specifically on a portion of patient readmissions. It would be most beneficial to Lakeland Regional Health if our team found relevant determinants in patients sent to post-acute care facilities as these treatments are typically the most expensive.

Additionally, we read an article called “Risk Prediction Models for Hospital Readmission Rates”. The article was written to discuss how readmission rates can be utilized to help create transitional care for when patients leave the hospital and go back home. The second way they can utilize readmission rates to measure the quality of the hospital to help lower hospital fines. The article documents that in the study, they examined ranges of readmission ranges, number of readmission rates, timeframes, and validation cohorts (Kansagara, Englander, & Salanitro, 2011).  

The article does mention six potential categories: “edical comorbidity, mental health comorbidity, illness severity, prior use of medical services, overall health and function, and so- biodemographic and social determinants of health” (Kansagara, Englander, & Salanitro, 2011).  Our team will most likely not use the same six groups based off the information we have been given. More so, the article discusses that they used a 95% confidence interval to see if any points are of significant value. This insight will help us think about what confidence interval we would like to utilize for our data analysis. We will be utilizing RStudio to help data wrangle and examine different variables that may contribute to our goal of locating specific variables that correlate with readmission rates. 

Methodology 
Our methodology has remained consistent throughout the year, as we continued to work towards our final milestones, identifying any correlation of social determinants to readmission rates. Through defining data structures with database schemas, data dictionaries, and relational diagrams, we can visualize and reference data sets to focus on. Cleaning and standardizing data is an important step to maintain a clean set of data and ensuring our analysis is accurate. Our team has described each variable in the data dictionary, following a simple format in excel we’ve translated the variable names and provided brief descriptions to standardize the definitions for all Lakeland Regional departments accessing that data.

In addition to completing the data dictionary, we have performed query analysis in SQL. But to do this, we had to clean the data and make it usable. There were a lot of issues with the data being unreliable and a lot of it not being useful. We narrowed down the tables and variables we found useful and condensed this data into an excel spreadsheet. In tandem with the data provided in the database, we gathered average income for the Lakeland area to be one of the significant variables used in the team’s analysis.
Data Extraction and Preprocessing 
The databased we used for this project was very daunting. With over 3 million records in just the patient demographic table alone, we knew we had our work cut out for us. Over ten tables and 685 different variables, we meticulously whittled down relevant variables to use for our analysis. For possible future analysis with this data, we created a data dictionary detailing everything in the database. 

Extracting the relevant data was yet another challenge. As a team, we had to learn how the database was connected, as we were never given a database schema or an Entity Relationship Diagram. These trials and tribulations led us to a few discoveries. One of the main ones being there was no patient age in the demographics section. Surely it would have been one of the most relevant factors to seeing who would have been readmitted, but we had to overcome this hurdle. One of the other significant finds is that the database’s main tables (Demographics and Hospital visit info) were only connected via the Zip Code variables. This major inconvenience led to a delay in analysis, but as a team, we believe we found enough relevant information to help explain readmission rates. 
Exploratory Data Analysis and Results
Figure 1: Distribution of Gender

Figure 1 displays that there are significantly more females than males being admitted to the hospital. Since the data we are using is over 3.4 million records, the difference between the two genders is significant.
Figure 2: Distribution of Race
Out of the roughly 3.4 million records, the distribution of race (Figure 2) is mostly comprised of white and black. Therefore, it stands to reason that these two races will hold the highest possibility of identifying potential correlations regarding readmission rates within the provided data set.
Figure 3: Distribution of Marital Status 
Figure 3 shows the distribution of marital status among the 3.4 million records. The most prominent of them all is single followed immediately by married. The married and single variables should be given priority since they are within many of the records.
Figure 4: Distribution of Visit Types 
According to Figure 4, the prominent visit type that is a part of the 3.4 million records is an emergency visit. Another potential avenue of discovery lies with the ambulatory visit variable being the second most commonly occurring visit type within the data set.

Figure 5: Distribution of Medical Services
Figure 5 shows that the prominent medical service provided to nearly 3.4 million records was emergency medicine. Immediately followed by that is general medicine. This will help the future capstone team that will take over the project in identifying any correlation.
Conclusion and Next Steps
We now understand what underlying conditions cause readmissions within 30 days to rise, the penalties of these increased rates, the social determinants that increase the probability that one patient readmits over another, and the benefits of having nearby clinics, doctors’ offices, and online medical services. Overall, the most meaningful lesson to our team was learning the overwhelming need and importance of data scientists in the medical field. Our progress will be carried forward with the next Lakeland Regional capstone group, and we are proud to give them a good bearing and foundation of the data and some ideas they could do with it. 
LRH: Health Social Determinants
Published:

LRH: Health Social Determinants

Identify Social Determinants of Health that Lead to Hospital Readmission

Published:

Creative Fields