For my capstone project I analyzed a data set on cardiovascular disease, or heart disease. Heart disease is preventable yet affects people around the world. In the United States, according to the Centers for Disease Control

One of the first variables I want to look at is how weight affects cardiovascular disease. And the first step to explore is to see what the distribution is in the dataset. Below I will show you how to do these visualizations of the distributions. To do this and obtain visualizations, I needed to go through a few steps.

First, since I am in the United States, I need to convert weight from Kg to Lbs. To do this I wrote a code for the conversion.

df[‘weight’]*=2.20462

Now I can see what the distribution is and I chose to use a histogram. In Python, this is done by the following code:

fig, ax = plt.subplots(figsize=(6, 5))

ax = sns.histplot(data=df, x=’weight’, bins=15)

ax.set_title(‘Distribution of Weight in Dataset (lbs)’);

And this is the result:

For a more detailed look at the distribution, you can also put in this code:

fig, ax = plt.subplots(figsize=(6, 5))

ax = sns.histplot(data=df, x=’weight’, bins=15)

ax.set_title(‘Weight (lbs) in Dataset’);

And obtain this visualization:

Now that you can see how to create these visualizations of the weight distributions, you can now begin to use this information to further the analysis of this project.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store