Data comes from Our World In data’s COVID dataset. I have prepared this a little bit for you.
Let’s get started with a dataset called cross.csv which you find as follows
cross=read.csv("https://raw.githubusercontent.com/mondpanther/datastorieshub/95d94862115819350247823f174a2633cde0236b/code/cross.csv")
names(cross)
[1] "X" "month" "iso_code" "date" "location"
[6] "total_cases" "total_deaths" "total_cases_per_million" "total_deaths_per_million" "total_tests"
[11] "total_tests_per_thousand" "total_vaccinations" "total_boosters" "total_vaccinations_per_hundred" "total_boosters_per_hundred"
[16] "population" "population_density" "continent" "cs" "L1vax"
[21] "vax" "lndeaths" "deaths" "lnvax" "period"
head(cross)
NA
NA
library(dplyr)
cross_eur=cross %>% filter(continent=="Europe")
Start by plotting
library(ggplot2)
cross_eur %>% ggplot( aes(x=vax,y=deaths) ) + # setup the aesthetic
geom_point() # use it to plot poines
# Note that above we broke the command over several lines which can be a good
# idea to make things more easily readable...however the following is exactly the same
# command:
cross_eur %>% ggplot( aes(x=vax,y=deaths) ) + geom_point()
Let’s do that a bit nicer:
cross_eur %>% ggplot( aes(x=vax,y=deaths) ) +
geom_point() +
ylab("Total number of deaths per 1 million")+
xlab("Total number of vaccinations per 100K")+
theme_minimal()+
geom_smooth(method="lm",se=FALSE)
`geom_smooth()` using formula 'y ~ x'
Vaccines seem to be associated with more deaths. Before we discuss this more let’s