```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
### Usage and Adaptation of Data Carpentry Materials:
Most material found in this document has been adapted from [Data Carpentry][https://datacarpentry.org/r-socialsci/] materials, under the [creative commons attribution license][https://creativecommons.org/licenses/by/4.0/]. Minor amendments have been made to allow for compatability in order.
### Exercise 0
Go ahead and load in tidyverse as usual.
```{r Tidyverse and Here Loading}
library(tidyverse)
```
-------------
For this workshop, we will continue to look at the interview data. Run the below to get a neater version of the data for plotting. If you have time at the end come back to it to see if you can break down what it is doing.
```{r DataImport}
interviews <- read_csv("https://raw.githubusercontent.com/datacarpentry/r-socialsci/main/episodes/data/SAFI_clean.csv")
interviews_plotting <- interviews %>%
## pivot wider by items_owned
separate_longer_delim(items_owned, delim = ";") %>%
## if there were no items listed, changing NA to no_listed_items
replace_na(list(items_owned = "no_listed_items")) %>%
mutate(items_owned_logical = TRUE) %>%
pivot_wider(names_from = items_owned,
values_from = items_owned_logical,
values_fill = list(items_owned_logical = FALSE)) %>%
## pivot wider by months_lack_food
separate_longer_delim(months_lack_food, delim = ";") %>%
mutate(months_lack_food_logical = TRUE) %>%
pivot_wider(names_from = months_lack_food,
values_from = months_lack_food_logical,
values_fill = list(months_lack_food_logical = FALSE)) %>%
## add some summary columns
mutate(number_months_lack_food = rowSums(select(., Jan:May))) %>%
mutate(number_items = rowSums(select(., bicycle:car)))
```
### Exercise 1
Create a scatter plot of `rooms` by `village` with the `respondent_wall_type` showing in different colours. Does this seem like a good way to display the relationship between these variables? What other kinds of plots might you use to show this type of data?
```{r Scatterplot Task}
interviews_plotting %>%
ggplot(aes(x = village, y = rooms)) +
geom_jitter(aes(color = respondent_wall_type), alpha = 0.3, width = 0.2, height = 0.2)
```
This is not a great way to show this type of data because it is difficult to distinguish between villages. What other plot types could help you visualize this relationship better?
-------------
### Exercise 2
Create a boxplot for `liv_count` for each wall type. Overlay the boxplot layer on a jitter layer to show actual measurements.
```{r Boxplot Task}
interviews_plotting %>%
ggplot(aes(x = respondent_wall_type, y = liv_count)) +
geom_boxplot(alpha = 0) +
geom_jitter(alpha = 0.5, width = 0.2, height = 0.2)
```
-------------
### Exercise 3
Create a bar plot showing the proportion of respondents in each village who are or are not part of an irrigation association(`memb_assoc`). Include only respondents who answered that question in the calculations and plot. Which village had the lowest proportion of respondents in an irrigation association?
**Hint:** you will have to do some data wrangling to get the data you need for the bar chart.
```{r Barchart Task}
percent_memb_assoc <- interviews_plotting %>%
filter(!is.na(memb_assoc)) %>%
count(village, memb_assoc) %>%
group_by(village) %>%
mutate(percent = (n / sum(n)) * 100) %>%
ungroup()
percent_memb_assoc %>%
ggplot(aes(x = village, y = percent, fill = memb_assoc)) +
geom_bar(stat = "identity", position = "dodge")
```