Module 27 A dplyr survey

Learning goals

  • This is a review exercise: apply the dplyr and ggplot skills introduced in the previous modules to analyze the results of a recent survey.

 

Use the dplyr verbs to answer these questions.

First, download the results of a recent survey by running this code.

1. Review the survey dataset and try to figure out which each row represents. “Each row is a ____.”

2. How many respondents are in the dataset?

3. Create a dataframe called old_people. This should include only people older than 20. Write code to calculate the number rows in your new dataframe.

4. Create a dataframe called captivated. This should include all those people who find Joe’s moustache to be “deeply captivating”. Write code to calculate the number rows in your new dataframe.

5. Create a dataframe called special_people. This should be people who are taller than 175cm, prefer cats over dogs, and consider themselves to be average at rock, paper, scissors.

6. In the full dataset (survey), do more people like cats or dogs? What about among “special” people?

7. Create a new variable in survey called “std_shoes” that standardizes shoe sizes by converting men’s shoe size to women’s (There is an approximate 1.5 size difference between Men’s and Women’s sizing (e.g., a men’s size 7 is roughly equivalent to a women’s size 8.5)

8. Get the avg shoe size by sex (Male, Female, Prefer not to say)

9. Get Average age, height, & number of siblings, by the sex

10. Do people that have ever had a mustache think there will be more pandemics on average than those who have never had a mustache?

11. Do people that prefer cats have smaller feet on average than those who prefer dogs?

12. Is eyesight associated with moustache perception?

13. What percentage of people think they are better than average at rock scissors paper?

14. What percentage of men and women think they are better than average at rock scissors paper?

15. How many people think money matters more than love?

16. Create a dataframe, grouped by whether or not people’s dads had moustaches, with variables showing each of the following: the maximum age, maximum height,minimum number of pandemics, and average number of siblings

17. What percentage of men have terrible eyesight?

18. How many women have a shoe size of 9 or more?

19. Create a variable in the survey dataset named days_old. Use the bday variable and subtract it from Sys.Date().

20. What is the standard deviation of age?

21. Create a one-column dataset which contains the name(s) of the person(s) with the most number of siblings (hint: use the following dplyr verbs in this order: filter, select).

22. Create a bar chart that compares the average age of those who prefer cats vs dogs

23. Create a scatter plot that shows the relationship between age and height (make it look nice!)

24. Create a scatter plot that shows the relationship between height and shoe size

25. Create a bar chart that shows the average number of siblings for dad’s mustache status.

26. Color the barchart above and add some degree of transparency in the color. Add a title.

27. Create a bar chart that shows the average shoe size by sex and cat/dog preference.