By the end of this assignment, you should be able to achieve the following tasks in R:
ggplot2
, including boxplots, scatterplots, and line graphs; andThese achievements belong to Learning Outcomes 4, 5, 6.
Click on any blue text to visit the external website.
You will make 10 plots for this assignment, using the skills you developed in the previous homework.
Note: If you contact me for help or (better yet) open an issue in the public discussion forum, please include the code that is not working and also tell me what you have tried.
Open your .Rproj
project file in RStudio.
Create an hw05
folder inside the same folder as your project file.
Review R4ds Chapter 3: Data Visualisation as necessary to make the plots described below.
Create a new R Notebook called <lastname>_hw05.Rmd
and save it in your hw05
folder.
Copy and paste the YAML header from HW04 and replace the default header in your new document. Change the title as appopriate.
Load the tidyverse
package in your first code chunk.
Develop this habit for the remaining assignments: Open your Rproj file, download or create your new notebook as assigned, edit the YAML file, and then insert your first code chunk where you will load any packages needed for the assignment.
For this homework, you will use ggplot2
to make plots from some of the datasets that come with R and the tidyverse
packages. I will give you the dataset to use, and other information to use for mapping, etc. I expect you will write and execute the code.
The first time you use a dataset, load it with the command data(dataset name)
in your code chunk. For example, data(faithful)
loads the Old Faithful dataset. Technically, you do not have to do this but it is good coding practice. I expect that you will do this.
After you load the dataset, and only for the first time, enter ?<dataset name>
in your code chunk to see the format of the data. For example, ?faithful
will give you information about the Old Faithful dataset.
Note: You should always inspect your data visually. That is why I am telling you to do this step.
You only need to do these two steps the first time you use a dataset.
Run each code chunk, and write 1-2 sentences that describes any trends or patterns that you observe in the plot. In other words, think like a scientist!
Include the #### Plot <no.>
header above each plot.
Apply some of your skills that you learned during Assignment 02. You will make two vectors, then combine them into a data frame for plotting. Review the assignments if necessary.
Make a vector called year
for 1821 to 1934. Rememeber how to use :
to make a sequence of numbers?
Look at the class()
of the lynx
dataset. The lynx
dataset is a “time series” class (ts
). You can convert the time series data to a vector by using the as.vector()
function. Just put the dataset name inside the parentheses. Assign this to the variable pelts
.
Make a dataframe called lynx_pelts
from these two vectors.
Make the line color maroon. Maroon is one of the default R colors.
labs
layer to change the x- and y-axis labels so that they do not have periods in the names (i.e., Petal Length
, Petal Width
).This requires two code chunks, which will be nearly identical
geom_violin
)In your description, describe in your own words what violin plots display (you can search the interwebs), and what is the difference among the two versions of gray shading. Hint: the grays extend from gray0
to gray100
. You can learn more about colors in R from this PDF file.
There is no plot 6. And, there is no spoon.
coord_flip()
coord_flip
but it is not required. Try both and choose the one you think looks best.lab
layer to change both axis labels so each starts with an upper-case letter.Conservation
. Make that change. (Do not try to change the actual legend entries like “cd” and “vu”). Note: This can be done a couple of different ways but scale_color_discrete()
is one good way.Make two scatterplots of your choice, with the following constraints.
facet_wrap()
to at least one of the plots using one of the nominal variables. You decide whether you use 2 or 3 columns. Hint: use one of the nominal variables with relatively few different types for wrapping. Explore: What happens if you use a nominal variable like genus
, with lots of types?Describe the patterns or trends you see in each graph.