Visualization and graphics in R

Visualization and graphics are strengths of R. There are currently three main – and unfortunately incompatible – ways of making graphs in R.

The first approach is to use the base R commands, with the plot function as the main workhorse, and related functions for more specialized plots (e.g. boxplot). You can make a lot of very good graphs using base R, but it often takes many lines of code and a good bit of fiddling.

The second approach is using the lattice package, and extensions that build on this package. lattice seems to have largely “lost” against option number 3 and I don’t see it used much anymore these days.

The third, and currently most popular approach is to make graphs using the ggplot2 package and various extensions to it. ggplot2 is one of the many R packages written by Hadley Wickham. ggplot2 and its extensions have by now become more or less the standard for good graphics in R. Note that while the package is called ggplot2, people interchangeably refer to it as ggplot or ggplot2. (No one uses ggplot “one” anymore.) I’ll do the same.

My recommendation for learning to make plots in R is to focus on ggplot2 and related packages.

For most graphs and figures you want for a publication, ggplot2 is probably your best choice. It, and its many add-on packages, product very high quality graphics with not too much code writing required. While you could use ggplot2 exclusively for any figures you make, it is also useful to know a little bit about how to make plots using base R, since they are often very good for quick and dirty plots, especially during the data exploration stage.

Practicing ggplot2

If you are new to ggplot2, a good and gentle introduction is the set of tutorials in the Visualize Data RStudio Primer section. Similar material, though a little bit more advanced, is covered in a non-interactive form in the Data Visualization, Exploratory Data Analysis and Graphics for Communication chapters of R4DS. A nice step-by-step tutorial can be found in this blog post. Another good source are the chapters in the Data Visualization section of IDS. Especially the Data visualization principles chapter is worth a read.

If you want to learn a bit more about ggplot2 in general, the ideas behind it, and also how to use it, you can check out Hadley’s free online ggplot2 book. I suggest you go through some of those materials and focus on the ones you find most useful.

For coding/R in general, and making plots in particular, it is generally a learning by doing approach. The most common approach - and one I still use a lot - is to search online for “how do I do X in ggplot”, and usually you find an example and code that is close enough to what you want so you can adjust it. The tricky bit is knowing what to search for. If you don’t know the type of plot you want, it is useful to learn a bit about different plots and what they are good for (something that was discussed in the previous unit). A good reference source for how do I do X with ggplot2 is the R graphics cookbook, which is meant to give you quick recipes for common questions and plots.

For inspiration and ideas, there is also the R graphics gallery where you can see all kinds of plots made in R, and you can also see the code. They are nicely organized into categories. It’s definitely worth browsing around.

Beyond static graphs

While you will likely always need regular figures for papers, reports, presentation slides, etc., it is becoming increasingly common to have interactive and dynamic visualizations. R has pretty good facilities for that, some in rapid development. The ggvis package is meant to be similar to ggplot2 but allow for interactive and dynamic visualizations. The plotly package also works well with R and allows interactive graphs.

Another great, and related set of tools is the htmlwidgets package which allows you to make a lot of very nice, interactive web-based graphs using R and various R packages.

For even richer, interactive graphical operations and visualizations, including writing full graphical user interfaces, you can use the shiny package and extensions to it. The shiny gallery has examples of apps. Some of my R packages, e.g. DSAIDE and DSAIRM use Shiny for the interactive graphical interface (and either ggplot2 or plotly to show graphs for different models).

Often, getting professional-looking interactive figures, dashboards, and other widgets in R is not that difficult. However, since those are all specialized products, they usually require commands and concepts that are somewhat different from basic or tidyverse R, which generally takes some time to learn.

Interactive graphs and apps are beyond what we’ll cover in this class, but it is something that might be worth learning, and you are certainly welcome to make use of some of these packages and tools for your class project 😃.

Further R visualization resources

There are of course a whole lot more resources available. Almost all books that teach data analysis with R contain some visualization chapter. There are likely also many other good books, websites, and R packages available. A few more are listed on the general resources page. I’m sure you’ll find many others yourself. If you come across any particularly nice sources, please let me know so I can add them.