Pencil Banner

Beginner's guide to R: Painless data visualisation

| June 7, 2013
One of the most appealing things about R is its ability to create data visualizations with just a couple of lines of code.

barplot(testscores, col=testcolors)

Note that the name of a color must be in quotation marks, but a variable name that holds a list of colors should not be within quote marks.

barplot(testscores, col=testcolors, main="Test scores")

And have the y axis go from 0 to 100:

barplot(testscores, col=testcolors, main="Test scores", ylim=c(0,100))

Then use las-1 to style the axis label to be horizontal and not turned 90 degrees vertical:

barplot(testscores, col=testcolors, main="Test scores", ylim=c(0,100), las=1)

And you've got a color-coded bar graph.

By the way, if you wanted the scores sorted from highest to lowest, you could have set your original testscores variable to:

testscores <- sort(c(96, 71, 85, 92, 82, 78, 72, 81, 68, 61, 78, 86, 90), decreasing = TRUE)

The sort() function defaults to ascending sort; for descending sort you need the additional argument: decreasing = TRUE.

If that code above is starting to seem unwieldy to you as a beginner, break it into two lines for easier reading, and perhaps also set a new variable for the sorted version:

testscores <- c(96, 71, 85, 92, 82, 78, 72, 81, 68, 61, 78, 86, 90)

testscores_sorted <- sort(testscores, decreasing = TRUE)

If you had scores in a data frame called results with one column of student names called students and another column of scores called testscores, you could use the ggplot2 package's ggplot() function as well:

ggplot(results, aes(x=students, y=testscores)) + geom_bar(fill=testcolors, stat = "identity")

Why stat = "identity"? That's needed here to show that the y axis represents a numerical value as opposed to an item count.

ggplot2's qplot() also has easy ways to color bars by a factor, such as number of cylinders, and then automatically generate a legend. Here's an example of graph counting the number of 4-, 6- and 8-cylinder cars in the mtcars data set:

qplot(factor(cyl), data=mtcars, geom="bar", fill=factor(cyl))

But, as I said, we're getting somewhat beyond a beginner's overview of R when coloring by factor. For a few more examples and details for many of the themes covered here, you might want to see the online tutorial Producing Simple Graphs with R. For more on graphing with color, check out a source such as the R Graphics Cookbook. The ggplot2 documentation also has a lot of examples, such as this page for bar geometry.