Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

Beginner's guide to R: Painless data visualisation

Sharon Machlis | June 7, 2013
One of the most appealing things about R is its ability to create data visualizations with just a couple of lines of code.

?rainbow

Now that you've got a list of colors, how do you get them in your graphic? Here's one way. Say you're drawing a 3-bar barchart using ggplot() and want to use 3 colors from the rainbow palette. You can create a 3-color vector like:

mycolors <- rainbow(3)

Or for the heat.colors pallette:

mycolors <- heat.colors(3)

Now instead of using the geom_bar() function without any arguments, add fill=mycolors to geombar() like this:

ggplot(mtcars, aes(x=factor(cyl))) + geom_bar(fill=mycolors)

You don't need to put your list of colors in a separate variable, by the way; you can merge it all in a single line of code such as:

ggplot(mtcars, aes(x=factor(cyl))) + geom_bar(fill=rainbow(3))

But it may be easier to separate the colors out if you want to create your own list of colors instead of using one of the defaults.

The basic R plotting functions can also accept a vector of colors, such as:

barplot(BOD$demand, col=rainbow(6))

You can use a single color if you want all the items to be one color (but not monochrome), such as

barplot(BOD$demand, col="royalblue3")

Chances are, you'll want to use color to show certain characteristics of your data, as opposed to simply assigning random colors in a graphic. That goes a bit beyond beginning R, but to give one example, say you've got a vector of test scores:

testscores <- c(96, 71, 85, 92, 82, 78, 72, 81, 68, 61, 78, 86, 90)

You can do a simple barplot of those scores like this:

barplot(testscores)

And you can make all the bars blue like this:

barplot(testscores, col="blue")

But what if you want the scores 80 and above to be blue and the lower scores to be red? To do this, create a vector of colors of the same length and in the same order as your data, adding a color to the vector based on the data. In other words, since the first test score is 96, the first color in your color vector should be blue; since the second score is 71, the second color in your color vector should be red; and so on.

Of course, you don't want to create that color vector manually! Here's a statement that will do so:

testcolors <- ifelse(testscores >= 80, "blue", "red")

If you've got any programming experience, you might guess that this creates a vector that loops through the testscores data and runs the conditional statement: 'If this entry in testscores is greater than or equal to 80, add "blue" to the testcolors vector; otherwise add "red" to the testcolors vector.'

Now that you've got the list of colors properly assigned to your list of scores, just add the testcolors vector as your desired color scheme:

 

Previous Page  1  2  3  4  5  6  7  Next Page 

Sign up for CIO Asia eNewsletters.