Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

Beginner's guide to R: Syntax quirks you'll want to know

Sharon Machlis | June 7, 2013
Why x=3 doesn't always mean what you think it should, about data types and more.

I mentioned at the outset that R syntax is a bit quirky, especially if your frame of reference is, well, pretty much any other programming language. Here are some unusual traits of the language you may find useful to understand as you embark on your journey to learn R.

Assigning values to variables
In pretty much every other programming language I know, the equals sign assigns a certain value to a variable. You know, x = 3 means that x now holds the value of 3.

Not in R. At least, not necessarily.

In R, the primary assignment operator is <- as in:

x <- 3

But not:

x = 3

To add to the potential confusion, the equals sign actually can be used as an assignment operator in R -- but not all the time. When can you use it and when can you not?

The best way for a beginner to deal with this is to use the preferred assignment operator <- and forget that equals is ever allowed. Hey, if it's good enough for Google's R style guide -- they advise not using equals to assign values to variables -- it's good enough for me.

(If this isn't a good enough explanation for you, however, and you really really want to know the ins and outs of R's 5 -- yes, count 'em, 5 -- assignment options, check out the R manual's Assignment Operators page.)

One more note about variables: R is a case-sensitive language. So, variable x is not the same as X. That applies to pretty much everything in R; for example, the function subset() is not the same as Subset().

c is for combine (or concatenate, and sometimes convert/coerce.)
When you create an array in most programming languages, the syntax goes something like this:

myArray = array(1, 1, 2, 3, 5, 8);

Or:

int myArray = {1, 1, 2, 3, 5, 8};

Or maybe:

myArray = [1, 1, 2, 3, 5, 8]

In R, though, there's an extra piece: To put multiple values into a single variable, you need the c() function, such as:

my_vector <- c(1, 1, 2, 3, 5, 8)

If you forget that c, you'll get an error. When you're starting out in R, you'll probably see errors relating to leaving out that c() a lot. (At least I certainly did.)

And now that I've stressed the importance of that c() function, I (reluctantly) will tell you that there's a case when you can leave it out -- if you're referring to consecutive values in a range with a colon between minimum and maximum, like this:

 

1  2  3  4  5  6  Next Page 

Sign up for CIO Asia eNewsletters.