Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

Beginner's guide to R: Syntax quirks you'll want to know

Sharon Machlis | June 7, 2013
Why x=3 doesn't always mean what you think it should, about data types and more.

As mentioned in the prior section, you can have a vector with multiple elements of the same type, such as:

1, 5, 7

or

"Bill", "Bob", "Sue"

A single number or character string is also a vector -- a vector of 1. When you access the value of a variable that's got just one value, such as 73 or "Learn more about R at Computerworld.com," you'll also see this in your console before the value:

[1]

That's telling you that your screen printout is starting at vector item number one. If you've got a vector with lots of values so the printout runs across multiple lines, each line will start with a number in brackets, telling you which vector item number that particular line is starting with. (See the screen shot, below.)

If you've got a vector with lots of values so the printout runs across multiple lines, each line will start with a number in brackets, telling you which vector item number that particular line is starting with.

If you want to mix numbers and strings or numbers and TRUE/FALSE types, you need a list. (If you don't create a list, you may be unpleasantly surprised that your variable containing (3, 8, "small") was turned into a vector of characters ("3", "8", "small") ).

And by the way, R assumes that 3 is the same class as 3.0 -- numeric (i.e., with a decimal point). If you want the integer 3, you need to signify it as 3L or with the as.integer() function. In a situation where this matters to you, you can check what type of number you've got by using the class() function:

class(3)

class(3.0)

class(3L)

class(as.integer(3))

There are several as() functions for converting one data type to another, including as.character(), as.list() and as.data.frame().

R also has special vector and list types that are of special interest when analyzing data, such as matrices and data frames. A matrix has rows and columns; you can find a matrix dimension with dim() such as

dim(my_matrix)

A matrix needs to have all the same data type in every column, such as numbers everywhere.

Data frames are like matrices except one column can have a different data type from another column, and each column must have a name. If you've got data in a format that might work well as a database table (or well-formed spreadsheet table), it will also probably work well as an R data frame.

In a data frame, you can think of each row as similar to a database record and each column like a database field. There are lots of useful functions you can apply to data frames, some of which I've gone over in earlier sections, such as summary() and the psych package's describe().

 

Previous Page  1  2  3  4  5  6  Next Page 

Sign up for CIO Asia eNewsletters.