And speaking of quirks: There are several ways to find an object's underlying data type, but not all of them return the same value. For example, class() and str() will return data.frame on a data frame object, but mode() returns the more generic list.
If you'd like to learn more details about data types in R, you can watch this video lecture by Roger Peng, associate professor of biostatistics at the Johns Hopkins Bloomberg School of Public Health:
Roger Peng, associate professor of biostatistics at the Johns Hopkins Bloomberg School of Public Health, explains data types in R.
One more useful concept to wrap up this section -- hang in there, we're almost done: factors. These represent categories in your data. So, if you've got a data frame with employees, their department and their salaries, salaries would be numerical data and employees would be characters (strings in many other languages); but you'd likely want department to be a factor -- in other words, a category you may want to group or model your data by. Factors can be unordered, such as department, or ordered, such as "poor", "fair", "good" and "excellent."
R command line differs from the Unix shell
When you start working in the R environment, it looks quite similar to a Unix shell. In fact, some R command-line actions behave as you'd expect if you come from a Unix environment, but others don't.
Want to cycle through your last few commands? The up arrow works in R just as it does in Unix -- keep hitting it to see prior commands.
The list function, ls(), will give you a list, but not of files as in Unix. Rather, it will provide a list of objects in your current R session.
Want to see your current working directory? pwd just throws an error; what you want is getwd().
rm(my_variable) will delete a variable from your current session.
R does include a Unix-like grep() function. For more on using grep in R, see this brief writeup on Regular Expressions with The R Language at regular-expressions.info.
Terminating your R expressions
R doesn't need semicolons to end a line of code (although it's possible to put multiple commands on a single line separated by semicolons, you don't see that very often). Instead, R uses line breaks (i.e., new line characters) to determine when an expression has ended.
What if you want one expression to go across multiple lines? The R interpreter tries to guess if you mean for it to continue to the next line: If you obviously haven't finished a command on one line, it will assume you want to continue instead of throwing an error. Open some parentheses without closing them, use an open quote without a closing one or end a line with an operator like + or - and R will wait to execute your command until it comes across the expected closing character and the command otherwise looks finished.
Sign up for CIO Asia eNewsletters.