% NOTE -- ONLY EDIT THE .Rnw FILE!!! The .tex file is % likely to be overwritten. \documentclass[compress]{beamer} \usepackage{SweaveBeamer} \input{commondefs} \SweaveOpts{prefix.string=figs/introduction,eps=FALSE,pdf=TRUE,keep.source=TRUE} \title{A Quick Introduction to \R{}} \subtitle{Objects, classes and functions} \begin{document} \begin{frame} \titlepage \end{frame} \begin{frame}[fragile] \frametitle{Objects} Objects in \R\ are anything that can be assigned to a variable. They could be \begin{itemize} \item Constants: \code{2, 13.005, "January"} \item Special symbols: \code{NA, TRUE, FALSE, NULL, NaN} \item Things already defined in \R{}: \code{seq, c} (functions), \code{month.name, letters} (character), \code{pi} (numeric) \item New objects we can create using existing objects (this is done by evaluating \emph{expressions} --- e.g., \code{1 / sin(seq(0, pi, length = 50))}) \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{Different types of Objects} \R\ objects come in a variety of types. Given an object, the functions \code{mode} and \code{class} tell us about its type. \begin{itemize} \item \code{mode(object)} has to do with how the object is stored \item \code{class(object)} gives the \emph{class} of an object. The main purpose of the class is so that generic functions (like \code{print} and \code{plot}) know what to do with it. \item Often, objects of a particular class can be created by a function with the same name as the class. <>= obj <- numeric(5) obj mode(obj) class(obj) @ \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{Mode and class} The mode of an object tells us how it is stored. Two objects may be stored in the same manner but have different class. How the object will be printed will be determined by its class, not its mode. <>= x <- 1:12 y <- matrix(1:12, 2, 6) class(x) mode(x) class(y) mode(y) @ \end{frame} \begin{frame}[fragile] \frametitle{Mode and class} <<>>= print(x) print(y) @ \end{frame} \begin{frame}[fragile] \frametitle{Classes and generic functions} This is achieved by the following mechanism: \begin{itemize} \item When \code{print(y)} is called, the \Rfunction{print} function determines that \code{class(y)} is \code{"matrix"}, so it looks for a function named \Rfunction{print.matrix}. Such a function exists, so the actual printing is done by \code{print.matrix(y)} \item When \code{print(x)} is called, the \Rfunction{print} function determines that \code{class(x)} is \code{"integer"}, so it looks for a function named \Rfunction{print.integer}. There is no such function, so instead the fallback is used, and the actual printing is done by \Rfunction{print.default} \item This happens only for \emph{generic functions} (those that call \code{UseMethod}) \end{itemize} This is actually a simplified version of what really happens, but it's close enough for our purposes. \end{frame} \begin{frame}[fragile] \frametitle{Functions} \begin{itemize} \item Functions in \R\ are simply objects of a particular type\\ (of mode \code{"function"}). \item Like other objects, they can be assigned to variables. Most of the time, they actually are assigned to a variable, and we refer to the function by the name of the variable it is assigned to. \item All standard functions (like \Rfunction{print}, \Rfunction{plot}, \Rfunction{c}) are actually variables, whose value is an \R\ object of mode \code{"function"}. When we refer to the \Rfunction{seq} function, we actually mean the value of the variable \code{seq} <>= class(seq) mode(seq) print(seq) @ \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{Calling functions} \begin{itemize} \item To call a function, the function object is followed by a list of arguments in parentheses: \code{fun.obj( arglist )} \item Every function has a list of formal arguments \\ (displayed by \code{args(fun.obj)}) \item Arguments can be matched by \begin{itemize} \item position: e.g., \code{plot(v1, v2)} \item name: e.g., \code{plot(x = v1, y = v2)} \item defaults: many arguments have default values which are used when the arguments are not specified. For example, \Rfunction{plot} can be given arguments called \code{col, pch, lty, lwd}. Since they have not been specified in the calls above, the defaults are used. \end{itemize} This is best understood by looking at examples \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{Functions are objects} Functions are regular \R\ objects, just like other objects like vectors, data frames and lists. So, \begin{itemize} \item variables can be assigned values which are functions \item functions can be passed as arguments to other functions. We have already seen examples of this --- namely \Rfunction{lapply} and \Rfunction{sapply} --- where one of the arguments is a function object. \end{itemize} What happens when a function is called ? \begin{itemize} \item Most functions return a new \R\ object, which is typically assigned to a variable or used as an argument to another function \item Some functions are more interesting for their \emph{side-effects}, e.g., \Rfunction{plot} produces a graphical plot, \Rfunction{write.table} writes some data to a file on disk. \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{Expressions} Before we talk more about functions, we need to know a bit about \emph{expressions}.\\~\\ Expressions are statements which are evaluated to create an object. They consist of operators (\code{+, *, \^{}}) and other objects (variables and constants). e.g., \code{2 + 2}, \code{sin(x)\^{}2} <>= a <- 2 + 2 print(a) b <- sin(a)^2 print(b) @ \end{frame} \begin{frame}[fragile] \frametitle{Expressions} \begin{itemize} \item \R\ has \emph{expression blocks} which consist of multiple expressions \item All individual expressions inside this composite block are evaluated one by one. All variable assignments are done, and any \code{print()} or \code{plot()} call have the appropriate side-effect. \item But most importantly: this whole composite block \emph{can be treated as a single expression} whose value on evaluation will be the value of the \emph{last expression evaluated inside the block} \end{itemize} <>= a <- { tmp <- 1:50 log.factorial <- sum(log(tmp)) sum.all <- sum(tmp) log.factorial } print(a) print(sum.all) @ \end{frame} \begin{frame}[fragile] \frametitle{Defining a function} A new function is defined / created by a construct of the form \\ \code{fun.name <- function( arglist ) expr } where \begin{itemize} \only<1>{\item \code{fun.name} is a variable where we store the function. This variable will be used to call the function later \item \code{arglist} is a list of formal arguments. This list \begin{itemize} \item can be empty (in which case the function takes no arguments) \item can have just some names (in which case these names become variables inside the function, whose values have to be supplied when the function is called) \item can have some arguments in \code{name = value} form, in which case the names are variables available inside the function, and the values are their default values \end{itemize} } \only<2>{\item \code{expr} is an expression (typically an \emph{exression block}) (which can make use of the variables defined in the argument list). This part is also referred to as the \emph{body} of a function, and can be extracted by \code{body(fun.obj)} \item Inside functions, there can be a special \code{return(val)} call which exits the function and returns the value \code{val} } \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{Variables and Scope} When an expression involves a variable, how is the value of that variable determined ? \begin{itemize} \item When inside a function, the variable is first searched for \emph{inside} the function. This includes \begin{itemize} \item Variables defined as arguments of that function \item Variables defined inside the function \end{itemize} Variables that are defined inside a function remain in effect only inside the function. When the function complete, these variables can no longer be accessed \item If a variable is not found inside a function, it is searched for \emph{outside} the function. The exact details of this is complicated, but all you need to know is that \begin{itemize} \item if you define a variable outside the function, it can be accessed inside the function as well. If two variables of the same name are defined both outside and inside the function, the \emph{one inside will be used} \end{itemize} \item If no variable with that name is found, an error is generated \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{Variables and Scope} <>= myvar <- 1 myfun1 <- function() { myvar <- 5 print(myvar) } myfun1() myvar myfun2 <- function() { print(myvar) } myfun2() @ \end{frame} \begin{frame}[fragile] \frametitle{Arguments and body} \Rfunction{args} and \Rfunction{body} give the two components of a function: <>= args(myfun1) body(myfun1) @ \end{frame} \begin{frame}[fragile] \frametitle{What happens to objects ?} An object is created whenever an expression is evaluated. Unless the result is assigned to some variable (or used as part of another expression), this object is lost. \\~\\ In the following example, all the \code{i\^{}2} values are calculated, but nothing is stored and nothing is printed. <>= for (i in 1:10) i^2 @ \end{frame} \begin{frame}[fragile] \frametitle{What happens to objects ?} What to do with an object after creating it depends on what the purpose is. \begin{itemize} \item If the only intent is to see the value, the object can be printed using the \code{print} function <>= for (i in 1:5) print(i^2) @ \item If the intent is to use these values for later, they should be assigned to some variable <>= a <- numeric(10) for (i in 1:10) a[i] <- i^2 print(a) @ \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{Automatic printing} Although we haven't made it explicit, it should be clear by now that objects are usually printed automatically when they are evaluated on the command prompt without assigning them to a variable. \begin{itemize} \item This is equivalent to calling \Rfunction{print} on the object \item This does not always happen (as in the for example above). \R\ has a mechanism for suppressing this printing, which is used in such cases. \item The automatic printing \emph{never} happens inside a function. So remember to deal with whatever objects you evaluate inside your functions. \item The suppression of automatic printing can sometimes lead to unexpected behaviour. To avoid surprises, use calls to \Rfunction{print} explictly. \end{itemize} \end{frame} \end{document}