Chapter 3 Functions in R
A function in R can be thought of as a sequence of statements that takes some input, uses that input to compute something, and then returns a result.
Why do we need functions?
To modularize a task so that we can reuse the same code in many places.
To increase readability of code.
To reduce redundancy and reduce the number of errors.
3.1 Built-in R functions
base R has many useful built-in functions that you can use.
- Other R packages are another source of useful functions.
R’s base package is loaded by default.
- A few other packages including graphics, stats, and *utils are usually loaded by default as well.
The base package includes many widely-used functions, such as the []print() or the sum() function.
There are many other R packages available which contain many useful functions.
3.2 Construction your own R functions
You can write your own functions as needed.
While base R and the many available R packages have a wide range of useful functions, being able to write your own functions gives you much greater flexibility when working with R.
3.2.1 Function Definition Syntax
- There are three key components of a function definition in R.
Function name: the name which will be used to call the function
Function arguments: values to pass to a function as input.
Return value: the value returned by a function as output.
- The general syntax for writing your own R function is:
function_name <- function(params){
## function_name is the name of the function
## params name of the input variable within this function
statement1 ## statements executed when the function is called
statement2 ## statements convert params into some value to be
... ## returned
return(return_value) ## return the variable return_value
}
3.2.2 Example 1
- Let’s write a function that takes a number as an input and returns the square of that number.
## define a new function named square
square <- function(x) { ## function name: square, argument : x
return(x*x) ## returns x*x
}
Once we have defined
square
, we can use it as many times as we would like.After defining the function, each time the function is used is referred to as calling the function.
Let’s try calling
square
with an input number of10
:
## [1] 100
3.2.3 Example 2
Let’s write another function that takes a single number (assumed to be an integer) as input and outputs another number according to the following rule: + if the input number is positive and even, return the number 2 + if the input number if positive and odd, return the number 1 + if the input number is not positive and even, return the number -2 + if the input number if not positive and odd, return the number -1
PositiveEven <- function(x) {
if( x > 0 & x%%2==0 ) {
return_value <- 2
} else if( x > 0 & x%%2==1 ){
return_value <- 1
} else if( x <= 0 & x%%2==0) {
return_value <- -2
} else {
return_value <- -1
}
return( return_value )
}
- Now, let’s look at a few examples of calling
PositiveEven
:
## [1] 1
## [1] -2
## [1] -2
## [1] 2
- We could make our function
PositiveEven
a bit more user-friendly by having our function throw an error whenever the user inputs a number that is not an integer.
PositiveEvenSafe <- function(x) { # Function named PositiveEvenSafe
if( x%%1 != 0) { # x%%1 will equal 0 if x is an integer
stop("x must be an integer")
# The stop function will stop the execution
# of the function and will return an error
}
if( x > 0 & x%%2==0 ) {
return_value <- 2
} else if( x > 0 & x%%2==1 ){
return_value <- 1
} else if( x <= 0 & x%%2==0) {
return_value <- -2
} else {
return_value <- -1
}
return( return_value )
}
- Now, let’s see what happens if we call the function
PositiveEvenSafe
with the argumentx = 2
and thenx = 7.1
## [1] 2
3.2.4 Rules for choosing function names
All the same rules for variable names apply to rules for choosing function names.
Examples of valid and invalid function names include:
Valid_Function_Names | Invalid_Function_Names |
---|---|
i | 2things |
my_function | location@ |
answer42 | _user.name |
.name | .3rd |
Another rule to keep in mind is that you cannot use a reserved word as a function name or variable name.
You can use built-in function names (for example, the print function) for your own functions, but this is NOT RECOMMENDED.
The following are the reserved words in R:
if else while function for
in next break TRUE FALSE
NULL Inf NA NA_integer
NA_real NA_complex NA_character
- You can find the list of reserved words in R by typing
directly in the R console
3.3 Default argument values in functions
We can provide default values for function parameters/arguments
- by adding = default_value after the parameter
If an argument is specified in the function call, the specified one is used
- Otherwise; the default argument value is used
In the function definition, it is generally better (though not required) to put parameters without default arguments before those with default arguments.
When calling a function, arguments must be specified for every parameter without default arguments.
Unlike Python, in R you can mix arguments with/without default arguments in an arbitrary order (though I don’t recommend it).
3.4 Specifying function arguments with keywords
We can specify how arguments are passed to parameters not only by their order but by names with keyword arguments.
Keyword arguments: have to do with how you call a function - not with the function definition itself.
For example, we could call our function
add3
with keywords in the following way:
## [1] 5
## [1] 5
## [1] 5
3.4.1 Example 1
- The function
foo
below has parametersx
,y,
,z
,w
.- The default value of
z
is \(0\), and the default value ofw
is TRUE.
- The default value of
foo <- function (x, y, z=0, w=TRUE) {
if(w) {
1000*x + 100*y + 10*z ## this is equivalent to return(...)
} else {
1000*x - 100*y + 10*z
}
}
- Let’s try calling
foo
using the original position of the arguments in the function definition.
## [1] 9350
## [1] 9350
## [1] 9300
- Now, let’s try calling
foo
using keyword arguments and change the orders ofx
andy
.
## foo(9) ## this will cause error because y is unknown
foo(x=9, y=5) ## specify x and y as keyword arguments
## [1] 9500
- We can even switch the positions of
x
andy
when using keyword arguments
## [1] 9500
- You can even mix which arguments you specify as positional and keyword:
## [1] 9500
## [1] 9530
3.5 More Examples of Writing Functions
3.5.1 Example 1: Pass/Fail from a Weighted Average
Let’s write an R function called
WeightedGrade
that computes a weighted average of a collection of test scores and outputs whether or not a student has “passed” or “failed” the course - using two possible criteria for choosing Pass/Fail.We would like the R function to have the following structure
Function Input: We want the inputs
grades
andweights
to be both numeric vectors of the same length.From
grades
andweights
the weighted grade average, is the sum of elements ingrades
multiplied by the elements inweights
divided by the sum ofweights
.All the elements of both
grades
andweights
should be greater than or equal to \(0\) and less than or equal to \(100\).
- Function Output: If the sum of
weights
equals \(100\) andscheme=="A"
, the function should return a character vector using the following rule:"Pass"
if the weighted grade average usinggrades
andweights
is greater than \(60\) and return"Fail"
if the weighted grade average is less than or equal to \(60\).
- If the sum of
weights
equals \(100\) andscheme=="B"
, the function should return a character vector using the following rule:"Pass"
if the weighted grade average usinggrades
andweights
is greater than \(80\) and return"Fail"
if the weighted grade average is less than or equal to \(80\).
- If the sum of
weights
does not equal \(100\), the function should return the valueNA
.
- Here is one way to write the function so that it satisfies the above description:
WeightedGrade <- function(grades, weights, scheme="A") {
if(sum(weights) != 100) {
## if sum(weights)!=100, return NA
ans <- NA
} else if(scheme=="A") {
## if sum(weights)==100 and scheme is A, use the
## following Pass/Fail rule
weighted_avg <- sum(weights*grades)/sum(weights)
if(weighted_avg > 60) {
ans <- "Pass"
} else {
ans <- "Fail"
}
} else {
## if sum(weights)==100 and scheme is not A, use the
## following Pass/Fail rule
weighted_avg <- sum(weights*grades)/sum(weights)
if(weighted_avg > 80) {
ans <- "Pass"
} else {
ans <- "Fail"
}
}
return(ans)
}
- Now, let’s test out our function by doing a few example runs:
## [1] "Fail"
## [1] "Pass"
## Note that scheme is a default argument
WeightedGrade(grades=c(60, 75, 80), weights=c(10, 40, 50) )
## [1] "Pass"
## [1] NA
## [1] "Pass"
3.6 Exercises
- Suppose we define the function
quiz
as
quiz <- function(bool_var1, x=0, bool_var2 = TRUE) {
y <- 0
if(bool_var1 & bool_var2) {
y <- x + 2
} else {
if(bool_var1) {
y <- x - 2
}
}
return(y)
}
What value does the following function call return?
- Write an R function that implements the following mathematical function in R
The function should have user-provided arguments x
and y
and should return NA
if y
does not equal either \(0\), \(1\), or \(2\)
- Write an R function called
PropGtZero
which returns the proportion of three entered numbers which are greater than \(0\). The function should have the following function definition
If
gt=TRUE
, thenPropGtZero
should return the proportion of the numbersx
,y
,z
which are greater than \(0\).If
gt=FALSE
, thenPropGtZero
should return the proportion of the numbersx
,y
,z
which are lesser than or equal to \(0\).If one or more of
x
,y
,z
, isNA
, the function should returnNA
.For example,
PropGtZero(3,2,-2)
should return \(2/3\).