Chapter 3 Functions in R

A function in R can be thought of as a sequence of statements that takes some input, uses that input to compute something, and then returns a result.
Why do we need functions?
- To modularize a task so that we can reuse the same code in many places.
- To increase readability of code.
- To reduce redundancy and reduce the number of errors.

3.1 Built-in R functions

base R has many useful built-in functions that you can use.
- Other R packages are another source of useful functions.
R’s base package is loaded by default.
- A few other packages including graphics, stats, and *utils are usually loaded by default as well.
The base package includes many widely-used functions, such as the []print() or the sum() function.
There are many other R packages available which contain many useful functions.

3.2 Construction your own R functions

You can write your own functions as needed.
While base R and the many available R packages have a wide range of useful functions, being able to write your own functions gives you much greater flexibility when working with R.

3.2.1 Function Definition Syntax

There are three key components of a function definition in R.
- Function name: the name which will be used to call the function
- Function arguments: values to pass to a function as input.
- Return value: the value returned by a function as output.
The general syntax for writing your own R function is:

function_name <- function(params){ 
     ## function_name is the name of the function 
     ## params name of the input variable within this function 
 
     statement1  ## statements executed when the function is called
     statement2  ## statements convert params into some value to be 
      ...        ## returned
     return(return_value)  ## return the variable return_value
}

3.2.2 Example 1

Let’s write a function that takes a number as an input and returns the square of that number.

## define a new function named square
square <- function(x) { ## function name: square, argument : x
   return(x*x)           ## returns x*x
}

Once we have defined square, we can use it as many times as we would like.
After defining the function, each time the function is used is referred to as calling the function.
Let’s try calling square with an input number of 10:

square(10)    ## example of using the function square

## [1] 100

3.2.3 Example 2

Let’s write another function that takes a single number (assumed to be an integer) as input and outputs another number according to the following rule: + if the input number is positive and even, return the number 2 + if the input number if positive and odd, return the number 1 + if the input number is not positive and even, return the number -2 + if the input number if not positive and odd, return the number -1

PositiveEven  <- function(x) { 
   if( x > 0 & x%%2==0 ) {
       return_value <- 2
   } else if( x > 0 & x%%2==1 ){
       return_value <- 1
   } else if( x <= 0 & x%%2==0) {
       return_value <- -2
   } else {
       return_value <- -1
   }
   return( return_value )
}

Now, let’s look at a few examples of calling PositiveEven:

PositiveEven(3)

## [1] 1

PositiveEven(-6)

## [1] -2

PositiveEven(0)

## [1] -2

PositiveEven(4)

## [1] 2

We could make our function PositiveEven a bit more user-friendly by having our function throw an error whenever the user inputs a number that is not an integer.

PositiveEvenSafe <- function(x) { # Function named PositiveEvenSafe
   if( x%%1 != 0) { # x%%1 will equal 0 if x is an integer 
       stop("x must be an integer")  
       # The stop function will stop the execution 
       # of the function and will return an error
       
   }
   if( x > 0 & x%%2==0 ) {
       return_value <- 2
   } else if( x > 0 & x%%2==1 ){
       return_value <- 1
   } else if( x <= 0 & x%%2==0) {
       return_value <- -2
   } else {
       return_value <- -1
   }
   return( return_value )
}

Now, let’s see what happens if we call the function PositiveEvenSafe with the argument x = 2 and then x = 7.1

PositiveEvenSafe(2)

## [1] 2

PositiveEvenSafe(7.1)
Error in PositiveEvenSafe(7.1) : x must be an integer

3.2.4 Rules for choosing function names

All the same rules for variable names apply to rules for choosing function names.
Examples of valid and invalid function names include:

Valid_Function_Names	Invalid_Function_Names
i	2things
my_function	location@
answer42	_user.name
.name	.3rd

Another rule to keep in mind is that you cannot use a reserved word as a function name or variable name.
You can use built-in function names (for example, the print function) for your own functions, but this is NOT RECOMMENDED.
The following are the reserved words in R:

if else while function for

in next break TRUE FALSE

NULL Inf NA NA_integer

NA_real NA_complex NA_character

You can find the list of reserved words in R by typing

?reserved

directly in the R console

3.3 Default argument values in functions

We can provide default values for function parameters/arguments
- by adding = default_value after the parameter
If an argument is specified in the function call, the specified one is used
- Otherwise; the default argument value is used
In the function definition, it is generally better (though not required) to put parameters without default arguments before those with default arguments.
- When calling a function, arguments must be specified for every parameter without default arguments.
- Unlike Python, in R you can mix arguments with/without default arguments in an arbitrary order (though I don’t recommend it).

3.3.1 Example 1

As an example, let’s write a function that adds 3 numbers and, as a default, sets one of these numbers to zero:

add3 <- function(x, y, z=0) {
    return(x + y + z)
}

The default value for z here is \(0\).

add3(1, 2)     ## omit z

## [1] 3

add3(1, 2, 0)  ## this should give the same as add3(1,2)

## [1] 3

add3(1, 2, 3)  ## set z to 3 instead of 0

## [1] 6

3.4 Specifying function arguments with keywords

We can specify how arguments are passed to parameters not only by their order but by names with keyword arguments.
Keyword arguments: have to do with how you call a function - not with the function definition itself.
For example, we could call our function add3 with keywords in the following way:

add3(2, 2, 1)      # Call function using original positions

## [1] 5

add3(x=2, y=2, z=1) # Call function using keywords

## [1] 5

add3(y=2, x=2, z=1) # With keywords, position does not matter

## [1] 5

3.4.1 Example 1

The function foo below has parameters x, y,, z, w.
- The default value of z is \(0\), and the default value of w is TRUE.

foo <- function (x, y, z=0, w=TRUE) { 
    if(w) {
       1000*x + 100*y + 10*z  ## this is equivalent to return(...)
    } else {
       1000*x - 100*y + 10*z
    }
}

Let’s try calling foo using the original position of the arguments in the function definition.

foo(9, 3, 5,TRUE) ## specify all arguments

## [1] 9350

foo(9, 3, 5)      ## omit argument w

## [1] 9350

foo(9, 3)       ## omit both z and w

## [1] 9300

Now, let’s try calling foo using keyword arguments and change the orders of x and y.

## foo(9)      ## this will cause error because y is unknown
foo(x=9, y=5)   ## specify x and y as keyword arguments

## [1] 9500

We can even switch the positions of x and y when using keyword arguments

foo(y=5, x=9)  ## when using keywords, argument order doesn't matter

## [1] 9500

You can even mix which arguments you specify as positional and keyword:

foo(9, y=5)     ## specify x as positional, y as keyword argument

## [1] 9500

foo(9, z=3, y=5)  ## y,z are keyword arguments, x is positional

## [1] 9530

3.5 More Examples of Writing Functions

3.5.1 Example 1: Pass/Fail from a Weighted Average

Let’s write an R function called WeightedGrade that computes a weighted average of a collection of test scores and outputs whether or not a student has “passed” or “failed” the course - using two possible criteria for choosing Pass/Fail.
We would like the R function to have the following structure

WeightedGrade <- function(grades, weights, scheme="A") { 
   
}

Function Input: We want the inputs grades and weights to be both numeric vectors of the same length.
- From grades and weights the weighted grade average, is the sum of elements in grades multiplied by the elements in weights divided by the sum of weights.
- All the elements of both grades and weights should be greater than or equal to \(0\) and less than or equal to \(100\).

Function Output: If the sum of weights equals \(100\) and scheme=="A", the function should return a character vector using the following rule:
- "Pass" if the weighted grade average using grades and weights is greater than \(60\) and return "Fail" if the weighted grade average is less than or equal to \(60\).
If the sum of weights equals \(100\) and scheme=="B", the function should return a character vector using the following rule:
- "Pass" if the weighted grade average using grades and weights is greater than \(80\) and return "Fail" if the weighted grade average is less than or equal to \(80\).
If the sum of weights does not equal \(100\), the function should return the value NA.

Here is one way to write the function so that it satisfies the above description:

WeightedGrade <- function(grades, weights, scheme="A") {
    if(sum(weights) != 100) {
        ## if sum(weights)!=100, return NA
        ans <- NA
    } else if(scheme=="A") {
        ## if sum(weights)==100 and scheme is A, use the 
        ## following Pass/Fail rule
        weighted_avg <- sum(weights*grades)/sum(weights)
        if(weighted_avg > 60) {
            ans <- "Pass"
        } else {
            ans <- "Fail"
        }
    } else {
        ## if sum(weights)==100 and scheme is not A, use the 
        ## following Pass/Fail rule
        weighted_avg <- sum(weights*grades)/sum(weights)
        if(weighted_avg > 80) {
           ans <- "Pass"
        } else {
           ans <- "Fail"
        }
    }
    return(ans)
}

Now, let’s test out our function by doing a few example runs:

WeightedGrade(grades=c(60, 75, 80), weights=c(10, 40, 50), scheme="B")

## [1] "Fail"

WeightedGrade(grades=c(60, 75, 80), weights=c(10, 40, 50), scheme="A")

## [1] "Pass"

## Note that scheme is a default argument
WeightedGrade(grades=c(60, 75, 80), weights=c(10, 40, 50) )

## [1] "Pass"

WeightedGrade(grades=c(60, 75, 80), weights=c(10, 30, 40) )

## [1] NA

WeightedGrade(grades=c(90, 95, 80, 86), weights=c(10, 10, 40, 40), scheme="B" )

## [1] "Pass"

3.6 Exercises

Suppose we define the function quiz as

quiz <- function(bool_var1, x=0, bool_var2 = TRUE) {
    y <- 0
    if(bool_var1 & bool_var2) {
        y <- x + 2
    } else {
        if(bool_var1) {
            y <- x - 2
        }
    }
    return(y)
}

What value does the following function call return?

quiz(FALSE, 1.3)

Write an R function that implements the following mathematical function in R

\[\begin{equation} L(x, y) = \begin{cases} 0 & \text{ if } x = 0 \text{ and } y = 0 \nonumber \\ 1 & \text{ if } x \neq 0 \text{ and } y = 0 \nonumber \\ |x| & \text{ if } y = 1 \nonumber \\ x^{2} & \text{ if } y = 2 \nonumber \end{cases} \end{equation}\]

The function should have user-provided arguments x and y and should return NA if y does not equal either \(0\), \(1\), or \(2\)

Write an R function called PropGtZero which returns the proportion of three entered numbers which are greater than \(0\). The function should have the following function definition

PropGtZero <- function(x, y, z, gt=TRUE) {
  
}

If gt=TRUE, then PropGtZero should return the proportion of the numbers x, y, z which are greater than \(0\).
If gt=FALSE, then PropGtZero should return the proportion of the numbers x, y, z which are lesser than or equal to \(0\).
If one or more of x, y, z, is NA, the function should return NA.
For example, PropGtZero(3,2,-2) should return \(2/3\).