Introduction to R and RStudio

Chapter 12 Functions

Functions are one of the fundamental building blocks of the R language. They are small pieces of reusable code that can be treated like any other R object.

If you’ve worked through any other part of this course, you’ve probably used some functions already. Functions are usually characterized by the name of the function followed by parentheses.

Let’s try using a few basic functions just for fun. The Sys.Date() function returns a string representing today’s date. Type Sys.Date() below and see what happens.

Sys.Date()
## [1] "2024-01-10"

Most functions in R return a value. Functions like Sys.Date() return a value based on your computer’s environment, while other functions manipulate input data in order to compute a return value.

The mean() function takes a vector of numbers as input, and returns the average of all of the numbers in the input vector. Inputs to functions are often called arguments. Providing arguments to a function is also sometimes called passing arguments to that function. Arguments you want to pass to a function go inside the function’s parentheses. Try passing the argument c(2, 4, 5) to the mean() function.

mean(c(2, 4, 5))
## [1] 3.666667

Functions usually take arguments which are variables that the function operates on. For example, the mean() function takes a vector as an argument, like in the case of mean(c(2,6,8)). The mean() function then adds up all of the numbers in the vector and divides that sum by the length of the vector.

In the following question you will be asked to modify a script that will appear as soon as you move on from this question. When you have finished modifying the script, save your changes to the script and type submit() in the console so that the script will be evaluated. There will be some comments in the script that opens up, so be sure to read them! Note that the scripts will automatically be saved in a temporary files folder, so if you want can you make a copy of the script and save it for later use (e.g. via copy-paste to a notepad file, or via File>Save As>…)

The last R expression to be evaluated in a function will become the return value of that function. We want this function to take one argument, x, and return x without modifying it. Delete the pound sign (hash, hashtag) so that x is returned without any modification. Make sure to save your script before you type submit().

boring_function <- function(x) {
  x
}

Now that you’ve created your first function let’s test it! Type: boring_function(‘My first function!’). If your function works, it should just return the string: ‘My first function!’

boring_function('My first function!')
## [1] "My first function!"

Congratulations on writing your first function. By writing functions, you can gain serious insight into how R works. As John Chambers, the creator of R once said:

To understand computations in R, two slogans are helpful: 1. Everything that exists is an object. 2. Everything that happens is a function call.

If you want to see the source code for any function, just type the function name without any arguments or parentheses. Let’s try this out with the function you just created. Type: boring_function to view its source code.

boring_function
## function(x) {
##   x
## }

Time to make a more useful function! We’re going to replicate the functionality of the mean() function by creating a function called: my_mean(). Remember that to calculate the average of all of the numbers in a vector you find the sum of all the numbers in the vector, and then divide that sum by the number of numbers in the vector.

Make sure to save your script before you type submit().

# You're free to implement the function my_mean however you want, as long as it
# returns the average of all of the numbers in `my_vector`.
#
# Hint #1: sum() returns the sum of a vector.
#   Ex: sum(c(1, 2, 3)) evaluates to 6
#
# Hint #2: length() returns the size of a vector.
#   Ex: length(c(1, 2, 3)) evaluates to 3
#
# Hint #3: The mean of all the numbers in a vector is equal to the sum of all of
#          the numbers in the vector divided by the size of the vector.
#
# Note for those of you feeling super clever: Please do not use the mean()
# function while writing this function. We're trying to teach you something 
# here!
#
# Be sure to save this script and type submit() in the console after you make 
# your changes.

my_mean <- function(my_vector) {
  # Write your code here!
  # Remember: the last expression evaluated will be returned! 
  sum(my_vector)/length(my_vector)
}

Now test out your my_mean() function by finding the mean of the vector c(4, 5, 10).

my_mean(c(4, 5, 10))
## [1] 6.333333

Next, let’s try writing a function with default arguments. You can set default values for a function’s arguments, and this can be useful if you think someone who uses your function will set a certain argument to the same value most of the time.

Make sure to save your script before you type submit().

# Let me show you an example of a function I'm going to make up called
# increment(). Most of the time I want to use this function to increase the
# value of a number by one. This function will take two arguments: "number" and
# "by" where "number" is the digit I want to increment and "by" is the amount I
# want to increment "number" by. I've written the function below. 
#
# increment <- function(number, by = 1){
#     number + by
# }
#
# If you take a look in between the parentheses you can see that I've set
# "by" equal to 1. This means that the "by" argument will have the default
# value of 1.
#
# I can now use the increment function without providing a value for "by": 
# increment(5) will evaluate to 6. 
#
# However if I want to provide a value for the "by" argument I still can! The
# expression: increment(5, 2) will evaluate to 7. 
# 
# You're going to write a function called "remainder." remainder() will take
# two arguments: "num" and "divisor" where "num" is divided by "divisor" and
# the remainder is returned. Imagine that you usually want to know the remainder
# when you divide by 2, so set the default value of "divisor" to 2. Please be
# sure that "num" is the first argument and "divisor" is the second argument.
#
# Hint #1: You can use the modulus operator %% to find the remainder.
#   Ex: 7 %% 4 evaluates to 3. 
#
# Remember to set appropriate default values! Be sure to save this 
# script and type submit() in the console after you write the function.

remainder <- function(num, divisor = 2) {
  # Write your code here!
  # Remember: the last expression evaluated will be returned! 
  num %% divisor
}

Let’s do some testing of the remainder function. Run remainder(5) and see what happens.

remainder(5)
## [1] 1

Let’s take a moment to examine what just happened. You provided one argument to the function, and R matched that argument to ‘num’ since ‘num’ is the first argument. The default value for ‘divisor’ is 2, so the function used the default value you provided.

Now let’s test the remainder function by providing two arguments. Type: remainder(11, 5) and let’s see what happens.

remainder(11, 5)
## [1] 1

Once again, the arguments have been matched appropriately.

You can also explicitly specify arguments in a function. When you explicitly designate argument values by name, the ordering of the arguments becomes unimportant. You can try this out by typing: remainder(divisor = 11, num = 5).

remainder(divisor = 11, num = 5)
## [1] 5

As you can see, there is a significant difference between remainder(11, 5) and remainder(divisor = 11, num = 5)!

R can also partially match arguments. Try typing remainder(4, div = 2) to see this feature in action.

remainder(4, div = 2)
## [1] 0

A word of warning: in general you want to make your code as easy to understand as possible. Switching around the orders of arguments by specifying their names or only using partial argument names can be confusing, so use these features with caution!

With all of this talk about arguments, you may be wondering if there is a way you can see a function’s arguments (besides looking at the documentation). Thankfully, you can use the args() function! Type: args(remainder) to examine the arguments for the remainder function.

args(remainder)
## function (num, divisor = 2) 
## NULL

You may not realize it but I just tricked you into doing something pretty interesting! args() is a function, remainder() is a function, yet remainder was an argument for args(). Yes it’s true: you can pass functions as arguments! This is a very powerful concept. Let’s write a script to see how it works.

Make sure to save your script before you type submit().

# You can pass functions as arguments to other functions just like you can pass
# data to functions. Let's say you define the following functions:
#
# add_two_numbers <- function(num1, num2){
#    num1 + num2
# }
#
# multiply_two_numbers <- function(num1, num2){
#   num1 * num2
# }
#
# some_function <- function(func){
#    func(2, 4)
# }
#
# As you can see we use the argument name "func" like a function inside of 
# "some_function()." By passing functions as arguments 
# some_function(add_two_numbers) will evaluate to 6, while
# some_function(multiply_two_numbers) will evaluate to 8.
# 
# Finish the function definition below so that if a function is passed into the
# "func" argument and some data (like a vector) is passed into the dat argument
# the evaluate() function will return the result of dat being passed as an
# argument to func.
#
# Hints: This exercise is a little tricky so I'll provide a few example of how
# evaluate() should act:
#    1. evaluate(sum, c(2, 4, 6)) should evaluate to 12
#    2. evaluate(median, c(7, 40, 9)) should evaluate to 9
#    3. evaluate(floor, 11.1) should evaluate to 11

evaluate <- function(func, dat){
  # Write your code here!
  # Remember: the last expression evaluated will be returned! 
  func(dat)
}

Let’s take your new evaluate() function for a spin! Use evaluate to find the standard deviation of the vector c(1.4, 3.6, 7.9, 8.8).

evaluate(sd, c(1.4, 3.6, 7.9, 8.8))
## [1] 3.514138

The idea of passing functions as arguments to other functions is an important and fundamental concept in programming.

You may be surprised to learn that you can pass a function as an argument without first defining the passed function. Functions that are not named are appropriately known as anonymous functions.

Let’s use the evaluate function to explore how anonymous functions work. For the first argument of the evaluate function we’re going to write a tiny function that fits on one line. In the second argument we’ll pass some data to the tiny anonymous function in the first argument.

Type the following command and then we’ll discuss how it works: evaluate(function(x){x+1}, 6)

evaluate(function(x){x+1}, 6)
## [1] 7

The first argument is a tiny anonymous function that takes one argument x and returns x+1. We passed the number 6 into this function so the entire expression evaluates to 7.

Try using evaluate() along with an anonymous function to return the first element of the vector c(8, 4, 0). Your anonymous function should only take one argument which should be a variable x.

evaluate(function(x){x[1]}, c(8, 4, 0))
## [1] 8

Now try using evaluate() along with an anonymous function to return the last element of the vector c(8, 4, 0). Your anonymous function should only take one argument which should be a variable x.

evaluate(function(x){x[length(x)]}, c(8, 4, 0))
## [1] 0

We’ve come to the end of our lesson! Go out there and write some great functions!

You’ve successfully completed this lesson!

In the exercises above, functions are defined such that the last evaluated expression will be returned by the function. However, although this works and is correct, it is better to explicitly specify the object that needs to be returned by the function, using the return function. For example, the following function called myFun takes objects x and y as input arguments, and in the function body it computes z as the product of x and y. As the last expression in the body of the function, the value of z is now explicitly returned by the function:

myFun <- function(x, y) {
  z <- x * y
  return(z)
}