R Environments

programming
Author

Michael Luu

Published

February 27, 2024

The following blog post hopes to demystify the concept of environments in R.

Example 1

Let’s start off by looking at a very simple example

a <- 1
b <- 2
c <- 3

a + b + c
[1] 6

In this code chunk we are defining the objects a, b, and c, and then adding them together. The important take away of this, is that the objects a, b, and c are all defined in the global environment.

We can check the environment by using the following function

rlang::current_env()
<environment: R_GlobalEnv>

The current_env function returns the current environment. In this case, it returns the global environment. The global environment is the top level environment in R. It is the environment where all objects are defined by default. However, we can also define objects within other environments. For example, whenever we are working with a function, the objects defined within the function are defined within the function’s environment.

Although we can call the current environment with current_env(), we can also capture the environment into an object using the following example

env <- rlang::current_env()

Now that we have the environment captured, we can use the env object to access objects that were defined within the environment.

names(env)
[1] "env"                 "a"                   ".QuartoInlineRender"
[4] "b"                   "c"                   ".main"              
c(env$a, env$b, env$c)
[1] 1 2 3

Example 2

Now let’s try this again, except we’re going to define the objects within a local environment. I’m going to redefine the objects a, b, and c within a local environment, and then add them together.

local({
  a <- 4
  b <- 5
  c <- 6
  
  env <- rlang::current_env()
  
  values <- c(a, b, c)
  
  results <- a + b + c
  
  tibble::lst(env,
              values,
              results)
})
$env
<environment: 0x0000027c4a61f4d8>

$values
[1] 4 5 6

$results
[1] 15

We can see here the name of the local environment, as well as the results of adding the objects a, b, and c together, using the objects defined within the local environment.

Example 3

What if we are working within a local environment that already has the objects a, b, and c defined? However, we would like to use the objects a, b, and c that are defined in the global environment.

Let’s take a look at the following example.

local({
  a <- 4
  b <- 5
  c <- 6
  
  env <- current_env()
  
  values <- c(a, b, c)
  
  r_expression <- rlang::expr(a + b + c)
  
  results <- eval_tidy(r_expression, env = global_env())
  
  tibble::lst(
    env,
    values,
    results
  )
  
})
$env
<environment: 0x0000027c4a8623b8>

$values
[1] 4 5 6

$results
[1] 6

In this example, we are defining the objects a, b, and c within a local environment. We then use the expr function to capture the expression a + b + c. We then use the eval_tidy function to evaluate the expression a + b + c within the global environment. This allows us to use the objects a, b, and c that are defined in the global environment, within the local environment.

Example 4

What if we want to use the objects a from the global environment, and b and c from the local environment? We can actually simplify the code by explicitly calling the object a from the global environment global, and then adding the objects b and c.

local({
  a <- 4
  b <- 5
  c <- 6
  
  env <- rlang::current_env()
  global <- rlang::global_env()
  
  results <- global$a + b + c
  
  values <- c(global$a, b, c)
  
  tibble::lst(
    env,
    values,
    results
  )
  
})
$env
<environment: 0x0000027c4aac9f50>

$values
[1] 1 5 6

$results
[1] 12

In this example, we are defining the objects a, b, and c within a local environment. We then use the global_env function to capture the global environment into an object called global. We then add the objects b and c to the object a that is defined in the global environment.

Example 5

Now that we have some of the basic fundamentals down, let’s get a little more complex.

A very typical workflow in R while creating new columns in a dataframe is to use the mutate function from dplyr. We would normally pipe the dataframe into mutate and define the new column. The output of mutate is a dataframe itself, in which we reassign the original dataframe with the new dataframe with the new column.

data <- tibble(i = 1:26)

data <- data |> 
  mutate(
    column_a = rnorm(26)
  )

data
# A tibble: 26 × 2
       i column_a
   <int>    <dbl>
 1     1   0.455 
 2     2   0.226 
 3     3   0.0299
 4     4   0.0565
 5     5   1.27  
 6     6   0.962 
 7     7   1.77  
 8     8   0.313 
 9     9   1.41  
10    10   0.469 
# ℹ 16 more rows

What if we wanted to create 26 new columns, each with a different name and different values. Instead of manually using mutate for all 26 columns, let’s use a functional programming approach to accomplish this goal.

In this example i’m going to use purrr::walk to iterate over the list of column names. purrr::walk does not return anything, instead we invoke purrr::walk for it’s side effects. What’s important here is that while invoking purrr::walk we are operating within a function, and the function has it’s own environment. What we’re going to do is to define the global environment as env and call the object data from the global environment. We will then iterate over, create the new column within the global data object and then reassign the global data object with the new dataframe with the new column. When we call the data object from the global environment, we have the new dataframe with the newly created objects.

data <- tibble(i = 1:26)

column_names <- paste0("column_", letters[1:26])

walk(column_names, \(x) {
  
  env <- global_env()
  
  env$data <- env$data |> 
    mutate(
      !!x := rnorm(26)
    )
  
})

data
# A tibble: 26 × 27
       i column_a column_b column_c column_d column_e column_f column_g column_h
   <int>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>
 1     1   2.54    -1.50     0.945    1.69    -0.0870  -1.07      0.533  -1.73  
 2     2  -0.566   -2.38    -0.951   -0.117    0.494   -0.172     0.106  -0.483 
 3     3  -0.489   -0.0235  -0.396    0.0982  -0.0762   0.653     0.684  -0.157 
 4     4  -1.05     1.92     1.71    -0.285    0.897   -2.10      0.234   0.141 
 5     5  -1.27    -1.07    -0.0825   1.52    -0.850    0.881    -0.632  -0.0621
 6     6  -0.803    0.422   -1.36    -2.60     0.132   -1.68      0.957  -0.567 
 7     7  -0.128   -0.680    0.297   -0.928   -0.722    0.856    -1.43    0.915 
 8     8  -0.0326   1.64     0.933    0.329    0.452    0.553    -0.380   0.608 
 9     9   0.117    1.27     0.240    1.50     0.658   -0.0955   -0.397  -1.31  
10    10   1.02    -1.12    -1.27     1.45    -0.532    0.321    -0.991  -1.53  
# ℹ 16 more rows
# ℹ 18 more variables: column_i <dbl>, column_j <dbl>, column_k <dbl>,
#   column_l <dbl>, column_m <dbl>, column_n <dbl>, column_o <dbl>,
#   column_p <dbl>, column_q <dbl>, column_r <dbl>, column_s <dbl>,
#   column_t <dbl>, column_u <dbl>, column_v <dbl>, column_w <dbl>,
#   column_x <dbl>, column_y <dbl>, column_z <dbl>

Conclusion

In summary, it’s extremely powerful to be able to specifically call objects from different environments. This allows us to use objects that are defined in different environments, within the same expression. Hopefully, this helps with demystyfing the concepts of environments within R.

Reuse

Citation

BibTeX citation:
@online{luu2024,
  author = {Luu, Michael},
  title = {R {Environments}},
  date = {2024-02-27},
  langid = {en}
}
For attribution, please cite this work as:
Luu, Michael. 2024. “R Environments.” February 27, 2024.