rlang is a powerful R package that allows the coder the ability to write code with code. The amazing book Advanced R by Hadley Wickham goes into this idea in much greater detail, but I wanted to present a small example of my favorite function parse_expr() from the rlang package in this post.
In short, rlang provides functions that facilitates the coder the ability to delay evaluation of expressions. Furthermore, we can manipulate the the expressions with various tools within rlang, by piecing together various expressions.
I will be using the data from the palmerpenguins package
# A tibble: 344 × 8
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
<fct> <fct> <dbl> <dbl> <int> <int>
1 Adelie Torgersen 39.1 18.7 181 3750
2 Adelie Torgersen 39.5 17.4 186 3800
3 Adelie Torgersen 40.3 18 195 3250
4 Adelie Torgersen NA NA NA NA
5 Adelie Torgersen 36.7 19.3 193 3450
6 Adelie Torgersen 39.3 20.6 190 3650
7 Adelie Torgersen 38.9 17.8 181 3625
8 Adelie Torgersen 39.2 19.6 195 4675
9 Adelie Torgersen 34.1 18.1 193 3475
10 Adelie Torgersen 42 20.2 190 4250
# ℹ 334 more rows
# ℹ 2 more variables: sex <fct>, year <int>
As a simple example, let’s say we would like to build a linear regression model for bill_length_mm, using all other variables as covariates.
fit <-lm( bill_length_mm ~ species + island + bill_depth_mm + flipper_length_mm + body_mass_g + sex + year,data = data )fit
Call:
lm(formula = bill_length_mm ~ species + island + bill_depth_mm +
flipper_length_mm + body_mass_g + sex + year, data = data)
Coefficients:
(Intercept) speciesChinstrap speciesGentoo islandDream
-3.893e+02 9.910e+00 6.487e+00 -4.624e-01
islandTorgersen bill_depth_mm flipper_length_mm body_mass_g
-7.327e-02 3.272e-01 5.724e-02 1.136e-03
sexmale year
2.054e+00 2.023e-01
This is a simple example, but what would happen if there are many many covariates we would like to include in the model. We can construct the formula as a string and then use parse_expr() to parse the string into an expression.
bill_length_mm ~ species + island + bill_depth_mm + flipper_length_mm +
body_mass_g + sex + year
As an example, we can construct an expression like so, and delay the evaluation for later time.
expr(lm("this is where the formula should be inserted", data = data))
lm("this is where the formula should be inserted", data = data)
Using the example from above, we can construct a new expression, and inject the previous expression into it using the !! (bang bang) operator.
lm_model <-expr(lm(!!f, data = data))lm_model
lm(bill_length_mm ~ species + island + bill_depth_mm + flipper_length_mm +
body_mass_g + sex + year, data = data)
As we can see, using the !! operator allows us to piece together various expressions. However, do note that the expression is still not evaluated. If we would like to evaluate the expression, we can pass the object lm_model into eval()
lm_model |>eval()
Call:
lm(formula = bill_length_mm ~ species + island + bill_depth_mm +
flipper_length_mm + body_mass_g + sex + year, data = data)
Coefficients:
(Intercept) speciesChinstrap speciesGentoo islandDream
-3.893e+02 9.910e+00 6.487e+00 -4.624e-01
islandTorgersen bill_depth_mm flipper_length_mm body_mass_g
-7.327e-02 3.272e-01 5.724e-02 1.136e-03
sexmale year
2.054e+00 2.023e-01
Although this is a fun toy example, the ability to piece together various pieces of expression and delay evaluation is a very power functional programming tool in R.