Debugging code in any language can be tricky. Even with the many different debugging tools available out there, I know most of us still have a tendency to lean into the print("Here") tricks. But I thought I’d share a little strategy that I like to use in R.

Particularly when you’re working with package development, certain bugs can be tough to reach. If you haven’t learned to use browser() in R, I highly recommend it! But even so, it can still be frustrating to use – especially when you’re dealing with methods such as vectorization or looping. Stopping the code at the problematic element or iteration can be a pain. Sometimes you need to work with the data right at that spot, but spend some time experimenting with different ways to fix the problem.

Nested instances

So here’s what I like to do. Let’s say you have a buggy function:

buggy_function <- function(x, y) {
   step1 <- mtcars %>% 
      mutate(
         flag = if_else(gear > x && carb > y, TRUE, FALSE)
      )
   
   step1 %>% 
      group_by(flag) %>% 
      summarize(
         mean = mean(mpg),
         sd = sd(mpg)
      )
}
buggy_function(3,3)
## # A tibble: 1 × 3
##   flag   mean    sd
##   <lgl> <dbl> <dbl>
## 1 TRUE   20.1  6.03

Hm. Not what I expected. If you dig into mtcars, you’ll see that there very well should be some FALSE records for the flag variable we created. The first step here would be digging into the step1 dataset to see what’s happening. That’s easy enough using the browser() function.

buggy_function <- function(x, y) {
   step1 <- mtcars %>% 
      mutate(
         flag = if_else(gear > x && carb > y, TRUE, FALSE)
      )
   browser()
   
   step1 %>% 
      group_by(flag) %>% 
      summarize(
         mean = mean(mpg),
         sd = sd(mpg)
      )
}
buggy_function(3, 3)
Called from: buggy_function(3, 3)
Browse[1]> 
Debugging function

This gives us the opportunity to stop the function from executing at this exact point and explore the environment. For this example, this is pretty straightforward, but you will inevitably hit more complex scenarios. What if your issue happens on the 112th iteration of some vectorized function? Here I’m using mtcars, but what if the starting dataset of this function where we create step1 is dynamic? In those cases, testing out code while using browser() can be a pain. It’s not very hard to inadvertently exit the browser and lose the environment where you were trying to debug. At that point, you need to start it up again and navigate your way back to the problem section.

So here’s a pretty simple technique to work around this problem. Move the troublesome data into your global environment so you can experiment freely. How would we do that?

> buggy_function(3, 3)
Called from: buggy_function(3, 3)
Browse[1]> assign('step1', step1, envir=globalenv())
browser window showing code

From within the browser, you can use the assign() function to take an object from the executing environment and assign it to another. Put simply – we’re grabbing that data so we can play with it outside of the browser. Here, I’m assigning a variable named step1 using the object step1 from within the executing function we’re debugging, and assigning it within the global environment. From there, I can experiment from outside the browser.

an RStudio console session of the code in the post, where the code executed from within the browser is highlighted and the code ran after the browser is highlighted
step1 %>%  select(gear, carb, flag)
##    gear carb  flag
## 1     4    4 FALSE
## 2     4    4 FALSE
## 3     4    1 FALSE
## 4     3    1 FALSE
## 5     3    2 FALSE
## 6     3    1 FALSE

With a little experimenting, I can find my issue. Within my if_else() call, the condition I’m using is written incorrectly

step1$gear > 3 && step1$carb > 3
## [1] TRUE

Whoops! It returns a single `TRUE` or `FALSE`. That’s because `&` and `&&` function differently (learn more here). Fixing that up I can see my mistake:

step1$gear > 3 & step1$carb > 3
##  [1]  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE
## [14] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [27] FALSE FALSE  TRUE  TRUE  TRUE FALSE

And finally I can fix my function!

buggy_function <- function(x, y) {
   step1 <- mtcars %>% 
      mutate(
         flag = if_else(gear > x & carb > y, TRUE, FALSE)
      )
   
   step1 %>% 
      group_by(flag) %>% 
      summarize(
         mean = mean(mpg),
         sd = sd(mpg)
      )
}
buggy_function(3, 3)
## # A tibble: 2 × 3
##   flag   mean    sd
##   <lgl> <dbl> <dbl>
## 1 FALSE  20.5  6.67
## 2 TRUE   18.5  2.40

And I’m good to go! I’ve found this quite helpful and hope you do too. Do you have any of your favorite debugging techniques that you’d like to share?

You can check out the repository for this post right here.

Back to Blog