esoteric R | Introducing Closures

Jeffrey A. Ryan
January 1, 2011

The R language provides object-oriented programming through two primary systems, known as S3 and S4. S3 implements a class-based dispatch mechanism, while S4 offers a more traditional object-oriented scheme. Both implementations utilize list-style constructs for objects and separate data from methods. A third mechanism, closures, offers the programmer the option of integrating methods within objects. This can be used as a lightweight object design with benefits that neither S3 nor S4 offer.

The Basics of a Closure


A closure in R is an object that contains functions bound to the environment the closure was created in. These functions maintain access to the scope in which they were defined, allowing for powerful design patterns that are difficult with the standard S3/S4 approach to objects in R.

To create closures, we use the environment object in R. This allows for data and methods to reside within the object instances, making self-aware behavior and selective inheritence easy. It's even possible to mix this with traditional R by assigning a class to the environment.

We'll start the exploration with an example of functionality found in other interpretted langauges — the stack1.

Example: A Stack in R

A stack implementation consists of three main components:

  1. a container variable --- a.k.a. the stack
  2. a push method to add elements
  3. a pop method to remove elements

The general idea is to be able to add elements to a container, and modify the container in-place. In R this is possible using some assignment tricks into the .GlobalEnv, but it can be frought with unintended consequences. Closures offer us a perfect alternative to keep surprises to a minimum.

First, we'll create single environment that will act as the container and then add into that environment a stack vector and the two methods, push and pop.

      s <- new.env()
  
      s$.Data <- vector()
      s$push <- function(x) .Data <<- c(.Data,x)
      s$pop  <- function() {
          tmp <- .Data[length(.Data)]
          .Data <<- .Data[-length(.Data)]
          return(tmp)
        }
      ls(s, all=TRUE)
      [1] ".Data" "pop" "push" 
    

We are using the double arrow <<- assignment operator in the push function to let assignment proceed up the internal stack frame until a variable is found to bind to. This allows for non-local modifications to our .Data variable. The push method appends new data to the stack and pop removes the last element of the stack and returns it to the caller. We can use the $ operator to access the internal methods of our environment.

      s$push(1)
      Error in s$push(1) : object '.Data' not found
    

Oops, something is wrong. It turns out that <<– can't find the .Data object stored in the s object. We haven't matched the environment of the function to the object's environment. R isn't starting its search for .Data in the correct location; it needs more information. The functions environment and as.environment work well here.

      environment(s$push) <- as.environment(s)
      environment(s$pop) <- as.environment(s)

      s$push(1)   # works now
      s$pop()
      [1] 1
    

We can use S3 classes to create push and pop methods to make the calls look more like normal R

      push <- function(x, value, ...) UseMethod("push")
      pop  <- function(x, ...) UseMethod("pop")
      push.stack <- function(x, value, ...) x$push(value)
      pop.stack  <- function(x) x$pop()
    

That completes our stack object. Unfortunately, we currently need to recreate most of the above code for each new "stack" object we'd like to create. A much better approach would be to functionalize this.

      new_stack <- function() { 
        stack <- new.env()
        stack$.Data <- vector()
        stack$push <- function(x) .Data <<- c(.Data,x)
        stack$pop  <- function() {
          tmp <- .Data[length(.Data)]
          .Data <<- .Data[-length(.Data)]
          return(tmp)
        }
        environment(stack$push) <- as.environment(stack)
        environment(stack$pop) <- as.environment(stack)
        class(stack) <- "stack"
        stack
      }
    

Not only can we now create stacks easily, we can also use this to extend the class with new functionality via inheritance.

Example: Making a Better Stack

An interesting extension to our example comes from extending our stack object with additional "shift" and "unshift" methods. Using the new_stack constructor, we can extend the "stack" object to a new class called "betterstack".

      new_betterstack <- function() {
        stack <- new_stack()
        stack_env <- as.environment(stack)
        stack$shift   <- function(x) .Data <<- c(x, .Data)
        stack$unshift <- function() {
          tmp <- .Data[1]
          .Data <<- .Data[-1]
          return(tmp)
        }
        environment(stack$shift)   <- stack_env
        environment(stack$unshift) <- stack_env
        class(stack) <- c("betterstack", "stack")
        stack
      }
    

To make the experience more R like, we again add S3 methods for shift and unshift like we did for push and pop. Putting it all together gets us a nice stack-like object for R.

      nb <- new_betterstack()
      push(nb, 1:3)
  
      nb$.Data
      [1] 1 2 3
  
      pop(nb) # from the back
      [1] 3
  
      unshift(nb) # from the front
      [1] 1
  
      shift(nb, 3)
      push(nb, 1)
      nb$.Data
      [1] 3 2 1
    

Conclusion

In this first installment on closures in R we covered a few of the basics. Creating objects using environment objects, adding methods that act on private data, and even incorporating this into the traditional S3 landscape. Some simple usage patterns one may encounter would include keeping track of a 'static' data without relying on global variables (hint: create incr and decr methods for the .Data) or allowing for method overrides by instance.

In future articles we'll examine some of the more nuanced behavior of closures in general, as well explore how R's implementation is different from implementations in other well know programming langauges.

esotericRTM is edited and published by lemnica. It covers common parts of R in-depth, and examines the lesser known aspects of programming with R – from beginner to advanced. Submissions from authors, developers, and users are encouraged.

Source code and articles can be found at

www.lemnica.com/esotericR

keywords:
closures, objects-oriented, environments

1 A stack is a common data structure used in programming. It is based on the idea of last in, first out (LIFO). Typically stacks have methods that allow data to be pushed onto, and popped off of, the stack. A good visual analogy is that of a stack of dishes in a cafeteria line.

In functional languages like R, side-effects such as the in-place modification in a stack are discouraged — part of the notion of least surprise. Sometimes the reality of functionality must triumph over philosophy though.

Other articles explain S3 classes in detail, but for our needs it is sufficient to understand it as a lightweight mechanism used in R to provide function dispatch depending on the 'class' of an object.

Note that we needn't reimplement .Data, pop, or push. These are inherited from the original with new_stack(). This is a major benefit when dealing with complex structures that have many variables or methods. Stubs can be defined and methods can be overwritten by the child objects with ease.

Examples of both implementations can be found in the IBrokers package that interfaces the Interactive Brokers trading platform. See the twsConnect and eWrapper objects in the package on CRAN.

About the author

Jeffrey Ryan is the founder of lemnica corp., a Chicago firm specializing in statistical software, training, and on-demand support. He helps organize the R/Finance conference series [www.RinFinance.com], and is a frequent speaker on software related topics. He is the author or co-author of a variety of R packages involving finance, large data, and visualizations including quantmod, xts, Defaults, IBrokers, RBerkeley, mmap, and indexing. He currently lives in Chicago, Illinois with his wife and three children.




Copyright 2011 lemnica, corp. All rights reserved.