Google
 

5/10/07

Function Objects in C++

Frequently we need to treat a function as a first class object in C++. One place these are frequently needed is in numerical methods code. For example a numerical integrator needs to somehow be handed a function to integrate. Some languages do this easily: for example functional languages such as Lisp and Haskell. But C++ is a little weaker in this regard.

Pointers to Functions

In the old days of C the way to do this was to hand in a pointer to a function. In fact this is exactly what is used throughout Numerical Recipes in C [2]. An example in that style is the following:


float f(float x) {
float y = 2;
float z = 3;
return x*x+y*y+z*z;
}

int main() {
. . .
. . .
integrate(&f,0,1);
. . .
}

But this has some problems. What if we have a function f() of three arguments but want to integrate it along the line y=2, z=3. We need to integrate a helper function

float f(float x,float y,float z) {
return x*x+y*y+z*z;
}

float g(float x) {
return f(x,y,z);
}

float y,z;

int main() {
. . .
. . .
y = 2;
z = 3;
integrate(&g,0,1);
. . .
}

Virtual Functions

That's a pretty ugly solution requiring a global variable. In order to lose the global variables a void * argument is added to the function to be integrated and to the integrate() function. The void * can be handed into the integrand which can dereference it to find a package containing extra arguments. A full example might look like:



struct Data {
float y,z;
};

float f(float x,void *d) {
Data *p = (Data *)d;
return x*x+p->y*p->y+p->z*p->z;
}

int main() {
. . .
. . .
Data d;
d.y = 2;
d.z = 3;
integrate(f,0,1,&d);
. . .
}

That's pretty horrible even though it's more or less the standard in the C world! In Scheme we can simply build a function on the fly like: (lambda (x) (f x y z)). Can we do something like this in C++?

C++ has a mechanism for handling pointers to functions that hides the fact that you are manipulating pointers to functions: virtual functions. Using these we can have our integrator function call a function whose identity isn't yet determined at runtime. Here's an example:



class Integrand {
public:
float evaluate(float) = 0;
float integrate(float x0,float x1) {
. . .
}
};

class F : public Integrand {
float y,z;
public:
Integrand(float y0,float z0) : y(y0), z(z0) {
}
float evaluate(float x) {
return x*x+y*y+z*z;
}
}

int main() {
. . .
. . .
F f(2,3);
integrate(f,0,1);
. . .
}

That's certainly much more elegant. But unfortunately it's authoritarian. It forces users of the integrate method to derive their functions from Integrand. What if you want to hand the same function to a numerical differentiator? Will you have to write the same function again, only derived from another base class? It would be nice if we could make completely generic function objects that can be used in multiple ways.

Generic Function Objects via Templates

The next approach is much more generic. It makes only one assumption about the function object that is passed in - that has operator() defined. (Incidentally, objects with such a method are often called functors or functoids. The latter is to be preferred as the former already has a perfectly good but quite different meaning.)


template
float integrate(const F &f,float x0,float x1) {
. . .
. . . f(x) . . .
. . .
}

class F {
float y,z;
public:
F(float y0,float z0) : y(y0), z(z0) { }
float operator()(float x) const {
return x*x+y*y+z*z;
}
};

int main() {
. . .
. . .
F f(2,3);
integrate(f,0,1);
. . .
}
This has many great advantages. The function object, f, can be pointer to a function: integrate(sin,0,1) would work for example Or it can be an object with operator() defined. One nice advantage over using virtual functions and function pointers is that it can use knowledge about the function object at runtime to optimise.

What we've actually made here is what computer scientists call a closure [3]. That's an aggregate made of a function and some of its arguments. We can't yet evaluate it because we don't yet know all of its arguments, so it's in a kind of suspended animation waiting for its final arguments.

Polymorphic Function Objects

But there's one more thing that would make the function objects even more useful. Suppose that our integral method had a whole battery of algorithms that it could throw at an integrand. Maybe one of those methods involves evaluating the function at complex arguments. As it stands this would be impossible. F only defines operator()(float) and so can only operate on floats. We'd like to write our function in a type independent way. What springs to mind is a polymorphic function object. Unfortunately there is no way to take the address of a polymorphic function. But there is a simple modification we can make to the class F above. Here it is:


class F {
float y,z;
public:
F(float y0,float z0) : y(y0), z(z0) { }
template
X operator()(X x) const {
return x*x+y*y+z*z;
}
};

With the help of a template member function F is truly polymorphic. Our integration algorithm can throw any type it likes at F, as long as the methods operator() uses understand. In fact, it's even possible for the integrate function to reconstruct a symbolic representation of the function for a large class of functions by applying operator() to symbolic types that have the arithmetic operators appropriately defined. With a little extra work we can do some real functional programming and even port Haskell code [1] directly to C++..

We have now come a long way from old-fashioned C function pointers.

References

[1] fc++: Functional Programming in C++
[2] Numerical Recipes
[3] Functions and Closures

No comments: