C++ Template Metaprogramming

C++ Template Metaprogramming (TMP)

Template metaprogramming is a metaprogramming technique in which templates are used by a compiler to generate temporary source code, which is merged by the compiler with the rest of the source code and then compiled. The output of these templates include compile-time constants, data structures, and complete functions. The use of templates can be thought of as compile-time execution.

The use of templates as a metaprogramming technique requires two distinct operations: a template must be defined, and a defined template must be instantiated. The template definition describes the generic form of the generated source code, and the instantiation causes a specific set of source code to be generated from the generic form in the template.

TMP is generally Turing-complete, meaning that any computation expressible by a computer program can be computed, in some form, by a template metaprogram. TMP have no mutable variables— that is, no variable can change value once it has been initialized, therefore template metaprogramming can be seen as a form of functional programming. In fact many template implementations only implement flow control through recursion.

C++ Template Metaprogramming: fundamentals

One of the key concepts  for writing Template Metaprogramming is to model your algorithm in recursion and it should be immutable. Let’s take evaluation of factorial, for example. We all know factorial of a number is the product of all positive integers till the number itself.

factorial(number) = 1 * 2 * 3 * … * (n-1) * n

Now being C++ programmer our first attempt would be to iterate:

static int factorial_imperative(int value)
{
    int factorial = 1;
    for(auto idx = value; idx > 1; --idx )
    {
        factorial *= idx;
    }
    return factorial;
}

This works, but has several limitations when it comes to converting it to metaprogram. First and foremost it mutates state, idx and factorial variables at line 4 and 6 respectively. Second it is not recursive, now recursion is primarily required if we want to avoid state mutation.

Keeping above considerations in mind, we will write another function that will be recursive in nature. Generally recursions are difficult to model and understand, primarily for two reasons:

  1. Very compact code, which is good as well as bad for it can be difficult to understand at times
  2. Understanding when to stop, specifying termination condition correctly

So here is the recursive and immutable version for evaluating factorial:

static int factorial_recursive(int value)
{
    return
            value <= 1 ? 1 :
            (value * factorial_recursive(value-1));
}

That was easy :), few observations though:

  1. we are not mutating any variable, even though we generate new values due to recursion but we are not mutating
  2. well defined termination condition, at line 4: value <= 1

Once we have recursive & immutable version of algorithm, it is very easy to convert that into TMP. I will first write out the TMP and then walk over various aspects of it and how it relates to previous algorithm.

//
//  in metaprogram every thing is a template-class/struct
//
//
template<int Num>
struct factorial_c
{
enum
    {
        value = Num * factorial_c<Num-1>::value
    };
};
//
//  terminating conditions
//
//  factorial of 1 is 1
//
template<>
struct factorial_c<1>
{
    enum
    {
        value = 1
    };
};
//
//
//  factorial of 0 is 1
//
template<>
struct factorial_c<0>
{
    enum
    {
        value = 1
    };
};
//
//
//
int main()
{
    auto factorial_5 = factorial_c<5>::value;
    std::cout << "factorial of 5 is "
              << factorial_5 << std::endl;
    return 0;
}

The above Template Metaprogram is almost equivalent to the recursive-immutable runtime version we wrote. Few observations:

  • Line 5 and 6 define a new template struct factorial_c which is parametrized by integers (C++ Templates can be parameterized by types as well as by integers).  Even though we have defined a template struct we are going to use it as if it were a function, see Line 43 for usage
  • Line 10: does the evaluation for us. Since C++ TMP  code is evaluated by compiler, which means every thing has to be compile time constant, we are forced to use enum or static const int, I prefer enums
  • We will be using this factorial_c TMP function (which is a struct as per what you see) whenever we want compiler to evaluate factorial of any number for us
  • Line 18, 19 & 23: This is where magic begins. Remember we talked about importance of termination conditions, this is our first termination condition for value of 1. What these lines means is we have already defined a general version of factorial_c but whenever compiler needs to evaluate factorial_c with Num being 1 it should pick this version, that’s why we have defined the enum with value in there, Line 23
  • Line 31 & 35: Similar to first termination condition we have one more here for evaluating factorial of 0
  • Line 43: This is where evaluation happens! This can be related to first call to recursive version, which will recurse till it’s termination conditions are met
  • Here is how the compiler evaluates the template recusions:

    factorial_5 = factorial_c<5>::value;
    factorial_5 = 5 * factorial_c<4>::value;
    factorial_5 = 5 * 4 * factorial_c<3>::value;
    factorial_5 = 5 * 4 * 3 * factorial_c<2>::value;
    factorial_5 = 5 * 4 * 3 * 2 * factorial_c<1>::value;
    factorial_5 = 5 * 4 * 3 * 2 * 1;
    factorial_5 = 120;
    

    It checks what is the number, it is 5 now, so none of special cases, known as template specializations, of factorial are applicable. As the number is not 0 or 1 it goes to general definition, which causes it to recurse as on Line 2. Same process is repeated till the Num with which factorial_c is parameterized is 1, Line 5, in which case the template specialization with Num being 1, Line 18,19 with previous code snippet, is invoked and the recursion stops.

    One must be wondering we have not reached 0, why do we want template specialization for 0 (Line 31 with previous code snippet), do I like to increase the number of lines of code (few idiots think that it is how one should measure how much you have done, that is sadistic approach to life and programming)? What happens if you want to evaluate factorial of 1 or 0?

    Few More Template Metaprograms

    Last section was short introduction to TMP, in this section I would like to put few more TMPs. Even though I am mostly writing algorithmic code here, by no means TMP is restricted to these. TMP is computation with types! not just numbers.

    Note: you cannot write TMPs with floats or doubles.

    gcd (greatest common divisor)

    In mathematics, the greatest common divisor (gcd), also known as the greatest common factor (gcf), or highest common factor (hcf), of two or more non-zero integers, is the largest positive integer that divides the numbers without a remainder. For example, the GCD of 8 and 12 is 4.

    Recursive & Immutable runtime code:

    static int gcd_recursive(int a, int b)
    {
        if( a == b )
        {
            return a;
        }
        //
        auto max = a > b ? a : b;
        auto min = a < b ? a : b;
        //
        //
        if( min == 0 )
        {
            return max;
        }
        return gcd_recursive(min, max % min);
    }
    

    TMP:

    //
    //  equivalent to max function
    //
    template<int One, int Two>
    struct  max_c
    {
        enum
        {
            value = One > Two ? One : Two
        };
    };
    //
    //
    //  equivalent to min function
    //
    template<int One, int Two>
    struct min_c
    {
        enum
        {
            value = One < Two ? One : Two
        };
    };
    //
    //
    //  TMP version of gcd
    //
    template<int One, int Two>
    struct gcd_c
    {
        enum
        {
            //  equivalent to:
            //  value = gcd(min, max % min)
            //
            value = gcd_c<  min_c<One, Two>::value,
                            max_c<One, Two>::value %
                            min_c<One,Two>::value>::value
        };
    };
    //
    //
    //  template specialization for
    //  Min being 0, this is our
    //  termination condition for recursion
    //
    template<int Max>
    struct gcd_c<Max, 0>
    {
        enum
        {
            value = Max
        };
    };
    

    lcm (least common multiple)

    In arithmetic and number theory, the least common multiple (also called the lowest common multiple or smallest common multiple) of two integers a and b, usually denoted by LCM(ab), is the smallest positive integer that is a multiple of both a and b.

    Formula for lcm is:

    lcm(one, two) = one * two / gcd(one, two)

    Recursive & Immutable runtime code:

    static int lcm_r(int one, int two)
    {
        return one * two / gcd_recursive(one, two);
    }
    

    TMP:

    template<int One, int Two>
    struct lcm_c
    {
        enum
        {
            value = One * Two / gcd_c<One, Two>::value
        };
    };
    

    As LCM can be evaluated using gcd also, we don’t care about terminating conditions specific to LCM, as it is not recursive at all! Who said that you couldn’t re-use TMPs. :).

    Parting Thoughts

    Well I don’t have summary for this post, as it is not yet finished. We have barely scratched the surface of this beast. C++ TMP has a complete language of its own, well thats what Boost.MPL library is all about. I will follow up on TMP with few more posts.