17 August, 2010

[C++ template] Is template better?

I wrote a lot of template code recently and using some amazing trick to solve my problems. However, when I was studying those template tricks, I feel like that they invented template programming because C++ is not perfect. But... On the contrary, sometime I think that C++ is perfect because of its powerful template features.

The most attractive part of template programming to me is its run-time performance. It generates the code in compile-time and eliminate a lot of branches at run-time. If you compare the template programming with C++ polymorphism, those approaches are different from the beginning. The first one ultimately leverage the power of compiler but the second one is depends on its run-time behavior.

So what kinds of benefits can template programming provides in compile-time? To explain in detail, we need a real example to demonstrate it.
Problem:
  1. Write a connection class with two types: client and server. The connection class have one method: "sayhi()", the server has to print out "server: hi" and client has to print out "client: hi".
  2. Write a function, which parameter is a pointer to the connection class and its job it to invoke the "sayhi()" method 10 times in a for-loop.
The next is the straightforward answer without using template trick.

After rewriting the code using template programming skills, the code will be like:

Let's see what can the compiler do for us, I used optimization level 3 to compile the source code, and I will post part of the assembly code and do the comparison below:
ASM of the original source code:

At line #18, you can see here is a branch corresponding to the line #9 in the original source code. So we have a branch condition in a for loop, the compiler cannot optimize it.
ASM of the template source code:

You can see that there is no branch in the template version because compiler can optimize it. Moreover, the for-loop has been unrolled! The compiler cannot unroll the first version because it is not able to know if it is a server connection or client connection in the for loop.

By using this example, we can see how template programming reduce the run-time overhead and its ability to "be optimized" by compiler.

No comments: