A Motivating Discussion
A good friend of mine, Ken, has been giving interviews at his work, and expressed how many candidates struggled with basic questions about object-oriented programming in C++. He said they would be able to explain concepts like inheritance and polymorphism, but would choke when asked, for instance, "Why is the virtual keyword necessary? Why aren't all functions automatically virtual?".
There are a couple of valid answers to this question. One of them is related to software design: if a function is non-virtual in a base class, it cannot and should not be overridden in child classes. The other answer is related to performance: virtual functions are more costly to invoke than non-virtuals (via a base class pointer or reference). This second answer is one that interests me more because I believe many programmers don't really understand what makes virtual functions more costly.
When I first learned about virtual functions in C++, I remember being really impressed with this feature, because accomplishing something similar in C wasn't easy. A fairly simple form of polymorphism is possible in C by using structs of function pointers that would be set at runtime; good examples of these can be seen in Quake 2's source code
) where they are used to polymorphically invoke rendering functions via OpenGL or the software renderer. However, to mimic C++ inheritance and virtual functions in C is another story.
I eventually read about how C++ compilers generate virtual function tables to implement virtual functions; and with that mystery solved, I happily coded in C++ for years. But since this recent discussion with Ken, I thought it might be fun, and perhaps instructive, to try and implement basic inheritance and virtual functions in C. Bjarne Stroustrup's first version of C++ was actually called "C with Classes"
because he would first transform C++ code to C, then compile it with a regular C compiler. So I figured I could whip up a simple program in C that would show how virtual dispatch and inheritance could be implemented. I think understanding and being able to step through such code can give some insight into what the cost of virtual functions are.
There are two files: main.cpp
which contains a simple test case in C++, and main.c
which contains the equivalent implementation in C. Each can be compiled and run separately (tested: Visual Studio, GCC).
The code in main.cpp is simple: it declares 3 classes, Base, ChildOne which derives from Base, and ChildTwo which derives from ChildOne. There are 3 virtual functions declared and implemented in Base: the destructor, Func1, and Func2. ChildOne overrides Func1, while ChildTwo overrides Func2. Both child classes implement a destructor as well.
Here's the output from running an executable built from main.cpp:
The main C++ features that this sample demonstrates are:
- Inheritance: ChildTwo inherits from ChildOne, which inherits from Base, which means that ChildTwo's data members include those of its two bases.
- Constructor Chaining: when constructing, say, a ChildTwo instance, we see that Base's constructor gets invoked, followed by ChildOne's, and then ChildTwo's.
- (Virtual) Destructor chaining: destructors get automatically invoked in the opposite order of the constructors.
- Virtual Functions (Polymorphism): the most derived implementation of a virtual function gets called depending on the type of the instance pBase points to. For example, on an instance of ChildTwo, we see that calling Func1 invokes ChildOne's version, while calling Func2 invokes ChildTwo's version.
Now for the interesting part: the implementation in C, which can be seen in main.c. First, the output from running an executable built from main.c, which is nearly identical to the output from main.cpp:
Let's go through how each of the C++ features I listed above are implemented in C:
In C, there are no classes, only structs; and structs can only contain data, and cannot inherit from another type. So the three "classes" are implemented as structs:
Base base; // This is how we "inherit" from Base in C to have the same memory layout
ChildOne base; // This is how we "inherit" from Base in C to have the same memory layout
The important part is how ChildOne "derives" from Base by declaring a member of type Base as its first member, and ChildTwo "derives" from ChildOne in a similar fashion. This mimics how data gets laid out in memory for inherited types in C++, namely that data is laid out in the order its declared, starting from the most-base to the most-derived class. What this boils down to is pointers and address manipulation.
In C++, we can write the following:
ChildTwo* pChildTwo = new ChildTwo();
Base* pBase = pChildTwo;
We don't need to cast pChildTwo to Base* because semantically in C++, a derived type "is-a" base type. The fact that memory is laid out from base-most to child-most also means that the address of pChildTwo and that of pBase will be exactly the same. The compiler doesn't need to manipulate the address at all when assigning from child pointer to base pointer.
In C, by laying out the data to mimic C++, we can also assign from child to base, but must explicitly cast since the types are not actually related:
ChildTwo* pChildTwo = NewChildTwo();
Base* pBase = (Base*)pChildTwo;
This type of explicit up-casting is done in many places in main.c.
Constructors are special functions in C++ in that the compiler makes sure that parent constructors are invoked before child constructors up the chain. In fact, you can explicitly specify which parent constructor should be invoked via the initialization list of a child constructor:
Child::Child() : Base()
In C, there is no way to specify implicit function chains, so I mimic this behavior by explicitly invoking the parent constructor from the child constructor first:
void Base_Construct(Base* pThis)
pThis->a = 0;
pThis->b = 0.f;
void ChildOne_Construct(ChildOne* pThis)
Base_Construct((Base*)pThis); // Compiler would inject this here
pThis->c = 0;
void ChildTwo_Construct(ChildTwo* pThis)
ChildOne_Construct((ChildOne*)pThis); // Compiler would inject this here
pThis->c = 0;
pThis->d = 0;
As far as I know, all C++ compilers implement virtual functions by using the virtual function table, or vtable, mechanism. If a class declares or inherits at least one virtual function, the compiler adds a hidden member that is a pointer to a vtable. The vtable is simply an array of pointers to virtual functions. The compiler generates a vtable per class, where each entry points to the most derived implementation of a virtual function for that class. To mimic all this in C, I do pretty much the same thing a C++ compiler does, except I do it at runtime.
First some useful type declarations:
typedef void (*Destructor)(void*);
typedef void (*Func1)();
typedef float (*Func2)(int arg1);
The two enums are used to index the vtable by class and function respectively. The typedefs are used to cast the generic pointer type I use to hold the address of a function (ptrdiff_t*) to the specific function type (more on this below).
I then declare a pointer that will point to a table of all vtables, one per class:
ptrdiff_t*** g_allVTables = NULL;
That's a lot of asterisks! I use ptrdiff_t because it can portably hold an address; for instance, it will be 32 bits large when compiling on a 32 bit platform, and 64 bits large when compiling on a 64 bit platform. So we can use a ptrdiff_t* to point to a single virtual function. As we need a table, or array, of these pointers, we use a ptrdiff_t** per class. Finally, g_allVTables will point to the first entry in an array of all vtables, so it's a ptrdiff_t***.
Now to build our vtables, the very first function called in main is InitTables():
First we allocate a table of 3 vtable pointers and store that in the global variable g_allVTables:
// We need a pointer to a vtable per class
g_allVTables = (ptrdiff_t***)malloc( sizeof(ptrdiff_t**) * NUM_CLASSES );
Then we allocate each vtable - that is, 3 entries per table since we have 3 virtual functions:
// For each class, we allocate vtable - in our case, it's simple as we have a fixed number
// of virtual functions for each class.
for (i = 0; i < NUM_CLASSES; ++i)
g_allVTables[i] = (ptrdiff_t**)malloc(sizeof(ptrdiff_t**) * NUM_VFUNCS);
Finally, we populate the vtables by setting the most-derived version of the virtual function per class:
// Populate Base vtable entries
g_allVTables[CLASS_BASE][VFUNC_DESTRUCTOR] = (ptrdiff_t*)&Base_Destruct;
g_allVTables[CLASS_BASE][VFUNC_FUNC1] = (ptrdiff_t*)&Base_Func1;
g_allVTables[CLASS_BASE][VFUNC_FUNC2] = (ptrdiff_t*)&Base_Func2;
// Populate ChildOne vtable entries
g_allVTables[CLASS_CHILDONE][VFUNC_DESTRUCTOR] = (ptrdiff_t*)&ChildOne_Destruct;
g_allVTables[CLASS_CHILDONE][VFUNC_FUNC1] = (ptrdiff_t*)&ChildOne_Func1;
g_allVTables[CLASS_CHILDONE][VFUNC_FUNC2] = (ptrdiff_t*)&Base_Func2;
// Populate ChildTwo vtable entries
g_allVTables[CLASS_CHILDTWO][VFUNC_DESTRUCTOR] = (ptrdiff_t*)&ChildTwo_Destruct;
g_allVTables[CLASS_CHILDTWO][VFUNC_FUNC1] = (ptrdiff_t*)&ChildOne_Func1;
g_allVTables[CLASS_CHILDTWO][VFUNC_FUNC2] = (ptrdiff_t*)&ChildTwo_Func2;
Now that we have our vtables built, we need to make sure instances of each class use them. Let's take a look again at how the Base "class" was declared:
Notice there is a data member named vtable. Each instance of a Base, ChildOne, or ChildTwo will contain this member, and it must be set to point to the correct vtable for that class when the instance gets created. In C++, this happens when you invoke operator new like so:
pBase = new ChildTwo();
Operator new does a few things: allocates memory for the object, instantiates the vtable if necessary, and invokes the constructor. To do this in C, we write a function to instantiate, or "new", each type:
ChildTwo* pInstance = (ChildTwo*)malloc(sizeof(ChildTwo));
((Base*)pInstance)->vtable = g_allVTables[CLASS_CHILDTWO];
After allocating the memory for the instance, note how we initialize the vtable member with the corresponding class entry from the master table of vtables. We then invoke the class-specific constructor, and finally return the instance. The NewBase and NewChildOne functions are similarly implemented.
We now have everything we need to be able to invoke these virtual functions via a base pointer. For instance, in main, we 'new' a ChildTwo and invoke the 2 virtual functions on it like so:
pBase = (Base*)NewChildTwo(); // pBase = new ChildTwo();
((Func1)pBase->vtable[VFUNC_FUNC1])(); // pBase->Func1();
((Func2)pBase->vtable[VFUNC_FUNC2])(13); // pBase->Func2(13);
To invoke Func1, we first look up the address of the function in the instance's vtable, pBase->vtable[VFUNC_FUNC1]. This evaluates to a ptrdiff_t*, a pointer to the function address we setup in InitTables. We cast this pointer to a Func1 function pointer type so that we can invoke the function using the call operator (). You may recall that in InitTables, for ChildTwo, we set the VFUNC_FUNC1 entry in its vtable to point to ChildOne_Func1, which is the most-derived implementation of Func1 for ChildTwo; so that's the function that gets invoked. For the second call, ChildTwo_Func2 gets invoked.
Virtual Destructor Chaining
In C++, the destructor for a type gets invoked when the instance gets destroyed via operator delete. If the type has a vtable, the destructor should be declared virtual so that the most derived destructor gets invoked. Similar to constructors, the destructors are automatically chained, except destruction happens in the opposite order: children before parents.
In my C code, the destructor is just another virtual function added to the vtables in InitTables, for example:
g_allVTables[CLASS_CHILDTWO][VFUNC_DESTRUCTOR] = (ptrdiff_t*)&ChildTwo_Destruct;
The analog to operator delete is a single function used to delete all 3 class types:
void DeleteBase(Base* pThis)
if (pThis != 0)
The invocation of the most-derived destructor is handled exactly the same way as for any virtual function. As for chaining destructors, we must do it manually within the destructor functions like so:
void ChildTwo_Destruct(ChildTwo* pThis)
ChildOne_Destruct((ChildOne*)pThis); // Compiler would inject this here
Notice how the parent destructor, ChildOne_Destruct, is invoked last to mimic the C++-style chaining. ChildOne_Destruct, in turn, invokes Base_Destruct in a similar fashion.
With this very basic implementation of C++ in C, we can now see more clearly what the extra cost of invoking a virtual function is. With a regular function call, the compiler (or linker) already knows which function to call, so it simply injects instructions to jump to the specific function address. On the other hand, invoking a virtual function requires 2 extra pointer dereferences, or lookups: first we dereference the vtable pointer to access the class-specific vtable, and then we dereference the function pointer for the function being invoked to get its address. So the compiler must inject code to perform these two lookups, followed by the jump to the dynamically resolved function address.
On its own, two extra memory lookups may not seem very expensive, but they can be when they involve instruction cache misses. I'm not a computer architecture expert, but I know that on certain platforms, cache misses can cost orders of magnitude more cycles than a regular instruction. So hopefully, this helps answer the question "Why is the virtual keyword necessary? Why aren't all functions automatically virtual?".
Thanks for reading!
Posted By Antonio Maiorano to Video Game Coding
at 3/24/2014 08:54:00 AM