Implementing Polymorphism (behind the scenes)
Despite the general opinion it is not a Dr.Strange kind of magic.
As a matter of fact it’s quite simple and clever way of exploiting the language features to create a powerful weapon.
The long story short; polymorphism is implemented by a bunch of pointers behind the scenes.
Don’t be startled it is quite simple actually. Let’s dive in!
Since it’s a mid level object oriented programming language, I’ll use the C++ to depict the concepts better.
As you know, all of the base classes are considered as an “interface” to an object in oop. By deriving this interface, you can constitute new classes and objects from it so that “at least” as the same size as the base class and does the same or similar jobs. But you are limited to what the interface gives you.(that’s the key point to remember)
In oop. we are allowed to change the inherited method’s behavior in general and it is called overriding. Furthermore it’s quite common to have such a scenario that a base class is derived by multiple sub-classes and the default behavior of the interface’s methods have been changed by the more appropriate ones per sub-classes. In this scenario when we use the interface to access to the sub-class, we’d except that the overrided methods to be called instead of the defined ones in the interface. But how this works, really?
In theory, we could use pointers to point to the appropriate methods of the objects and we could use pointers to call the functions instead of the direct function calls.
Indeed, this is how polymorphism works behind the scenes, the compiler itself creates such pointers for us for the methods that’s declared as “virtual”. Look at the code below!
#include <iostream>using namespace std;class Base
{
public:
virtual void function1() {cout<<"BASE::f1()" << endl;};
virtual void function2() {cout<<"BASE::f2()" << endl;};
};
class D1: public Base
{
public:
void function1() override {cout<<"D1::f1()" << endl;};
};class D2: public Base
{
public:
void function2() override {cout<<"D2::f2()" << endl;};
};
int main()
{
Base base;
base.function1(); // calls the Base::function1();
base.function2(); // calls the Base::function2();
D1 d1;
d1.function1(); // calls the D1::function1();
d1.function2(); // calls the Base::function2();
D2 d2;
d2.function1(); // calls the Base::function1();
d2.function2(); // calls the D2::function2();
return 0;
}
The way that this could work is done by the compiler itself.
There is a concept called “virtual-lookup-table” which is created for each classes that has “virtual” functions declared and for its derived classes.
This lookup table simply consist of <function_name — pointer> pairs.
When a function is called by an object, the compiler look for the method name in the virtual-look-up table (if defined one) and later in the class itself. If it’s found in the table, it fetches the address(location) of the method and calls it.
At the beginning I said that “we are limited to the interface.” So the best place to define the pointer to the lookup table is inside the “interface”.
class Base
{
public:
VirtualTable* __vptr;
virtual void function1() {};
virtual void function2() {};
};
class D1: public Base
{
public:
void function1() override {};
};
class D2: public Base
{
public:
void function2() override {};
};
here the “__vptr” is inherited all of the derived classes (D1 and D2 here).
- When we create an object from Base, compiler also defines a lookup table for Base(only) and assign the address of the table to the __vptr and then fills the table with the names of function1() and function2() with their corresponding function addresses by applying the name-lookup.
- The compiler does the same operation for D1 and D2 here but when it fills the table it places the function addresses by looking up from the current class scope to its base classes and places the first encountered function address. This is simply called name-lookup.
D1 d1;
Base* pd1 = &d1; // pd1->__vptr points to the D1’s virtual tableD2 d2;
Base* pd2 = &d2; // pd1->__vptr points to the D2’s virtual table
Efficiency Issues
As you may expect, virtual functions comes with a small cost compared to its benefits.
Memory Complexity
- Lookup-tables only created for the classes that has a function defined virtual, so you not need to worry about other classes.
- Where it is defined, it increases the size of the class +1 pointer size(__vptr) and allocates the memory for virtual-lookup-table that consist of all (virtual_function_name <—> address) pairs that’s defined or inherited by the class.
Time Complexity
- Accessing functions other than virtual simply cost a function call overhead.
- Accessing virtual functions costs 1 search operation and 2 address resolutions. First for the lookup-table, we access it via __vptr and we look for the function name in the table(linear search) and another address resolution for the corresponding function address to call.
References
[20.3.2 Virtual Functions] https://www.amazon.com/C-Programming-Language-4th/dp/0321563840
https://en.wikipedia.org/wiki/Virtual_method_table
https://www.learncpp.com/cpp-tutorial/the-virtual-table/