People have been using C++ for a long time now and most of us think that we are pretty well versed with it. Interestingly enough, as we spend more time with something, we keep discovering more things about it. I love it when the thing you have been using for so long turns out to have powerful hidden features. I encounter a lot of C++ in my day-to-day life and so I end up spending a lot of time with it. Over time, as I delved deeper into C++, I came across some abstruse features, some of which really surprised me! So I thought I should write about them. Let’s see what they are.
What’s so “vexing” about the most vexing parse?
This is kind of a popular thing in the programming world. The term “most vexing parse” was coined by Scott Meyers for an ambiguity in C++ declaration syntax that leads to counter-intuitive behavior. Consider the following piece of code:
int myName(int (x));
This can be interpreted in two ways:
- Variable: myName is a variable of type ‘int’ initialized to ‘x’ (which is cast to an int)
- Function: myName is a function that returns an int and has one argument, which is an int named ‘x’
The C++ standard requires it to be interpreted as the second. Another slightly more complicated example would be:
class Bank { public: Bank(); }; class Accountant { public: Accountant(const Bank& myBank); int get_status(); }; int main() { Accountant myAccountant(Bank()); return accountant.get_status(); }
Here, we are declaring two classes. We are initializing the constructor in the class Accountant using the class Bank. This looks fine, right? What can be the problem? Well, consider the following line:
Accountant myAccountant(Bank());
This could be interpreted as one of the following two cases:
- A variable definition for the variable myAccountant of class Accountant, being passed an anonymous instance of class Bank
- A function declaration for a function myAccountant which returns an object of type Accountant. It takes a single unnamed argument which is a function returning type Bank (and taking no input).
Most programmers expect the first, but the C++ standard interprets as the second. It should account for many of those developer hours spent looking for bugs centered around this whole thing.
The Pointer Surprise
As we all know, we can access an element of an array using, say, ptr[4], if you want to access the fifth element. Accessing an element of an array like this is actually just short for *(ptr+4). But we knew that already! Now the interesting thing is that this can be equivalently written as *(4+ptr) and therefore as 3[ptr]. This turns out to be completely valid code. Check it out yourself if you want, the compiler won’t throw any kind of error. If you want, you can compile the following code:
int main() { char s[5] = “hello”; cout << s[0] << endl; cout << 0[s] << endl; return 1; }
When you compile this, you won’t get any error. You will get the letter ‘h’ printed on the terminal twice.
An evil programmer’s friend: Redefining keywords
First of all, let me go on record and say that redefining keywords is bad programming practice. Redefining keywords via the preprocessor is technically supposed to cause an error but tools allow it in practice. This lets you do fun bug-introducing stuff like #define true false or #define else.
#define int float #define float char
We can do things like these and get away with it. Now why on earth would a programming language allow that? Well, as it turns out, this is actually useful in some situations. Let’s say we are using a large data-oriented library and we don’t want any kind of public inheritance. So we basically need to override the C++ access protection mechanism. One way to do this would be to just patch the library. But let’s say we don’t want to do that. How would we solve this issue? Well, we can just turn off access protection before including the headers for the library. Although we need to remember that we should turn the protection back on once we are done!
#define public private #include “mylibrary.h" #undef private
I hear there’s a “new” alternative
There is an alternate syntax for the new operator that runs in place on an already allocated object. This is called “placement new”. The object in question is assumed to be of the correct size and that it has the correct alignment. This involves setting up the vtable and calling the constructor.
#include <iostream> using namespace std; struct Animal { int numLegs; Animal() { cout << “Animal::Animal()” << endl; } ~Animal() { cout << "Animal::~Animal()" << endl; } }; int main() { // Allocating our own memory Animal *ptr = (Animal *)malloc(sizeof(Animal)); // Use placement new new (ptr) Animal; // We need to call the destructor ourselves ptr->~Animal(); // We need to release the memory ourselves free(ptr); return 1; }
This seems to be a roundabout way of doing the same thing, right? We can just use “new” and move on with our lives. When is it ever useful? Well, placement new is used when we are writing custom allocators for performance-critical systems. Let’s say we have a slab allocator that starts with a single large chunk of memory. If we allocate memory using malloc, we run into the problems of memory fragmentation and overhead of heap traversal. So we use placement new to allocate objects sequentially within the chunk, which helps us avoid those problems.
Did you say variable declaration inside a conditional statement?
C++ contains a syntactical shorthand for simultaneously declaring a variable and branching on its value. What this means is that you can do a single variable declaration, and can go into an ‘if’ statement based on that.
struct Books { virtual ~Books() {} }; struct Drama : Books { int x, y; }; struct Fantasy : Books { int key; }; void log(Books *books) { if (Drama *drama = dynamic_cast<Drama *>(event)) cout << "Drama " << drama->x << " " << drama->y << endl; else if (Fantasy *fantasy = dynamic_cast<Fantasy *>(event)) cout << "Fantasy " << fantasy->key << endl; else cout << "Books" << endl; }
Overloading a member function based on value type, not the input args
In C++, you can overload a member function based on the value type of the object. You can achieve this using a ref-qualifier. It basically sits in the same position as a cv-qualifier and affects overload resolution depending on if the object for this is an lvalue or an rvalue:
#include <iostream> class MyClass { public: void myFunc() & { std::cout << "lvalue" << std::endl; } void myFunc() && { std::cout << "rvalue" << std::endl; } }; int main() { MyClass myClass; myClass.myFunc(); // Prints "lvalue" MyClass().myFunc(); // Prints "rvalue" return 1; }
Members need pointers too
Let’s say we need to describe a pointer to a member of any instance of a class. How do we do it? As we all know, we use the pointer-to-member operators. There are two pointer-to-member operators, .* for values and ->* for pointers:
#include <iostream> using namespace std; class MyClass { public: int num; void myFunc() {} };
For the record, I don’t condone making variables public in a class. This is just for demonstration purposes only and I didn’t want to obscure it by creating a bunch of getters and setters.
// We have the extra "MyClass::" in the pointer type int MyClass::*ptrNum = &MyClass::num; void (MyClass::*ptrFunc)() = &MyClass::myFunc; int main() { MyClass myClass; MyClass *ptr = new MyClass; // Call the stored member function (myClass.*ptrFunc)(); (ptr->*ptrFunc)(); // Set the variable in the stored member slot myClass.*ptrNum = 1; ptr->*ptrNum = 2; delete pt; return 1; }
This looks eerily similar to a regular function pointer, right? I mean, how is it any different? Well, there is actually a difference between member function pointers and regular function pointers. Casting between a member function pointer and a regular function pointer will not work. Also, member function pointers may be up to four times larger than regular pointers. The compiler may need to store the address of the function body, the offset to the correct base during multiple inheritance, the index of another offset in the vtable during virtual inheritance, and even the offset of the vtable inside the object itself for forward declared types. These features is particularly useful for writing libraries.
———————————————————————————————————
Reblogged this on void_ Blog.
Amazing!