Codementor Events

From Simple Techniques to Design Patterns in C++ — Part I

Published Feb 26, 2021Last updated Aug 24, 2021
From Simple Techniques to Design Patterns in C++ — Part I

This series of articles introduces a simple technique that most experienced developers use frequently and propose multiple ways to expand it to compelling design patterns to solve various problems. This first article's target audience is the beginner C++ developer that has already started implementing real projects (even small ones) and noticed that things quickly get complicated as code expands…
Things will get more advanced in the next articles. While exploring the main subject, we will often drift and talk about good practices and other techniques. At the end of the article, I summarize all the covered topics.
Experienced developers will probably find this introduction material too simple. The next articles on the series will build on this one to present more advanced concepts and links with Design Patterns in a practical way.

The next article is online at this location: Part II. It explain the Abstract Factory pattern, why, when and how to use it. In the process, it introduces Bob, the product manager, and how to deal with ever changing requirements.

Introducing the Pimpl

The pimpl, also known as the Compiler Firewall or Envelope/Letter idiom, was introduced by James O Coplien in the 1992 book Advanced C++ Programming Styles and Idioms. pimpl is short for "Private Implementation" or "Pointer to Implementation". The idea seems pretty awkward or counterproductive at first. But the benefits are tremendous when the size of the project grows. It offers many opportunities to introduce very interesting Design Patterns. It is quite popular and used extensively in the Qt framework for instance. We will see in this first article how it works and what kind of problem it solves. The following articles will explore new software design opportunities and expand to solve various design problems elegantly.

If you were not exposed to this technique before, your first reaction would probably be "But… WHY?!?!!". It indeed looks like we are introducing some complexity instead of making things cleaner. But bear with me, there are many excellent reasons to do that. I will explain them in detail in the article. At the end, you will be able to make a sound decision about your class, when to use the pimpl, and how to implement it efficiently.

Just a Few Words...

Before reading this (rather long) article, I would like to clarify a few things. Words are essential, as Albert Camus said:

To name things wrongly is to add to the misfortune of the world.

I like this quote and how it applies to computer programming. We all know how hard it is to find meaningful names for variables (I think it is a science by itself 😃 ). But here, I would like to define a few words that I will reuse to be sure that they won't cause any misunderstanding.

Declaration

A declaration is a statement that presents an entity, such as function, struct, class, namespace, for instance, without providing its implementation.
It provides information about its type and other characteristics (arguments for a function...). For instance:

void display( Shape & s );

... is the declaration of a function. Every entity must be declared before being used.
Often, declarations are in the header file: (.h). But not always: those files can also provide some implementation, and some declarations sometimes go into the CPP file. In that case, you can only use them inside that CPP file.

Definition

A definition of an entity, such as a function, is the code itself of the function: its body. For functions, definitions are typically stored in the CPP file (or inlined in the header).
In the case of classes, the class definition is often in the header file, and its implementation in the CPP file.
There are many exceptions: inline functions or templates are frequently defined in headers.

In case of a class:

struct Point;

is a declaration, and:

struct Point {
  float x;
  float y;
  void translate( float dx, float dy );
};

...is the definition of the class.

This definition has the declaration of the translate function. Its definition (implementation) could be in the CPP file.
There are no hard rules on where the definitions and the declarations should be located. Try to keep your headers as small as possible. And keep in mind that the game's rule is to include as few files as possible.

Composition

An object A uses composition when it embeds other objects. All the objects are deleted when A is deleted. They belong to A, and their lifecycle depends on A. this embedding can be done through direct instance of those object declared inside A. A can also create an instance of another object and delete it in its destructor. For instance:

#include <B.h>
class A {
public:
   ...
private:
  B m_b;
};

or

#include <B.h>
class A{
public:
     A() : m_b(new B) {}
     ~A() { delete m_b; }
private:
    B m_b;
};

We say that composition models a has a relationship. In our case, A has a B.

Aggregation

Aggregation is a way for an object to reuse another object but controlling its lifetime. An object can take another object as a pointer or reference in the constructor. For instance:

class B;
class A 
{
     A( B * b ): m_b( b ) {}
private:
      B *m_b; 
};
or, using reference:
```cpp
class A
{
public:
      A( B & b ) : m_b( b ) {}
 private:
    B &m_b;   // A object does not own m_b.
};

It models a uses a relationship. We can say that A uses a B.

Delegation

Delegation is the simple fact for a class to delegate its work to another class. It can use Aggregation or composition.

Ok, enough with vocabulary. Let's move on!

Basic Rectangle Class

Let's introduce our friend, the Rectangle. It is a simple class modeling a simple axis-aligned rectangle. Its implementation details are not very important. I will only propose a few functions to get an idea about how it works. And yes, I know, the technique I will present is overkill for such a toy class, but I am sure you can extrapolate to a much bigger class you already had to implement. But as an example for this article, it is a perfect fit.
Here is the header:

#include <Point.h> // Necessary include, we have a Point in the private section.

class Rectangle
{
public: 
    Rectangle( Point p, float w, float h ); // Create with a point, width and height.
    ~Rectangle();
    
    void render() const;
    void translate( float dx, float dy );
    
    float getWidth() const;

    // Some other rectangle function...
    // ...

private:
    Point m_p; // Upper-left point.
    float m_w; // width
    float m_h; // height
};

And here is the CPP:

#include "Rectangle.h"

Rectangle::Rectangle( Point p, float w, float h )
    : m_p(p)
    , m_w(w)
    , m_h(h)
{}
    
void Rectangle::render() const {
    // implement rendering with OpenGL, for instance.
}
void Rectangle::translate( float dx, float dy ) {
    m_p.x += dx;
    m_p.y += dy;
}

float Rectangle::getWidth() const {
    return m_w;
}

// Some other rectangle function...
// ...

There are many ways to model a Rectangle: we choose to store the upper left point, the width, and the rectangle's height. We could also store the center and the half-width and half-height. At this point, we decided on our design and implemented this first version on this project.

The class using a Rectangle (let's call it the Client class) could use it by composition or aggregation. It would have either a Rectangle, a pointer (smart or not) to a Rectangle, or a reference to a Rectangle. For instance:

#include "Rectangle.h" // I need to include because I use a rectangle.
class Client
{
public:
   // ...
private:
   Rectancle m_rect;  // Here, I use directly a rectangle (composition).
}

We have no choice. We must #include "Rectangle.h". Either in our header because we use the Rectangle directly. It means that because the client is embedding a Rectangle, it must know at least how to create and destruct one. So it must have access to the Rectangle declaration (the header).

If we choose to use a reference (aggregation):

class Rectangle; // Here, no include because we use pointer or reference.
                 // We just declare Rectangle.
class Client
{
    Client( Rectangle & r ); // Will be used to initialize m_rectRef.
    
    /// ... Client Functions....
private:
    Rectangle &m_rectRef; // Initialized with r in constructor. 
};

And the implementation:

#include "Client.h"
#include "Rectangle.h" // Here we use Rectangle functions, 
                       // we need its definition.
Client::Client( Rectangle & r )
    : m_rectRef( r )
{}
/// More functions of Client Using Rectangle...

A pointer is like a reference in this regard: a simple declaration is necessary, and the include must be on the CPP file.
Anyways, the client somehow makes use of our Rectangle:

image
It seems simple enough...

The implementation of our Rectangle seems simple enough. We are using it all over our big codebase. Then, someone changes the way Rectangle works internally for some good reason. Without changing the interface, the implementation is modified to store 2 points, upper left and lower right, instead of upper left, width, and height. Nothing changes for the clients. Still, your whole codebase recompiles…

Know Your Enemy

Before discussing dependencies and other rather technical things, let's see one of the problem's roots: coupling.

It is a general principle in software architecture. You will make many decisions to try very hard to reduce it to a minimum. Coupling is bad, really bad. Tight coupling between 2 classes means that a change in one class has consequences. There are many kinds of relationships between two entities, a function call, access to members of a class, including headers, inheriting, friend, and so on...

An object needs to know at least a little about another one it uses to do its job: its API, or class it inherits from, for instance. The less it needs to know, the better. We will discuss the coupling created when an object needs to use another. Not inheritance, it could be an interesting topic for an article.

The Simple Pimpl, an Example.

Instead of merely implementing a class with a public interface and a private implementation, you split it in two to have an interface class with a public interface and a private pointer to an implementation class to forward all its function calls.

Here is a pimpl version of our Rectangle class:


class RectangleImpl;  // we are just declaring we don't include!

class Rectangle
{
public:
    Rectangle( Point p, float w, float h );
    ~Rectangle();  
    
    void render() const;
    void translate( float dx, float dy );
    
    float getWidth() const;

    // Some other rectangle function...
    // ...
      
private:
    std::unique_ptr< RectangleImpl > m_pimpl; // here is the Pointer
                                               // to the implementation.
};

Then the implementation:

#include "Rectangle.h"       // Here I need to include the header.

// The implementation can be in its own header/cpp files.
// I put the whole class here with inline implementation for brevity. 
class RectangleImpl
{
public:
    RectangleImpl( Point p, float w, float h )
        : m_p(p)
        , m_w(w)
        , m_h(h)
    {}
    
    void render() const {
        // implement rendering with OpenGL, fo instance.
    }
    void translate( float dx, float dy ) {
        m_p.x += dx;
        m_p.y += dy;
    }
    
    float getWidth() const {
        return m_w;
    }

    // Some other rectangle function...
    // ...
     
private:
    Point m_p;
    float m_w;
    float m_h;
};


// Here is the implementation of the Rectangle class.
// It just forwards all calls to its implementation.

Rectangle::Rectangle( Point p, float w, float h )
  : m_pimpl( std::make_unique< RectangleImpl >( p, w, h ) // Implementation Creation.
  {}

Rectangle::~Rectangle() = default; // We will explain later why the destructor is defined here.
  
 Rectangle::render() const 
 {
    m_pimpl->render();    // Using delegation.
 }
 
 void Rectangle::Translate( float dx, float dy )  // Using delegation
 {
    m_pimpl->translate(dx, dy);
 }
 float getWidth() const
 {
    return m_pimpl->getWidth(); /// Using delegation
 }
 
 // Now you get the idea for the other functions.
 // ...

The client using your Rectangle class would use it either by composition or aggregation. This straightforward schema would depict it:

pimpl-rect.png

The class in the middle hides its implementation thoroughly but behaves the same as the basic Rectangle presented at the beginning. The CPP file has two essential things:

  • the implementation of the Rectangle, named RectangleImpl,
  • and the Rectangle itself that forwards all the calls to its implementation.

Now, we will answer the central question:
why would you want to do that?

The Compiler Firewall

As I mentioned in the introduction, another name for this technique is the Compiler Firewall. This section will explain what it is and why you would need it.

The Dependency Entropy

Dependencies are bad. They are necessary, and you have to live with it. You will spend some part of your engineering time managing them.

There are multiple levels of dependencies. And it is essential to keep those in mind to know the long-term implications of using them. The first one is library dependency.

cable-ecureuil.png
Things get complicated without proper dependencies management.

Library Dependencies

Have you already experienced problems because you link with a lot of libraries? Those libraries use themselves more libraries, and some of them are interdependent. You want to update one of them for good reasons, but you are stuck with another library using it indirectly and requiring an old version. When some libraries update, it triggers another update for yet another library, and so on.

Some languages and frameworks are particularly vulnerable to this kind of entropy. In a certain way, C++ is no exception.

We are talking about the libraries we integrate into our project and the libraries we create internally for our project.

If you have worked with a project split in multiple libraries, modifying a single small file and having your whole project recompiling is a real pain!

To use a library, you do two things:

  • you include a header,
  • and you link with the library.

The first step happens at compile time. If an included file has changed, then the files using it are recompiled. Then the second step occurs at link time. If you link with a library, you import all its dependencies. This is a kind of macroscopic dependency at multiple libraries level.

Class Dependencies

On a smaller scale, classes use some other classes, which creates another dependency type. If you split your project into multiple internal libraries, some class inter-dependencies can be kept in the same library. Changing a class could only trigger a recompilation of its own library.

But in real cases, class dependencies are across libraries and can trigger multiple libraries recompilation and linking.

If each class header includes the definitions of the classes it uses systematically, here is what could happen:

LibraryDeps.png

A simple change in the F header will trigger the recompilation of F, E, D, B, and A. On a small scale, it seems manageable, but on larger projects… recompiling the whole codebase can take hours (or even days in some extreme cases). You don't want to be the guy that drove the entire R&D to a halt with your latest commit. 😭

compiling.png
ok, I know, you saw this thousands of times....

The change propagation is like a ripple that spreads all over your code on complex projects. That should trigger a warning in the software developer's head. If some modification in one of my functions or classes implies a lot of recompilation, it means that if I change its behavior slightly, it will impact a lot of code. The probability for side effects, regression bugs can rise very quickly, which is pretty bad on a big project. On medical software, for instance, providing a change propagation analysis to evaluate the impact of changes in critical parts is mandatory. So you want to keep it to a minimum, restraining the rest of the code's impact and limiting the performed user tests.

Let's go back to our first simple implementation of the Rectangle.

All the functions are implemented inside the Rectangle.cpp file. It seems simple and straightforward.

It is efficient, with no use of a pointer to an implementation.

You gave many thoughts on your public interface and put your variables and functions in your header's private part. It is not accessible to the user. Seems good enough protection, right?

Wrong… the user of the class cannot use the private parts of the class. But it can see them because they are declared in the Rectangle.h file. They must include it in their CPP files (or worse, in their header files) to use it. Every modification you make to the implementation will trigger a recompilation. It is even worse if the include is in a header used on another header used in the whole codebase...

Let's say that you want modify implementation of rectangle, to represent it with two opposite Points. Your private part would be:

class Rectangle {
     Rectangle( Point p, float w, float h ); // Same argument constructor.
...
... // Same as previous.
private:
    Point m_p1;
    Point m_p2;
}

The constructor would have the same interface but a different implementation:

Rectangle::Rectangle( Point p, float w, float h ) // Change the implementation.
    : m_p1(p)
    , m_2(p.x + w, p.y + h)
{}

float Rectangle::getWidth() const { return std::abs( m_p2.x -  m_p1.x ); }

// All other functions are adapted to use p1 and p2 instead of p, w and h.

The implementation is hidden from the user if you put it in the CPP file. But it is not enough since the header still "shows" the data structure (Point, w, h) and private implementation (private functions).

Since nothing has changed for your class user, why must they recompile?

Just because they "saw" your implementation in the private part…

The pimpl is a solution to this problem. It allows your class user to include the Rectangle header, without seeing the implementation. This way, changes to the implementation are entirely hidden.

The only visible thing is a pointer to an implementation. But it only requires a declaration. It does not need to know about the RectangleImpl. The only thing that your compiler needs to know is that Rectangle uses a pointer to some class named RectangleImpl, and a simple pointer is just a 64 bits unsigned integer (on most platforms). In our case, we wrap the pointer into a std::unique_ptr, but it makes almost no difference (we'll see that later). It is enough knowledge for the compiler to allow the client to use the Rectangle.

Internally the compiler must know two things to be able to use a class:

  • The list of functions (public, protected, and private),
  • The layout of embedded variables.

If one of those two things change, all the class users must recompile.

The RectangleImpl definition is necessary for the Rectangle.cpp, but your class's users do not include this file. They just link with it.

So you can freely change your implementation. It will not affect your users as long as your exposed public interface does not change.

The std::unique_ptr Based Implementation Details

We will see later that we can implement many cool things from this simple idea of having an interface and a separate implementation.

The safest and straightforward way to implement a pimpl is to use the std::unique_ptr.

It provides many advantages:

  • std::unique_ptr is destroyed automatically when your Rectangle class is deleted. No memory leaks.
  • It is a "0 cost abstraction", which means that the generated code is almost equivalent to the one you would write yourself using raw pointers, with a new RectanleImpl in the Rectangle constructor, and the proper delete in the destructor. Ok, I admit, there is a small cost in certain circumstances, but most of the time, it is negligible compared to the benefits.

And you can be sure that even under extreme pressure, with a passed deadline, and after a loooong period of work, the compiler won't forget to insert the deletion code.

Copying a Pimpl

But there is a catch. Using the std::unique_ptr disables the copy of your Rectangle because it is not copyable (but it is moveable, I might write another article on that). It is another difference between the std::unique_ptr and a raw pointer. Let's see that in detail.

Imagine a raw pointer implementation of the pimpl. The Rectangle constructor and destructor would look like this (I don't show the header, it is always the same):

Rectangle::Rectangle( Point p1, float w, float h )
  : m_pimpl( new RectangleImpl( p1, w, h )
{}

Rectangle::~Rectangle()
{
  delete m_pimpl;  // If you forget that: memory leak!
}

You could copy your Rectangle without a problem, and the copy constructor would be generated by default for you. You could pass your Rectangle as an argument to a function:

float area( Rectangle r ) {                // Pass by copy.
    return r.getWidth() * r.getHeight(); 
} // Here is the end of r scope: it is destroyed,
  // the pimpl should be destroyed as well.
...
// You would use the function like this:
Rectangle rect( Point(0, 0), 10, 5 );
std::cout << area( rect ); // Here rect is copied, and the
                           // pimpl pointer is copied,
                           // not the RectangleImpl itself!
                           // But at this point, r has deleted the RectangleImpl...

...
// crash at the end of the scope or end of the program when rect destructor is called, 
// The pimpl is destroyed twice!

Here is what happens:

  1. The Rectangle rect is copied when passed to the function area(). In this process, the pointer to RectangleImpl it contains is copied. As a result, the Rectangle in the calling code and the Rectangle copied as an argument share the same RectangleImpl. You can already guess that it will not finish with a happy end...
  2. The end of the area function is reached, the r argument goes out of scope. The RectangleImpl is destroyed.
  3. At the end of the code, rect is out of scope, and its destructor function is called. It tries to delete the already destroyed RectangleImpl. Double delete: BOOM!

Keep in mind that a std::unique_ptr cannot be shared. That is why you must copy the one provided. It is a critical design point. In the next article, we will see how to create shareable pimpls. It requires a lot of care. The use of std::unique_ptr prevents you from a very nasty bug that is very easy to make if you use a raw pointer. The trouble is that the compiler generates the copy operator and the copy constructor by default by copying the pointer. Not the object it points to.

The problem with this bug is that it will explode at run time. Your compiler will not tell you that there is a problem, and worse: it might run a few times before crashing. If the memory used by the deleted Rectangle is not immediately reused, it might still hold a few values that make sense for your software. But at some point, it will be rewritten. You will get crash or strange bugs that seem unrelated as your software use memory belonging to another object.

The use of std::unique_ptr prevents this problem because you cannot copy it. It disables the generation of the copy constructor and the copy operator of the Rectangle.

If you want to be able to copy your Rectangle, you have to provide two additional functions:

class Rectangle 
{
...
  Rectangle( const Rectangle & r );
  Rectangle& operator=( const Rectangle & r );
...
};

With the implementation:


Rectangle::Rectangle( const Rectangle & r )
    : m_pimpl( std::make_unique< RectangleImpl >( *r.m_pimpl ) )
{}
  
Rectangle::operator=( const Rectangle & r ) 
{
    m_pimpl = std::make_unique< RectangleImpl >( *r.m_pimpl );
    return *this;
}

What happens here? It is straightforward:

  • The first function, the copy constructor, creates its own Rectangle using the copy constructor of the provided RectangleImpl.

  • The second one, the copy operator, does the same thing. It creates its own RectangleImpl by copy constructing the provided one. This way, you can pass Rectangle by copy. It will work.

Of course, it is possible to implement those two functions if you implement your pimpl with a raw pointer. But using std::unique_ptr prevents you from forgetting that, in addition to automatically deleting the implementation when destructing your Rectangle.

For the past ten years, since smart pointers are available in the standard, I have been writing a lot of code for various companies. I only used raw pointers in specific circumstances, for hardware libraries that still use it, or for libraries implemented in C. Besides those few exceptional cases, I don't use new and deleted anymore.

Moving a Pimpl

I mentioned before that the std::unique_ptr is movable. The move semantics has been introduced in C++11. Since this article is already very long, I will not talk about it here. But for the sake of completeness, if you want a movable Rectangle you must define the move constructor and the move operator.

In Your header:

class Rectangle
{
public:
   ...
   Rectangle(Rectangle && op) noexcept;              // Now, it is movable
   Rectangle& operator=(Rectangle && op) noexcept;   //
   ...
};

And the following default implementation in the CPP file (for the same reasons as the destructor):

Rectangle::Rectangle(Rectangle &&) noexcept = default;
Rectangle& Rectangle::operator=(Rectangle &&) noexcept = default;

Note that those two functions are noexcept. The compiler generates them with the move operator, and the move constructor of the std::unique_ptr witch is also noexcept. It allows some more optimizations, knowing that those functions will not raise any exceptions.

The Pimpl and the Destructor

There is another small catch with the std::unique_ptr… You may have noticed that we declared the rectangle destructor in the Rectangle header, then in the CPP file, we just defined it with a default implementation.

It is very tempting just to define the default implementation in the header.

#include "RectangleImpl.h" 
class Rectangle {
    ... constructors...
    ~Rectangle() = default;   // Problem! Here the compiler need the 
                              // RectangleImpl destructor, you must include!
                              // Don't do that!

But sadly, it does not work. This definition, and the =default are equivalent to letting the compiler generate it for you. To do so, it needs to destroy the RectangleImpl. Remember that it is automatic. But as I said, there is no magic. To generate this Rectangle destructor function, it needs to know about the RectangleImpl destructor. So you need to include RectangleImpl… Wait! No! All this article is about avoiding making your implementation visible to your users. What the point of including it at the end?

There is a simple solution. You just have to define your destructor implementation in your CPP file. And that's all!

Even if you use the default implementation, you just have to do that:

class Rectangle {
    ... constructors...
    ~Rectangle();   // just declare, do not define it here...
  ...
}

and,

Rectangle::~Rectangle() = default; // Use default implementation
                                   // or define your custom implementation,
                                   // but remember that you don't have
                                   // to delete the m_pimpl smart pointer.

Well, this is a little annoying. I concede. It would have been perfect if you just had to let your compiler deal with that without the need to include or define the destructor in the CPP file. But it is a little price to pay for all the features and security provided by the std::unique_ptr. And the best thing is that this problem triggers an error at compile-time, so you just have to remember how to fix it by adding a one-line destructor in your CPP file instead of including your implementation on your Rectangle.h.

The Cost of the Pimpl

Here's probably the essential point of the article. So important that I put it in its own section. This pimpl technique is cool, but it has a cost. And it is mandatory to understand it to make good design decisions.

When you call a Rectangle method, it delegates it to its implementation. This small pointer hop and second function call is probably negligible and will have close to no impact on your code. But if the function is called on a tight loop, it might become substantial. For instance, implementing a Point with just X and Y coordinates stored by millions in vectors, would probably benefit from a pimpl. Iterating on the vector to call a translate function on each point in a tight loop would make the cost of the pimpl too high.

Using pimpls everywhere for its sake is not the point of this article. The Rectangle class is probably too simple to be implemented with a pimpl in real projects (Qt, which is heavily using pimpl, does not use it for QRect, for instance, as it is a core class used in performance-critical loops).

Wrap Up

There are a lot of things that we can build on top of what we have just seen in this article, but it is already quite long! We saw that as a developer, you should limit dependencies to a minimum: be it library or class dependencies.

We saw the cost of including headers, when includes are required and when they are not, and how to avoid it.

I presented the most basic technique to do to solve this problem: the pimpl, and I proposed a straightforward implementation using the very efficient std::unique_ptr smart pointer, paying attention to details, like how the copy operator is deleted by default, and how to handle the small problem with destructor declaration and implementation.

One of the most critical points of the article is when NOT to use the pimpl. It is mandatory to fully understand a given technique's tradeoffs before generalizing its use.

The next article will build on this technique and propose interesting "features". Some variations of the pimpl are very flexible, efficient, and elegant. It will be an opportunity to introduce some Design Patterns.

Until then, take care, and think before coding!

Guillaume Schmid.
Kaer Labs CTO and Co-Founder

Introduction picture under creative common license:
https://commons.wikimedia.org/wiki/File:Gehry_Las_Vegas.jpg
"Compiling" comic by xkcd: https://xkcd.com/303/
I Took the squirrel on telecom cables picture myself in Saigon.

Discover and read more posts from Guillaume Schmid
get started
post comments1Reply
David Harris
a year ago

From Simple Techniques to Design Patterns in C++ is an exceptional resource that masterfully guides programmers through the intricacies of C++ development. The author’s clear explanations and practical examples make complex concepts easy to grasp, enabling readers to level up their coding skills. What sets this book apart is its relevance to real-world applications. Whether you’re developing software for mobile card readers or any other industry, the book’s comprehensive coverage of design patterns equips you with invaluable knowledge to create efficient and scalable solutions. It’s a must-have for any programmer seeking to enhance their understanding of C++ and tackle complex projects with confidence.