Smart Pointers #
We begin by explaining why we need smart pointers. Then describe the principle of how smart pointers work.
Why Do We Need Smart Pointers? #
Shared Access #
Sometimes we have an object that’s used (mostly read-only) by multiple other objects. One such example might be the spatial discretization of the domain (called grid). It’s usually created once at the beginning of the simulation, but many different pieces of the simulation want to read the grid.
struct Grid {
std::vector<double> x;
std::vector<double> y;
};
class FluxLoop {
// ...
private:
Grid grid;
}
class InsituVisualization {
// ...
private:
Grid grid;
};
One could argue that the InsituVisualization doesn’t need the grid, one can
pass it in along with the values; but let’s assume there’s a good reason to do
so.
While the above is sound in terms of RAII. The problem is that both FluxLoop
and InsituVisualization have a copy of the Grid object. Which wastes too
much memory.
We could use:
class FluxLoop {
// ...
private:
const Grid& grid;
}
or
class InsituVisualization {
// ...
private:
Grid const * const grid;
};
both have the problem that if the grid is deallocated before the FluxLoop
object, there’s a dangling pointer or reference. We highly motivated to avoid
the possibility of dangling pointers; and almost never store a reference in an
object.
Virtual Classes #
Virtual classes are very convenient. They’re great for implementing different flavours of and algorithm that solves the same problem.
They have one big annoyance, we inherently want to allocate them dynamically and be able to return them from functions. The following looks desirable:
ODESolver * solver = make_ode_solver("rk4");
However, we now have to remember to deallocate solver. This is a violation of
RAII because there’s a resource, the memory for storing the ODESolver object,
that’s not strictly tied to the lifetime of the object, here an ODESolver*,
used to access the resource.
This will inevitably lead to leaks and is a prime example of a class of difficulties in C++ we’ll in all but the most dire circumstances refuse to deal with.
What is the Principle of a Smart Pointer? #
A smart pointer consists of a pointer and a way of tracking ownership of the object pointed to. By ownership we mean controlling the life time of the object. In particular, making sure the object is kept alive sufficiently long; but also the responsibility to free the resource once it’s not needed anymore.
The std::shared_ptr
#
An std::shared_ptr is a reference counted smart pointer. As the name suggests
it’s for dealing with objects that have shared ownership of the object the
smart pointer refers to. Meaning multiple object have independent opinions
about how long an object must be kept alive. The idea is to deallocate the
object when the last owner is destroyed, i.e. when the last shared_ptr
holding that object goes out of scope.
Conceptually this mean a shared_ptr consists of a pointer to the object and a
pointer to an integer type counting the number of smart_ptrs referring to
this object. The count is increased atomically. Which implies that copying
shared_ptr can be copied safely in a multi-threaded context. Note, that this
does not imply that using the object is thread safe.
The atomic increment/decrement is a source for concern w.r.t. performance.
However, in a typical HPC application shared_ptr aren’t copied in the
inner most loop.
Conceptually, it could be implemented as follows:
template<class T>
struct shared_ptr {
public:
~shared_ptr() {
--block->shared_ptr_count;
if(block->shared_ptr_count == 0) {
delete ptr;
}
if(block->shared_ptr_count == 0 && block->weak_ptr_count == 0) {
delete block;
}
}
shared_ptr(const shared_ptr& other) {
ptr = other.ptr;
block = other.block;
++block->shared_ptr_count;
}
private:
struct ControlBlock {
std::atomic<size_t> shared_ptr_count;
std::atomic<size_t> weak_ptr_count;
};
T* ptr;
ControlBlock* block;
};
The important part is that there’s a (dynamically allocated) control block
that’s shared by all copies of the shared_ptr and all weak pointers to this
resource. The resource is freed when the last shared pointer is destroyed. The
control block is only deleted when there’s not more shared or weak pointers
left that point to this resource.
The std::unique_ptr
#
An std::unique_ptr can hold objects that shall only have one owner. A
unique_ptr can be moved but not copied. Just like the shared_ptr it ties
the lifetime of the contained object to the lifetime of the smart pointer.
Conceptually, you can imagine it as:
template<class T>
struct unique_ptr {
unique_ptr(T* ptr) : ptr(ptr) {}
~unique_ptr() { delete ptr; }
T * ptr;
};
along with:
- the ability to inject a custom deleter in case
delete ptr;is not the correct way of freeing the resource. - API to make
unique_ptrbehave like aT*in terms of accessing methods, dereferencing, etc.
Leaks With Smart Pointers #
Unfortunately, it’s still possible to create a resource leak when using shared pointers. Note that the condition for releasing the resource was, when the last owner goes out of scope, which meant when the last shared pointer is destroyed.
If one were to ever create a cycle of shared_ptrs. Then the reference count
would never drop to zero, even though the smart pointers can’t be accessed
anymore. This is where a std::weak_ptr comes into play. It avoids the problem
of a dangling pointer. While the object a weak_ptr refers to can be destroyed
while the weak_ptr is alive, unlike a raw pointer, the weak_ptr will be
equal to a nullptr if the object has been destroyed. Therefore, it’s possible
to avoid accessing the freed memory, which still a massive improvement over a
dangling pointer, which can’t be recognized as dangling until it’s too late.
Further Reading #
When implementing custom smart pointers, I’d recommend Chapter 7 in Andrei Alexandrescu’s Modern C++ Design (ISBN 0-201-70431-5). It discusses the matter in sufficient detail. The book is from 2001, but I feel the design related content remains clear and relevant in 2024. That said, it is a nice book to appreciate how much compilers and the language has improved since.