A bit of post-modern C++ refactoring

This is a preview

This playground version isn't public and is work in progress.

A functional touch

Post-modern C++ programmers master more than one paradigm and, without a doubt, functional paradigm is one to know.

We cannot really cover functional programming in C++, it would need more than a workshop! However, we could introduce a few interesting concepts and try them out.

Lambdas, lambdas everywhere

Lambdas, lambdas everywhere

C++11 introduced lambda expressions (often called lambdas) that are, basically, nested anonymous functions:

  • nested because we generally create them inside another function,
  • anonymous because their type and name is automatically generated by the compiler.

Typically, lambdas are used to encapsulate a few lines of code that are passed to algorithms or asynchronous methods:

vector<int> v = {10, 20, 30, 41, 50, 67};
auto firstOdd = std::find(begin(v), end(v), [](int i) { return i%2 == 1; });

This expression [](int i) { return i%2 == 1; } generates a lambda.

C++ lambdas come with some details to master. I'll show you the most important ones and leave the others to further readings.

What's a lambda, in short

Basically, a lambda will result in generating a callable object (or functor, in the C++ slang).

For this lambda:

[](int i) { return i%2 == 1; }

the compiler could generate something very close to:

struct lambda12384950_t
{
    bool operator()(int i) const
    {
        return i%2 == 1;
    }
} lambda12384950;

Such lambdas are stateless and they are automatically convertible to function pointers:

using FunPtr = bool(*)(int);

FunPtr ptr = [](int i) { return i%2 == 1; };

This kind of lambdas is comparable to creating static strings:

int main()
{
    auto someString = "this is a string";
    cout << someString;
}

In this case:

[](auto i) { return i%2 == 1; }

The call operator is templatized:

struct lambda12384950_t
{
    template<typename T>
    bool operator()(T i) const
    {
        return i%2 == 1;
    }
};

As you will learn in a moment, lambdas can capture variables in outer and global scopes, becoming stateful:

vector<int> prices = {1,2,3};
auto isInPrices = [v](int i) { return find(begin(v), end(v), i) != end(v); };

The lambda above may be turned into:

struct lambda12384950_t
{
    lambda12384950_t(const vector<int>& field)
        : _private_field_1(field)
    {}

    bool operator()(T i) const
    {
        return find(begin(_private_field_1), end(_private_field_1), i) != end(_private_field_1);
    }
    
private:
    vector<int> _private_field_1;
};

Clearly, this kind of lambda cannot be converted to function pointers.

Parameter list

Just as ordinary functions, lambdas can accept input parameters:

auto l = [](int i) { return i%2 == 1; };

From C++14 we can leave the compiler deduce parameters automatically (adopting templates deduction rules):

auto l = [](auto i) { cout << i << "\n"; };

The parameter list is optional when the lambda is parameterless:

auto l = [] { cout << "hello" << "\n"; };

Lambda capture

A lambda can introduce new variables in its body (in C++14), and it can also access (capture) variables from the surrounding scope. A lambda begins with the capture clause ([]), which specifies which variables are captured, and whether the capture is by value or by reference. Variables that have the ampersand (&) prefix are accessed by reference and variables that do not have it are accessed by value.

We can use the default capture mode to indicate how to capture any outside variables that are referenced in the lambda: [&] means all variables that we refer to are captured by reference, and [=] means they are captured by value. We can even mix captures:

[&total, factor] // total by reference, factor by value
[factor, &total] // same as before
[&, factor] //factor by value, everything else by reference (only variables the lambda actually accesses)
[factor, &] // same as before
[=, &total] //total by reference, everything else by value (only variables the lambda actually accesses)
[&total, =] // same as before
[&, this] // methods and data members of the enclosing class (this) and other variables by reference
[*this] // the enclosing class is copied into the lambda (the closure becomes self-contained)

In addition, from C++14 we can even introduce and initialize new variables in the capture clause, without the need to have those variables exist in the lambda function’s enclosing scope:

pNums = make_unique<vector<int>>(nums);  
//...  
auto a = [ptr = move(pNums)]() {  
   // the lambda owns ptr
};  

Return type

The return type of a lambda expression is automatically deduced:

auto x1 = [](int i){ return i; }; // return type is int

Lambda return type deduction is the same as auto return type deduction that follows the rules of template argument deduction.

In other words, assume lambdas return by value and when we need to return differently we could specify what to return by hand:

vector<int> v = {1,2,3};
auto get0    = [&]           { return v[0]; }; // return type is int
auto get0Ref = [&]() -> int& { return v[0]; }; // return type is int&

Such syntax is called trailing return type.

How to hold and pass lambdas around?

Because each lambda function is implemented by creating a separate class, as we saw earlier, even single lambda function is really a different type - even if the two functions have the same arguments and the same return value!

To store and pass lambdas around we have a few options supported by the standard:

auto and Templates

auto is always our friend when we need to declare variables whose type we don't know:

auto lam = [](int i, int j) { return i+j; };

This is the most efficient way to store lambdas. Similarly, we can pass lambdas by using templates (this is the approach adopted by the standard library):

template<typename Callable>
void MyAlgo(Callable callable);

MyAlgo(lam);

Sometimes we have to commit to a non-generic type, for different reasons.

Function pointers

Committing stateless lambdas to function pointers is automatic:

using FunPtr = void(*)(int, int);

void MyAlgo(FunPtr);

auto lam = [](int i, int j) { ... };
MyAlgo(lam); // automatic conversion

What about stateful lambdas?

Functional polymorphic wrapper: std::function

C++11 introduces a convenient wrapper for storing any kind of function (lambda function, functor, or function pointer): std::function. It allows you to specify the exact types for the argument list and the return value in the template:

std::function<int(int, int)> lam = [](int i, int j) { return i+j; };

This is a great way of passing around lambdas both as parameters and as return values.

void MyAlgo(std::function<int(int, int)> lam);

std::function is able to store any kin of function, including stateful callables:

vector<int> values = {1,2,3};
std::function<int(int)> lam = [&](int i) { return v[i]; };

Note: invoking a default-constructed std::function results in throwing an exception. You can check if a std::function holds a callable:

 std::function<void(int)> f;
 if (!f)
 {
    cout << "empty";
 }

The snippet above will print empty.

Continue Reading:

Hands on!

An open question is dividing the company: how to access all the available urls? A group of devs sustains we should just have a GetUrls function returning a vector with url information; on the other hand, another faction supports a different idea: we should make our service visitable by adding support for internal iteration. That is, instead of controlling the iteration (e.g. by iterating on the resulting data structure), the client just provides the code which must be executed for all/some of data elements. We take (the service) care of how the elements are accessed.

Some benefits of internal iteration:

  • declarative style ("what" opposed to "what" and "how")
  • internal details are hidden (we can change them afterwards with no impact on the client)

Your boss decides to embrace this latter option to give a little functional touch to the service. Your task consists in:

  • implementing a visit function which invokes a callable object for each stored urls,
  • write the test to check that it works properly.

You have free rein! Choose the option you like the most (templates, std::function, ...):

Implement a visit function
Do you really give up? :(

Template-based solution is probably the simplest and also most effective:

template<typename Action>
void MicroUrlService::VisitMicroUrls(Action action) const
{
	for (auto&[id, url] : m_idToUrl)
		action(url);
}

Have you notices we cannot ignore id? This can result in a "unused variable" warning, a licit reason to avoid structure bindings here:

template<typename Action>
void MicroUrlService::VisitMicroUrls(Action action) const
{
	for (auto& p: m_idToUrl)
		action(p.second);
}

Bonus: passing stateful visitors

Encouraged by the growing interest generated by your new function, Tom implemented a visitor to find the most popular url shortned by our service. Although the logic seems correct, the test he wrote is not passing.

Can you help him? (just copy and paste your previous implementation of visitation and then accommodate the test)

Fix the test
Do you really give up? :(

A simple idiom to pass stateful visitors by value is using std::ref, which creates a value-type that is actually a reference:

service.VisitMicroUrls(std::ref(visitor));

std::ref creates a std::reference_wrapper, which provides a call operator (operator()) proxying to the wrapped object.

This approach is very useful to invoke functions taking generic callables by value and we want to pass them by reference. To give you an example, pretty much every function of the STL takes such objects by value:

thread t1 { fun }; // fun is by value
thread t1 { std::ref(fun) }; // not copied (be careful with the scope...)

Another usage of such idiom, probably less common, is to pass polymorphic objects to an STL algorithm.

Open Source Your Knowledge: become a Contributor and help others learn. Create New Content