A bit of post-modern C++ refactoring

This is a preview

This playground version isn't public and is work in progress.

Viewing strings

std::string_view in a nutshell

string_view gives us the ability to refer to an existing string in a non-owning way.

An instance of the string_view class can be thought of as a "view" into an existing character buffer. Specifically, a string_view consists of only a pointer and a length, identifying a section of character data that is not owned by the string_view and cannot be modified by the view. Consequently, making a copy of a string_view is a shallow operation: no string data is copied.

string_view has implicit conversion constructors from both const char* and const string&, and since string_view doesn’t copy, there is no O(n) memory penalty for making a hidden copy. In the case where a const string& is passed, the constructor runs in O(1) time. In the case where a const char* is passed, the constructor invokes a strlen() automatically (or you can use the two-parameter string_view constructor).

string_view's interface recalls string's. For example:

string_view sv = "this is a string view";
sv = sv.substr(0, 4); // sv is "this"

The snippet above does not allocate data, instead it just returns a new string_view.

Many operation can be performed just by narrowing the view on the original string. Here is how we can left-trim:

string_view sv = "    trim me";
sv.remove_prefix(std::min(sv.find_first_not_of(" "), sv.size())); // sv is "trim me"

Remember that string_view is not necessarily NUL-terminated.

Continue reading:

Hands on!

Jane, our Chief Marketing Officer, has just came back from CppCon where she heard about std::string_view during a coffee break. She is keen on selling to our investors that Gugol is already using such C++17 feature into our codebase.

All things considered, this request is not so silly. The service's interface seems a bit inconsistent: it takes const char* but in Stats that takes const string&. std::string_view could be really a good fit to improve the interface and also to avoid copying...

Work on MicroUrlService and uniform the parameters of its public functions. Decomment some lines in StringViewTest.cpp and make it work.

Replace const char* with std::string_view on our service's interface

This exercise is just to let you familiarize with std::string_view, in general this guideline is insightful:

Guideline

If your API only needs to reference the string data during a single call, and doesn’t need to modify the data, accepting a string_view is sufficient. If you need to reference the data later or need to modify the data, you can explicitly convert to a C++ string object.

Adding string_view into an existing codebase is not always the right answer: changing parameters to pass by string_view can be inefficient if those are then passed to a function requiring a string or a NUL-terminated const char*. It is best to adopt string_view starting at the utility code and working upward, or with complete consistency when starting a new project.

Do you really give up? :(

Using std::string_view is as easy as using std::string. For example:

std::string MicroUrlService::ClickUrl(std::string_view microUrl)
{
	auto secret = microUrl.substr(microUrl.find_last_of('/') + 1);
	auto& url = m_idToUrl[Ext::Shortener::shortURLtoID(secret.data())];
	url.Clicks++;
	return url.OriginalUrl;
}

Remember the simple idiom to construct a std::string from std::string_view:

std::string_view strView = ...;
std::string str {strView.data(), strView.size()};

It's very common to have such utility somewhere in our codebases:

std::string to_string(std::string_view sv)
{
    return {sv.data(), sv.size()};
}

Bonus Track: avoid temporary strings in map lookups

IT is worried that our service accepts too many requests per second and they decided to implement a very simple load balancing strategy to split the work among several service instances. You know it cannot scale but you decide to help optimize it a bit, in the meantime some people of your team will develop a better strategy.

The load balancing strategy is very naive. Basically, the job is sent to a certain instance of the service depending on the first letter of the url. Since IT people regularly attend Coding Gym, they know this lookup can be implemented very easily with std::map. The only problem is that their function takes std::string_view, triggering a conversion to std::string for every lookup.

Can you help them avoid such useless conversion? Accommodate LoadBalancer here below:

Lookup map of strings by using string_view
Solution

C++14 introduced transparent comparators to perform heterogeneous lookup on associative containers using keys that are not necessarily the same as the associative container key type.

For example:

struct Book
{
    string title;
    string author;
    int id;
};

struct BookComparator
{
    bool operator()(const Book &x, const Book &y) const
    {
        return x.title < y.title; // we just use title
    }
};

set<Book> library = {...};

library.find({"title", "", ""});

It would be better to pass only the title, wouldn't it?

In C++14 we can turn BookComparator into a transparent comparator just by declaring a type is_transparent:

struct BookComparator
{
    typedef void is_transparent;

    bool operator()(const Book &x, const Book &y) const
    {
        return std::tie(x.title, x.author) < std::tie(y.title, y.author); // std::tie idiom
    }

    bool operator()(const Book &x, string_view title) const
    {
        return x.title < title;
    }
    
    bool operator()(string_view title, const Book &y) const
    {
        return title < y.title;
    }
    
    // other comparisons, as needed
};

library.find("title");

In the C++ library, std::less<void> or simply std::less<> (from C++14) is special: it is a specialization of std::less with parameter and return type deduced and it has is_transparent declared. This comparator automatically enables operator< comparisons among different types, when supported.

In the exercise above, std::less<> can be used to lookup an instance of std::string_view without incurring in creating a temporary std::string:

//...
    
std::string_view MicroUrlServiceIpFor(std::string_view s)
{
	auto it = PrefixToIp.lower_bound(s); // now takes string_view directly
	return std::prev(it)->second;
}
private:
                           // look here ---v
std::map<std::string, std::string, std::less<>> PrefixToIp;

Continue Reading:

is_transparent: How to search a C++ set with another type than its key

Open Source Your Knowledge: become a Contributor and help others learn. Create New Content