The magic of non-constant constant-expressions in C++
Disclaimer: The technique described in this post is primarily meant as "just another clever hack, diving into the dark corners of C++". I do not recommend anyone to incorporate the contents of this post into production code without considering its caveats (further described in upcoming posts).
The disclaimer was added after reading comments about this post on other sites, where the purpose of the post has been misunderstood. I do not endorse usage of this technique outside your bedroom (without considering the consequences of such usage).
Introduction
Implementing f() to make the following snippet compile without the
static_assert being fired looks impossible, doesn't it?
// <insert solution here>
int main () {
constexpr int a = f ();
constexpr int b = f ();
static_assert (a != b, "fail");
}
We all "know" that an expression that is usable in a constant-expression can neither depend on, nor change, the global state. This lack of dependency must inherently mean that it will yield the same value upon each invocation having the same set of arguments... right?
Up until about a week ago I "knew" this to be true, and I honestly thought the above was not possible to implement without triggering undefined-behavior (ie. writing a program that the ISO C++ Standard would consider ill-formed).
As it turns out, I was very wrong.
Note: The contents of this posts addresses C++11, some wording in the C++14 ISO Standard has changed, but the wording of the relevant sections are effectively the same (ie. the technique described in this post is also legal C++14).
Why is this an interesting problem?
By solving the problem described in the previous section, we will successfully add state to the world of translation (meaning that we can write stateful meta-programs).
This might seem like a small and innocent concept at first glance, but the technique described in this post allows us to solve a number of problems that has previously required either very complex code, or that have been entirely unsolvable.
Where would we end up if we had a solution?
A solution to the previously described problem would make us reconsider if it is really impossible to do the "impossible", it would grant us increased power during the phase of translation, and it could - potentially - give us the ability to express what follows.
A compile-time counter
using C1 = ...;
int constexpr a = C1::next (); // = 1
int constexpr b = C1::next (); // = 2
int constexpr c = C1::next (); // = 3
A compile-time container of meta-types
using LX = ...;
LX::push<void, void, void, void> ();
LX::set<0, class Hello> ();
LX::set<2, class World> ();
LX::pop ();
LX::value<> x; // type_list<class Hello, void, class World>
Other ideas
- How to implement a trait to check if class has been instantiated?
- How to get address of called function after overload resolution?
- How can I add reflection to a C++ application?
Prerequisites
Note: The contents of this section is meant to be used as a detailed introduction to the technical aspects related to the solution at the end of this post, as such it could potentially be skipped by more experienced (or eager) readers.
If you are only here for the cake, please, skip to the solution.
Note: It is recommended to at least skim through the different parts if you are interested in knowing exactly how, and why, the solution works (and is legal C++).
The friend keyword
There are more to friends than the simple act of making some entity have
access to the protected and/or private parts of some other entity. Please
consider the following (very trivial) example:
class A;
void touch (A&);
class A {
friend void touch (A&);
int member;
};
void touch (A& ref) {
ref.member = 123; // ok, `void touch(A&)` is a friend of `class A`
}
int main () {
A a; a.member = 123; // ill-formed, `A::member` is private
A b; touch (b);
}
What is important to note is that we first declare void touch (A&) in the
global scope, after which we say that it is a friend of A (through the
friend-declaration), and lastly we redeclare and define it in the global
scope.
We could, just as well, have provided a declaration and definition of
void touch (A&) directly inside class A - leaving out the other steps
completely - as in the following example:
class A {
friend void touch (A& ref) {
ref.member = 123;
}
int member;
};
int main () {
A b; touch (b); // ok
}
It is very important to note that the two approaches are not directly equivalent (even though it may seem like it in this particular example).
In the latter snippet, void touch(A&) will be injected into the innermost
namespace scope surrounding class A, but the function will only be accessible
through
Argument-Dependent Lookup
(ADL).
class A {
public:
A (int);
friend void touch (A) { ... }
};
int main () {
A b = 0; touch (b); // ok, `void touch (A)` found through ADL
touch (0); // ill-formed
}
References:
[class.friend]p1,6-7
The name of a friend
The below section of the standard further proves what was mentioned earlier, but I would like you to focus on what is written between the lines (because that is equally important):
7.3.1.2/3Namespace member definitions[namespace.memdef]p3Every name first declared in a namespace is a member of that namespace. If a
frienddeclaration in a non-local class first declares a class, function, class template or function template the friend is a member of the innermost enclosing namespace. Thefrienddeclaration does not by itself make the name visible to unqualified lookup (3.4.1) or qualified lookup (3.4.3).
Notice that nowhere is it stated that the name introduced by a
friend-declaration must have any particular relation to name of the class it
is declared and/or defined in, or any particular relation to the class at all
(for that matter).
#include <iostream>
class A {
public:
A (int) { }
friend void f (A);
};
void g (A);
class B {
friend void f (A) {
std::cout << "hello world!" << std::endl;
}
class C {
friend void g (A) {
std::cout << "!dlrow olleh" << std::endl;
}
};
};
int main () {
A a (0); f (a); g (1);
}
Note: It should be fairly trivial to reason about the snippet above, but if you are uncertain about the outcome I encourage you to whip out your compiler of choice and play with the program.
References:
[namespace.memdef]p3
The rules of constant-expressions
There are a lot of rules associated with constexpr, and I have been meaning to
write a rather detailed introduction to the concept, but in short:
- a variable marked
constexprmust be initialized by a constant-expression, and; - a
constexprfunction can only invoke other functions markedconstexpr, and every expression involved (this includes creating objects) must be such that can appear in a constant-expression, and; - a constant-expression;
- can only include objects having literal type, and;
- can only access objects whose lifetime either began within the evaluation of said constant-expression, or that are previously declared in a way which makes them constant-expression friendly, and;
- must not contain any expression that has undefined-behavior, and;
- must not invoke an undefined
constexprfunction or constructor.
Note: The above list is by no means complete, but the rules stated can make it easy to reason about
constexprand constant-expressions in most contexts.
References:
[expr.const]p2-3
Instead of diving into a heavy session explaining every aspect of
constant-expressions, we will focus on the rule that (explicitly) states that
we cannot invoke an undefined constexpr function in a constant-expression.
constexpr int f ();
void indirection ();
int main () {
constexpr int n = f (); // ill-formed, `int f ()` is not yet defined
indirection ();
}
constexpr int f () {
return 0;
}
void indirection () {
constexpr int n = f (); // ok
}
Note: In order to understand the example you must grasp the concept of
int f ()being undefined inmain, but having a definition insideindirection(since by then the definition is available).
How to check if an expression is a constant-expression
There are a number of ways to check whether a certain expression is allowed to be used where a constant-expression is required, and some are more complicated than others.
An experienced C++ developer might directly see that proper use of Substitution Failure Is Not An Error (SFINAE) could grant us these powers, and he/she would be correct; but the powers granted by SFINAE comes with the substantial cost of writing some rather complex piece of code.
A far easier approach is to use the noexcept-operator, which shall yield
true if certain conditions are met, such as if the expression checked is a
constant-expression:
constexpr int f ();
void indirection ();
int main () {
constexpr bool b = noexcept (f()); // false, `f()` is not a constant-expression
} // since it lacks a definition
constexpr int f () {
return 0;
}
void indirection () {
constexpr bool b = noexcept (f()); // true
}
Note:
clangcurrently has a bug wherenoexceptdoes not yieldtrueeven though the expression checked is a constant-expression. A workaround is available in the appendix of this post.
References:
[expr.unary.noexcept]p1-3
The semantics of template instantiation
If there is one section of the ISO C++ Standard that is often considered to be a real challenge, it is the wording related to templates. If I were to explain every aspect of template instantiation, this post would quickly grow out of proportion, and you would be stuck reading for at least another set of hours.
Since I doubt that is what any of us want, I will instead try to explain the fundamental parts of template instantiation that is required to understand the solution that appears further down in this post.
Note: Please note that the contents of this section is by no means meant to be a complete reference for template instantiation. There are exceptions to the rules mentioned, as well as things that I have intentionally left out because they are too far beyond the scope of this post.
The Fundamental
What follows is a very condensed list of the most pertinent (in relation to the contents of this post) parts of template instantiation:
A template specialization will be implicitly instantiated if, and only if, it has not yet been explicitly specialized or explicitly instantiated.
A function template specialization will be implicitly instantiated when it is referenced in a context that would require its definition.
A class template specialization will be implicitly instantiated when it is referenced in a context that requires a completely-defined object type, or when the completeness of the class type affects the semantics of the program.
Instantiation of a class template specialization will implicitly instantiate the declarations of its members, but not their definition, unless the member is either;
- a static data-member, in which case no instantiation takes place, or;
- an unscoped enumeration or an anonymous union, in which case their declaration and definition will be instantiated.
The definition of a member of a class template specialization is implicitly instantiated when it is required, but not sooner.
If a class template specialization contains
friend-declarations, the names of itsfriendsare treated as if an explicit specialization had been declared at the point of instantiation.
References:
[temp.friend]p1,[temp.spec]p1-6,[temp.inst]p1-5,8-11
Point of Instantiation
Whenever a template specialization is referenced in a context that requires instantiation, that context gives birth to a "point of instantiation" (which effectively denotes a location where the compiler is allowed to generate code for the referenced template specialization).
If a template specialization X is referenced in a context that depends on a
template-parameter of some surrounding template Y, the given point of
instantation depends on the point of instantation of Y.
- If
Xis a function template specialization, the point of instantiation is that ofY. - If
Xis a class template specialization, the point of instantiation is immediately before the point of instantiation ofY.
Otherwise, the given point of instantiation is tied to the location of the
namespace scope declaration/definition (D) which contains the statement
referring to X.
- If
Xis a function template specialization, the point of instantiation is immediately afterD. - If
Xis a class template specialization, the point of instantiation is immediately beforeD.
References:
[temp.point]p1-7
Generation of function template specialization
Function template specializations can have any number of points of instantiation, but the expressions within the specialization must yield the same meaning no matter at what point the compiler choose to generate it, otherwise the program is ill-formed (no diagnostic required).
For added simplicity, the ISO C++ Standard considers any instantiated function template specialization to have an additional point of instantiation at the end of the translation unit (TU).
namespace N {
struct X { /* intentionally left blank */ };
void func (X, int);
}
template<class T>
void call_func (T val) {
func (val, 3.14f);
}
int main () {
call_func (N::X {});
}
namespace N {
float func (X, float);
}
We have two points of instantiation of void call_func<N::X> (N::X) in the
above example. The first is right after the definition of main (because of the
function call therein), and the second is at the end of the TU.
The snippet is ill-formed since the behavior of call_func<N::X> changes
depending on where the compiler choose to generate its definition:
If it happens right after the definition of
main,void N::func (N::X, int)will be called (since there is no other overload at this point).If it happens at the end of the TU,
float N::func (N::X, float)is a better match than the previous overload, and as such it will be called.
References:
[temp.point]p1-7
Generation of class template specialization
Class template specializations has at most one point of instantiation, which effectively means that the compiler must generate the specialization the first time it is instantiated.
namespace N {
struct X { /* intentionally left blank */ };
void func (X, int);
}
template<class T> struct A { using type = decltype (func (T{}, 3.14f)); };
template<class T> struct B { using type = decltype (func (T{}, 3.14f)); };
int main () {
A<N::X> a;
}
namespace N {
float func (X, float);
}
void g () {
A<N::X>::type a; // ill-formed, type would be `void`
B<N::X>::type b; // ok, type is `float`
}
Note:
A<N::X>point of instantiation is immediately beforemain, whereasB<N::X>does not have its point of instantiation untilgis reached.
References:
[temp.point]p3
Putting it all together
The rules associated with friend-declarations in class templates means that
the upcoming snippet is semantically equivalent to an implementation that would
declare an explicit specialization matching A<short>, and another one
matching A<float>, at their respective point of instantiation.
constexpr int func (short);
constexpr int func (float);
template<class T>
struct A {
friend constexpr int func (T) { return 0; }
};
template<class T>
A<T> indirection () {
return {};
}
// (1)
int main () {
indirection<short> (); // (2)
indirection<float> (); // (3)
}
Note: When an evaluated expression contains a call to a specialization of
indirection<T>, the return-type of the function template specialization must be completely defined; which inherently causes an implicit instantiation ofA<T>inmain.
Note: Please note that the functions
func (short)andfunc (float)are undefined untilA<short>andA<float>have been instantiated, respectively.
Note: It is important to note that a point of instantation
(1)denotes in what context the compiler may generate the relevant code. What triggers the instantiation is where the change actually takes effect ((2)and(3)).
Solution
Hopefully the prerequisites have touched on every aspect associated with the solution presented further down in this section.
As a quick reminder, you should have knowledge about the following topics in order to fully understand how and why the solution works:
- The
friendkeyword, and; - The rules of constant-expressions, and;
- The rules of template instantiation.
The Implementation
constexpr int flag (int);
template<class Tag>
struct writer {
friend constexpr int flag (Tag) {
return 0;
}
};
template<bool B, class Tag = int>
struct dependent_writer : writer<Tag> { };
template<
bool B = noexcept (flag (0)),
int = sizeof (dependent_writer<B>)
>
constexpr int f () {
return B;
}
int main () {
constexpr int a = f ();
constexpr int b = f ();
static_assert (a != b, "fail");
}
Note:
clangincorrectly shows the wrong behavior, a workaround is available in the appendix.
But, how does it work?
Having read the prerequisites, the solution presented in the previous section should be quite easy to understand, but even then; a detailed explanation of the different parts might be of interest.
This section will walk through the source code, step by step, and provide a short description and rationale for each segment.
The "variable"
A constexpr function can be in either one of two states; either it is usable
in a constant-expression, or it isn't - if it lacks a definition it
automatically falls in the latter category - there is no other state (unless we
consider undefined behavior).
Normally, constexpr functions should be treated exactly as what they are;
functions, but we can also think of them as individual handles to "variables"
having a type similar to bool, where each "variable" can have one of two
values; usable or not-usable.
constexpr int flag (int);
In our program it helps if you consider flag to be just that; a handle (not a
function). The reason is that we will never actually call flag in an evaluated
context, we are only interested in its current state.
The Writer
writer is a class template which, upon instantiation, will create a
definition for a function in its surrounding namespace (having the signature
int flag (Tag), where Tag is a template-parameter).
template<class Tag>
struct writer {
friend constexpr int flag (Tag) {
return 0;
}
};
If we, once again, think of constexpr functions as handles to some variable,
we can treat an instantiation of writer<T> as an unconditional write of the
value usable to the variable behind the function in the
friend-declaration.
The Proxy
template<bool B, class Tag = int>
struct dependent_writer : writer<Tag> { };
I would not be surprised if you think dependent_writer looks like a rather
pointless indirection; why not directly instantiate writer<Tag> where we want
to use it, instead of going through dependent_writer?
- Instantiation of
writer<int>must depend on something to prevent immediate instantiation, and; dependent_writeris used in a context where a value of typeboolcan be used as dependency.
Note: When writing the implementation I considered many different alternatives, trying to find an implementation which was easy to grasp at first glance. I honestly hope that this step is not too confusing.
The Magic
template<
bool B = noexcept (flag (0)), // (1)
int = sizeof (dependent_writer<B>) // (2)
>
constexpr int f () {
return B;
}
The above might look a little weird, but it's really quite simple;
(1)will setB = trueifflag(0)is a constant-expression, otherwiseB = false, and;(2)implicitly instantiatesdependent_writer<B>(sizeofrequires a completely-defined type).
The behavior can be expressed with the following pseudo-code:
IF [ `int flag (int)` has not yet been defined ]:
SET `B` = `false`
INSTANTIATE `dependent_writer<false>`
RETURN `false`
ELSE:
SET `B` = `true`
INSTANTIATE `dependent_writer<true>`
RETURN `true`
Conclusion
That people keep discovering crazy ways to do new things in C++ (that were previously considered impossible) is both amazing and horrible. — Mara Bos
This post has explained the basic idea behind adding state to constant-expressions, in other words; the common theory (often expressed by myself) of constant-expressions being "constant" has been shattered.
When writing this post I could not help thinking about the history of template meta-programming and how crazy it is that a language is capable of doing more than was perhaps ever intended.
What's next?
I have written a stateful meta-templating library called smeta that will be published, explained, and discussed, in upcoming posts - the topics that will be covered include:
- How to implement a compile-time counter
- How to implement a compile-time container for meta-types
- How to inspect and govern overload resolution
- How to add reflection to C++
Note: Entities in the above list will be made clickable as soon as the relevant post has been published.
Appendix
Workaround for clang
Because of Bug 15481 (and
related) bug(s) in clang, the solution posted earlier will in-correctly yield
the wrong behavior. Below is an alternative implementation of the solution,
written specifically for clang (though still legal C++).
namespace detail {
struct A {
constexpr A () { }
friend constexpr int adl_flag (A);
};
template<class Tag>
struct writer {
friend constexpr int adl_flag (Tag) {
return 0;
}
};
}
template<class Tag, int = adl_flag (Tag {})>
constexpr bool is_flag_usable (int) {
return true;
}
template<class Tag>
constexpr bool is_flag_usable (...) {
return false;
}
template<bool B, class Tag = detail::A>
struct dependent_writer : detail::writer<Tag> { };
template<
class Tag = detail::A,
bool B = is_flag_usable<Tag> (0),
int = sizeof (dependent_writer<B>)
>
constexpr int f () {
return B;
}
int main () {
constexpr int a = f ();
constexpr int b = f ();
static_assert (a != b, "fail");
}
Note: I'm currently writing the relevant bug reports, which will effectively show why we need the above workaround for
clang. The above will be updated with links as soon as they have been filed.
Additional Credit
This posts would not have happend if it wasn't for a number of people, but a special thanks goes out to:
-
- Helping me with the wording of this post.
- Providing me with funds to buy contact lenses so that I wasn't (literally) writing in the blind.
- Not getting fed up with me constantly bugging her for opinions.
-
- Proof-reading, as well as coming up with interesting thoughts on how to clearify the contents of this post.
-
- Unsuccessfully attempting to prove that the technique described in this post is ill-formed (by throwing sections of the ISO Standard in my face), and jftr: I would do the same for him.