[cxx-abi-dev] Mangling of string literals versus variadic templates
David Vandevoorde
daveed at edg.com
Tue Dec 17 19:10:50 UTC 2013
On Dec 16, 2013, at 8:10 PM, Richard Smith <richardsmith at google.com> wrote:
> Hi,
>
> Consider:
>
> void g(...);
> template<int...N> inline const char *f() { g("foo" + N ...); return "bar"; }
> const char *p = f();
> const char *q = f<0>();
> const char *r = f<0, 1>();
>
> In f<>, what is the mangling of the "bar" string literal? Is this the first string literal or the second?
I think it has to be the second.
> The ABI document says "In all cases the numbering order is strictly lexical order based on the original token sequence", but it's not obvious what that would mean for a template instantiation that discards tokens, as this one does.
Why would it mean something different if you discard tokens? The internal representation probably still knows there was a string literal involved and could associate a discriminating sequence number with it, no?
> Now consider:
>
> template<int...N> inline const char *h() { g([]{return "foo";}() + N ...); return "bar"; }
> const char *p = h();
> const char *q = h<0>();
> const char *r = h<0, 1>();
>
> What happens here? Does the string literal inside the lambda get a number in the context of the outer function as well as a number within the lambda? It appears within the original token sequence...
I agree the spec misses this, but I think numbering should not include literals (or other entities) that are in a nested body function. I.e., "bar" would be the first string literal in all instances of h.
> EDG is the only vendor I can find that provides manglings for string literals at all, and its results here are surprising. It provides these manglings for "bar":
>
> f<>: _ZZ1fIJEEPKcvEs_0
> f<0>: _ZZ1fIJLi0EEEPKcvEs_0
> f<0, 1>: _ZZ1fIJLi0ELi1EEEPKcvEs_0
>
> These suggest that EDG includes "foo" in the numbering, even though it is not actually part of f<>. But then:
>
> h<>: _ZZ1hIJEEPKcvEs_0
> h<0>: _ZZ1hIJLi0EEEPKcvEs
> h<0, 1>: _ZZ1hIJLi0ELi1EEEPKcvEs
>
> These seem very surprising. "foo" is included in the numbering *only* in the case where it doesn't actually appear in the instantiated function body.
I suspect that's just a bug.
>
>
> Suggestion: change in 5.1.6:
>
> "In all cases the numbering order is strictly lexical order based on the original token sequence<ins>, excluding any tokens that are part of the body of a nested entity</ins>. All entities occurring in that sequence are to be numbered, even if subsequent optimization <ins>or (in the case of a string literal) expansion of an empty parameter pack</ins> makes some of them unnecessary."
That sounds good.
>
> This would make EDG correct, except for the h<> case, where the mangling would be _ZZ1hIJEEPKcvEs
>
> Another related issue is with user-defined literals. If 123_x appears in an inline function, and implicitly calls operator""_x("123"), the implicit string literal should (presumably) be assigned a mangling number. If it calls operator"""_x(123ULL), a mangling number should presumably not be assigned.
Ah yes, nice catch.
>
> Suggestion: insert after previously-quoted text from 5.1.6:
>
> If a user-defined-literal implicitly passes a string literal to a literal operator, the user-defined-literal token is numbered as if it were a string literal token.
Hmmm, would it be better to number all user-defined literals without having to worry about how they'll be transformed?
Daveed
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sourcerytools.com/pipermail/cxx-abi-dev/attachments/20131217/f09241f7/attachment.html>
More information about the cxx-abi-dev
mailing list